Rat sensitivity to multipoint statistics is predicted by efficient coding of natural scenes

  1. Riccardo Caramellino
  2. Eugenio Piasini
  3. Andrea Buccellato
  4. Anna Carboncino
  5. Vijay Balasubramanian  Is a corresponding author
  6. Davide Zoccolan  Is a corresponding author
  1. Visual Neuroscience Lab, International School for Advanced Studies, Italy
  2. Computational Neuroscience Initiative, University of Pennsylvania, United States

Abstract

Efficient processing of sensory data requires adapting the neuronal encoding strategy to the statistics of natural stimuli. Previously, in Hermundstad et al., 2014, we showed that local multipoint correlation patterns that are most variable in natural images are also the most perceptually salient for human observers, in a way that is compatible with the efficient coding principle. Understanding the neuronal mechanisms underlying such adaptation to image statistics will require performing invasive experiments that are impossible in humans. Therefore, it is important to understand whether a similar phenomenon can be detected in animal species that allow for powerful experimental manipulations, such as rodents. Here we selected four image statistics (from single- to four-point correlations) and trained four groups of rats to discriminate between white noise patterns and binary textures containing variable intensity levels of one of such statistics. We interpreted the resulting psychometric data with an ideal observer model, finding a sharp decrease in sensitivity from two- to four-point correlations and a further decrease from four- to three-point. This ranking fully reproduces the trend we previously observed in humans, thus extending a direct demonstration of efficient coding to a species where neuronal and developmental processes can be interrogated and causally manipulated.

Editor's evaluation

This work will be of interest to neuroscientists who want to understand how visual systems are tuned to and encode natural scenes. It reports that rats share phenomenology with humans in sensitivity to spatial correlations in scenes. This shows that an earlier paper's hypothesis about efficient coding may be more broadly applicable. This work also opens up the possibility of studying this kind of visual tuning in an animal where invasive techniques can be used to study this neural origins of this sensitivity and its development.

https://doi.org/10.7554/eLife.72081.sa0

Introduction

It is widely believed that the tuning of sensory neurons is adapted to the statistical structure of the signals they must encode (Sterling and Laughlin, 2015). This normative principle, known as efficient coding, has been successful in explaining many aspects of neural processing in vision (Atick and Redlich, 1990; Fairhall et al., 2001; Laughlin, 1981; Olshausen and Field, 1996; Pitkow and Meister, 2012), audition (Carlson et al., 2012; Smith and Lewicki, 2006) and olfaction (Teşileanu et al., 2019), including adaptation (Młynarski and Hermundstad, 2021) and gain control (Schwartz and Simoncelli, 2001). In Hermundstad et al., 2014, we reported that human sensitivity to visual textures defined by local multipoint correlations depends on the variability of such correlations across natural scenes. This allocation of resources to features that are the most variable in the environment, and thus more informative about its state, is accounted for by efficient coding, demonstrating its role as an organizing principle also at the perceptual level (Hermundstad et al., 2014; Tesileanu et al., 2020; Tkacik et al., 2010). However, it remains unknown whether this preferential encoding of texture statistics that are the most variable across natural images is a general principle underlying visual perceptual sensitivity across species. Although some evidence exists for differential neural encoding of multipoint correlations in macaque V2 (Yu et al., 2015) and V1 (Purpura et al., 1994), the sensitivity ranking we previously reported in Hermundstad et al., 2014 has not been investigated in any species other than humans (Hermundstad et al., 2014; Tesileanu et al., 2020; Tkacik et al., 2010; Victor and Conte, 2012). Moreover, while monkeys are standard models of advanced visual processing (DiCarlo et al., 2012; Kourtzi and Connor, 2011; Lehky and Tanaka, 2016; Nassi and Callaway, 2009; Orban, 2008), they are less amenable than rodents to causal manipulations (e.g. optogenetic or controlled rearing) to interrogate how neural circuits may adapt to natural image statistics. On the other hand, rodents have emerged as powerful model systems to study visual functions during the last decade (Glickfeld et al., 2014; Glickfeld and Olsen, 2017; Huberman and Niell, 2011; Katzner and Weigelt, 2013; Niell and Scanziani, 2021; Reinagel, 2015; Zoccolan, 2015). Rats, in particular, are able to employ complex shape processing strategies at the perceptual level (Alemi-Neissi et al., 2013; De Keyser et al., 2015; Djurdjevic et al., 2018; Vermaercke and Op de Beeck, 2012), and rat lateral extrastriate cortex shares many defining features with the primate ventral stream (Kaliukhovich and Op de Beeck, 2018; Matteucci et al., 2019; Piasini et al., 2021; Tafazoli et al., 2017; Vermaercke et al., 2014; Vinken et al., 2017). More importantly, it was recently shown that rearing newborn rats in controlled visual environments allows causally testing long-standing hypotheses about the dependence of visual cortical development from natural scene statistics (Matteucci and Zoccolan, 2020). Establishing the existence of a preferential encoding of less predictable statistics in rodents is therefore crucial to understand the neural substrates of efficient coding and its relationship with postnatal visual experience.

Results

To address this question, we measured rat sensitivity to visual textures defined by local multipoint correlations, training the animals to discriminate binary textures containing structured noise from textures made of white noise (Figure 1A). The latter were generated by independently setting each pixel to black or white with equal probability, resulting in no spatial correlations. Structured textures, on the other hand, were designed to enable precise control over the type and intensity of the correlations they contained. To generate these textures we built and published a software library (Piasini, 2021) that implements the method developed in Victor and Conte, 2012. Briefly, for any given type of multipoint correlation (also termed a statistic in what follows), we sampled from the distribution over binary textures that had the desired probability of occurrence of that statistic, but otherwise contained the least amount of structure (i.e. had maximum entropy). The probability of occurrence of the pattern was parametrized by the intensity of the corresponding statistic, determined by a parity count of white or black pixels inside tiles of 1, 2, 3, or 4 pixels (termed gliders) used as the building blocks of the texture (Victor and Conte, 2012). When the intensity is zero, the texture does not contain any structure–it is the same as white noise (Figure 1A, left). When the intensity is +1, every possible placement of the glider across the texture contains an even number of white pixels, while a level of –1 corresponds to all placements containing an odd number of white pixels. Intermediate intensity levels correspond to intermediate fractions of gliders containing the even parity count. The structure of the glider and the sign of the intensity level dictate the appearance of the final texture. For instance (see examples in Figure 1A, right), for positive intensity levels, a one-point glider produces textures with increasingly large luminance, a two-point glider produces oriented edges and a four-point glider produces rectangular blocks. A three-point glider produces L-shape patterns, either black or white depending on whether the intensity is negative or positive.

Figure 1 with 1 supplement see all
Visual stimuli and behavioral task.

(A) Schematic of the four kinds of texture discrimination tasks administered to the four groups of rats in our study. Each group had to discriminate unstructured binary textures containing white noise (example on the left) from structured binary textures containing specific types of local multipoint correlations among nearby pixels (i.e. 1-, 2-, 3-, or 4-point correlations; examples on the right). The textures were constructed to be as random as possible (maximum entropy), under the constraint that the strength of a given type of correlation matched a desired level. The strength of a correlation pattern was quantified by the value (intensity) of a corresponding statistic (see main text), which could range from 0 (white noise) to 1 (maximum possible amount of correlation). The examples shown here correspond to intensities of 0.85 (one- and two-point statistics) and 0.95 (three- and four-point statistics). (B) Schematic representation of a behavioral trial. Left and center: animals initiated the presentation of a stimulus by licking the central response port placed in front of them. This prompted the presentation of either a structured (top) or an unstructured (bottom) texture. Right: in order to receive the reward, animals had to lick either the left or right response port to report whether the stimulus contained the statistic (top) or the noise (bottom). Figure 1—figure supplement 1 shows the performances attained by four example rats (one per group) during the initial phase of the training (when the animals were required to discriminate the stimuli shown in A), as well as the progressively lower statistic intensity levels that these rats progressively learned to discriminate from white noise during the second phase of the experiment.

Notably, two-point and three-point gliders are associated to multiple distinct multipoint correlations, corresponding to different spatial glider configurations. For instance, two-point correlations can arise from horizontal (-), vertical (|) or oblique gliders (/, \), while three-point correlations can give rise to L patterns with various orientations (θ, θ¬, θ, θ). In our previous study with human participants (Hermundstad et al., 2014), we tested all these two-point, three-point, and four-point configurations, as well as 11 of their pairwise combinations, for a total of 20 different texture statistics. In that set of experiments, we did not test textures defined by one-point correlations because, by construction, the method we used to measure the variability of texture statistics across natural images could not be applied to the one-point statistic. In our current study, practical and ethical constraints prevented us from measuring rat sensitivity to a large number of statistic combinations, because a different group of animals had to be trained with each tested statistic (see below), meaning that the number of rats required for the experiments increased rapidly with the number of statistics studied. Therefore, we chose to test the 4-point statistic, as well as one each of the two-point and three-point statistics (those shown in Figure 1A). One of the three-point statistics (corresponding to the glider θ) was randomly selected among the four available, since in our previous study no difference was found among the variability of distinct three-point textures across natural images, and aggregate human sensitivity to three-point correlations was measured without distinguishing among glider configurations. As for the two-point statistic, we selected one of the two gliders (the horizontal one) that yielded the largest sensitivity in humans, so as to include in our stimulus set at least an instance of both the most discriminable (two-point -) and least discriminable (three-point θ) textures. In addition, we also tested the one-point statistic because, given the well-established sensitivity of the rat visual system to luminance changes (Minini and Jeffery, 2006; Tafazoli et al., 2017; Vascon et al., 2019; Vermaercke and Op de Beeck, 2012), performance with this statistic served as a useful benchmark against which to compare rat discrimination of the other, more complex textures. Finally, while in Hermundstad et al., 2014, both positive and negative values of the statistics were probed against white noise, here we tested only one side of the texture intensity axis (either positive, for one-, two-, and four-point configurations, or negative, for three-point ones) — again, with the goal of limiting the number of rats used in the experiment (see Materials and methods for more details on the rationale behind the choice of statistics and their polarity, and see Discussion for an assessment of the possible impact of these choices on our conclusions).

For each of the four selected image statistics, we trained a group of rats to discriminate between white noise and structured textures containing that statistic with nonzero intensity (Figure 1A). Each trial of the experiment started with the rat autonomously triggering the presentation of a stimulus by licking the central response port within an array of three (Figure 1B). The animal then reported whether the texture displayed over the monitor placed in front of him contained the statistic (by licking the left port) or white noise (by licking the right port). The rat received liquid reward for correct choices and was subjected to a time-out period for incorrect ones (Figure 1B). In the initial phase of the experiment, the intensity of the statistic was set to a single level, close to the maximum (or minimum, in case of the three-point statistic, for which we used only negative values), to make the discrimination between structured textures and white noise as easy as possible for naive rats that had to learn the task from scratch. The learning curves of four example rats, one per group, are shown in Figure 1—figure supplement 1A. In the following phase of the experiment, the intensity of the statistic was gradually reduced using an adaptive staircase procedure (see Materials and methods) to make the task progressively harder. The asymptotic levels of the statistics reached across consecutive training sessions by four example rats, one per group, are shown in Figure 1—figure supplement 1B. Following this training, rats were subjected to: (1) a main testing phase, where textures were sampled at regular intervals along the intensity level axis and were randomly presented to the animals; and (2) a further testing phase, where rats originally trained with a given statistic were probed with a different one (see Materials and methods for details on training and testing).

The main test phase yielded psychometric curves showing the sensitivity of each animal in discriminating white noise from the structured texture with the assigned statistic (example in Figure 2A, black dots). To interpret results, we developed an ideal observer model, in which the presentation of a texture with a level of the statistic equal to s produces a percept x sampled from a truncated Gaussian distribution centered on the actual value of the statistic (s) with a fixed standard deviation σ (Fleming et al., 2013; Geisler, 2011). Here, σ measures the ‘blurriness’ in the animal's sensory representation for a particular type of statistic (i.e. the perceptual noise) and, consequently, its inverse 1/σ captures its resolution, or sensitivity — i.e., the perceptual threshold for discriminating a structured texture from white noise. As detailed in the Materials and methods, our ideal observer model yields the psychometric function giving the probability of responding ‘noise’ at any given level of the statistic s as

p(report noise|s)=Φ(x*(α,σ)-sσ)-Φ(-1-sσ)Φ(1-sσ)-Φ(-1-sσ)
Figure 2 with 1 supplement see all
Rat sensitivity to multipoint correlations.

(A) Psychometric data for an example rat trained on two-point correlations. Black dots: fraction of trials in which a texture with the corresponding intensity of the statistic was correctly classified as ‘structured’. Empty black circle: fraction of trials the rat has judged a white noise texture as containing the statistic. Blue line: psychometric function corresponding to the fitted ideal observer model (see main text). (B) Psychometric functions obtained for all the rats tested on the four statistics (n indicates the number of animals in each group). (C) Values of the perceptual sensitivity 1/σ to each of the four statistics. Filled dots: individual rat estimates. Empty diamonds: group averages. The dashed line emphasizes the sensitivity ranking observed for the four statistics. Significance markers ** and *** indicate, respectively, p < 0.01 and p < 0.001 for a two-sample t-test with Holm-Bonferroni correction. The same analysis was repeated in Figure 2—figure supplement 1 including only the rats that reached a certain performance criterion during the initial training.

where Φ(x) is the standard Normal cumulative density function, α captures the animal’s prior choice bias and x*(α,σ) is the decision boundary used by the animal to divide the perceptual axis into ‘noise’ and ‘structured texture’ regions. The two free parameters of the model (α and σ) parameterize the psychometric function (example in Figure 2A, blue curve) and can be estimated from behavioral data by maximum likelihood. Prior bias (α) and sensitivity (1/σ) are related, respectively, to the horizontal offset and slope of the curve.

Fitting this model to the behavioral choices of rats in the four groups led to psychometric functions with a characteristic shape, which depended on the order of the multipoint statistic an animal had to discriminate (Figure 2B). In particular, the sensitivity 1/σ followed a specific ranking among the groups (Figure 2C), being higher for one- and two-point than for three-point (p1<0.001 and p2<0.001, two-sample t-test with Holm-Bonferroni correction) and four-point (p1<0.001, p2<0.001) correlations, and larger for four-point than three-point correlations (p<0.01). When focusing on the texture statistics that had been also tested in our previous study (i.e. two-point horizontal, three-point, and four-point correlations), this sensitivity ranking was the same as the one observed in humans and as the variability ranking measured across natural images (Hermundstad et al., 2014): two-point horizontal > four-point > three-point. Moreover, for the set of statistics that were studied both here and in Hermundstad et al., 2014, the actual values of the rat sensitivity matched, up to a scaling factor, both the human sensitivity and the standard deviation of the statistics in natural images (Figure 3). This match was quantified with the ‘degree of correspondence’, defined in Hermundstad et al., 2014, which takes on values between 0 and 1, with one indicating perfect quantitative match (see Materials and methods for details). The degree of correspondence was 0.986 between rat sensitivity and image statistics (p-value: 0.07, Monte Carlo test), and 0.990 between rat sensitivity and human sensitivity (p-value: 0.05, Monte Carlo test). For reference, Hermundstad et al., 2014 reported values between 0.987 and 0.999 for the degree of correspondence between human sensitivity and image statistics. This indicates not only a qualitative but also a quantitative agreement between our findings and the pattern of texture sensitivity predicted by efficient coding.

Quantitative match between rat sensitivities, human sensitivities and texture variability across natural images.

For the three texture statistics that were tested both in our current study with rats and in our earlier study with humans (Hermundstad et al., 2014), rat average sensitivities are compared to human average sensitivities and to the variability of these statistics across natural scenes data from Hermundstad et al., 2014. The three sets of data points have been scaled in such a way that each triplet of sensitivity values had Euclidean norm = 1, so as to allow an easier qualitative comparison (for a quantitative comparison see main text).

To further validate these findings, we performed additional within-group and within-subject comparisons. To this end, each group of animals was either tested with a new statistic or was split into two subgroups, each tested with a different statistic. Results of these additional experiments are reported in Figure 4, comparing the sensitivity to the new statistic(s) with the sensitivity to the originally learned statistic (colored symbols without and with halo, respectively) for each group/subgroup. Rats trained on one- and two-point statistics (the most discriminable ones; see Figure 2C) performed poorly with higher-order correlations (compare the green and purple star with the red star, and the green and purple cross with the blue cross in Figure 4), while animals trained on the four-point statistic performed on two-point correlations as well as rats that were originally trained on those textures (compare the blue square to the blue cross). This shows that the better discriminability of textures containing lower order correlations is a robust phenomenon, which is independent of the history of training and observable within individual subjects. Moreover, performance on four-point correlations was higher than performance on three-point correlations for each group of rats (compare the green to the purple symbols connected by a line). This was true, in particular, not only for rats trained on four-point and switching to three-point (green vs. purple square, p < 0.01, paired one-tailed t-test) but even for rats trained on three-points and switching to four-point (green vs. purple triangle, p < 0.05, paired one-tailed t-test). This means that the larger discriminability of the four-point statistic, as compared to the three-point one, is a statistically robust phenomenon within individual subjects.

Rat sensitivity to multipoint correlations – dependence on training history and within-subject analysis.

Colored points with halo show the average sensitivities (with SEM) of the four groups of rats to the statistics (indicated by the symbols in the key) they were originally trained on (i.e. same data as the colored diamonds in Figure 2C). The other colored symbols connected by a line show the average sensitivities (with SEM) obtained when subgroups of rats originally trained on a given statistic (as indicated by the symbol in the key) were tested with different statistics (as indicated in abscissa). Specifically: (1) out of the nine rats originally trained/tested with one-point correlations (star), four were tested with three-point (purple star) and four with four-point (green star) correlations (one rat did not reach this test phase); (2) out of the nine rats originally trained/tested with two-point correlations (cross), five were tested with three-point (purple cross) and four with four-point (green cross) correlations; (3) out of the eight rats originally trained/tested with three-point correlations (triangle), seven were tested with four-point (green triangle) correlations (one rat did not reach this test phase); and (4) out of the eleven rats originally trained/tested with four-point correlations (square), eight were tested with three-point (purple square) and three with two-point correlations (blue square). Sensitivities achieved by individual animals are represented as shaded data points with the corresponding symbol/color combination.

Discussion

Overall, our results show that rat sensitivity to multipoint statistics is similar to the one we previously observed in humans and to the variability of multipoint correlations we previously measured across natural images (Hermundstad et al., 2014; Tkacik et al., 2010). This agreement holds both qualitatively and quantitatively (Figures 24). Importantly, we found the expected sensitivity ranking (two-point horizontal > four-point > three-point) to be robust not only across groups (Figure 2C) but also for animals that were sequentially tested with multiple texture statistics (Figure 4) - and even at the within-subject level for the crucial three-point vs. four-point comparison. Moreover, we found a high degree of correspondence between rat and human sensitivities (Figure 3).

A potential limitation of our study is related to our stimulus choices, both in terms of selected texture statistics and polarity (i.e. negative vs. positive intensity). A first possible issue is whether the three texture statistics that were tested in both the present study and in Hermundstad et al., 2014 are sufficient to allow a meaningful comparison between rat and human sensitivities, as well as rat sensitivity and texture variability in natural scenes. We addressed this matter at the level of experimental design, by carefully choosing the three statistics that, based on the sensitivity ranking observed in humans, would have yielded the cleanest signature of efficient coding (Hermundstad et al., 2014). That is, we selected two statistics that were, respectively, maximally and minimally variable across natural images, and yielded the largest and lowest sensitivities in humans: horizontal two-point correlations and one of the three-point correlations. The four-point correlation was then a natural choice as the third statistic, as it was the only one characterized by a differently shaped glider. Additionally, human sensitivity to this statistic, as well as its variability across natural images, is only slightly larger than for the three-point configurations. Therefore, finding a reliable sensitivity difference between three-point and four-point textures also for rats would have provided strong evidence for matching texture sensitivity across the two species. Due to the experimental limitations discussed in the Results and the Materials and methods sections, we were unable to analyze one of the oblique two-point statistics, for which human sensitivity takes on an intermediate value between the two-point horizontal and three-point correlations, and that in humans allows one to differentiate between the predictions of efficient coding and those stemming from an oblique effect for patterns that are rotated versions of each other (Hermundstad et al., 2014).

The second potential limitation is related to the choice of polarity (positive or negative intensity values for the examined statistics). This choice was guided by different considerations depending on the kind of statistic. For one-point correlations we chose positive intensity values because they yield patterns that are brighter than white noise. Since previous work from our group has shown that rat V1 neurons are very sensitive to increases of luminance (Tafazoli et al., 2017; Vascon et al., 2019), our choice ensured that one-point textures were highly distinguishable from white noise (as indeed observed in our data; see Figure 2B–C), which was the key requirement for our benchmark statistic. This enabled us to guard against issues in our task design: if the animals had failed to discriminate one-point textures, this would have suggested an overall inadequacy of the behavioral task rather than a lack of perceptual sensitivity to luminance changes. For two-point and four-point statistics we also used positive intensity values — a choice dictated by the need of testing a rodent species that has much lower visual acuity than humans (Keller et al., 2000; Prusky et al., 2002; Zoccolan, 2015). Positive two-point and four-point correlations give rise to large features (thick oriented stripes and wide rectangular blocks made of multiple pixels with the same color), while negative intensities produce higher spatial frequency patterns, where color may change every other pixel (see Figure 2A in Hermundstad et al., 2014). Therefore, using negative two-point and four-point statistics would have introduced a possible confound, since low sensitivity to these textures could have been simply due to the low spatial resolution of rat vision. For three-point correlations, polarity does not affect the shape and size of the emerging visual patterns, but it determines their contrast. Positive and negative intensities yield L-shaped patches that are, respectively, white and black. In this case, we chose the latter to make sure that the well-known dominance of OFF responses observed across the visual systems of many mammal species would not play in favor of finding the lowest sensitivity for the three-point statistic. In fact, several studies have shown that primary visual neurons of primates and cats respond more strongly to black than to white spots and oriented bars (Liu and Yao, 2014; Xing et al., 2010; Yeh et al., 2009). A very recent study has shown that this is the case also for the central visual field of mice, although in the periphery OFF and ON response are more balanced (Williams et al., 2021). Indeed, the asymmetry begins already in the retina where there are more OFF cells than ON cells (Ratliff et al., 2010). Since in our behavioral rigs rats face frontally the stimulus display (Figure 1B) and maintain their head oriented frontally during stimulus presentation (Vanzella et al., 2019), it was important that the L-shaped patterns produced by three-point correlations had the highest saliency. Choosing negative intensity values ensured that this was the case, thus excluding the possibility that the low-sensitivity found for three-point textures (Figures 24) was partially due to presentation at a suboptimal contrast. Notwithstanding these considerations, one could wonder whether probing also the opposite polarities of those tested in our study would be desirable for a tighter test of the efficient coding principle. Previous studies, however, found human sensitivity to be nearly identical for negative and positive intensity variations of each of the statistic tested in our study: one-point, two-point, three-point, and four-point correlations (Victor and Conte, 2012), even in the face of asymmetries of the distribution of the corresponding statistic in natural images (see Figure 3—figure supplement 9 in Hermundstad et al., 2014). In the present work, we have accordingly decided to focus the available resources on the differences between different statistics, rather than between positive and negative intensities of the same statistic.

In summary, our choices of texture types and their polarity were all dictated by the need of adapting to a rodent species texture stimuli that, so far, have only been used in psychophysics studies with humans (Hermundstad et al., 2014; Tesileanu et al., 2020; Tkacik et al., 2010; Victor and Conte, 2012) and neurophysiology studies in monkeys (Purpura et al., 1994; Yu et al., 2015). Our goal was to maximize the sensitivity of the comparison with humans and natural image statistics, while reducing the possible impact of phenomena (such as rat low visual acuity and the dominance of OFF responses) that could have acted as confounding factors. Thanks to these measures, our findings provide a robust demonstration that a rodent species and humans are similarly adapted to process the statistical structure of visual textures, in a way that is consistent with the computational principle of efficient coding. This attests to the fundamental role of natural image statistics in shaping visual processing across species, and opens a path toward a causal test of efficient coding through the altered-rearing experiments that small mammals, such as rodents, allow (Hunt et al., 2013; Matteucci and Zoccolan, 2020; White and Fitzpatrick, 2007).

Materials and methods

Psychophysics experiments

Subjects

A total of 42 male adult Long Evans rats (Charles River Laboratories) were tested in a visual texture discrimination task. Animals started the training at 10 weeks, after 1 week of quarantine upon arrival in our institute and 2 weeks of handling to familiarize them with the experimenters. Their weight at arrival was approximately 300 g and they grew to over 600 g over the time span of the experiment. Rats always had free access to food but their access to water was restricted in the days of the behavioral training (5 days a week). They received 10–20 ml of diluted pear juice (1:4) during the execution of the discrimination task, after which they were also given free access to water for the time needed to reach at least the recommended 50 ml/kg intake per day.

The number of rats was chosen in such a way to yield meaningful statistical analyses (i.e. to have about 10 subjects for each of the texture statistic tested in our study), under the capacity constraint of our behavioral rig. The rig allows to simultaneously test six rats, during the course of 1–1.5 hr (Zoccolan, 2015; Djurdjevic et al., 2018). Given the need for testing four different texture statistics, we started with a first batch of 24 animals (i.e. 6 per statistics), which required about 6 hr of training per day. This first batch was complemented with a second one of 18 more rats, again divided among the four statistics (see below for details), so as to reach the planned number of about 10 animals per texture type. The first batch arrived in November 2018 and was tested throughout most of 2019; the second group arrived in September 2019 and was tested throughout most of 2020. In the first batch, four animals did not reach the test phase (i.e. the phase yielding the data shown in Figure 2A and B), because three of them did not achieve the criterion performance during the initial training phase (see below) and one died shortly after the beginning of the study. In the second batch, one rat died before reaching the test phase and two more died before the last test phase with switched statistics (i.e. the phase yielding the data of Figure 2C).

All animal procedures were conducted in accordance with the international and institutional standards for the care and use of animals in research and were approved by the Italian Ministry of Health and after consulting with a veterinarian (Project DGSAF 25271, submitted on December 1, 2014 and approved on September 4, 2015, approval 940/2015-PR).

Experimental setup

Request a detailed protocol

Rats were trained in a behavioral rig consisting of two racks, each equipped with three operant boxes (a picture of the rig and a schematic of the operant box can be found in previous studies [Zoccolan, 2015; Djurdjevic et al., 2018]). Each box was equipped with a 21.5” LCD monitor (ASUS VEZZHR) for the presentation of the visual stimuli and an array of three stainless-steel feeding needles (Cadence Science), serving as response ports. To this end, each needle was connected to a led-photodiode pair to detect when the nose of the animal approached and touched it (a Phidgets 1203 input/output device was used to collect the signals of the photodiodes). The two lateral feeding needles were also connected to computer-controlled syringe pumps (New Era Pump System NE-500) for delivery of the liquid reward. In each box, one of the walls bore a 4.5 cm-diameter viewing hole, so that a rat could extend its head outside the box, face the stimulus display (located at 30 cm from the hole) and reach the array with the response ports.

Choice of image statistics to be used in the experiment

Request a detailed protocol

As mentioned in the main text, in our experiment we studied the 1-point and 4-point statistic, as well as one of the two-point and one of the three-point statistics. In the nomenclature introduced by Victor and Conte, 2012, these are, respectively, the γ, α, β- and θ statistics. By comparison, in humans, Victor and Conte, 2012 studied a total of five statistics (the same we tested, plus β/), while Hermundstad et al., 2014 tested many more, including combinations of statistic pairs, although they did not investigate γ. Our choice of which statistics to test was constrained on practical and ethical grounds by the need to use the minimum possible number of animals in our experiments, which led us to study one representative statistic per order of the glider. We note also that we decided to test the γ statistic, even though this was omitted by Hermundstad et al., 2014 (as explained in that paper, the method used to assess the variability of all other multipoint correlation patterns in natural images can’t be applied to γ by construction, because the binarization threshold used for images is such that γ=0 for all images in the dataset). The reason for including γ was that it provided a useful control on the effectiveness of our experimental design, as (unlike for the other visual patterns) we expected rats to be able to easily discriminate stimuli differing by average luminosity (Minini and Jeffery, 2006; Tafazoli et al., 2017; Vascon et al., 2019; Vermaercke and Op de Beeck, 2012). As mentioned in the Discussion, failure of the rats to discriminate one-point textures would have indicated a likely issue in the design of the task.

Human sensitivity to multipoint correlation patterns does not distinguish between positive and negative values of the statistics (Victor and Conte, 2012). Therefore, again in order to minimize the number of animals necessary to the experiment, we only collected data for positive values of the γ, β and α statistics, and negative values of the θ statistic (see below for the specific values used). Unlike two- or four-point statistics, θ statistics change contrast under a sign change (namely, positive θ values correspond to white triangular patterns on a black background, and negative θ values correspond to black triangular patterns on a white background). On the other hand, dominance of OFF responses (elicited by dark spots on a light background) has been reported in mammals, including primates, cats, and rodents (Ratliff et al., 2010; Liu and Yao, 2014; Xing et al., 2010; Yeh et al., 2009; Williams et al., 2021). Therefore we reasoned that if rats, unlike humans, were to have a different sensitivity to positive and negative θ values, the sensitivity to negative θ would be the higher of the two.

Finally, for the sake of simplicity, whenever in the text we refer to the ‘intensity’ of a statistic, this should be interpreted as the absolute value of the intensity as defined by Victor and Conte, 2012. This has no effect when describing γ, β, or α statistics, and only means that any value reported for θ should be taken with a sign flip (i.e. negative instead of positive values) if trying to connect formally to the system of coordinates in Victor and Conte, 2012.

Visual stimuli

Request a detailed protocol

Maximum-entropy textures were generated using the methods described by Victor and Conte, 2012. To this end, we implemented a standalone library and software package that we have since made publicly available as free software (Piasini, 2021). In the experiment, we used white noise textures as well as textures with positive levels of four different multipoint statistics, as described above (see also Figure 1A). It should be noted that, with the exception of the extreme value of the γ statistic (γ=1 corresponds to a fully white image), the intensity level of a given statistic does not specify deterministically the resulting texture image. In our experiment, for any intensity level of each statistic, multiple, random instances of the textures were built to be presented to the rats during the discrimination task (see below for more details).

Subjects had to discriminate between visual textures containing one of the four selected statistics and white noise. Each texture had a size of 39 × 22 pixels and occupied the entire monitor (full-field stimuli). The pixels had a dimension of about 2 degrees of visual angle. Given that the maximal resolution of rat vision is about one cycle per degree (Keller et al., 2000; Prusky et al., 2000; Prusky et al., 2002), such a choice of the pixel size guaranteed that the animals could discriminate between neighboring pixels of different color. Textures were showed at full-contrast over the LCD monitors that were calibrated in such a way to have minimal luminance of 0.126 ± 0.004 cd/mm (average ± SD across the six monitors), maximal luminance of 129 ± 5 cd/mm, and an approximately linear luminance response curve.

Discrimination task

Request a detailed protocol

Each rat was trained to: (1) touch the central response port to trigger stimulus presentation and initiate a behavioral trial; and (2) touch one of the lateral response ports to report the identity of the visual stimulus and collect the reward (all the animals were trained with the following stimulus/response association: structured texture → left response port; white noise texture → right response port). The stimulus remained on the display until the animal responded or for a maximum of 5 s, after which the trial was considered as ignored. In case of a correct response the stimulus was removed, a positive reinforcement sound was played and a white (first animal batch) or gray (second batch) background was shown during delivery of the reward. In case of an incorrect choice, the stimulus was removed and a 1–3 s time-out period started, during which the screen flickered from middle-gray to black at a rate of 10 Hz, while a ‘failure’ sound was played. During this period the rat was not allowed to initiate a new trial. To prevent the rats from making impulsive random choices, trials where the animals responded in less than 300 or 400 ms were considered as aborted: the stimulus was immediately removed and a brief sound was played. In each trial, the visual stimuli had the same probability (50%) of being sampled from the pool of white noise textures or from the pool of structured textures, with the constraint that stimuli belonging to the same category were shown for at most n consecutive trials (with n varying between 2 and 3 depending on the animal and on the session), so as to prevent the rats from developing a bias toward one of the response ports.

Stimulus presentation, response collection and reward delivery were controlled via workstations running the open source suite MWorks (https://mworks.github.io;Starwarz and Cox, 2021).

Experimental design

Request a detailed protocol

Each rat was assigned to a specific statistic, from one- to four-point, for which it was trained in phases I and II and then tested in phase III. Generalization to a different statistic from the one the rat was trained on was assessed in phase IV. Out of the 42 rats, 9 were trained with one-point statistics, 9 with two-point, 12 with three-point, and 12 with four-point. The animals that reached phase III were 9, 9, 8, and 11, respectively, for the four statistics.

Phase I

Request a detailed protocol

Initially, rats were trained to discriminate unstructured textures made of white noise from structured textures containing a single high-intensity level of one of the statistics (for one-point and two-point: 0.85; for three-point and four-point: 0.95). To make sure that the animals learned a general distinction between structured and unstructured textures (and not between specific instances of the two stimulus categories), in each trial both kinds of stimuli were randomly sampled (without replacement) from a pool of 350 different textures. Since the rats typically performed between 200 and 300 trials in a training session, every single texture was not shown more than once. A different pool of textures was used in each of the five days within a week of training. The same five texture pools were then used again (in the same order) the following week. Therefore, at least 7 days had to pass before a given texture stimulus was presented again to a rat.

For the first batch of rats, we moved to the second phase of the experiment all the animals that were able to reach at least an average performance of 65% correct choices over a set of 500 trials, collected across a variable number of consecutive sessions (the learning curves of four example rats from this batch, one per group, are shown in Figure 1—figure supplement 1A). Based on this criterion, two rats tested with three-point textures and one rat tested with four-point textures were excluded from further testing. For the second batch of rats, we decided to admit all the animals to the following experimental phases after a prolonged period of training in the first phase. In fact, we reasoned that, in case some texture statistic was particularly hard to discriminate, imposing a criterion performance in the first phase of the experiment would bias the pool of rats tested with such very difficult statistic toward including only exceptionally proficient animals. This in turn, could lead to an overestimation of rat typical sensitivity to such difficult statistic. On the other hand, the failure of a rat to reach a given criterion performance could be due to intrinsic limitations of its visual apparatus (such as a malfunctioning retina or particularly low acuity). Therefore, to make sure that our result did not depend on including in our analysis some animals of the second batch that did not reach 65% correct discrimination in the first training phase, the perceptual sensitivities were re-estimated after excluding those rats (i.e. after excluding one rat from the two-point, three rats from the three-point, and one from the four-point groups). As shown in Figure 2—figure supplement 1, the resulting sensitivity ranking was unchanged (compare to Figure 2C) and all pairwise comparisons remained statistically significant (two-sample t-test with Holm-Bonferroni correction).

Phase II

Request a detailed protocol

In this phase, we introduced progressively lower levels of intensity of each statistic, bringing them gradually closer to the zero-intensity level corresponding to white noise. To this end, we applied an adaptive staircase procedure to update the minimum level of the statistic to be presented to a rat based on its current performance. Briefly, in any given trial, the level of the multipoint correlation in the structured textures was randomly sampled between a minimum level (under the control of the staircase procedure) and a maximum level (fixed at the value used in phase I). Within this range, the sampling was not uniform, but was carried out using a geometric distribution (with the peak at the minimum level), so as to make much more likely for rats to be presented with intensity levels at or close to the minimum. The performance achieved by the rats on the current minimum intensity level was computed every ten trials. If such a performance was higher than 70% correct, the minimum intensity level was decreased by a step of 0.05. By contrast, if the performance was lower than 50%, the minimum intensity level was increased of the same amount.

This procedure allowed the rats to learn to discriminate progressively lower levels of the statistic in a gradual and controlled way (the asymptotic levels of the statistics reached across consecutive training sessions by four example rats of the first batch, one per group, are shown in Figure 1—figure supplement 1B). At the end of this phase, the minimum intensity level reached by the animal in the three groups was: 0.21 ± 0.12, 0.2 ± 0.2, 0.70 ± 0.22, and 0.56 ± 0.18 (group average ± SD) for, respectively, one-, two-, three-, and four-point correlations.

Phase III

Request a detailed protocol

After the training received in phases I and II, the rats were finally moved to the main test phase, where we measured their sensitivity to the multipoint correlations they were trained on. In each trial of this phase, the stimulus was either white noise or a patterned texture with equal probability. If it was a patterned texture, the level of the statistic was randomly selected from the set {0.02, 0.09, 0.16, …, 0.93, 1} (i.e. from 0.02 to 1 in steps of 0.07) with uniform probability. The responses of each rat over this range of intensity levels yielded psychometric curves (see example in Figure 1B), from which rat sensitivity was measured by fitting the Bayesian ideal observer model described below (Figure 2A and B).

Phase IV

Request a detailed protocol

To verify the sensitivity ranking observed in phase III, we carried out an additional test phase, where each rat was tested on a new statistic, which was different from the one the animal was previously trained and tested on. The two groups of rats that were originally trained with the statistics yielding the highest sensitivity in phase III (i.e. one- and two-point correlations; see Figure 2B) were split in approximately equally-sized subgroups and each of these subgroups was tested with the less discriminable statistics (i.e. three- and four-point correlations; leftmost half of Figure 2C). This allowed assessing that, regardless of the training history, sensitivity to four-point correlations was slightly but consistently higher than sensitivity to three-point correlations. For the group of rats originally tested with the three-point statistic, all the animals were switched to the four-point (third set of points in Figure 2C). This allowed comparing the sensitivities to these statistics at the within-subject level (notably, these rats were found to be significantly more sensitive to the four-point textures than to the three-point, despite the extensive training they had received with the latter). For the same reason, most of the rats (8/11) of the last group (i.e. the animals originally trained/tested with the four-point correlations; last set of points in Figure 2C) were switched to the three-point statistic, which yielded again the lowest discriminability. A few animals (3/11) were instead tested with the two-point statistic, thus verifying that the latter was much more discriminable than the four-point one (again, despite the extensive training the animals of this group had received with the four-point textures).

Data Availability
Request a detailed protocol

Experimental data are available at Caramellino et al., 2021.

Ideal observer model

In this section we describe the ideal observer model we used to estimate the sensitivity of the rats to the different textures. The approach is a standard one and is inspired by that in Fleming et al., 2013. Because our intention is to use an ideal observer as a model for animal behavior, we will write interchangeably ‘rat’, ‘animal’, and ‘ideal observer’ in the following.

Preliminaries

Request a detailed protocol

The texture discrimination task is a two-alternative forced choice (2AFC) task, where the stimulus can be either a sample of white noise or a sample of textured noise, and the goal of the animal is to correctly report the identity of each stimulus. On any given trial, either stimulus class can happen with equal probability. The texture class is composed of K discrete, positive values of the texture. In practice, K=14, and these values are {0.02,0.09,,0.93,1}, but we’ll use a generic K in the derivations for clarity. The texture statistics are parametrised such that a statistic value of zero corresponds to white noise. Therefore, if we call s the true level of the statistic, the task is a parametric discrimination task where the animal has to distinguish s=0 from s>0.

Key assumptions

Request a detailed protocol
  1. each trial is independent from those preceding and following it (both for the generated texture and for the animal’s behavior);

  2. on any given trial, the nominal (true) value of the statistic is some value s. Because the texture has finite size, the empirical value of the statistic in the texture will be somewhat different from s. We lump this uncertainty together with that induced by the animal’s perceptual process, and we say that any given trial results on the production of a percept x, sampled from a truncated Normal distribution centered around the nominal value of the statistic and bounded between a=-1 and b=1:p(x|s,σ,a,b)=1σϕ(x-sσ)Φ(b-sσ)-Φ(a-sσ)

    where ϕ() is the probability density function of the standard Normal and Φ() is its cumulative density function. Setting the bounds to –1 and 1 allows us to account for the fact that the value of a statistic is constrained within this range by construction. We will keep a and b in some of the expressions below for generality and clarity, and we will substitute their values only at the end.

  3. we assume that each rat has a certain prior over the statistic level that we parametrise by the log prior odds:

    α:=lnp(s=0)p(s>0)

    where α depends on the rat. More specifically, we assume that each rat assigns a prior probability p(s=0)=1/(1+e-α) to the presentation of a noise sample, and a probability of 1/[K(1+eα)] to the presentation of a texture coming from any of the K nonzero statistic values. In formulae: p(s)=δs,01+e-α+1Kk=1Kδs,sk1+eα where δ is Kronecker’s delta, and sk>0,k{1,,K} are the K possible nonzero values of the statistic. Note that this choice of prior matches the distribution actually used in generating the data for the experiment, except that α is a free parameter instead of being fixed at 0.

  4. we assume that the true values of α, σ, a and b are accessible to the decision making process of the rat.

Derivation of the ideal observer

Request a detailed protocol

For a particular percept, the ideal observer will evaluate the posterior probability of noise vs texture given that percept. It will report ‘noise’ if the posterior of noise is higher than the posterior of texture, and ‘texture’ otherwise.

More in detail, for a given percept x we can define a decision variable D as the log posterior ratio:

(1) D(x):=lnp(s=0|x)p(s>0|x)=lnp(x|s=0)p(x|s>0)+lnp(s=0)p(s>0)

With this definition, the rat will report ‘noise’ when D>0 and ‘texture’ otherwise.

By plugging in the likelihood functions and our choice of prior, we get

(2) D(x)=α+ln[1σϕ(x/σ)Φ(bσ)-Φ(aσ)]-ln[1Kk1σϕ(x-skσ)Φ(b-skσ)-Φ(a-skσ)]

Now, remember that given a value of the percept x, the decision rule based on D is fully deterministic (maximum a posteriori estimate). But on any given trial we don’t know the value of the percept — we only know the nominal value of the statistic. On the other hand, our assumptions above specify the distribution p(x|s) for any s, so the deterministic mapping D(x) means that we can compute the probability of reporting ‘noise’ as,

(3) p(report noise|s)=p(D>0|s)=x:D(x)>0p(x|s)x

We note at this point that D(x) is monotonic: indeed,

(4) D(x)x=xσ2+[1Kkexp[(xsk)22σ2]Φ(bskσ)Φ(askσ)]11Kkexp[(xsk)22σ2](xskσ2)Φ(bskσ)Φ(askσ)=[1Kkexp[(xsk)22σ2]Φ(bskσ)Φ(askσ)]11Kkexp[(xsk)22σ2]skσ2Φ(bskσ)Φ(askσ)<0 for all x

where for the last inequality we have used the fact that a < b and therefore, Φ((bsk)/σ)>Φ((ask)/σ). This result matches the intuitive expectation that a change in percept in the positive direction (i.e. away from zero) should always make it less likely for the observer to report ‘noise’.

Because D(x) is monotonic, there will be a unique value of x such that D(x)=0, and the integration region x:D(x)>0 will simply consist of all values of x smaller than that. More formally, if we define

(5) x=x(α,σ) such that D(x)=0

we can write

(6) p(report noise|s)=ax1σϕ(xsσ)Φ(bsσ)Φ(asσ)dx=Φ(x(α,σ)sσ)Φ(asσ)Φ(bsσ)Φ(asσ)=Φ(x(α,σ)sσ)Φ(1sσ)Φ(1sσ)Φ(1sσ)

where in the last passage we have substituted a=-1 and b=1.

Example: single-level discrimination case

Request a detailed protocol

To give an intuitive interpetation of the results above, consider the case where K=1, so the possible values of the statistic are only two, namely 0 and s1. In this case,

D(1)=α+lnΦ(bs1σ)Φ(as1σ)Φ(bσ)Φ(aσ)2xs1s122σ2=α+β2xs1s122σ2

where

β:=lnΦ(b-s1σ)-Φ(a-s1σ)Φ(bσ)-Φ(aσ)

so that we can write x* in closed form:

x*(1)=D(1)-1(0)=s12+σ2s1(α+β)

which can be read as saying that the decision boundary is halfway between 0 and s1, plus a term that depends on the prior bias and the effect of the boundaries of the domain of x (but involves the sensitivity too, represented by σ).

Simplifying things even further, if we remove the domain boundaries (by setting a- and b+), we have that β0. In this case, by plugging the expression above in Equation 6 we obtain,

(7) p(report noise|s)=Φ[σs1α-(s-s1/2)σ]

and therefore we recover a simple cumulative Normal form for the psychometric function. By looking at Equation 7 it is clear how the prior bias α introduces a horizontal shift in the psychometric curve, and σ controls the slope (but also affects the horizontal location when α0).

Fitting the ideal observer model to the experimental data

Request a detailed protocol

Independently for each rat, we infer a value of α and σ by maximising the likelihood of the data under the model above. More in detail, for a given rat and a given statistic value s (including 0), we call Ns the number of times the rat reported ‘noise’, and Ts the total number of trials. For a given fixed value of α and σ, under the ideal observer model the likelihood of Ns will be given by a Binomial probability distribution for Ts trials and probability of success given by the probability of reporting noise in Equation 6,

ps(Ns|α,σ)=(TsNs)p(rep. noise|s,α,β)Ns(1-p(rep. noise|s,α,β))Ts-Ns

Assuming that the data for the different values of s is conditionally independent given α and σ, the total log likelihood for the data of the given rat is simply the sum of the log likelihoods for the individual values of Ns,

lnp({Nsk}k=1K|α,σ)=k=1Klnpsk(Nk|α,σ)

We find numerically the values of α and σ that maximise this likelihood, using Matlab’s mle function with initial condition α=0.1, σ=0.4. Note that evaluating the likelihood for any given value of α and σ requires finding x*, defined as the zero of Equation 2. We do this numerically by using Matlab’s fzero function with initial condition x=0.

Comparing the estimated sensitivity in rats to sensitivity in humans and variability in natural images

Request a detailed protocol

To compare quantitatively our sensitivity estimates in rat to those in humans and to the variance of the statistics in natural images reported in Hermundstad et al., 2014, we computed the degree of correspondence, as defined in Hermundstad et al., 2014, between these sets of numbers. Briefly, define sr=(1/σβ,1/σθ,1/σα) as the array containing the rat sensitivities for the three statistics that were tested both here and by Hermundstad et al., 2014 (β=β- and θ=θ in the notation used by Hermundstad et al., 2014), sh as the array containing the corresponding values for humans, and v as that containing the standard deviations of the distribution of the corresponding statistics in natural images. For our comparisons, we use the values of v reported by Hermundstad et al., 2014 for the image analysis defined by the parameters N=2 and R=32 (i.e. the analysis used for the numbers reported in the table in Figure 3C in their paper). The degree of correspondence between any two of these arrays is their cosine dissimilarity:

c(rat,human)=srshsrshc(rat,images)=srvsrvc(human,images)=shvshv.

The degree of correspondence is limited by construction to values between 0 and 1, with one indicating a perfect correspondence up to a scaling factor. Hermundstad et al., 2014 report values of 0.987–0.999 for c(human,images), averaging over all texture coordinates and depending on the details of the analysis.

To assess statistical significance of our values of c, we compare our estimated values with the null probability distribution of the cosine dissimilarity of two unit vectors sampled randomly in the positive orthant of the 3-dimensional Euclidean space. If such vectors are described, in spherical coordinates, as

v1=(θ1,ϕ1),v2=(θ2,ϕ2)

with 0θ1,θ2,ϕ1,ϕ2π/2, the cosine of the angle they form with each other is

cnull=v1v2=cos(θ1)cos(θ2)+cos(ϕ1-ϕ2)sin(θ1)sin(θ2)

The p-values reported in the text for c(rat,human) and c(rat,images) are computed by sampling 107 values of cnull, and assessing the fraction of samples with values larger than the empirical estimates.

Data availability

Experimental data are available at (Caramellino et al., 2021).

The following data sets were generated
    1. Caramellino R
    2. Piasini E
    3. Buccellato A
    4. Carboncino A
    5. Balasubramanian V
    6. Zoccolan D
    (2021) Zenodo
    Data from "Rat sensitivity to multipoint statistics is predicted by efficient coding of natural scenes".
    https://doi.org/10.5281/zenodo.4762567

References

    1. Victor JD
    2. Conte MM
    (2012) Local image statistics: maximum-entropy constructions and perceptual salience
    Journal of the Optical Society of America. A, Optics, Image Science, and Vision 29:1313–1345.
    https://doi.org/10.1364/JOSAA.29.001313

Decision letter

  1. Stephanie E Palmer
    Reviewing Editor; The University of Chicago, United States
  2. Timothy E Behrens
    Senior Editor; University of Oxford, United Kingdom

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Rat sensitivity to multipoint statistics is predicted by efficient coding of natural scenes" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Timothy Behrens as the Senior Editor. The reviewers have opted to remain anonymous.

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

All reviewers thought that the paper was exciting, but needed some revision to clarify the results and presentation.

1) It would be useful if the manuscript scaled back claims about the alignment between the human and rat data slightly to make them more comparable with the results presented in the paper, and also to emphasize the ranking of sensitivities, qualitatively, is what's established, rather than a precise quantitative match to the full correlation matrix. Some specific points along these lines are:

– It is unclear why the ranking in rat sensitivity is evidence for efficient coding. In Hermundstad et al., 2014, efficient coding was established by comparing the image-based precision matrix with the human perceptual isodiscrimination contours. There is no such comparison here.

– Revision prompt (1a) Claims should be softened slightly.

– The previous paper emphasized that the difference of perceptual sensitivity between horizontal/vertical edges and diagonal edges is not merely an "oblique effect": Horizontal and vertical pairwise correlation share an edge, while pixels involved in diagonal pairwise correlations only share a corner. One wonders whether rats show any sensitivity difference between horizontal/vertical edges and diagonal edges. The manuscript in its current form misses this important comparison. Without showing this, the rat sensitivity does not fully reproduce the trend previously observed in humans. It seems like acquiring new data from the rats is prohibitively time-consuming, so again, the claims of the paper should be softened a bit.

Revision prompt (1b) – If possible, it would be useful to see a comparison of the rat sensitivity to different 2-point correlations, and a note about whether it matches the human data or not.

Revision prompt (1c) – It would be very helpful if the authors can generate analysis as in Figure 3B or 3C in Hermundstad et al., 2014 (3C is maybe easier?). If such analysis is possible, then it shows that the rat sensitivity also quantitatively matches the results from efficient coding. Again, if this is prohibitive, claims should be softened.

2) One part of the analysis was unclear: Why does it work with this theory to find the sensitivity only to positive parity values?

It seemed surprising that one would not need to test negative pairwise correlations, negative 3-point correlations, or negative 4-point patterns. For the 3-point glider, in particular, the large correlated patches (triangles) change contrast when parity is inverted, so it is the only correlational stimulus in this set that inverts contrast under parity inversion (besides the trivial 1-point glider). Given the light-dark asymmetries of the natural world, it would seem possible that the three-point sensitivity depends (strongly?) on the parity. This seems to be true of some older point-statistic discrimination tasks in humans (from Chubb?), where the number of black pixels (rather than merely dark gray) seemed to account for human discrimination thresholds. The parity of 3-point gliders clearly makes an impact on motion perception when these are looked at in space-time (i.e., Hu and Victor and various subsequent work in flies and fish), and the percept strength is also different for positive vs. negative parity. So, given the contrast inversion asymmetry in 3-point gliders and prior work on light-dark asymmetries in discriminability, it seems one needs to test whether sensitivity is the same under positive and negative parity for these types of spatial correlations. If the authors contend that this is not necessary given the efficient coding hypothesis being tested, some discussion is warranted of light-dark asymmetries in natural scenes and in this suite of stimuli, and why they are neglected in this framework (if that's the case).

3) Figure 3 needs revision for clarity. All reviewers found the layout confusing. Perhaps the authors could find a clearer way to present the results, using more figure panels.

4) The luminance values listed for the visual stimuli seem rather odd, since the mean luminance is not the average of the max and min luminance (the light and dark pixels). This seems to imply that these patterns do contain not equal numbers of light and dark pixels, which they should for all the 2, 3, and 4 point glider stimuli. It's not clear how this is consistent with the described experiments. Please clarify this point in the text.

https://doi.org/10.7554/eLife.72081.sa1

Author response

Essential revisions:

All reviewers thought that the paper was exciting, but needed some revision to clarify the results and presentation.

1) It would be useful if the manuscript scaled back claims about the alignment between the human and rat data slightly to make them more comparable with the results presented in the paper, and also to emphasize the ranking of sensitivities, qualitatively, is what's established, rather than a precise quantitative match to the full correlation matrix. Some specific points along these lines are:

– It is unclear why the ranking in rat sensitivity is evidence for efficient coding. In Hermundstad et al., 2014, efficient coding was established by comparing the image-based precision matrix with the human perceptual isodiscrimination contours. There is no such comparison here.

– Revision prompt (1a) Claims should be softened slightly.

We thank the reviewers for underscoring the difference between, on the one hand, a quantitative comparison of the sensitivity to the variance of the statistics in natural images, and on the other hand a more qualitative comparison of their rank ordering. In our initial submission, we built our argument based on the rankings in order to better connect not only with Hermundstad et al., 2014, but also with earlier human psychophysics results on the same task (Victor and Conte 2012), where there was no comparison with natural image statistics and therefore only the qualitative ranking among sensitivities was examined. We also note that Hermundstad et al., do, in fact, make ample use of the rank-ordering agreement between natural image statistics and human sensitivity in order to support their argument (“rank-order”, or similar locutions, are used six times between the results and the discussion). In this sense, while it is true that “In Hermundstad et al., 2014, efficient coding was established by comparing the image-based precision matrix with the human perceptual isodiscrimination contours”, it is also true that the rank ordering was presented as part of the evidence for efficient coding.

Having said this, we nevertheless agree that our argument can be strengthened by presenting both approaches, qualitative and quantitative. We have now added a new figure (Figure 3), where we compare our estimates of psychophysical sensitivity in rats with the corresponding values for human psychophysics and natural image statistics reported in Hermunstad 2014 (note that we were only able to compare three out of the four statistics that we tested, as Hermunstad et al., did not consider 1-point correlations). The comparisons in Figure 3 (and the related quantitative measures reported in the text, lines 146-160) reveal a strong quantitative match, similar to that between the human psychophysics and the image statistic data.

Finally, in response to a specific point raised by Reviewer 2, point 1 (“Hermundstad et al., 2014 did not include 1st-order at all.”), the 1st order statistic γ was indeed only studied in Victor and Conte 2012, which only contained phychophysics data, and not in Hermundstad et al., 2014, which connected psychophysics with natural image statistics. Indeed, it is not possible to analyze the variability of γ in natural images with the method established by Hermundstad et al., 2014, because each image is binarized in such a way to guarantee that γ=0 by construction. In this sense, like the use of qualitative ranking discussed above, γ was included to better reflect the approach in Victor and Conte. Moreover, we wanted to include a sensory stimulus condition that we were sure the animals could detect well, in order to ensure that any failure to learn or perform the task was due to limitations in sensory processing and not in the learning or decision-making process. Before performing our experiments, the only statistic that we were confident the rats could be trained to distinguish from noise was γ [Tafazoli et al., 2017, Vascon et al., 2019], and therefore it made sense to include it in the experimental design. We have modified the Results (lines 90-93, 104-108), the Methods (316-321) and the Discussion (212-218) to express this point more clearly.

– The previous paper emphasized that the difference of perceptual sensitivity between horizontal/vertical edges and diagonal edges is not merely an "oblique effect": Horizontal and vertical pairwise correlation share an edge, while pixels involved in diagonal pairwise correlations only share a corner. One wonders whether rats show any sensitivity difference between horizontal/vertical edges and diagonal edges. The manuscript in its current form misses this important comparison. Without showing this, the rat sensitivity does not fully reproduce the trend previously observed in humans. It seems like acquiring new data from the rats is prohibitively time-consuming, so again, the claims of the paper should be softened a bit.

Revision prompt (1b) – If possible, it would be useful to see a comparison of the rat sensitivity to different 2-point correlations, and a note about whether it matches the human data or not.

When designing our experiment, we prioritized collecting data for the other statistics as they were closer to the extremes of the measured sensitivity values, therefore offering a clearer signal for a comparison with rat data. For instance, had we found better sensitivity to 3- or 4-point statistics than to (horizontal) 2-point statistics, this would have been a very clear sign that perceptual sensitivity in rat is organized differently than in humans. Conversely, we reasoned that a comparison based on 2-point diagonal instead of 2-point horizontal would have been more easily muddled and made inconclusive by the experimental noise that we expected to observe in rats. We agree that, given the high precision of the quantitative match between rats, humans and image statistics now highlighted by the new Figure 3, it would be interesting to test rats also for their sensitivity to diagonal 2-point correlations and check whether they matched the pattern exhibited by humans. However, as the editor rightly surmises, acquiring new data at this stage would indeed be exceedingly time consuming. Therefore, we have modified the text to better highlight that we did not seek to replicate this particular result in Hermundstad et al., 2014 (as well as that we could not test as many correlation patterns as in Hermundstad et al., 2014 more generally, due to practical and ethical constraints). We also note that, since we did not test 2-point diagonal, we can’t draw conclusions similar to those in Hermundstad 2014 about the difference of an effect due to efficient coding and one due to a hypothetical oblique effect for the specific 2-point horizontal vs. diagonal comparison. These points are now all brought up in the Discussion of our revised manuscript (lines 189-208). It is also worth noting that the oblique effect was a minor point of the Hermundstad et al., paper and the main arguments did not hinge on it.

Revision prompt (1c) – It would be very helpful if the authors can generate analysis as in Figure 3B or 3C in Hermundstad et al., 2014 (3C is maybe easier?). If such analysis is possible, then it shows that the rat sensitivity also quantitatively matches the results from efficient coding. Again, if this is prohibitive, claims should be softened.

Thank you for the suggestion. As mentioned above, we have now added a new figure (Figure 3) where we compare rat sensitivity, human sensitivity, and image statistic data in a way similar to Figure 3B in Hermundstad 2014, for the image statistics that were tested in both our experiment and in Hermundstad et al., 2014. We have also computed the “degree of correspondence” c between rat and image data and between rat and human data, using the definition of this metric introduced by Hermundstad et al., and reported by them in Figure 3C and in the main text. The degree of correspondence captures quantitatively the excellent match between rat, human and image data, with c(rat, image)=0.986 and c(rat, human)=0.990, where 0≤c≤1, and c=1 indicates perfect match. These results are reported in the Results section of the updated manuscript (lines 146-160).

2) One part of the analysis was unclear: Why does it work with this theory to find the sensitivity only to positive parity values?

It seemed surprising that one would not need to test negative pairwise correlations, negative 3-point correlations, or negative 4-point patterns. For the 3-point glider, in particular, the large correlated patches (triangles) change contrast when parity is inverted, so it is the only correlational stimulus in this set that inverts contrast under parity inversion (besides the trivial 1-point glider). Given the light-dark asymmetries of the natural world, it would seem possible that the three-point sensitivity depends (strongly?) on the parity. This seems to be true of some older point-statistic discrimination tasks in humans (from Chubb?), where the number of black pixels (rather than merely dark gray) seemed to account for human discrimination thresholds. The parity of 3-point gliders clearly makes an impact on motion perception when these are looked at in space-time (i.e., Hu and Victor and various subsequent work in flies and fish), and the percept strength is also different for positive vs. negative parity. So, given the contrast inversion asymmetry in 3-point gliders and prior work on light-dark asymmetries in discriminability, it seems one needs to test whether sensitivity is the same under positive and negative parity for these types of spatial correlations. If the authors contend that this is not necessary given the efficient coding hypothesis being tested, some discussion is warranted of light-dark asymmetries in natural scenes and in this suite of stimuli, and why they are neglected in this framework (if that’s the case).

Before addressing the reviewer’s point, we should first clarify that the range of values of the 3-point statistic used in our experiment in fact spans the negative, rather than positive, half of the space of possibilities in the parameterization of Victor and Conte, 2012. We reported these as positive values in our initial submission due to an inversion of the 3-point axis in our code. This is simply a matter of convention and does not affect any of the arguments made in the paper, or the reviewer’s point, but we wanted to clarify this first. We have now explained in the Methods section (lines 334-338) that, although we still refer to 3-point intensities using positive numbers, if the reader is interested in connecting formally to the system of coordinates in Victor and Conte 2012, the sign of the values of 3-point statistics we report should be inverted.

In humans, the answer to the question raised by the reviewer is already known: Victor and Conte report that “consistently across subjects, thresholds for negative and positive variations of each statistic are closely matched” (Victor and Conte 2012, caption to Figure 7). Similarly, Hermundstad et al., 2014 remark on the same phenomenon and investigate it specifically (Hermundstad et al., 2014, Figure 3 —figure sup)

Moreover, even foregoing the above argument about human equal sensitivity to the positive and negative variations in the statistics, we observe the following with respect to the mention of the contrast inversion asymmetry in 3-point gliders, in relation to the light-dark asymmetry in natural scenes. Dominance of OFF responses (elicited by dark spots on a light background) has been reported in mammals, including primates, cats, and, more recently, rodents (Liu and Yao 2014; Xing, Yeh, and Shapley 2010; Yeh, Xing, and Shapley 2009; Williams et al., 2021). Therefore, if rats unlike humans had different sensitivity to positive and negative 3-point statistics, one would expect that the sensitivity to the negative 3-point correlations would be the highest of the two (as negative intensities corresponds to dark triangular patterns on white background). Since we are interested in the hypothesis that the 3-point configuration is the statistic with the lowest sensitivity of those we tested, by testing negative intensities we are choosing the stricter test, whereas testing positive values would risk biasing the experiment towards the desired conclusion. Indeed, this was the reason why the negative half of the 3-point axis was chosen in the first place.

As for the reason we used positive intensity values of the 2-point and 4-point statistics, this choice was dictated by the need of testing rats with textures containing features that were large enough to be processed by their low-resolution visual system. In fact, rat visual acuity is much lower than human acuity and, while positive 2-point and 4-point correlations give rise to, respectively, thick oriented stripes and wide rectangular blocks made of multiple pixels with the same color, negative intensities produce higher spatial frequency patterns, where color may change every other pixel see Figure 2A in Hermundstad et al., 2014. Therefore, using negative 2-point and 4-point statistics would have introduced a possible confound, since low sensitivity to these textures could have been simply due to the low spatial resolution of rat vision. Finally, in the case of the 1-point statistic, positive intensity values were chosen because they yield patterns that are brighter than white noise and, as such (we reasoned), would be highly distinguishable from white noise, given the high sensitivity of rat V1 neurons to increases of luminance.

All these explanations are now provided in the Methods section (lines 322-338) and a thorough discussion of the possible impact of our stimulus choices (both at the level of texture type and polarity) on our conclusions is now presented in the Discussion of our revised manuscript, including the rationale behind testing only on either positive or negative values of each given statistic (lines 189-249).

3) Figure 3 needs revision for clarity. All reviewers found the layout confusing. Perhaps the authors could find a clearer way to present the results, using more figure panels.

Thank you for this valuable suggestion. This figure (that in our revised manuscript has become Figure 4) has now been redesigned from scratch for better clarity. We still kept all data in a single panel in order to enable easy comparisons between conditions with same test statistic but different training, or same training and different test statistics.

4) The luminance values listed for the visual stimuli seem rather odd, since the mean luminance is not the average of the max and min luminance (the light and dark pixels). This seems to imply that these patterns do contain not equal numbers of light and dark pixels, which they should for all the 2, 3, and 4 point glider stimuli. It's not clear how this is consistent with the described experiments. Please clarify this point in the text.

Thank you for noticing this inconsistency with the luminance values reported in our Method section. In fact, while the minimal and maximal luminance values of the display were correctly reported in our sentence, the luminance that we reported as corresponding to mid-gray was incorrect. The actual value is, as it should, halfway between the maximum and minimum (average among monitors is equal to 61±8 cd/mm). This error was due to the fact that we erroneously reported the luminance of pixel intensity level 128 without taking into account the linearization of the pixel/luminance curve that we carried out before presenting the stimuli. We have now corrected this error in our revised Methods (lines 355-357), where we simply report the average maximal and minimal luminosity levels of the monitors.

https://doi.org/10.7554/eLife.72081.sa2

Article and author information

Author details

  1. Riccardo Caramellino

    Visual Neuroscience Lab, International School for Advanced Studies, Trieste, Italy
    Contribution
    Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Visualization, Writing - original draft
    Contributed equally with
    Eugenio Piasini
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-2201-8079
  2. Eugenio Piasini

    Computational Neuroscience Initiative, University of Pennsylvania, Philadelphia, United States
    Present address
    Neural Computation Lab, International School for Advanced Studies, Trieste, Italy
    Contribution
    Conceptualization, Formal analysis, Methodology, Software, Writing – review and editing
    Contributed equally with
    Riccardo Caramellino
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-0384-7699
  3. Andrea Buccellato

    Visual Neuroscience Lab, International School for Advanced Studies, Trieste, Italy
    Contribution
    Investigation
    Competing interests
    No competing interests declared
  4. Anna Carboncino

    Visual Neuroscience Lab, International School for Advanced Studies, Trieste, Italy
    Contribution
    Investigation
    Competing interests
    No competing interests declared
  5. Vijay Balasubramanian

    Computational Neuroscience Initiative, University of Pennsylvania, Philadelphia, United States
    Contribution
    Conceptualization, Funding acquisition, Supervision, Writing – review and editing
    For correspondence
    vijay@physics.upenn.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-6497-3819
  6. Davide Zoccolan

    Visual Neuroscience Lab, International School for Advanced Studies, Trieste, Italy
    Contribution
    Conceptualization, Funding acquisition, Methodology, Project administration, Resources, Supervision, Writing – review and editing
    For correspondence
    zoccolan@sissa.it
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7221-4188

Funding

FP7 Ideas: European Research Council (616803-LEARN2SEE)

  • Davide Zoccolan

National Science Foundation (1734030)

  • Vijay Balasubramanian

National Institutes of Health (R01NS113241)

  • Eugenio Piasini

Computational Neuroscience Initiative of the University of Pennsylvania

  • Vijay Balasubramanian

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We acknowledge the financial support of the European Research Council Consolidator Grant project no. 616803-LEARN2SEE (DZ), the National Science Foundation grant 1734030 (VB), the National Institutes of Health grant R01NS113241 (EP) and the Computational Neuroscience Initiative of the University of Pennsylvania (VB). These funding sources had no role in the design of this study and its execution, as well as in the analyses, interpretation of the data, or decision to submit results.

Ethics

All animal procedures were conducted in accordance with the international and institutional standards for the care and use of animals in research and were approved by the Italian Ministry of Health and after consulting with a veterinarian (Project DGSAF 25271, submitted on December 1, 2014 and approved on September 4, 2015, approval 940/2015-PR).

Senior Editor

  1. Timothy E Behrens, University of Oxford, United Kingdom

Reviewing Editor

  1. Stephanie E Palmer, The University of Chicago, United States

Publication history

  1. Preprint posted: May 18, 2021 (view preprint)
  2. Received: July 13, 2021
  3. Accepted: November 18, 2021
  4. Version of Record published: December 7, 2021 (version 1)

Copyright

© 2021, Caramellino et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 568
    Page views
  • 72
    Downloads
  • 2
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Riccardo Caramellino
  2. Eugenio Piasini
  3. Andrea Buccellato
  4. Anna Carboncino
  5. Vijay Balasubramanian
  6. Davide Zoccolan
(2021)
Rat sensitivity to multipoint statistics is predicted by efficient coding of natural scenes
eLife 10:e72081.
https://doi.org/10.7554/eLife.72081
  1. Further reading

Further reading

    1. Developmental Biology
    2. Neuroscience
    Mariah L Hoye et al.
    Research Article

    Mutations in the RNA helicase, DDX3X, are a leading cause of Intellectual Disability and present as DDX3X syndrome, a neurodevelopmental disorder associated with cortical malformations and autism. Yet, the cellular and molecular mechanisms by which DDX3X controls cortical development are largely unknown. Here, using a mouse model of Ddx3x loss-of-function we demonstrate that DDX3X directs translational and cell cycle control of neural progenitors, which underlies precise corticogenesis. First, we show brain development is sensitive to Ddx3x dosage; complete Ddx3x loss from neural progenitors causes microcephaly in females, whereas hemizygous males and heterozygous females show reduced neurogenesis without marked microcephaly. In addition, Ddx3x loss is sexually dimorphic, as its paralog, Ddx3y, compensates for Ddx3x in the developing male neocortex. Using live imaging of progenitors, we show that DDX3X promotes neuronal generation by regulating both cell cycle duration and neurogenic divisions. Finally, we use ribosome profiling in vivo to discover the repertoire of translated transcripts in neural progenitors, including those which are DDX3X-dependent and essential for neurogenesis. Our study reveals invaluable new insights into the etiology of DDX3X syndrome, implicating dysregulated progenitor cell cycle dynamics and translation as pathogenic mechanisms.

    1. Neuroscience
    Payel Chatterjee et al.
    Research Article

    During flight maneuvers, insects exhibit compensatory head movements which are essential for stabilizing the visual field on their retina, reducing motion blur, and supporting visual self-motion estimation. In Diptera, such head movements are mediated via visual feedback from their compound eyes that detect retinal slip, as well as rapid mechanosensory feedback from their halteres - the modified hindwings that sense the angular rates of body rotations. Because non-Dipteran insects lack halteres, it is not known if mechanosensory feedback about body rotations plays any role in their head stabilization response. Diverse non-Dipteran insects are known to rely on visual and antennal mechanosensory feedback for flight control. In hawkmoths, for instance, reduction of antennal mechanosensory feedback severely compromises their ability to control flight. Similarly, when the head movements of freely-flying moths are restricted, their flight ability is also severely impaired. The role of compensatory head movements as well as multimodal feedback in insect flight raises an interesting question: in insects that lack halteres, what sensory cues are required for head stabilization? Here, we show that in the nocturnal hawkmoth Daphnis nerii, compensatory head movements are mediated by combined visual and antennal mechanosensory feedback. We subjected tethered moths to open-loop body roll rotations under different lighting conditions, and measured their ability to maintain head angle in the presence or absence of antennal mechanosensory feedback. Our study suggests that head stabilization in moths is mediated primarily by visual feedback during roll movements at lower frequencies, whereas antennal mechanosensory feedback is required when roll occurs at higher frequency. These findings are consistent with the hypothesis that control of head angle results from a multimodal feedback loop that integrates both visual and antennal mechanosensory feedback, albeit at different latencies. At adequate light levels, visual feedback is sufficient for head stabilization primarily at low frequencies of body roll. However, under dark conditions, antennal mechanosensory feedback is essential for the control of head movements at high of body roll.