Introduction

Recent studies challenge the traditional notion in mice of the primary visual cortex (V1) receptive field (RF), in which the neural response is directly driven by feedforward inputs and modulated by stimuli in the surround. The traditional perspective on the RF surround suggests that it modulates the responses within the RF. A substantial body of literature, primarily on primates and cats (Maffei and Fiorentini, 1976; Allman et al., 1985; Angelucci et al., 2002, 2017), supports this notion. However, recent studies conducted on mice challenge this view, revealing that neurons in V1 can be activated by stimuli presented in the surround, even when a gray patch occludes the classic RF (Keller et al., 2020). To specifically investigate responses to the surround of the RF, the stimulation protocols consisted of presenting stimuli in the surround while occluding the RF center with a patch. While some inconsistencies exist in primate studies (Slllito et al. (1995); Cavanaugh et al. (2002); Gieselmann and Thiele (2008), see Discussion) this phenomenon has been consistently characterized in mice (Schnabel et al., 2018; Keller et al., 2020; Kirchberger et al., 2023). Optogenetic manipulation suggests that the increase in firing rates for the gray patch depends on feedback connections from higher-order areas (Keller et al., 2020). However, it remains unclear how to functionally interpret the responses in the absence of direct stimulation of the classical RF and how they relate to the properties of stimuli in the surround.

A first possible interpretation is that the increase in neural firing in response to a stimulus in the surround reflects a prediction error response, i.e. a mismatch or omission signal that results from a comparison between the gray mask with its surrounding stimuli (Rao and Ballard, 1999; Keller et al., 2020). A second explanation for responses to surround stimuli in the absence of bottom-up inputs is a fill-in effect, where neurons interpolate information from the surrounding context to infer the occluded part of the image (Ramachandran and Gregory, 1991; Fiorani Júnior et al., 1992; Derrington, 1996; Pessoa et al., 1998; Murakami et al., 1997; Komatsu, 2006; Muckli et al., 2015). Such an inference may reflect a similar process as e.g. the response to an illusory contour, which also does not require direct stimulation of the classical RF (von der Heydt et al., 1984; Lee and Nguyen, 2001; Pak et al., 2020; Nieder, 2002; Redies et al., 1984). A third possible explanation for increased firing rates to the surround stimulus is that the surround stimulus effectively creates a gray achromatic surface stimulus, which is represented by enhanced neural firing. In macaque V1, Zweig et al. (2015) have shown that responses to black or white achromatic surfaces are delayed at the center compared to the edge, with a systematic increase in response latency depending on the size of the surface. Zweig et al. (2015) suggests that this reflects the perceptual inference of the achromatic surface information, requiring information transfer from the edge towards the center of the surface. Fourth, (Schnabel et al., 2018; Kirchberger et al., 2023) explain responses to distal surround stimuli as figure-ground modulation. At present, it is unclear which of these interpretations are supported by empirical data.

An important factor to consider is the size of the gray patch. For example, Keller et al. (2020) shows a maximum response to a 15° gray patch. However, it is possible that for such a small gray patch, combined with the inclusion of neurons with RF distances to the patch center up to 10° of visual angle, the surrounding stimuli might still stimulate the classical RF (see Discussion). The response to this bottom-up input can then be amplified due to a center-surround mismatch. We raise some important factors to consider when describing responses to only surround stimulation. It is therefore important to demonstrate responses for large gray patches where a direct feedforward input can be excluded. Although some figures in Keller et al. (2020) show a small increase relative to baseline for gray patches larger than 50°, this effect was not consistent in the study, which may e.g. reflect specific fluorescence normalization with two-photon imaging. The size of the patch may also be an important factor in determining the latency of the neural responses. Zweig et al. (2015) have shown that responses to black or white achromatic surfaces are delayed at the center compared to the edge, with a systematic increase in response latency depending on the size of the surface. This effect was interpreted as a gradual perceptual inference of the surface information, traveling from the edge toward the center of the surface.

In the present study, we recorded V1 and lateral geniculate nucleus (LGN) neurons using Neuropixels in awake mice. We demonstrate that neural responses in V1 can increase with stimulation of the distal surround, up to 90° diameter, whereas LGN responses decrease for the same stimuli. Based on this observation, we perform a detailed investigation of the neural responses to distal surround stimuli with a large gray patch covering the RF. We systematically investigate the dependence of the neural response on the properties of the surrounding stimuli using single-unit and population decoding analyses. We presented different types of stimuli: (1) stationary and drifting gratings that were spatially continuous and moved coherently; (2) surround stimuli that were divided into two gratings that lacked motion coherence and could be spatially discontinuous; (3) noisy textures; and (4) luminous surfaces (black/white).

Results

Responses to occluded grating stimuli

We used Neuropixel probes to record neural activity across all layers of V1 in head-fixed mice placed on a running disk (Figure 1a,b; see Methods). We centered the visual stimuli on RFs of the recorded neurons (Figure 1c, see Methods). In the first experimental paradigm, we presented grating stimuli. Stimuli were presented in two main conditions, namely “Gratings” (Gr) and “Gray-center/Grating-surround” (GcGr)(Figure 1d). In the classical condition, gratings of different sizes were presented with direct visual stimulation of the neurons’ classical RFs. By contrast, in the Gray-center/Grating-surround condition, gray circular patches of different sizes were positioned at the center of the stimulus, similar to (Keller et al., 2020). The gray patches thus occluded the stimulus behind the patch and had the same intensity as the gray screen in the inter-trial interval. Grating stimuli were presented either in drifting or in a stationary condition (i.e., DrGr and StGr for classical, DrGcGr and StGcGr for the gray patch condition). We included only single units that met several criteria in terms of visual responses (see Methods) and, importantly, had less than 10° (absolute) distance to the grating stimulus’ RF center.

V1 responses to gray patches with gratings in the surround. a) Extracellular recordings across V1 layers in awake head-fixed mice on a running disk. b) Example session of current source density analysis to identify cortical layers. c) Sparse noise protocol (top) for RF (receptive field) mapping. Example RF for MUA (multi-unit activity). d) Illustration of main stimulus conditions: In the classical condition (Gr), grating stimuli of different sizes were presented, either drifting (Dr) or stationary (St). In the gray-center/grating-surround condition (GcGr), a gray patch was centered on the neuronal RF and had the same luminance as the background during the inter-trial interval. Hence, at stimulus onset, only the surround stimulus changes. Shown here is a grating (top) or gray center patch (bottom) of 70°. The dashed circle represents an RF of 20° diameter. e) Average firing rates of single units, normalized to baseline, shown in logarithmic scale. The left panel corresponds to the early (0.04 s to 0.15 s) and the right panel to the late stimulus period (0.2 s to 1 s after stimulus onset) (number of neurons n=335, 6 animals). f) Statistical analysis for all conditions from e). Sizes are separated into small < 45° and large ≥ 45° (*p-values < 0.01 comparing drifting vs. stationary per size, Wilcoxon signed-rank test). g) Average spike density normalized to baseline in logarithmic scale. Solid lines represent drifting conditions and dashed lines represent stationary conditions. The black line on top of each subplot represents the stimulus period. h) Histogram of rise times of neural responses for Gr (black) or GcGr (red) (sizes ≥ 45°), PDF is the probability density function. Solid lines are a Kernel smoothing function of the histogram. Difference between drifting and stationary gratings: *p-value < 0.01. i) Each point represents the peak response time of the average spike density function as a function of response magnitude.

In the classical condition, we found that neurons showed maximum firing rates for gratings of sizes around 15°-25° of diameter, with surround suppression for larger patch sizes (Figure 1e-g, S1a-c). In the Gray-center/Grating-surround condition, neural firing rates also reached a maximum value when the gray center patch had a size of around 15° of diameter, with a decrease in firing for larger patch sizes (Figure 1e-g). In agreement with Keller et al. (2020), we observed an increase in neural firing rates during the late stimulus period for a 15° diameter gray patch, as compared to the response to full-field drifting grating stimuli (Figure S2d). Keller et al. (2020) have described this firing increase as an “inverse receptive field”. We did not observe such an increase in firing, on average, when the grating in the surround was stationary (Figure S2c,d).

Increased firing rates relative to baseline for gray patch sizes of 15°-25° diameter (Figure 1e-g) might potentially be explained by the presence of an edge in the neural receptive field. Across neurons, the stimulus was not exactly centered on the neuronal RF but could be 10° diameter away from the RF center, which might place part of the surround stimulus in the RF. This explanation does not apply to larger patch sizes, such that we can be certain that the surround stimulus does not induce a direct bottom-up drive. In our experiment, we included gray patches with sizes up to 90°. Strikingly, we found that, on average, firing rates were increased relative to baseline (i.e. the gray screen) even for large gray patch sizes up to 90° (Figure 1e-h). That is, a stimulus presented in the distal surround induced a reliable increase in firing rates across neurons. We hereafter refer to this effect as the surround-induced response. Surround-induced responses were stronger in the early stimulus period than in the late stimulus period (Figure 1h). Furthermore, surround-induced responses were found for both drifting and stationary grating stimuli presented in the distal surround. However, we observed a stronger rate increase for drifting gratings in the late stimulus period (Figure 1e,f).

We further analyzed how the temporal structure of surround-induced response differed from responses in the classical stimulus condition. Keller et al. (2020) have shown that firing responses in the gray-center/grating-surround condition are delayed as compared to the classical grating condition, suggesting a potential dependence on feedback processing. However Keller et al. (2020) did not study the dependence of this effect on stimulus latency. To study the latencies of firing responses, we first quantified, for each unit, the rise time of the neural response based on the spike-density function. Comparing classical and stimuli covered by large gray center patches of 45° diameter and larger, we observed that response latencies were delayed for the gray-center/grating-surround condition, both for drifting and static stimuli. Response latencies showed a systematic dependence on the size of the gray center patch. Response latencies were comparable between the gray-center/grating-surround and classical grating up to gray patch sizes of about 45° diameter (Figure 1h). For larger gray patch sizes, firing responses in the gray-center/grating-surround condition were delayed by about 50 ms compared to the classical grating condition (Figure 1h). This resulted in a negative correlation between the magnitude of the neural response and the response latency (Figure 1i).

In addition, we recorded single neurons LGN using the same paradigm (Figure 2a,b). Similar to V1, LGN neurons had a maximum response to grating stimuli in the classic RF of around 15°-25° diameter, similar to V1. In contrast to V1 neurons, LGN neurons did not show an increase in firing rates relative to baseline for larger patch sizes in the gray-center/grating-surround condition. Both during the early and late stimulus period, LGN firing rates decreased below baseline levels, with maximum suppression around 45° diameter (Figure 2c). Even at 25° diameter, the firing response of LGN neurons was weak and did not differ from baseline levels in the late stimulus period. This analysis indicates that the increase in V1 firing rates for patch sizes of 25° up to 90° diameter is not inherited from area LGN, but depends on horizontal and top-down cortical feedback.

LGN responses to gray patches with a grating in the surround. a) Stimuli, as in Figure 1. b) Representative scheme of extracellular recordings in LGN. c) Average firing rates normalized to baseline (30 single units, 2 animals).

Responses to discontinuous surround stimuli

As described above, grating stimuli presented in the distal surround can increase V1 firing rates relative to baseline (i.e. surround-induced response). These grating stimuli were spatially coherent, i.e. they had a continuous spatial structure that was interrupted by the gray center patch. Such a continuous grating stimulus allows for prediction, through interpolation, of the stimulus behind the gray center patch. We thus wondered to what extent the surround-induced response depends on the spatial continuity of the surround stimuli. To investigate this, we disrupted the continuity of surround stimulus, by dividing it into two separate gratings of orthogonal orientations (Figure 3a,b). These two gratings were placed next to each other, with the dividing line centered on the neuronal RF. In the drifting grating condition, the stimuli moved in orthogonal directions. Gratings were presented in two conditions, either vertical and horizontal (0° and 90°) or diagonal (45° and 135°) (Figure 3a,b). In the vertical-horizontal condition, the two gratings were spatially discontinuous at the location of the patch, whereas in the diagonal condition, the gratings were spatially continuous (although motion was non-coherent in both conditions). For the gray-center/grating-surround condition, a rectangular gray patch was superimposed onto the grating stimuli. We varied the width of this rectangular gray patch up to 90° diameter. Altogether, we defined four conditions for the gratings (Gr) and the gray-center/grating-surround (GcGr) conditions: drifting/stationary (Dr/St) and continuous/non-continuous (C and NC), i.e. DrC, DrNC, StC, and StNC.

Neural responses to rectangular gray patches with orthogonal grating stimuli in the surround. a) Spatially continuous gratings (C) and a gray rectangular patch covering the gratings (GcC). The stimulus could be presented as either drifting or stationary. b) Non-continuous gratings in the surround (NC), which was covered by the gray patch in a subset of trials (GcNC). c) Population size tuning shown as the logarithm of firing rates normalized to baseline during early (0.04 s to .15 s) and late (0.2 s to 1 s) stimulus periods. “Dr” and “St” refer to drifting and stationary conditions. d) Statistical analysis of data in (d). Sizes were divided in small (< 45°) and large (≥ 45°) *p-values < 0.01 Wilcoxon signed-rank test. e) Average spike density function for different sizes of the rectangular gray patch. Solid line represents the stimulus period. f) Histogram of response latencies (rise time) for Gr (black) and GcGr (red) conditions. Latency was computed for sizes of ≥ 45°. The black and red lines are (kernel) smoothing estimates. Drifting and stationary conditions are pooled together. PDF corresponds to the probability density function (Wilcoxon signed-rank test, *p-value < 0.01). g) Scatter plot of rise time for the Gr vs. GcGr condition (sizes ≥ 45°, r-Pearson correlation value).

We found surround-induced responses for gratings with orthogonal orientations (Figure 3c-e), with delays in response latencies for both drifting and stationary gratings (Figure 3f,g).

The surround-induced response was found both for drifting and stationary gratings and for both spatially continuous (i.e. diagonal) and non-continuous (vertical-horizontal) gratings. In fact, surround-induced responses were greater for the non-continuous than continuous surround stimuli (comparing StGcNC vs. StGcC) during the late stimulus period (Figure 3d).

The observation that the continuity of the stimulus in the surround is not necessary to generate a surround-induced response, predicts that surround-induced responses might also occur for a noise stimulus in the distal surround. We, therefore, presented pink noise stimuli either in the classical condition or with a circular gray patch centered on the neuronal RF (Figure 4a). In the gray-center/noise-surround condition, surround-induced responses were observed for gray patches up to 90° diameter (Figure 4b,c). Surround-induced response was strongest in the early stimulus period, and their magnitude showed a negative dependence on the size of the gray patch (Figure 4c). We also observe a difference of ∼50 ms in response latency between the classical (i.e. a noise image centered on the RF) and the gray-center/noise-surround condition (Figure 4d,e).

Neural responses to gray patch with pink noise background in the surround. a) Illustration of stimuli for 70° gray patch: Pink noise patch with a gray background (PN), or a gray patch with a pink noise background (GcPN). The black line is the PSTH for the PN condition. b) Average normalized firing rates (relative to baseline; logarithmic scale) for different sizes of the gray patch in the gray-center/noise-surround (GcPN) condition. For comparison, we include the pink noise condition at the size of 0 (PN; black dot). c) Average (normalized to baseline) spike density function for different sizes of the gray patch. d) Probability density function (PDF) of the rise time of the response. The line highlighted shows the Kernel smoothing function estimate from the PDF (Wilcoxon signed-rank test, *p-value < 0.01). e) Scatter plot of the rise time for the PN or GcPN conditions (sizes ≥ 45°, r-Pearson correlation value).

Achromatic surface stimuli

Next, we studied conditions in which gray center patches of different sizes were presented on either a white or black background (Figure 5a,b) (Gray-center/White-surround, GcWs; Gray-center/Black-surround, GcBs). In the classical condition, we presented white or black patches of different sizes on a gray background (White-center/Gray-surround, WcGs; Black-center/Gray-surround, BcGs) (Figure 5a,b). The onset of a white or black surround stimulus (GcWs and GcBs) led to an increase in V1 firing rates above baseline levels. This firing increase was strongest for gray patches around 5°-15° of diameter but was still significant for large gray patches (Figure 5c-f). V1 firing rates were also increased for white or black surface stimuli centered on the neuronal RF (WcGs and BcGs, Figure 5c-f). Differences in firing rates were relatively small when comparing the Gray-center/White-surround and White-center/Gray-surround conditions, as well as the Gray-center/Black-surround and Black-center/Gray-surround conditions. In the late stimulus period, opposite patterns were found for white and black: Firing rates were slightly higher in the White-center/Gray-surround than in the Gray-center/White-surround condition (Figure 5e). However, firing responses were stronger for the Gray-center/Black-surround than Black-center/Gray-surround condition. Hence, firing rates were generally higher in the condition with the brighter surface in the center. Furthermore, we found that latencies were not significantly different between conditions and peaked around 120 ms (Figure 5g,h). Thus, firing responses to gray surface stimuli on a white/black background tend to have comparable magnitudes and latencies as white/black surface stimuli on a gray background. This differs from the case of grating stimuli, where we found a strong latency difference and stronger responses for grating stimuli on a gray background, as compared to gray patch stimuli on a grating background (see Figure S3 for a direct comparison).

Neural activity for a gray patch with a black or white surround. a) White stimuli: white patches with a gray surround (WcGs) or a gray patch with a white surround (GcWs). b) Black stimuli: black center with a gray surround (BcGs) or a gray patch with a black surround (GcBs). c) Average firing rates normalized to baseline and in logarithmic scale for early (0.04 s to 0.15 s) and late (0.2 s to 1 s) stimulus period. d) Same as (c) for black stimuli shown in (b). e) Mean and SEM of neural responses for small and large patch sizes, separately for early and late stimulus periods. (*p-values < 0.01). f) Average spike density for different stimulus sizes. Spike densities are normalized to baseline and shown in a logarithmic scale. g) Histogram of response latencies across neurons (response rise time). The line highlighted shows the estimated probability density (kernel smoothing). Response latencies were computed for patch sizes of 45° and larger. PDF corresponds to the probability density function. h) Scatter plot of the rise time for the classical and gray patch condition per unit (n=247, 6 animals, r-Pearson correlation value).

Decoding of population firing rate vectors

We further investigated differences in neural population vectors between different kinds of surround stimuli. To this end, we computed the Euclidean distance between two firing rate vectors for all pairs of trials (Figure 6a-b). Based on these distance matrices, we computed low dimensional embeddings via t-SNE (Figure 6c). In addition, we performed supervised classification via support vector machines (Figure 6d). We performed these analyses including 70° and 90° diameter gray patches or stimuli for the protocols with stationary and drifting gratings (Figure 1), the rectangular gray patch with orthogonal gratings (Figure 3, and the protocol with black/white backgrounds (Figure 5).

Population analyses to investigate stimulus specificity. a) Dissimilarity matrices of firing rate vectors across trials. The distance between firing rate vectors was computed using Euclidean distances. For visualization purposes, the diagonal shows the maximum value. StGr: Stationary grating (classical condition). DrGr: Drifting grating (classical condition). StGcGr indicates a gray-center/gratings surround with a stationary grating (i.e. patch condition). b) Mean distances between protocols based on dissimilarity matrices. Black lines show the (i.e., SEM), where n is the number of samples (i.e., distances). St-GcSt indicates the distance between stationary grating (classical) and gray-center/stationary-grating (patch) conditions. For gratings with a circular gray patch during early periods, only (St-GcSt, St-GcDr) were not distinguishable (p-val equals 0.396). For late periods, all the distances were statistically distinguishable. For Gratings with rectangular patch during early periods, (St-GcSt, Dr-GcSt), (St-GcSt, Dr-GcDr), and (Dr-GcSt, Dr-GcDr) were not significant distinguishable (p-values equal to 0.4585, 0.9402, 0.9478, respectively). For late periods, all the comparison yielded significance. Finally, for B&W, all the comparisons were statistically significant except for the distances between (BcGs-GcWs, GcBs-GcWs) (p-val = 0.0247) for early periods. For late periods, the comparisons (BcGs-GcBs, BcGs-WcGs) and (GcBs-GcWs, GcBs-WcGs) were not significantly distinct (p-values 0.0338 and 0.0308, respectively). c) 2D t-SNE embedding based on dissimilarity matrices shown in a). d) Support Vector Classifier (SVC) based on matrices in a). Classification score across 20 repetitions. 40% of trials were used for training and 60% for testing.

Across protocols, we refer to the condition with the gray patch in the center as the “patch” condition and the condition with the stimulus (grating, or black surface) centered on the RF as the “classical” condition. We first examined whether the patch and classical condition could be distinguished. In all three protocols, firing rate vectors formed distinct clusters for the patch condition and the classical condition, with classification performance above 90% for all stimulus conditions and early and late stimulus periods (Figure 6d). Next, we analyzed to what extent the surround stimulus in the patch condition could be decoded from the surround-induced response. For all protocols, the surround stimulus could with high accuracy be decoded both during the early and late stimulus period (Figure 6d). Decoding performances were comparable between the patch and classical conditions. That is, the surround-induced response contained about the same amount of information about the surround stimulus as the activity in the classical condition (i.e. when the same surround stimulus was presented) (Figure 6d).

Nevertheless, there were differences between the grating and rectangular protocol compared to the black and white protocol. The t-SNE and dissimilarity matrices showed two main clusters in the grating and rectangular protocol, one for the classical and one for the patch condition. This was reflected by the fact that the distance between stationary and drifting grating in the patch condition was substantially smaller than the distance between the other conditions. However, in the black-white protocol, we did not observe a distinct cluster for the gray patch condition, and the distance was not consistently lower than distances between the other conditions (Figure 6a-b). Finally, we examined if the surround-induced responses in the gray patch condition, for a given surround stimulus, tended to be similar to the responses in the classical condition for the same surround stimulus. This was generally not the case. However, the distance between a stationary grating in the classical and patch condition (St-GcSt) was larger than the distance between a drifting grating in the classical and stationary grating in the patch condition (Dr-GcSt), for example.

Discussion

Recent studies suggest that V1 neurons can be driven by a surround stimulus when a gray patch covers the classical RF, which likely depends on feedback (Schnabel et al., 2018; Keller et al., 2020; Kirchberger et al., 2023). This effect has been interpreted as a prediction of the occluded content or a prediction error (Keller et al., 2020). Alternatively, it may reflect the representation of the uniform achromatic surface itself (Zweig et al., 2015). We recorded V1 and LGN neurons using Neuropixels in awake mice and show that V1 firing rates increase by presenting a grating stimulus in the background, while the RF is covered by a gray patch up to 90° of visual angle. LGN firing rates decreased for larger gray patches, suggesting that the increase in V1 firing rates for large gray patches derives from horizontal or top-down feedback. V1 response latencies showed a systematic increase in the size of the gray center patch. Increased firing at the gray center patch did not require spatial continuity or motion coherence of the surround stimulus and was generalized to noisy textures and luminous surfaces. Responses to black/white surfaces had a similar magnitude and response latency as responses to a gray patch with black/white stimuli in the surround. We suggest that increased V1 firing for a gray patch following the presentation of a distal surround stimulus primarily reflects the representation of the achromatic surface.

Our findings further suggest that the increase in V1 firing is due to horizontal or top-down feedback because surround stimuli induced decreases in LGN firing rates, which had RF sizes comparable to V1. Furthermore, mice make infrequent and small eye movements, while our surround stimuli were up to 45° away from the RF center.

Surround-induced responses

Our results demonstrate that V1 neurons are driven (i.e. increased firing rates relative to baseline) by various kinds of distal surround stimuli. We refer to this effect as the “surround-induced response”. We further observed an increase in response latency with the increase in the size of the gray patch up to about 50 ms. In previous experiments, Keller et al. (2020) also presented drifting gratings masked by circular gray patches with sizes up to 90°. Some (their Figure 1) but not all (their Figure 4) of their figures showed increased ΔF/F activity above zero.

We further show that the surround-induced response generalizes to moving and stationary stimuli, continuous and non-continuous stimuli, noisy textures, and achromatic surfaces. Thus, the surround-induced response can occur when the surround stimulus cannot be interpolated inside the region of the gray center patch (i.e. lack of spatial continuity). Our findings further suggest that surround-induced responses also occur in the absence of motion coherence and do not require a perceptual inference that an object is moving behind the gray center patch. The surround-induced response also occurs when the gray center patch is a circular object on a uniform achromatic background, in which there is no salient object behind the gray center patch. We also observed that the surround-induced response was stimulus-specific: it was possible to decode with high accuracy if the surround stimulus was drifting or stationary. Likewise, it was possible to decode if the surround stimulus was black or white. It is possible that e.g. the difference between a stationary and drifting grating reflects the strength of the surround input, with less adaptation for drifting surround stimuli.

The inverse receptive field

The surround-induced response should be distinguished from the concept of an inverse RF that was recently put forward by Keller et al. (2020). The surround-induced response reflects an increase in firing rates relative to baseline levels, while the inverse RF refers to an increase in firing for a full-field stimulus masked by a gray patch relative to the same full-field stimulus. Thus the interpretation of the “inverse RF” is that V1 firing rates increase due to the omission of classic RF stimuli.

However, it is unclear what mechanism underlies the inverse RF. Keller et al. (2020) reported preferred inverse RF sizes of about 15°. Similarly, we found an increase in V1 firing rates as compared to the full-field grating when it was masked by gray circular patches of 15°. We observed this increase specifically in the late stimulus period and only for drifting, but not for static gratings (Figure S2). We further find that the neural response for stimuli around 15° to 25° does not have a delayed latency as compared to classic RF inputs (note that Keller et al. (2020) did not analyze the dependence of response latencies on the gray patch size). Given that the strongest inverse response occurs for circular gray patches around 15° without a delayed response latency, it is possible that for such a stimulus there is some remaining bottom-up input into the classical RF. We noted that V1 RF sizes are approximately 15° large and that units are included in the analysis (also in Keller et al. (2020)) that have a classic RF within 10° away of the stimulus center. Hence, some of the grating in the surround may still cover the classical RF, which is consistent with the finding that LGN neurons show increased firing relative to baseline in the gray-center/grating-surround condition for gray patches of 15°. We posit that when a small circular gray patch is superimposed onto a grating stimulus, there are two factors determining the neural response: (1) The gray patch changes the spatial frequency content of the classical RF input. Consequently, the bottom-up drive may decrease compared to a grating stimulus. (2) The mismatch between surround and classic RF input induced by the gray patch can increase the response strength. If the influence of the second factor exceeds the influence of the first factor, the overall neural response may increase compared to a full-field grating stimulus, giving rise to an “inverse RF”. As the second factor depends on feedback, inverse RFs may be observed specifically in superficial layers (Keller et al., 2020).

Differences between species

Our study suggests that there are surround-induced responses in mice, i.e. increase in firing relative to baseline caused by stimuli in the distal surround. However, it is unclear whether this phenomenon also occurs in primates. Several studies did not report surround-induced V1 firing responses for grating stimuli, neither in the anesthetized nor the awake monkey (Gieselmann and Thiele, 2008; Slllito et al., 1995; Cavanaugh et al., 2002). By contrast, other studies did observe surround-induced V1 firing responses in primates. Rossi et al. (2001) showed an increase in V1 firing rates in the awake monkey for an oriented textured surround with a gray patch of 4° centered around the neuronal RF. Similar to our study, this response was substantially weaker than the classical response and also delayed in time, and was induced by distal surround stimulation considering that macaque RFs are about 1° wide. Papale et al. (2023) found increased V1 firing rates for natural scenes. In their study, the surround stimulation however occurred relatively close to the neuronal RFs. In humans, similar stimuli have been shown to increase V1 BOLD responses (Muckli et al., 2015).

It remains to be investigated what explains these discrepancies between non-human-primate studies. A possible explanation is that the studies reporting surround-induced responses used full-field stimulus in the surround (Rossi et al., 2001; Papale et al., 2023), similar to our study, while the other studies that did not report surround-induced responses used a smaller surrounding annulus (Slllito et al., 1995; Cavanaugh et al., 2002; Gieselmann and Thiele, 2008). It is possible that many forms of surround stimulation induce subthreshold activity in V1 neurons, composed of a mixture of excitatory and inhibitory conductances, but that only a subset of stimuli induce suprathreshold activity. Indeed, studies of the cortical point-spread function demonstrate that a visual stimulus elicits a wave of activity propagating up to 10 times bigger than the size of the retinotopic start point (Grinvald et al., 1994). Moreover, studies in cats have shown that postsynaptic integration fields in V1 are up to five times larger than the integration fields of suprathreshold spiking activity (Bringuier et al., 1999). The latencies of the subthreshold potentials increased with the distance of the stimulus to the center of the integration field.

There may be genuine differences between mice and monkeys. In contrast to macaque V1 (Talluri et al., 2023), neurons in mouse V1 can be strongly driven by many factors not related to visual stimulation (Vinck et al., 2015; Stringer et al., 2019). Consequently, a modulatory or weak input caused by surround stimulation may lead to changes in suprathreshold activity in mice but not in monkey V1. Surround stimuli may have a different effect on inhibitory and excitatory neurons as compared to mice. One characteristic feature of primates is their high acuity vision as compared to rodents. Theoretical models of predictive and effcient coding entail that stimuli with high precision should induce stronger inhibitory feedback, whereas lower precision should lead to more pooling (Huang and Paradiso, 2008; Coen-Cagli et al., 2012). This is illustrated by the increased spatial summation for low-contrast stimuli compared to high-contrast stimuli (Sceniak et al., 1999). Therefore, there may be more spatial summation in mouse V1 than in monkey V1.

Mechanisms

One of the central motivations was to understand the functional significance of surround-induced responses. Our data indicates that the surround-induced response in V1 does not depend on enhanced feedforward activation of LGN neurons, whose firing rates decreased with the presentation of distal surround stimuli. A plausible mechanism is therefore horizontal or top-down feedback (Marques et al., 2018).

We will discuss several possible interpretations of surround-induced responses, which are not necessarily exclusive.

  1. A first possibility is that surround-induced response represents a prediction error or an omission signal (Keller et al., 2020), i.e. a mismatch between the circular gray patch and the surround (Spratling, 2010; Rao and Ballard, 1999). We argue this explanation may account for the inverse RF, but not for the surround-induced responses. For the latter, there is no mismatch when there is a large gray patch centered around the neural RF, because the bottom-up input into the RF (i.e. homogeneous gray surface) is the same as the near (proximal) surround (i.e., also a homogeneous gray surface). Furthermore, contrary to our observations, one would have expected that a mismatch response depends on the continuity of the surround stimuli, i.e. whether a consistent prediction based on the surround can be generated.

  2. A second possibility is that the surround-induced response represents an interpolation (i.e. prediction or fill-in) from the distal surround stimulus into the region of the circular gray patch (Derrington, 1996; Komatsu, 2006; Muckli et al., 2015). Such interpolation can be conceptualized as a perceptual inference of the content behind the occluder based on the surround stimulus. We argue that this is not a plausible explanation for the surround-induced response for the following reasons: (i) We found equally strong surround-induced responses when the surround stimulus was not spatially continuous. For such a stimulus, the rectangle is not perceptually interpreted as an occluder of a “hidden” object. Likewise, we did not find stronger surround-induced responses for moving stimuli in the early period, even though moving stimuli should facilitate the inference that there is an object behind the gray patch. (ii) We found equally strong surround-induced responses when the gray patch appeared as a salient object over a uniform, non-salient achromatic background. In this case, it is unclear why surround-induced responses would represent the uniform background behind the salient object rather than the salient and directly visible object itself.

  3. We argue that the surround-induced response most likely reflects a representation of the gray surface stimulus (patch) itself. A previous study in macaque V1 has shown that for achromatic surfaces (e.g. a black patch on a gray background), neural firing increases at the center of the achromatic surface with a delay compared to the response at the edge (Zweig et al., 2015; Peter et al., 2019). This delay increased with the size of the achromatic surface (Zweig et al., 2015). This effect was interpreted as the inference of the surface information itself. Importantly, this surface information may not be available from the direct feedforward input, considering that a uniform achromatic surface has zero power at all spatial frequencies (Zweig et al., 2015). Thus, according to Zweig et al. (2015), the V1 representation of the center of a uniform achromatic surface derives from neural responses at the edge of the surface. We argue that the surround-induced response with a gray patch in the RF has a similar mechanistic origin. That is, presenting a distal surround stimulus activates neurons around the edge of the gray patch, which then leads to a transient and delayed increase in V1 firing (i.e. a surround-induced response) at the center of the gray patch. This interpretation is compatible with several observations: (i) The response magnitude and latency of the surround-induced response were very similar to the neural response when a black or white patch was presented on a gray background. In fact, in the late stimulus period, surround-induced responses (i.e. with a gray patch) were stronger than responses to a black patch on a gray background. Furthermore, we did not observe that the population vectors for the gray patch on a black or white background formed a separate cluster (in the t-SNE embedding) as compared to a black or white patch on a gray background. (ii) Similar to Zweig et al. (2015), we observed a systematic increase in the latency of surround-induced responses as a function of the surface (patch) size.

    It is possible that surround-induced responses do not merely encode the surface information of the gray patch, but in addition, encode information about the properties of the distal surround stimulus. Such a scenario would be consistent with the finding that human fMRI activity contains information about the predicted content behind the occluder (Muckli et al., 2015). Stimulus specificity, however, may also be due to the strength of neural activation in the cortical regions directly driven by the distal surround stimuli. For example, stronger surround-induced responses for moving than stationary gratings could simply reflect less adaptation of neurons in the surround.

  4. A closely related explanation is that the surround-induced responses represent a figure-ground effect (Self et al., 2013; Schnabel et al., 2018; Kirchberger et al., 2023). In this interpretation, the surround-induced response occurs because the gray patch appears as the figure (i.e. the foreground) on a background, and thus draws bottom-up attention (i.e. is salient). While figure-ground modulation may have contributed to the increase in V1 firing, we argue that it alone does not account for all of our observations. The reason is that figure-ground modulation assumes that there is some representation of the gray patch to begin with, begging the question of how this representation emerges. Following (Zweig et al., 2015), we argue that the representation of a uniform surface, with information traveling from the edge to the center, forms a mechanism through which the surface is seen as an object, leading to perceptual grouping and image segmentation. These signals can then be further boosted when the surface appears as a foreground, however, they may also occur when e.g. the patch is large and flanked by two salient stimuli, as observed here.

In sum, the most consistent explanation for our empirical observations of increased V1 firing due to a distal surround stimulus is that the presentation of a distal surround stimulus leads to the representation of the gray center patch covering the classical RF.

Methods and Materials

Materials availability

Further information and requests for resources should be directed to Martin Vinck (martin.vinck@esi-frankfurt.de).

Data and code availability

The open-source MATLAB toolbox Fieldtrip (Oostenveld et al., 2011) was used for data analysis. Data and custom MATLAB scripts are available upon request from Martin Vinck (martin.vinck@esi-frankfurt.de). For the population analysis, we used Scikit-Learn 0.22.1, Numpy 1.18.1 and Numba 0.51.2 for data cleaning and multi-CPU processing, SciPy 1.5.4 for statistics, and Matplotlib 3.1.3 for visualizations.

Animals

The experiments were conducted in compliance with the European Communities Council Directive 2010/63/EC and the German Law for Protection of Animals, ensuring that all procedures were ethical and humane. All procedures were approved by local authorities, following appropriate ethics review. The study utilized three to eight-month-old mice of both genders (C57BL/6). Mice were maintained on an inverted 12/12 h light cycle and recordings were performed during their dark (awake) cycle.

Head Post Implantation Surgery

One day before the surgery, we handled the mice to reduce stress on the surgery day. We administered an analgesic (Metamizole, 200 mg/kg, sc) and an antibiotic (Enrofloxacin, 10 mg/kg, sc, Bayer, Leverkusen, Germany) and waited for 30 minutes. Anesthesia was then induced by placing the mice in an isoflurane-filled chamber (3% in oxygen, CP-Pharma, Burgdorf, Germany) and maintained throughout the surgery with isoflurane (0.8-1.5% in oxygen). We regulated the animal’s body temperature by using a heating pad, previously set to the body temperature. We constantly applied eye ointment (Bepanthen, Bayer, Leverkusen, Germany) to prevent eye dryness. Before making an incision, the skin was disinfected three times with Chlorexidine, followed by ethanol each time. After exposing the skull, we cleaned it with 3% peroxide three times, followed by iodine each time. The animal was positioned on a stereotaxic frame (David Kopf Instruments, Tujunga, California, USA). The skull was then aligned, and we measured the coordinates for V1 bilaterally, utilizing the transverse sinus as a reference point as previously described (Wang et al., 2011) (V1, AP: 1.1 mm anterior to the anterior border of the transverse sinus, ML: 2.0-2.5 mm) and marked the coordinates for V1. We positioned a screw in the frontal part of the skull to stabilize the implant. A custom-made titanium head-post was placed at the level of bregma, securing it with dental cement (Super-Bond C & B, Sun Medical, Shiga, Japan). The area designated as V1 was covered using cyanoacrylate glue (Insta-Cure, Bob Smith Industries Inc, Atascadero, CA USA). We closely monitored the animal’s recovery for 3-5 days, administering antibiotics for two consecutive days and providing metamizole in drinking water. We acclimated the animals to the running disk over five days. On the first day, we placed the mice on the disk for 5 minutes in complete darkness. We gradually increased the duration of exposure over the following days.

Extracellular Recordings

On the day of the recording session, we performed a circular craniotomy of approximately 0.8 mm-1 mm diameter on V1 while the animals were under anesthesia (Isoflurane). We administered dexamethasone and metamizole thirty minutes before the procedure. We covered the craniotomy with Kwik-Cast (World Precision Instruments, Sarasota, USA) and inserted two pins into the cerebellum for grounding. We waited for at least 2 hours before the recording session. For the recording sessions, awake animals were head-fixed and placed on a running disk. We used Neuropixel probes, the probe was inserted around 1100-1300 µm depth with a 15° angle and recorded simultaneously from ∼150 channels, for LGN recordings we simultaneously recorded 384 channels. We isolated single units with Kilosort 2.5 (Steinmetz et al., 2021) and manually curated them with Phy2 (Rossant et al., 2021).

Visual stimuli

The experiment was run on a Windows 10 and stimuli were presented on an Asus PG279Q monitor set at 144 Hz refresh rate, racing mode, contrast 50% and brightness 25%. We employed Psychtoolbox-3 (Brainard, 1997) to create the stimuli presented. Throughout the study, we consistently positioned the screen at a 30° angle of the eye contralateral to the recording hemisphere at a distance of 15 cm. For all protocols, the stimulus duration was 1 s, followed by an inter-trial interval of 1.3 s unless specified.

Sparse Noise and Receptive Field Mapping

We employed a locally sparse noise protocol to find the center of the RFs, modified from Allen Brain (see https://observatory.brain-map.org/visualcoding/stimulus/l The protocol consisted of black and white squares of 4.65 degrees, arranged in a 23×42 array. The stimulus was presented for 0.25 s, during which black and white squares were randomly positioned on a gray background. The total session duration was 15 minutes. We averaged the response over all trials and positions of the screen for the black or white squares, separately. A heatmap of the response was created, and we obtained the position and dimension of the response peak. We fitted an ellipse in the center of the response. To center the visual stimuli during the recording session, we averaged the multiunit activity across the responsive channels and positioned the stimulus at the response peak previously found. For all the following analyses based on the neuronal response to visual stimuli, we performed RF mapping using single-unit responses. We performed a permutation test of the responses inside the RF detected vs a circle from the same area where the screen was gray for the same trials. We included RFs of single units that met the following criteria: z-score of the response > 4, a permutation test p-value < 0.03, and an RF diameter within the range of 10° to 30°. We only included units in which the center of the RF was < 10° of visual angle from the center of the stimulus.

Sinusoidal gratings

We presented drifting (2 cycles/sec) and static sinusoidal gratings, with a spatial frequency of 0.04 cycles per degree, with randomized orientations (0°, 45°, 90°, 135°, 180°, 225°) and sizes (5°, 10°, 15°, 25°, 45°, 55°, 70°, and 90°), equally balanced between gratings and gray patches over gratings. All stimuli were displayed in full contrast with a gray background. The patch had the same gray value as the one presented during the inter-stimulus interval. For the patch condition, we displayed gratings covering half the size of the x-axis of the screen. We presented 10-20 repetitions of each condition (2 motion conditions, 6 orientations, 8 sizes, 2 conditions of the patch, with or without a patch, in total 192 conditions per session). Luminance of all the stimuli were measured with Flame UV-VIS Miniature Spectrometer sensor placed at the center of the visual stimulus patch. Luminance intensities were constant across all stimulus conditions (100 lumen cd/m2).

Orthogonal gratings with elongated patch

We presented gratings with orthogonal orientations in each half size of the screen (Figure 3a-b). The drifting (0.04 cycles per degree) or static gratings were in randomized orientations (0°, 45°, 90°, 135°, 180°). We randomized conditions with full-field gratings without a patch or with a rectangular gray patch with different sizes of diameter (5°, 10°,15°,25°,45°,55°,70°, 90°). For continuous gratings, the direction of the gratings on each side of the screen allowed for the completion of a pattern (one-half of the screen with 45° gratings and the other half of the screen with 135° gratings). Opposite, for the non-continuous condition, the orientations of gratings in each half of the screen did not allow pattern completion as one side was horizontal and the other side was vertical (0° vs. 90°). We presented 10-15 repetitions of each condition (8 sizes, 5 orientations and 2 stimuli conditions only gratings or gratings with patch, in total 80 conditions per session).

Pink Noise

We showed a pink-noise background with a gray patch in randomized diameter sizes (0°, 5°, 15°, 25°, 35°, 45°, 55°, 70°, 80°, and 90°). We used two (high/low) contrast values of the pink noise, randomized, and each size of the patch was presented in 10-20 repetitions per session.

Black and white stimuli with patch

We showed 2 sets of stimuli: (1) White patches (centered on the RF) with a gray surround (WcGs) or a gray patch with a white surround (GcWs). (2) A black patch with a gray surround (BcGs) or a gray patch with a black surround (GcBs). The diameter size of the center patch was randomized (5°, 15°, 25°, 35°, 45°, 55°, 70°, 80°, and 90°), as well as the color (black or white) of the patch or the background. We presented around 10-15 repetitions per condition (9 sizes, and 4 conditions of the patch, either gray patch with white/black background or black/white patch with gray surround).

Assignment of cortical layers in V1

The assignment of superficial, L4, and deep cortical layers was based on the current source density (CSD) of the average LFP signal during whole screen flash stimulation. The protocol consisted of a 100 ms long white screen period with a 2 s gray screen for the inter-stimulus period. To increase the spatial sampling rate, we interpolated the LFP traces with an interpolation factor of 4. CSD analysis was computed by taking the second discrete spatial derivative across the different electrode recording sites. The step size of the discrete spatial derivative was 200 µm. Single units were assigned to a cortical layer based on the location of the channel with the highest amplitude during a spike.

Inclusion criteria

We included the following criteria in the spike-sorted units: 1) The ZETA-test (Montijn et al., 2021) was applied to the period around the onset of the classical gratings (0 ms, 250 ms) to test which neurons showed significantly modulated spiking activity (p-value<0.05 and zeta responsiveness > 2). 2) V1 units: assignment of the layer with CSD analysis. 3) Units that met the selection criteria of a good RF and Euclidean distance from the center of the RF to the center of stimulus had to lie within < 10 of visual angle. 4) Modulation of response to each protocol (gratings, black/white, pink Noise, and rectangular patch). We included units that were positively modulated for the classical condition of each protocol. The modulation response was calculated as the average firing rate during the stimulus presentation (30 to 250 ms) subtracting the average response from baseline (−250 to −30 ms) and dividing by the average response from baseline (i.e., (F RstimF Rbase)∕F Rbase).

Statistical Analysis

We obtained the average firing rate from 0.04 s to 0.15 s for the early period and 0.2 s to 1s for the late period in the size-tuning plots. For all analyses, we normalized the responses per unit to the baseline, calculated the logarithm of the normalized responses, and presented the mean and standard error of the mean (SEM). For the spike density function for the different conditions, we used a time window of the Gaussian smoothing kernel from −.05 to .05. The spike density functions of every unit were also normalized to the baseline, and we obtained the logarithmic values and presented them as mean responses and SEM. We defined the rise time (latency of responses) as the time in which the response of every unit (baseline subtracted) crossed a threshold (μbase + σ2) up to 0.5s. We plotted the population density function (PDF) of the rise times for diameter sizes of the patch or the gratings ≥ 45°. We included values > 0.02 s and obtained the kernel density function of the PDF. At the end, we calculated the Pearson’s correlation coeffcient to correlate the rise time values of different conditions of visual stimuli. For all the statistical analyses, we calculated the Wilcoxon signed-rank test.

Population Analysis

In total, the dataset yielded population spiking patterns that consisted of N = 344 neurons, which were pooled across multiple sessions as in previous studies (Kheradpezhouh et al., 2020; Deitch et al., 2021; Sotomayor-Gómez et al., 2023).

Additionally, for the occluded stimulus, we included patch sizes of 70 ° and larger.

We calculated firing rate vectors for each analysis period by dividing the spike count per neuron by a window length T. From each dissimilarity matrix, we computed a 2D representation of epochs using t-Distributed Stochastic Neighbor Embedding (t-SNE) manifold algorithm. We used different perplexity depending on the stimulus condition. Perplexity is equal to 20 for both Gratings with circular and rectangular occluders, and 100 for black and white. We compared their distances statistically via a two-sided Wilcoxon test, as for previous analyses.

We trained a C-Support Vector Classifier based on dissimilarity matrices. We used 40% of trials for training and 60% for testing. We shuffled trials for training and testing 20 times. Thus, such values correspond to the average accuracy across 20 iterations.

Statistical significance

We compared the population of distances, shown in Figure 6, using a two-sided Wilcoxon test. We computed the p-value for statistical comparison for gratings, gratings with the rectangular patch, and black and white stimuli. We consider p-value < 0.01 as a threshold for statistical significance.

Acknowledgements

Conceptualization: NC, MV. Experiments: NC, AT, AB. Data analysis: NC, BSG. Supervision: MV. Writing of main draft: NC, BSG, and MV, with comments from other authors.

Firing rate across V1 Layers to gratings in the far surround a) Stimuli presented as in Figure 1). Drifting and stationary gratings and gratings covered with a gray patch. b) CSD analysis from the first 200 ms in response to 45° gratings (left) and gratings covered by a 45° gray patch (right). c) Population size tuning per layer. Firing rate during the early and late period of stimulation, every unit is normalized to baseline (Superficial units n = 22, L4 units n = 213 and deep layer units n = 208, Wilcoxon signed ranked test p < 0.01). d) Probability density function of the rise time per unit separated into layers for drifting and stationary gratings. From top to bottom superficial units, layer 4 units, and deep units (*p-values < 0.01 Wilcoxon signed ranked test p < 0.01). e) Scatter plot of the rise time for Gr or GcGr separated by layers (sizes ≥ 45°, r-Pearson correlation value).

Firing rates to 90° gratings vs different sizes of the patches covering gratings. a) Scatter plots comparing 90° Gr (we define it as full-field) to different sizes of the gray patch for the drifting condition (r-Pearson correlation coeffcient). b) Same as in a for stationary conditions. c) Full-field gratings (90°) compared to different sizes of the gray patch covering gratings during the early stimulus presentation from 0.04 s to 0.15 s (Mean and the SEM. * p-values < 0.01 units per size 5° and 10°, n = 117 neurons, for sizes > 10°, n = 335 neurons, 6 animals). d) Same as c) but for the late stimulus period 0.2 s to 1 s.

Comparison of firing rate during the late period for larger sizes for all protocols. a-d) Stimuli on the top, plots represent the average of the population firing rate during the late period from 0.2 s to 1 s, each unit is normalized to baseline, for sizes ≥ 45°. Each protocol includes different units and is normalized to the baseline for each block (*p-values < 0.01, Wilcoxon signed-rank test, each group is compared within the same protocol).