Statistical learning attenuates visual activity only for attended stimuli

Abstract
Introduction
Results
Discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

Perception and behavior can be guided by predictions, which are often based on learned statistical regularities. Neural responses to expected stimuli are frequently found to be attenuated after statistical learning. However, whether this sensory attenuation following statistical learning occurs automatically or depends on attention remains unknown. In the present fMRI study, we exposed human volunteers to sequentially presented object stimuli, in which the first object predicted the identity of the second object. We observed a reliable attenuation of neural activity for expected compared to unexpected stimuli in the ventral visual stream. Crucially, this sensory attenuation was only apparent when stimuli were attended, and vanished when attention was directed away from the predictable objects. These results put important constraints on neurocomputational theories that cast perception as a process of probabilistic integration of prior knowledge and sensory information.

https://doi.org/10.7554/eLife.47869.001

Introduction

Previous experience constitutes a valuable source of information to guide perception and behavior. Extracting statistical regularities from past input in the environment to form expectations about the future has been shown to improve behavior in myriad ways (Bertels et al., 2012; Hunt and Aslin, 2001; Kim et al., 2009). Indeed, the acquisition of statistical regularities is thought to occur automatically (Turk-Browne et al., 2009) and affects behavior even in the absence of an intention to learn, or an awareness of, the regularities (Fiser and Aslin, 2002; Brady and Oliva, 2008). Given the significant behavioral and perceptual relevance of expectations, it is perhaps not surprising that the brain shows a remarkable sensitivity to statistical regularities. Many studies documented attenuated neural responses for expected compared to unexpected object stimuli in ventral visual regions subserving object recognition, both in terms of single unit spiking activity in monkeys (Meyer and Olson, 2011; Kaposvari et al., 2018) and in terms of non-invasively measured BOLD activity in humans (den Ouden et al., 2010; Egner et al., 2010; Richter et al., 2018; for a review see de Lange et al., 2018). This reduced response to expected stimuli has frequently been interpreted, within a predictive processing framework (Friston, 2005; Rao, 2005; Rao and Ballard, 1999), as signifying a reduction of prediction errors elicited by the stimulus when sensory input matches prior expectations. However, it remains largely unknown whether this sensory attenuation process to predicted visual stimuli is automatic, as its relation to statistical learning may suggest, or only apparent when the predictable stimuli are attended.

Indeed, research on visual statistical learning in monkeys has typically not manipulated attention, but only required monkeys to passively fixate in order to obtain reward (Meyer and Olson, 2011; Kaposvari et al., 2018), thereby precluding conclusions pertaining to the dependence of these predictive processes on attention. Many studies in humans, providing evidence for suppressed responses to expected stimuli, did require participants to attend the predictable stimuli (e.g., den Ouden et al., 2010; Egner et al., 2010; Kok et al., 2012a; Richter et al., 2018). On the other hand, den Ouden et al. (2009) demonstrated attenuated responses to task-irrelevant expected stimuli, suggesting the possibility that the sensory consequences of statistical learning may not depend on attention. Similarly, Kok et al. (2012a) showed that the sensory attenuation for grating stimuli with an expected orientation was independent of whether the orientation feature was attended or not. Importantly however, in both these studies the expected or unexpected stimulus was the only stimulus presented on the screen, so even though the stimuli were not relevant, attention was not effectively disengaged by other stimuli. Without competition, it is likely that even a task-irrelevant stimulus will receive some attention.

Thus, at present it remains unclear whether statistical learning automatically results in altered neural responses to expected compared to unexpected visual stimuli, or whether this process hinges on the stimuli being attended. In order to answer this question, we exposed participants to sequentially presented pairs of object images. The first image predicted the identity of the second image, thereby making an image expected depending on temporal context. We recorded responses to expected and unexpected object images using whole-brain fMRI while participants performed one of two tasks. Either participants categorized the predictable, second object image as (non-)electronic (rendering the object images attended), or they classified a concurrently shown character (letter or symbol), presented within the fixation dot, as (non-)letter (rendering the object images unattended).

In brief, our results demonstrate strong sensory attenuation for expected object images within the ventral visual stream. Crucially however, expectation suppression was only evident when objects were attended and vanished when participants attended the concurrently presented alphanumeric characters at fixation. This suggests that sensory attenuation induced by statistical learning is not the result of an automatic integration of prior knowledge with incoming information, but hinges on attention, thus constraining neurocomputational theories of perceptual inference.

Results

We exposed participants to statistical regularities by presenting object image pairs in which the leading image predicted the identity of the trailing image. During a learning session, participants performed a detection task of unpredictable upside-down images. On the next day, in the MRI scanner, participants were shown the same object image pairs, however unexpected trailing images were also presented; that is, images which were predicted by a different leading image. Crucially, participants either classified the trailing object as (non-)electronic, thus actively attending the predictable object, or classified a concurrently presented, but unpredictable, trailing character as (non-)letter, thus not attending the predictable object.

Attention is a prerequisite for perceptual expectations

First, we investigated whether the sensory attenuation for expected object stimuli was equally present when participants attended the objects or not, focusing on our a priori defined ROIs (see Figure 1A): primary visual cortex (V1), object-selective lateral occipital complex (LOC), and temporal occipital fusiform cortex (TOFC). In all three regions, expectation suppression was robustly present when participants attended the objects (V1: t₍₃₃₎ = 3.573, p=0.001, d_z = 0.613; LOC: t₍₃₃₎ = 3.860, p=5.0e-4, d_z = 0.662; TOFC: t₍₃₃₎ = 5.133, p=1.2e-5, d_z = 0.880), but absent when participants attended the characters at fixation; that is, when the predictable objects were unattended (V1: t₍₃₃₎ = −0.216, p=0.830, d_z = −0.037; LOC: t₍₃₃₎ = −0.831, p=0.412, d_z = −0.143; TOFC: t₍₃₃₎ = 0.072, p=0.943, d_z = 0.012). Indeed, Bayesian analyses showed moderate support for the null hypothesis (BF₁₀ <1/3) of no expectation suppression in all three regions during the character categorization task (V1: BF₁₀ = 0.188; LOC: BF₁₀ = 0.253; TOFC: BF₁₀ = 0.184). The robustness of this distinct pattern of expectation suppression for the two conditions was statistically confirmed by an interaction analysis (expectation by attention interaction, V1:, F_(1,33) = 7.706, p=0.009, η²=0.189; LOC: F_(1,33) = 12.580, p=0.001, η²=0.276; TOFC: F_(1,33) = 16.955, p=2.4e-4, η²=0.339).

Figure 1

Download asset Open asset

Expectation suppression within the ventral visual stream depends on attention.

(A) Displayed are parameter estimates + /- within subject SE for responses to expected (blue) and unexpected (green) object stimuli during the objects attended task (attended) and objects unattended task (unattended). In all three ROIs, V1 (left), LOC (middle), and TOFC (right) BOLD responses were significantly suppressed in response to expected stimuli during the objects attended task. No difference was found between BOLD responses to expected and unexpected stimuli during the objects unattended task. The interaction effect between expectation and attention condition was significant in all three ROIs. (B) Expectation suppression in primary visual cortex is stimulus unspecific, and specific only in higher visual areas. Displayed is the average expectation suppression effect (BOLD responses, unexpected minus expected) split into stimulus-driven (light gray) and non-stimulus-driven (dark gray) gray matter voxels. Data are shown for the three ROIs, V1 (left bars), LOC (middle bars), and TOFC (right bars). Expectation suppression in LOC and TOFC was significantly larger for stimulus-driven than non-stimulus-driven voxels, while no such difference was evident in V1, indicating that expectation suppression in V1 was stimulus unspecific. Error bars indicate within-subject SE. Note, that the ROI masks in panel A and B differ, for details see: *ROI definition* and *Stimulus specificity analysis* in the Materials and methods section. *p<0.05. **p<0.01. ***p<0.001.

https://doi.org/10.7554/eLife.47869.002

Figure 1—source data 1 Expectation suppression within the ventral visual stream depends on attention. The source data file contains a separate JASP file per ROI, containing BOLD data for expected and unexpected stimuli for both attention conditions (objects attended and unattended tasks; Figure 1A). Also contained is a JASP file showing expectation suppression per ROI split into stimulus driven and not stimulus driven voxel (Figure 1B).: https://doi.org/10.7554/eLife.47869.003
Download elife-47869-fig1-data1-v2.zip

Thus, in V1, LOC, and TOFC, there was a significant suppression of BOLD responses for expected compared to unexpected object stimuli exclusively during the object categorization task. No such modulation of BOLD responses by expectation was observed in the objects unattended condition in any of the three a priori ROIs, and in fact, there was moderate evidence for the absence of such a modulation when objects were unattended. We repeated all ROI analyses within the same ROIs but with different ROI sizes in order to ensure that our results were not dependent on the a priori but arbitrarily defined ROI mask size. Results were highly similar (i.e., the same effects showing statistically significant results) to those mentioned above within all three ROIs (V1, LOC, TOFC) for all tested ROI sizes, ranging from 100 to 400 voxels (800 mm³ - 3200 mm³) in steps of 100 voxels. Thus, our results do not depend on the exact ROI size but represent responses within the respective areas well.

We also examined how expectation modulated neural activity outside our predefined ROIs by performing a whole-brain analysis. Results of this whole brain analysis are illustrated in Figure 2A. The upper row in Figure 2A shows extensive clusters of expectation suppression throughout the ventral visual stream when objects were attended, but no difference when the objects were unattended (middle row), leading to a significant interaction (bottom row). These results complement our ROI-based analysis by showing that the observed expectation suppression effect is not unique to the a priori defined ROIs but evident throughout the ventral visual stream.

Figure 2

Download asset Open asset

Expectation suppression across cortex for attended object stimuli only.

(A) Widespread expectation suppression across cortex in the objects attended condition. Displayed are parameter estimates for unexpected minus expected image pairs overlaid onto the MNI152 2 mm template. Color indicates unthresholded parameter estimates: red-yellow clusters represent expectation suppression. Opacity represents the z statistics of the contrasts. Black contours outline statistically significant clusters (GRF cluster corrected). Significant clusters included major parts of the ventral visual stream (early visual cortex, LOC, TOFC), anterior insula, and inferior frontal gyrus during the objects attended condition (upper row). No significant clusters were evident in the objects unattended condition (middle row). The interaction (attended >unattended; bottom row) showed significant clusters similar to those of the attended condition, albeit less extensive. (B) Expectation suppression across the ventral visual stream for attended objects, but with task-irrelevant predictions. Displayed are z statistics of the contrast unexpected minus expected of the conjunction: *attended task-relevant predictions* ∪ *task-irrelevant predictions*; data of task-irrelevant predictions from Richter et al. (2018). Exclusively the ventral visual stream clusters showed significant expectation suppression in this conjunction, while all non-sensory area clusters were no longer significant. Thus, only the ventral visual stream clusters displayed a sensitivity to conditional probabilities, irrespective of whether predictions were task-relevant or task-irrelevant, as long as the predictable stimuli were attended.

https://doi.org/10.7554/eLife.47869.004

Figure 2—source data 1 Expectation suppression across cortex for attended object stimuli only. The source data file contains nifti images for the whole brain contrast unexpected >expected (expectation suppression). Separate files are included for each attention condition, as well as their interaction (attended >unattended), both in terms of unthresholded parameter estimate, z and thresholded z-maps (Figure 2A). The thresholded z map of the conjunction analysis (Figure 2B) is also included.: https://doi.org/10.7554/eLife.47869.005
Download elife-47869-fig2-data1-v2.zip

Outside the ventral visual stream, additional clusters of expectation suppression are evident in anterior insula and the frontal operculum, the precentral and inferior frontal gyrus, superior frontal gyrus and supplementary motor cortex, superior parietal lobule, as well as parts of the cerebellum. All significant clusters are summarized in a table in Supplementary file 1. Again, all these non-sensory clusters showed reduced activity for expected objects only when the object stimuli were attended and categorized. There was no significant modulation of activity by expectation anywhere in the whole brain analysis when the objects were unattended.

Expectation suppression requires attention to the stimuli, but not their predictable relationship

During the object categorization task, the ability to form expectations about the trailing object stimulus was helpful for the participants, and indeed expected object stimuli were categorized more quickly and accurately (see Figure 5A and Expectations facilitate object classification). This begs the question whether the expectation suppression effect that we observed throughout multiple brain areas during the object categorization task reflects differences in task engagement. Participants had an incentive to (implicitly or explicitly) use their knowledge of the predictable relationship between the leading and trailing image to prepare their object categorization response. In order to examine which brain regions exhibited expectation suppression irrespective of the relevance of the predictable relationship between stimuli, we performed a conjunction analysis that highlighted regions that showed significant expectation suppression both in the current study (during the object categorization task) and in a similar study that we published previously (Richter et al., 2018). During this latter study, participants also attended the object stimuli, but were asked to press a button whenever an object appeared that was flipped upside-down. Upside-down images occurred rarely, and importantly, were not related to the (implicitly learned) statistical regularities. Figure 2B shows the whole-brain results of this conjunction analysis. Significant, bilateral clusters of expectation suppression were evident throughout most of the ventral visual stream. However, none of the non-sensory clusters showed significant expectation suppression during both experiments. Thus, only in the ventral visual stream we found strong and robust evidence for expectation suppression, regardless of whether the predictable relationship was task-relevant or task-irrelevant, as long as the predictable object pairs were attended.

Stimulus specificity of the neural modulation by expectation

Next, we investigated the stimulus specificity of expectation suppression. Stimulus specificity concerns the question whether only stimulus-driven voxels or also voxels that were not (strongly) driven by the object stimuli displayed expectation suppression. The rationale was that an unspecific suppression effect (i.e., expectation suppression that is also evident in not stimulus-driven voxels) may result from global non-sensory effects, such as changes in general arousal or global surprise signals. On the other hand, stimulus-specific suppression effects, being limited to stimulus-driven voxels, are rather suggestive of a more specific suppression mechanism that selectively operates on the neural populations that represent the expected stimulus; for example, the dampening of stimulus-specific prediction errors as a result of a match between prediction and input.

All three ROIs were split into two populations of gray matter voxels, according to their stimulus responsiveness (stimulus-driven: responding to the object images; not stimulus-driven: not significantly responding to the object images), using independent data from the localizer run. There were strong differences between the ROIs in terms of the stimulus specificity of expectation suppression (Figure 1B; ROI x drive interaction: F_{(1.245,41.080)} = 7.651, p=0.005, η²=0.188). Whereas there was clear evidence for a larger expectation suppression effect in stimulus-driven than not stimulus-driven voxels in higher visual areas (LOC: t₍₃₃₎ = 3.991, p=3.4e-4, d_z = 0.684; TOFC: t₍₃₃₎ = 4.654, p=5.1e-5, d_z = 0.798), suppression was not significantly different between stimulus-driven and not stimulus-driven voxels in V1 (t₍₃₃₎ = −1.057, p=0.298, d_z = −0.181). Indeed, a Bayesian analysis indicated moderate support for the absence of a difference between stimulus-driven and not stimulus-driven voxels in V1 (BF₁₀ = 0.307). Of note, all sub-populations in all three ROIs showed significant expectation suppression (all p<0.05), suggesting that there is a general suppression of activity for expected stimuli in visual cortex, irrespective of whether the visual cortical area is driven by the stimuli. However, in later visual cortical areas (LOC and TOFC) there was significantly more expectation suppression in neuronal subpopulations that were driven by the stimulus, implying a more selective suppression mechanism in these areas.

Surprising stimuli elicit a larger pupil dilation

In view of the suggestion that a global, stimulus unspecific response modulation may partially account for expectation suppression, we performed an exploratory analysis to examine whether surprising stimuli were associated with a stronger pupil dilation in our task. Pupil responses have been with linked with changes in arousal (Reimer et al., 2014; Vinck et al., 2015), which in turn may account for the stimulus unspecific suppression component. Moreover, pupil dilation scales with surprise (Damsma and van Rijn, 2017; Kloosterman et al., 2015; Preuschoff et al., 2011). Thus, this account would predict enhanced pupil dilation to unexpected compared to expected stimuli when objects were attended.

There was indeed a larger pupil diameter for unexpected compared to expected trailing images during the objects attended task (Figure 3, left). This difference emerged gradually starting ~600 ms after the onset of the trailing object image, and was significant between 1.5–2.8 s, as assessed with a cluster permutation test (p_cluster = 0.017). When objects were unattended, no significant difference in pupil diameter was found between the expectation conditions, and in fact, no timepoint surpassed the cluster formation threshold (i.e., all timepoints p>0.05 uncorrected; Figure 3, right). However, the expectation induced difference in pupil diameter was not reliably different between attended and unattended stimuli (p_cluster = 0.393). Thus, the data showed that the pupil was significantly more dilated for unexpected than expected objects when the images were attended, mirroring the results of the neural data – albeit, without a reliable difference between attended and unattended stimuli. This tentatively suggests that the enhanced BOLD responses to unexpected stimuli might be partially accounted for by a global mechanism, such as increased arousal in response to surprising stimuli.

Figure 3 with 3 supplements see all

Download asset Open asset

Larger pupil dilations in response to unexpected compared to expected stimuli during the objects attended task.

Displayed are pupil diameter traces over time, relative to trailing image onset. Pupil diameter data for expected (blue) and unexpected (green) image pairs are shown for the objects attended task (left) and objects unattended task (right). The black line on the abscissa denotes statistically significant differences in pupil dilations between expected and unexpected images (cluster permutation test, p<0.05). In the objects attended condition significantly larger pupil dilations in response to unexpected images are evident between 1.52 to 2.88 s after trailing image onset (left). No significant difference is found in the objects unattended condition (right), nor in the interaction between conditions. The first vertical dashed line indicates leading image onset, the second vertical line trailing image onset. Shaded areas denote within-subject SE. Timepoints from −1.0 to −0.5 s served as baseline period.

https://doi.org/10.7554/eLife.47869.006

Figure 3—source data 1 Larger pupil dilations in response to unexpected compared to expected stimuli during the objects attended task. The source data file contains the preprocessed pupil diameter traces (participants by timepoints) for each of the four experimental conditions separately (two attention by two expectation conditions).: https://doi.org/10.7554/eLife.47869.010
Download elife-47869-fig3-data1-v2.zip

Expectation suppression and pupil dilations to surprising stimuli are associated

We explored whether expectation suppression and pupil dilation differences between unexpected and expected objects were associated. In other words, we sought for evidence of an association between the effect of expectations on pupil dilation and the expectation induced neural response attenuation. For this analysis we rank correlated expectation suppression magnitudes with pupil dilation differences for each participant. Results, displayed in Figure 4A, suggest that, when objects were attended, expectation suppression in V1 was more pronounced for trailing images that also resulted in larger pupil dilation differences (t₍₃₁₎ = 2.464, p=0.019, d_z = 0.436). This association was not reliable in LOC (t₍₃₁₎ = 1.413, p=0.167, d_z = 0.250; BF₁₀ = 0.466) or TOFC (t₍₃₁₎ = 1.401, p=0.171, d_z = 0.248; BF₁₀ = 0.458). There was no correlation of pupil dilation differences and expectation suppression when stimuli were unattended in any of the ROIs (V1: t₍₃₁₎ = −0.159, p=0.875, d_z = −0.028; BF₁₀ = 0.191; LOC: t₍₃₁₎ = −0.125, p=0.901, d_z = −0.022; BF₁₀ = 0.190; TOFC: t₍₃₁₎ = 0.177, p=0.861, d_z = 0.031; BF₁₀ = 0.192). There was no significant overall difference in the correlation strength between attended and unattended stimuli (F_(1,31) = 1.892, p=0.179, η²=0.058), nor between ROIs (F_{(1.558,48.293)} = 0.134, p=0.823, η²=0.004), nor their interaction (F_(2,62) = 0.482, p=0.603, η²=0.015). Thus, when stimuli were attended there was evidence for an association of pupil dilation and expectation suppression in V1.

Figure 4

Download asset Open asset

Expectation suppression is associated with pupil dilation differences and behavioral benefits of expectations.

(A) Correlation of expectation suppression magnitude and pupil dilation differences due to expectation. When predictable objects are attended, trailing images that induce larger pupil dilation differences are also showing larger expectation suppression magnitudes in V1. No such association is evident when objects are unattended. (B) Correlation of expectation suppression magnitude and RT benefits due to expectation. When predictable objects are attended, larger RT benefits are associated with larger expectation suppression effects in V1 and TOFC. This association is absent when objects are unattended. Error bars indicate within-subject SEM. *p<0.05.

https://doi.org/10.7554/eLife.47869.011

Figure 4—source data 1 Neural effects of expectations are associated with pupil dilation differences and reaction time benefits. The source data file contains two JASP files of the analyses conducted on the correlation coefficients (Fisher z-transformed Rho), correlating expectation suppression (neural metric) with (A) pupil dilation differences due to expectations and (B) RT benefits due to expectations (behavioral measure). Correlation coefficients for data from the three ROIs (V1, LOC, TOFC) and both attention tasks are provided.: https://doi.org/10.7554/eLife.47869.012
Download elife-47869-fig4-data1-v2.zip

Expectations facilitate object classification

In order to assess whether, concurrent with the neural effects of expectations, behavioral benefits of expectations were evident, we analyzed behavioral responses during MRI scanning in terms of reaction times (RTs) and response accuracy. Overall, the objects attended (classify electronic items) and objects unattended task (classify characters at fixation) showed very similar response accuracies (attended: 94.3 ± 5.4% vs. unattended: 94.0 ± 6.6%, mean ± SD) and only minor differences in RTs (attended: 574 ± 150 ms vs. unattended: 602 ± 131 ms, mean ± SD). This supports the notion that both tasks were of approximately equal difficulty.

During the object categorization task, participants could benefit from the foreknowledge of the identity of the trailing object image, as they were asked to categorize the trailing image. Such a benefit would however not be expected during the character categorization task, as the participants could fully ignore the object stimuli during this task. This is precisely what we observed, both in terms of accuracy and RTs (Figure 5A). During the object categorization task, participants were more accurate (W = 457, p=3.2e-4, r_B = 0.536) and faster (W = 9, p=3.8e-9, r_B = −0.970) for expected compared to unexpected trailing object stimuli. Conversely, during the character categorization task, no such benefit was observed in terms of accuracy (t₍₃₃₎ = 1.600, p=0.119, d_z = 0.274; BF₁₀ = 0.582) or RT (W = 252, p=0.447, r_B = −0.153; BF₁₀ = 0.273). The robustness of this distinct pattern of behavioral advantage for expected stimuli for the two conditions was statistically confirmed by an interaction analysis (accuracy: F_(1,33) = 5.203, p=0.029, η²=0.136; RT: F_(1,33) = 37.543, p=6.6e-7, η²=0.532).

Figure 5

Download asset Open asset

Behavioral results demonstrate statistical learning.

(A) Behavioral benefits of expectations demonstrate statistical learning. Displayed are mean accuracy (left) and mean reaction time (right) + /- within subject SE. Responses to expected stimuli are significantly more accurate and faster, an effect exclusively observed during the objects attended condition. Thus, object identity expectations benefit behavioral performance during object classification and do not impact letter classification. (B) Pairs of both the objects attended image set and the objects unattended image set were classified significantly above chance, indicating a learning of the pairs for both conditions. Displayed are mean accuracy (left) and mean reaction time (right) during the post-scanning pair recognition task, + /- within subject SE. The dashed line indicates chance level. During the pair recognition task, no differences in either classification accuracy (left) or response speed (right) were observed between pairs previously belonging to the objects attended task compared to the objects unattended task. *p<0.05. ***p<0.001.

https://doi.org/10.7554/eLife.47869.013

Figure 5—source data 1 Behavioral results demonstrate statistical learning. The source data file contains separate JASP files containing the behavioral performance data and conducted analyses for the object and letter classification tasks (both in terms of RTs and response accuracy; Figure 4A), as well as data from the post-scanning object recognition task (Figure 4B).: https://doi.org/10.7554/eLife.47869.014
Download elife-47869-fig5-data1-v2.zip

Neural and behavioral effects of expectations are associated

In order to explore whether the observed expectation suppression is associated with the behavioral benefits due to expectations, we correlated the magnitude of expectation suppression and the expectation induced RT benefits. Results, illustrated in Figure 4B, show that when the predictable objects were attended, behaviorally observed expectation RT benefits and neurally observed expectation suppression were associated in both, V1 (t₍₃₃₎ = 2.442, p=0.020, d_z = 0.419) and TOFC (t₍₃₃₎ = 2.236, p=0.032, d_z = 0.384), but no reliable correlation was found in LOC (t₍₃₃₎ = 1.384, p=0.176, d_z = 0.237, BF₁₀ = 0.439). There was no association in any ROI when objects were unattended (V1: t₍₃₃₎ = −0.418, p=0.679, d_z = −0.072, BF₁₀ = 0.199; LOC: t₍₃₃₎ = −0.374, p=0.711, d_z = −0.064, BF₁₀ = 0.196; TOFC: t₍₃₃₎ = 0.179, p=0.859, d_z = 0.031, BF₁₀ = 0.186). On average correlations were not reliably larger when objects were attended than when they were unattended (attention: F_(1,33) = 2.920, p=0.097, η²=0.081). The pattern of results was similar in all ROIs (F_{(1.636,53.988)} = 0.615, p=0.513, η²=0.018; interaction: F_{(1.461,48.203)} = 0.381, p=0.619, η²=0.011). Thus, there is some evidence that when the objects were attended, participants showed larger benefits (faster RTs) for expected trailing images for which they also showed larger magnitudes of expectation suppression in V1 and TOFC. These results suggest that the neural and behavioral effects of expectations are associated.

No differences in association strength between attended and unattended object pairs

An alternative explanation for the absence of sensory attenuation for expected object stimuli during the character categorization task is that statistical regularities for the objects that are presented during this condition have simply not been learned. This explanation may be unlikely, because the vast majority of exposure to the expected pairs takes places in the learning session, during which the same task (upside-down image detection) was used for all image pairs. However, it is nonetheless important to ensure that statistical regularities were learned for the image pair sets of the object and the character categorization task. To empirically address this, we tested whether participants had explicit knowledge of the statistical regularities for all object pairs. During this post-scanning pair recognition task, participants were asked to indicate which one of two trailing images was more likely given the leading image. Participants indicated the correct trailing image with above chance accuracy for both, the set of object pairs that was previously attended (Figure 5B; performance = 62.1 ± 1.8%, mean ± SE; t₍₃₃₎ = 6.803, p=4.6e-8, d_z = 1.167) and the set that was previously unattended (performance = 58.7 ± 2.2%; t₍₃₃₎ = 3.905, p=2.2e-4, d_z = 0.670). There was no statistically significant difference in accuracy on the pair recognition task between these sets of objects (W = 365, p=0.256, r_B = 0.227; BF₁₀ = 0.737). Reaction times were also similar for both sets of objects (objects previously attended: RT = 458.8 ± 25.4 ms; objects previously unattended: RT = 466.5 ± 25.9 ms; t₍₃₃₎ = −1.208, p=0.236, d_z = −0.207; BF₁₀ = 0.358). Thus, the image pairs belonging to both task conditions (objects attended and unattended tasks) were reliably learned, most likely during the extensive behavioral training session, and there was no evidence for a significant difference in the learning of associations for the two sets of object pairs. This strongly suggests that the differences in sensory attenuation between the two attention conditions are unlikely to be explained by differences in the strength of the association between the object pairs.

Visual processing continues in the absence of attention

Finally, one may wonder whether the lack of expectation suppression when objects were unattended is due to the fact that object stimuli simply did not elicit strong activity in the ventral visual stream, as they were not in the focus of attention. Although all three ROIs showed reliable above-baseline activity also when objects were unattended (Figure 1A), and activity in LOC and TOFC was of similar amplitude during both conditions, the overall activity level may partly represent stimulus-unrelated activity. Therefore, in an explorative analysis, we assessed the strength of stimulus-specific activity in our three ROIs, by means of a decoding analysis of the trailing images. In brief, a multi-class decoder was trained to differentiate between the six trailing images per attention condition. The classifier was trained on data obtained in an independent localizer run, during which participants performed a separate task (detection of dimming of fixation dot). Performance of this decoder was tested on the mean parameter estimates per trailing image for each of the two attention conditions of the main MRI task data. Because each task was comprised of six trailing images, chance performance was 16.7%. One-sample t-tests or Wilcoxon signed rank test (as applicable) showed that in each of the three ROIs (V1, LOC, TOFC) and tasks (objects attended, objects unattended) object identity could be decoded above chance (V1 attended: 81.1%; W = 595, p=3.3e-7, r_B = 1; V1 unattended: 84.8%; W = 595, p=3.2e-7, r_B = 1; LOC attended: 37.3%; t₍₃₃₎ = 6.303, p=4.0e-7, d_z = 1.08; LOC unattended: 38.0%; W = 583, p=9.7e-7, r_B = 0.96; TOFC unattended: 25.0%; W = 476, p=0.002, r_B = 0.60), except in TOFC in the attended condition (TOFC attended: 19.6%; W = 383, p=0.143, r_B = 0.287; BF₁₀ = 0.388).

Moreover, decoding accuracy was not different between the objects attended and unattended conditions in any of the ROIs (V1: t₍₃₃₎ = −1.197, p=0.240, d_z = −0.205, BF₁₀ = 0.354; LOC: t₍₃₃₎ = −0.214, p=0.832, d_z = −0.037, BF₁₀ = 0.188; TOFC: t₍₃₃₎ = −1.726, p=0.094, d_z = −0.296, BF₁₀ = 0.697). This suggests that the object stimuli evoked a reliable stimulus-specific activity pattern in all three sensory regions, which was not significantly different in strength between the two tasks (object categorization and character categorization). Note, the participants’ task during the localizer run, which we used to train the classifier, was to detect a dimming of the fixation dot. As such, object stimuli were unattended during the localizer run, which may render the training data more similar in terms of attention allocation to the objects unattended task than the objects attended task. This may explain why decoding accuracy is similar, or even higher, for unattended compared to attended objects. More importantly, overall visual processing of the object stimuli was clearly present even when the objects stimuli were not attended, as the identity of the objects could be reliably decoded from neural activity patterns throughout the ventral visual stream when objects were unattended.

Discussion

In the present study, we set out to investigate how sensory attenuation following visual statistical learning is modulated by attention. In line with previous studies (Alink et al., 2010; den Ouden et al., 2010; Kok et al., 2012a; Richter et al., 2018; Summerfield et al., 2008) we found a significant and wide-spread attenuation of neural responses to expected compared to unexpected stimuli. Crucially, we showed that attending to the predictable stimuli is a prerequisite for this expectation suppression effect to arise. While unattended objects led to reliable and stimulus-specific increases in neural activity, and object pairs were equally learned for these stimuli, there was no differential activity depending on whether the trailing object was expected or unexpected. Additionally, we found that higher visual areas exhibited stimulus-specific expectation suppression, whereas early visual cortex showed a global, stimulus unspecific suppression, possibly arising from a general increase in arousal in response to surprising stimuli.

Attention is a prerequisite for expectation suppression

Our results show that a core neural signature of perceptual expectations, expectation suppression (Alink et al., 2010; den Ouden et al., 2010; Kok et al., 2012a; Richter et al., 2018), is only evident when attention is directed to the predictable object stimuli. Specifically, when participants engaged in an object categorization task, we found a wide-spread reduction of neural activity for expected compared to unexpected stimuli throughout the ventral visual stream (V1, LOC, TOFC), as well as several non-sensory areas (anterior insula, inferior frontal gyrus, precentral gyrus, and superior parietal lobule). Strikingly, no modulation of neural activity by expectation was found when attention was drawn away from the object stimuli.

Interestingly, by directly comparing our present data with a previous dataset, in which we used a similar design (reported in Richter et al., 2018), we established that expectation suppression is present throughout the ventral visual stream irrespective of whether predictions are task-irrelevant, as long as the object stimuli are attended. In contrast, the larger activity for surprising stimuli in non-sensory areas (insular, frontal and parietal cortex) was only observed in the context of task-relevant expectations. This suggests that neural activity in the ventral visual stream is modulated by conditional probabilities, as long as the stimuli are attended, while the modulations in non-sensory regions are probably reflecting differences in task demands, given that unexpected stimuli were more difficult to categorize (reflected by a cost in speed and accuracy). During the object classification task, unexpected objects may require response inhibition, reevaluation of the category, and thus a new response decision. Given that the anterior insula has been associated with task control, action evaluation (Brass and Haggard, 2010), as well as general attentional processes (Nelson et al., 2010), and inferior frontal gyrus with response inhibition (Aron et al., 2003; Aron et al., 2004), the interpretation that the expectation modulation in non-sensory clusters may reflect task related aspects, but not conditional probabilities per se, appears well-supported by previous research.

Finally, our results also demonstrate that larger expectation suppression effects in V1 and TOFC are associated with increased reaction time benefits afforded by expectations when people are judging the predictable objects. This suggests that the observed expectation suppression effect may not merely constitute an epiphenomenon of more resource efficient neural processing. Instead, given the present data, it is plausible that the behavioral advantage of predicting stimuli may partially be rooted in improved and more effective sensory processing already at the early stages of visual processing. Predictions may thus help in converging more rapidly on an interpretation of the current sensory input, thereby contributing to faster reactions to expected than unexpected stimuli.

No perceptual predictions without attention

Our results corroborate and extend earlier work by Larsson and Smith (2012), who observed that stimulus expectation only affected repetition suppression when the stimuli were attended. However, they appear at odds with several previous studies that have reported expectation suppression in the visual system for stimuli that were not task-relevant and thus appeared unattended (den Ouden et al., 2009; Kok et al., 2012a; Kok et al., 2012b). However, in all these studies, while the predictable stimuli were task-irrelevant, attention was not effectively drawn away by a competing stimulus that required attention. While our attention manipulation is also based on task-relevance, we do engage attention elsewhere using a competing task. This is a crucial difference between the present and previous studies, because it is likely that any supraliminal stimulus, in the absence of competition, will be attended to some degree, even if it is not task-relevant, especially if the stimulus is surprising (Horstmann and Herwig, 2015). Indeed, synthesizing earlier and current findings, we can conclude that expectation suppression in the visual system occurs irrespective of exact task goals and relevance of the predictable objects and their predictable relationship, but it is abolished by drawing attention away from the stimuli. This suggests that the integration of prior knowledge and sensory input is gated by attention – that is, prior knowledge only exerts an influence on stimuli that are in the current focus of attention, instead of automatically and pre-attentively modulating sensory input as an obligatory component of perceptual processing.

It is however possible that other, more ‘stubborn’ prior expectations (Yon et al., 2019) that are derived over longer (ontogenetic or phylogenetic) time scales may persist even when attention is drawn away, such as perceptual fill-in during the Kanizsa illusion (Kok et al., 2016). Therefore, it is crucial to discriminate between different types of predictions, as expectations of different sources may rely on different neural mechanisms and therefore have distinct properties. Similarly, for simple stimuli, such as oriented gratings (Kok et al., 2012a; Kok et al., 2012b) or simple sequences (Ekman et al., 2017), the resolution of expectations may depend less on recurrent processing throughout the visual hierarchy than for complex objects. Thus, it is conceivable that the automaticity of predictive processing partially depends on the complexity of the predictable stimuli and their association, with increasing complexity requiring increasing processing across the hierarchy, and in turn a focus of attention on the predictable stimuli.

Specific vs. unspecific surprise responses

In LOC and TOFC expectation suppression was largest in neural populations that were driven by the stimuli. Surprisingly, this was not the case in V1, where the suppression was uniformly present in the population that was driven by the stimuli and the population that was not. This replicates the results of our previous study (Richter et al., 2018) and suggests that the expectation suppression we observe in V1 is not the result of a stimulus-specific reduction in prediction error responses of neurons processing the stimulus. Rather, they suggest that the observed expectation suppression effect in V1 may be accounted for by a more general response modulation. Widespread nonperceptual modulations of visual cortical activity have been documented in response to unexpected events (Jack et al., 2006; Donner et al., 2008) and have been suggested to be linked to the cholinergic or noradrenergic system (Aston-Jones and Cohen, 2005; Yu and Dayan, 2005a). Interestingly, both the cholinergic and noradrenergic systems have also been associated with fluctuations in pupil dilation (Reimer et al., 2016). In line with this, we found a significantly enhanced pupil dilation in response to unexpected stimuli when the objects were attended. This suggests two possible global mechanisms which may partially account for the observed unspecific expectation suppression effect. Given that both pupil dilation (Reimer et al., 2014; Vinck et al., 2015) and the noradrenergic system (Berridge et al., 2012) are associated with arousal changes, it is possible that expectation suppression is partially accounted for by an increased arousal in response to surprising stimuli. A related explanation is that enhanced pupil dilation to surprising stimuli (Damsma and van Rijn, 2017; Kloosterman et al., 2015; Preuschoff et al., 2011) results in enhanced retinal illumination, which in turn leads to stronger responses in early visual areas (Haynes et al., 2004), which could potentially also contribute to stimulus unspecific expectation suppression in V1. These interpretations are further supported by the fact that expectation suppression and pupil dilation differences between unexpected and expected attended stimuli were associated, with trailing images that elicit larger pupil dilation differences also showing more pronounced expectation suppression in V1.

It is unlikely however that these explanations can fully account for the observed expectation suppression effect across the visual hierarchy, given the stimulus-specificity of suppression in LOC and TOFC. Also, it is important to bear in mind that earlier studies, using different stimuli and paradigms, did observe stimulus-specific expectation effects in V1 (Kok et al., 2012a; Gavornik and Bear, 2014). Combined, the evidence suggests that the resolution of prediction errors crucially depends on the visual areas that are specifically coding the feature that is diagnostic of an expectation confirmation or violation, while areas below this level may only witness an unspecific, global modulation in their response, signifying the binary expectation confirmation or violation.

Attention and prediction errors

Within the predictive coding framework, it has been suggested that attention modulates the gain of prediction error units (Feldman and Friston, 2010). On first glance, our results may not appear compatible with the suggestion that attention modulates the gain of prediction errors, because we observe a stimulus-specific bottom-up signal (prediction error) when stimuli are unattended, but no difference in the size of this prediction error between expected and unexpected stimuli. However, it is conceivable that the gain modulation of activity in prediction error units only occurs after the initial feedforward activity sweep, once the object predictions are strongly activated and start exerting an effect on the resolution of the prediction error. In particular, the response to unexpected attended stimuli may be upregulated by attention, while prediction errors for expected attended stimuli are rapidly resolved, thus resulting in the difference in activity for attended objects. On the other hand, when attention is drawn away from the object stimuli, a reduced gain on prediction error units results in the observed attenuation of overall BOLD responses, and an absence of a reliable difference between expected and unexpected stimuli. A closely related, but conceptually distinct, interpretation is that attention constitutes a (modulation of the) prior itself (Rao, 2005; Yu and Dayan, 2005b). On this account, attention boosts relevant predictions, as during the object classification task, thus leading to wide-spread expectation suppression, due to larger prediction errors for unexpected compared to expected stimuli. However, when attention is disengaged from the object stimuli, object predictions are not generated, and thus do not exert an effect on sensory processing.

Interpretational limitations

One may wonder whether the character categorization task at fixation may have drawn attention away from the objects so forcefully that the object stimuli were no longer processed by sensory cortex. It is important to note here that, although attention was engaged at fixation by the character categorization task, this task was of trivial difficulty. Thus, it seems unlikely that attentional resources were exhaustively engaged by the task, preventing any processing of the surrounding object stimuli, thereby causing the absence of predictive processing. Indeed, behavioral performance was at ceiling during both tasks. Furthermore, even when objects were unattended reliable visual processing took place, as evident by strong responses and object-specific neural patterns in the visual ventral stream. This suggests that in-depth visual processing of object stimuli did occur in the absence of attention, but predictive processes in particular ceased.

Another alternative explanation of the present results could be that predictive relationships were not learned for the set of objects that were used during the character categorization task, thereby accounting for the absence of a prediction effect. The pair recognition task at the end of the experiment however showed that associations were learned for both image pair sets. Thus, a lack of visual processing or absence of learning cannot account for the observed results. Also, it is worth noting that initially the used probabilistic associations (P(expected|cue)=0.5) may appear less strong than in some previous studies; for example, Egner et al. (2010), Kok et al. (2012a), and Summerfield et al. (2008) used P(expected|cue)=0.75. However, the likelihood ratio of expected/unexpected stimuli (0.5/0.1 = 5) used here is actually larger (i.e., each unexpected image is more surprising) than in the cited studies (0.75/0.25 = 3). Moreover, similar probabilistic associations have been successfully employed in studies investigating neural effects of statistical learning in both non-human primates (Meyer and Olson, 2011) and humans (Richter et al., 2018). In short, the utilized conditional probabilities are comparable to previous studies investigating statistical learning. Finally, it is worth emphasizing that neither adaptation nor familiarity effects can account for the observed results, because all trailing objects served both as expected and unexpected images, depending only on temporal context (i.e., the leading image).

Conclusion

In sum, our results suggest that visual statistical learning results in attenuated sensory processing for predicted input, but only when this input is attentively processed. Thus, attention seems to gate the integration of prior knowledge and sensory input. This places important constraints on neurocomputational theories that cast perceptual inference as a process of automatic integration of prior and sensory information.

Share this article

Cite this article

Expectation suppression within the ventral visual stream depends on attention.

Figure 1—source data 1

Expectation suppression across cortex for attended object stimuli only.

Figure 2—source data 1

Larger pupil dilations in response to unexpected compared to expected stimuli during the objects attended task.

Figure 3—source data 1

Expectation suppression is associated with pupil dilation differences and behavioral benefits of expectations.

Figure 4—source data 1

Behavioral results demonstrate statistical learning.

Figure 5—source data 1

Experimental paradigm.

Author details

David Richter

Contribution

For correspondence

Competing interests

Floris P de Lange

Contribution

Competing interests

Citations by DOI

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Research organism