Introduction

The involvement of brain oscillations in cortical computations and their role in laminar communication channels are highly debated in the neuroscience community. Although the significance of γ band oscillations for cortical computations remains uncertain (Hermes et al., 2019; Shirhatti et al., 2022; Dowdall et al. 2023; Schneider et al., 2021), they have been associated with bottom up stimulus feature specific processing (Fries, 2009, 2015). In line with anatomical findings, γ band oscillations are primarily associated with neuronal activity in granular and supra-granular layers (Bastos et al., 2012). Hence, nearly all current frameworks predict that γ band oscillations should be observed specifically during ongoing stimulus feature processing in supra-granular layers (Fries, 2015; Bonnefond et al., 2017; Bastos et al., 2012).

In contrast, α oscillations are thought to reflect a general inhibitory process (Klimesch et al., 2007; Pfurtscheller et al., 1996; Jensen and Mazaheri, 2010) (but see e.g. Bonnefond et al. 2017 or Bonnefond et al. 2024). A prevailing idea is that α is modulated at a global scale with low spatial specificity, possibly explaining the lack of α modulation observed in some animal studies using local references. However, an intracranial EEG study revealed that α band could be associated with surround suppression (Harvey et al., 2013). This suggests that α band oscillations can exhibit high spatial specificity and possibly be tightly linked to the activity of neuronal populations processing specific stimulus features in primary visual regions, or at least act with receptive field specificity. The discrepancies between results indicating high versus low spatial specificities leads to the assumption that α band activity reflects a family of low-frequency oscillations that serve functionally distinct roles. As proposed by Klimesch in the 90s (Klimesch, 1997; Klimesch et al., 1998; Klimesch, 1999), different frequency bands, lower and upper α band, could be associated with these distinct roles (see also Rodriguez-Larios et al. 2022; Bonnefond et al. 2017). Another debate relates to the laminar profile of α. Animal and human studies have reported that α oscillations are mainly linked to feedback-directed signaling, implying the involvement of top-down processes (van Kerkoerle et al., 2014), possibly coordinated with the involvement of the pulvinar (Saalmann et al., 2012). Stimulation of higher-order visual regions have been found to trigger α band activity in primary visual regions (van Kerkoerle et al., 2014). Accordingly, α oscillations have been reported to be the strongest in the deep layers which receive feedback connections from higher-order regions (van Kerkoerle et al., 2014; Spaak et al., 2012; Scheeringa et al., 2016). However, the laminar profile of α oscillations remains debated with other studies. Those mostly use current source density approaches, reporting a stronger power in superficial layers (Haegens et al., 2015; Halgren et al., 2019), receiving feedback projections, but also feed-forward input from granular layer generators. Further research therefore is needed to better understand the laminar profile of α oscillations in different frequencies and their spatial specificities.

Addressing these questions requires a thorough investigation of feature-specific cortical activity on the laminar level, which has traditionally posed a challenge in healthy human participants. Hence, most evidence has been obtained from animal models. Past studies have primarily focused on methodological obstacles (Scheeringa et al., 2016, 2023; Bonaiuto et al. Yang, 2021) or have relied on laminar level functional magnetic resonance imaging (fMRI) alone (Kok et al., 2016; Lawrence et al., 2019a; Self et al., 2019; Stephan et al., 2019; Yang et al., 2021). However, fMRI lacks the temporal resolution required and serves only as a proxy for neuronal activity. In contrast, electroencephalography (EEG) and magnetoencephalography (MEG) directly record neuronal activity, but have relatively low spatial resolution because the signals obtained stem from large populations of neurons. Recent studies combining high-resolution (laminar-level) fMRI with EEG recordings have successfully addressed these limitations (Scheeringa et al., 2011, 2016). In our study, we employed this approach to investigate, for the first time, feature-specific BOLD signals in the deep (infra-granular), middle (granular), and superficial (supra-granular) cortical layers in primary visual regions. In addition, we not only examined the relationship between EEG and positive BOLD signal deflections but also related EEG power changes to negative BOLD deflections, addressing the link between α and inhibition potentially interfering with unwanted information. Negative BOLD signal deflections have received less attention and mostly focused on the default mode network (Hinz et al., 2019; Mayhew et al., 2013; Raichle et al., 2001), however sparse evidence suggests that negative BOLD deflections contribute to understanding ongoing neuronal processing and specifically the inhibition of unwanted information (Tootell et al., 1998; Shmuel et al., 2006). In our analyses, we focused on feature-specific and feature-unspecific BOLD signal changes, highlighting the general importance of studying BOLD signal decreases for functional tasks.

The primary aim of this study was to investigate the relationship between cortical oscillations and feature-specific and feature-unspecific BOLD signals across multiple cortical layers. We employed a visual oddball task with stimuli consisting of two orthogonal gratings offset by 90° that served as features of interest (left or right oriented gratings). Our results demonstrate that γ band activity not only relates to the feature-specific superficial layer BOLD signal (Bonnefond et al., 2017; van Kerkoerle et al., 2014; Scheeringa et al., 2016; Van Kerkoerle et al., 2017) but also to deep layer BOLD activity. Additionally, we found that general modulatory, feature unspecific processes, potentially associated with attention related mechanisms, and feature-specific laminar activation profiles were linked to distinct α frequency bands.

This study provides the first direct evidence in healthy human participants that low-frequency oscillations serve multiple purposes in the visual cortex, associated with distinct cortical layer profiles.

Results

Behavioural and intermediate results

On average (SD) participant’s response accuracy was 94% (8%) with a false alarm rate of 2% (3%) to non-oddball stimuli and a miss rate of 5% (7%), indicating that participants performed the task adequately well and complied to the task instructions. The average (SD) reaction time was 759 ms (131 ms). See Figure 1 for a graphical representation of the task.

Experimental Procedure of the main experiment

1.2s prior to the stimulus onset, the fixation indicator would turn from green to red, indicating the subject to avoid blinking. After a period of 0.4s (0.8s before stimulus onset) the last fMRI volume of the previous trial finished recording. For a period of 3s no fMRI data was collected in order to avoid gradient artifacts in the EEG data. The stimulus presented is a left or right (± 45°from vertical axis) oriented grating. 16.7% of stimuli (2 × 8.3% for left and right respectively) were modified to have a slight wavy pattern (see example grating). Participants were asked to respond to those oddball trials with a button press. If no oddball trial was presented, the stimulus would remain on the screen for 1.6 s, followed by a 0.6s period where only the fixation indicator is shown. In case of a response, corresponding feedback (“correct”, “false alarm” or “miss”) is displayed instead. After that, the fixation indicator turns back to green, indicating to the subject that the period to avoid blinking has ended. Now, three consecutive 3D EPI volumes (TR : 3.3 s) are recorded, before the next trial starts. Overall, 240 trials (four blocks with 60 trials each) have been collected. Additionally, a high resolution (0.8 mm iso voxel size) T1 weighted full brain image has been collected before the main experiment to obtain the individual subject’s anatomy. Furthermore, three blocks of pRF mapping (128 trials each) have been performed after the main experiment using the same fMRI sequence as in the main experiment (without gaps).

The EEG signal, used to construct regressors for the final combined EEG-fMRI analyses, was obtained by a time frequency (TF) analysis of each virtual channel, computed for each voxel location of the gray matter in V1. In each hemisphere, one virtual channel for low and one for high frequencies was selected that showed the strongest α decrease and γ increase respectively. Figure 2 B depicts the average EEG response of the selected, TF transformed virtual channels averaged over all trials and participants. By visual inspection, the main α band decrease was determined to last from 0.2 s to 0.8 s after stimulus onset and comprised frequencies from 8 Hz to 14 Hz. The main γ band increase was determined between 0.1 s and 0.7 s after stimulus onset for a frequency range between 50 Hz and 70 Hz. A time window between 0.1 s and 0.8 s, was used for the main analyses, since both time windows for α and γ responses are covered. Furthermore, this time window ensures that only time bins went into the analyses that were related to the actual stimulus processing (μRT = 759 ms). To verify that the main TF response of interest originated at the occipital pole, a DICS beamformer full brain analysis had been conducted at the centre frequencies of the selected bands. As visible in Figure 2 A, the top 5% of the average decrease at 11 Hz or increase at 60 Hz in localized to occipital regions.

Intermediate results for the EEG and fMRI data.

A) Full brain DICS beamformer results. Subject average of log-ratios between stimulus and baseline for 11 Hz (α)and 60 Hz (γ) for both sets of separately filtered EEG data. This serves illustrative purposes only, since the virtual channels of interest have been selected from time-frequency transformed virtual channels obtained using LCMV beamforming. Here, the 5% vertices with the strongest decrease (top) or increase (bottom) are shown. B) Time-frequency representation of virtual EEG channels. Subject average of log-ratios between stimulus and baseline of time-frequency transformed virtual channels, obtained using LCMV beamforming (2 × 2 virtual channels for low and high frequencies and both hemispheres separately). Only the right hemispheric channels are shown. The white empty square indicates the data points that were included in the combined EEG-fMRI analyses. Average reaction time and stimulus onset are indicated by a continuous or dashed white line, respectively. C) Average t-value distribution. Surface projection of average t-map of the first level contrast for the general fMRI activation (top) and the contrast between left and right stimulus orientation (bottom) for illustrative purposes.

As expected, the strongest BOLD signal increase was found at the occipital pole, due to the central presentation of the stimulus (Engel et al., 1997; Dumoulin and Wandell, 2008). In addition, we observed a clear-cut t-value distribution pattern around the calcarine sulcus, with more negative tvalues located dorsal and more positive t-values ventral to the fissure. The first level contrast between both stimulus orientations did not reveal any clear pattern, which served as a sanity check. See Figure 2 C for a visualization.

Combined EEG-fMRI analyses

The relationship between the EEG and fMRI data has been investigated on a trial-by-trial basis by means of a general linear model (GLM) as was previously done (Scheeringa et al., 2011, 2016). A separate GLM model was computed for each TF bin by convolving the EEG response of the time-frequency transformed virtual channels with the standard hemodynamic response function as built into SPM12. Task and nuisance fMRI regressors served as control parameters and were fixed for each model. β coefficients for every voxel that were taken into consideration were multiplied by the layer weights for those respective voxels. This procedure was repeated for multiple activation thresholds, such that the most activated (or deactivated) 5%, 10% and 25% voxels have been considered. The data were averaged for a time window between 0.1 s and 0.8 s after stimulus onset.

Feature unspecific BOLD activation

Feature unspecific BOLD activity was defined as the response to both stimulus orientations combined. In addition to positive feature unspecific BOLD deflection, negative signal changes have been investigated as well. For all three sub-selection thresholds (5%, 10%, 25%) of positive and negative BOLD deflections, a negative relationship between α power changes and BOLD signal change has been found by means of a cluster permutation test (Maris and Oostenveld, 2007). The strongest negative relationship between α and the negative BOLD signal was found mainly in superficial layers. Neither for positive nor for negative voxel subselections, a significant relationship between the γ band EEG power regressors and the BOLD signal has been observed. See Figure 3 for a visualization of the results and Table 1 for p-values, adjusted using the Benjamini-Hochberg approach (Benjamini and Hochberg, 1995).

False discovery rate (FDR) adjusted p-values, computed using the Benjamini-Hochberg procedure (Benjamini and Hochberg, 1995) for the respective voxel sub-selections in V1.

FDR corrections where computed on the p-values resulting from either the cluster permutation tests (Maris and Oostenveld, 2007) or the corresponding aros tests Clausner and Gentili, (2022) across the respective selection thresholds (5%, 10% and 25%). The circle symbol (°) indicates a p-value smaller than 0.05 before multiple comparison correction was applied. Conditions are abbreviated as follows: fu act.: feature unspecific activation; fs cont.: feature specific contrast; fs incr.: feature specific BOLD increase. See also Table 1 (supplementary material) for more information and Figure 4 and Figure 3 for a visual representation.

Average β coefficients

(average contribution per voxel) for feature unspecific activation. For each frequency bin, a separate regressor has been used and results were weighted with the respective layer contribution weights. Frequency bins (0.5 Hz steps for α and 2.5 Hz steps for γ) are indicated at the out-most half circle. Layers are indicated by coloured abbreviations for superficial (S), middle (M) and deep (D) layers. Percentages indicate the threshold at which voxels were selected and refer to the respective sub-selection (e.g. 5% would refer to the 5% most positive or negative t-values given a respective subselection). More saturated areas indicate a significant cluster after a cluster permutation test (Maris and Oostenveld, 2007) and FDR correction (Benjamini and Hochberg, 1995). For each cted significant cluster an aros test (Clausner and Gentili, 2022) across all layers for the widest possible frequency range that the cluster spans. In case of a significant result after FDR correction, white circles indicate in which layer the respective effect was strongest (3 circles), medium strong (2 circles) and weakest (1 circle). The strongest effect in the α band corresponds to the most negative β values, whereas in the γ band the strongest effect refers to the most positive values. Corresponding cluster and aros p-values are shown in Table 1.

Feature contrast

Contrasting trials, where the respective stimulus orientation was left with those where the stimulus orientation was right, yields the feature specific contrast. Voxel were separated into those with a stronger response to left oriented stimuli (L) or those with a stronger response to right oriented stimuli (R). Voxel preferences have been combined with EEG power regressors either congruently (EEGco; stimulus orientation of EEG regressor matches voxel preference) or incongruently (EEGinco; stimulus orientation of EEG regressor does not match voxel preference). Furthermore, the contrast EEGcoinco has been computed for each respective sub-selection of voxels (5%, 10%, 25%). We found a significant negative relationship between α band power and the BOLD signal for EEGco (predominantly in superficial layers) and EEGcoinco (predominantly in superficial and deep layers). A result significant at the trend level (before correction for multiple comparisons) for EEGinco was observed as well. A mixed-effects model, accounting for variability across layers, has been conducted to compare upper frequency α (11 − 13 Hz) and lower frequency α (8 − 10 Hz) with respect to congruency. After correction for multiple comparison, a significant interaction (pFDR < 0.01) indicates a stronger negative relationship, specifically of high frequency α with congruently selected voxels. This indicates a more feature related contribution of high frequency α and a more general contribution of low frequency α to the overall effect. No significant relationship between γ band power and the BOLD signal has been observed after correction for multiple comparisons. However, a trend level result has been found for the 25% threshold, with the strongest positive relationship in deep and superficial layers. See Figure 4 A for a visualization of the results and Table 1 for exact p-values, corrected using the Benjamini-Hochberg procedure (Benjamini and Hochberg, 1995) to adjust false discovery rates (FDR).

Average β coefficients for V1

(average contribution per voxel) resulting from GLM using EEG power regressors to predict the BOLD signals for various voxel sub-selections. For each frequency bin a separate model has been computed and results were weighted with the respective layer contribution weights. Frequency bins (0.5 Hz steps for α and 2.5 Hz steps for γ) are indicated at the out-most half circle. Layers are indicated by coloured abbreviations for superficial (S), middle (M) and deep (D) layers. Voxel were selected based on the first level contrast as follows: (A) Voxel that respond stronger for one orientation over the other irrespective of the response to one condition alone; (B) Voxel that respond strictly positive for either orientation and stronger for one orientation over the other; See also Table 1 for an overview of how the conditions are constructed. Percentages indicate the threshold at which voxels were selected and refer to the respective sub-selection (e.g. 5% would refer to the 5% most positive or negative t-values given a respective sub-selection). Congruent thereby means that if a voxel e.g. responds stronger to left oriented gratings over right oriented gratings, it was paired with EEG based predictors for the left orientation trials (and vice versa). In turn, incongruent means that if a voxel e.g. responds stronger to left oriented gratings over right oriented gratings it was paired with EEG based predictors for the right orientation trials (and vice versa). More saturated areas indicate a significant cluster after a cluster permutation test Maris and Oostenveld, (2007) and FDR correction (Benjamini and Hochberg, 1995). Black dots indicate a cluster p-value of > .05 after, but < .05 before the correction was applied. For each significant cluster, an aros test (Clausner and Gentili, 2022) has been conducted across all layers for the widest possible frequency range that the cluster spans. In case of a significant result after FDR correction, white circles indicate in which layers the respective effect was strongest (3 circles), medium strong (2 circles) and weakest (1 circle). If the aros p-value was < .05 before, but > .05 after the correction, aros shapes are indicated using gray circles. The strongest effect in the α band corresponds to the most negative β values, whereas in the γ band the strongest effect refers to the most positive. Corresponding cluster and aros p-values are shown in Table 1. The right sub-panel indicates the average upper (11 − 13 Hz) and lower (8 − 10 Hz) α power. The p-value was calculated based on a linear mixed effect model controlling for individual layer activation and has been FDR corrected (Benjamini and Hochberg, 1995).

Feature specific BOLD increase

The Feature specific BOLD increase was defined similar to the feature specific contrast (see above), but limited to the sub-selection of voxels that respond strictly positive to either orientation compared to baseline (stimulusBL > 0). We observed a negative relationship between the α frequency band and the feature specific BOLD signal increase for EEGco. A trend level result (significant uncorrected for multiple comparisons) for EEGinco was observed as well. No significant effect was observed for the contrast EEGcoinco in the α band. The laminar profile for EEGco has been observed to be flat except for the most liberal threshold (25% most activated voxel) where the strongest relationship between α and the BOLD signal was located in superficial layers followed by deep layers. For the trend level clusters in EEGinco the strongest effect was found in deep layers. The mixed-effects model for lower frequency α (8 − 10 Hz) and upper frequency α (11 − 13 Hz) yielded, a significant interaction (pFDR < 0.01), indicating a stronger negative relationship for high frequency α with congruently selected voxels. Within the γ frequency range, we observed a positive relationship between the γ band EEG signal and the feature specific BOLD signal increases for the contrast between congruent and incongruent conditions. Only for the most liberal threshold (25%) a significant layer profile was obtained, with the strongest relationship between γ and the BOLD signal in superficial and deep cortical layers. See Figure 4 B for a visualization of the results and Table 1 for exact p-values, corrected using the Benjamini-Hochberg procedure (Benjamini and Hochberg, 1995) to adjust false discovery rates (FDR).

Discussion

In the present study, we combined laminar-level fMRI with simultaneously recorded EEG in healthy human participants to investigate how stimulus-induced changes in α and γ band power relate to the BOLD signal across cortical layers. We found that γ band activity was positively correlated with feature-specific increases of the BOLD signal in superficial and deep layers (see Figure 4).

Additionally, we observed a negative relationship between α band power and feature-specific BOLD signal activity in superficial and deep layers, primarily for frequencies between 11 and 13 Hz (see Figure 4). Furthermore, we found α band oscillations to be negatively related to BOLD signal increases and decreases for any stimulus (global modulation; see Figure 3)

The strongest BOLD increase (independent of the stimulus orientation) was found at the occipital pole, which was expected given the central stimulus presentation (Engel et al., 1997; Dumoulin and Wandell, 2008). In addition, we observed reduced BOLD activity dorsal to the calcarine sulcus (Figure 2 C top). Even though attention has not been manipulated as part of this experiment, the overall distribution of positive and negative BOLD reported here, closely resembles previous findings on the retinotopy of the attentional spotlight, where receptive fields at attended locations would exhibit higher BOLD activity and unattended parts lower BOLD activity (Tootell et al., 1998). One interpretation would be that the task could be solved by spatially attending a part of the stimulus only. In addition, negative deflections do probably not reflect a case of “blood stealing” (Smith et al., 2004), since it has been found that a decrease in attention led to a decrease in the BOLD signal (Tomasi et al., 2006), which has been related to decreased neuronal activity (Shmuel et al., 2006).

The combined EEG-fMRI analyses revealed a positive correlation between γ and feature-specific voxels in both superficial and deep layers. The relationship between γ and superficial layer BOLD is similar to what has been reported before and was predicted by the literature (Scheeringa et al., 2016; Bonnefond et al., 2017). Previous work in monkeys also attributed γ band activity to superficial layers (van Kerkoerle et al., 2014). Accordingly, oscillations in the γ band have been attributed mostly to a feed-forward flow of information via superficial anatomical connections (Fries, 2015; Bastos et al., 2012). However, we also found a relationship between γ and deep layer BOLD, which was not reported previously. A notable difference to previous work on laminar level EEG-fMRI is that the present study included a contrast between congruent and incongruent features. Since the here reported deep layer γ effect is stronger compared to middle layers, we do consider this finding functionally relevant and not related to higher SNR, which is higher for more superficial layers (Koopmans et al., 2010). Furthermore, even though van Kerkoerle et al.(2014) discuss the γ effect with respect to superficial layer neuronal activity only, their data shows a significant peak in layer 6 as well, which might even be stronger as compared to superficial layers and potentially even stronger modulated by task relevance (see figures 2F and 2G in the respective publication). In fact, an analysis of the relationship between the EEG signal and the BOLD signal that focused on the feature contrast only (LR; independent of the comparison to baseline) revealed a trend level result with an even stronger deep layer contribution as compared to superficial layers. This strongly implies an involvement of γ oscillations in feature specific processes (Fries, 2009). Since γ band oscillations are thought to be a marker of ongoing active processing of stimuli (Fries, 2009, 2015; Vinck et al., 2023; Spyropoulos et al., 2024) or directly reflect neuronal spiking activity (Ray and Maunsell, 2015), limiting the analysis strictly to voxels with a positive t-value might have increased the SNR substantially, resulting in a stronger effect for feature specific BOLD signal increases as compared to the feature contrast alone. Recent publications on the information exchange within and between primary visual cortex areas of macaques also reported deep layer γ band activity depending on the stimulus material (Gieselmann and Thiele, 2022; Ferro et al., 2021). Those publications challenge the feed-forward exclusivity of γ altogether by revealing intra-area feedback communication in V1 from layer 5 to layer 6 and layer 6 to supra-granular layers. Possibly, the relationship between γ and deep layer BOLD we observed is also related to similar processes. However, it was also shown that feed-forward connections to higher-order regions do exist in deep cortical layers of macaque monkeys too (Markov et al., 2014). Conclusive evidence about the directionality and source of the γ related deep layer effect thus remains uncertain. The correlative nature of the here presented approach cannot provide such insights and are likely better addressed in recordings in animal models (van Kerkoerle et al., 2014). Our data however, clearly supports the investigation of feature processing related deep layer γ band activity.

Within the α band, we found a general negative relationship between α band power and the orientation-independent BOLD signal (with a stronger response in superficial layers for the negative BOLD signal; see Figure 3). While the negative relationship between α band power changes and positive BOLD signal deflections has been reported previously (Zumer et al., 2014; Scheeringa et al., 2011, 2016), a similar relationship with BOLD signal decreases has not been studied as extensively. Whether α band power increases locally relative to baseline has not been tested. In general, α power decrease have been observed very prominently and thus would likely overshadow local increases (see Figure 2 A and B). Hence, we interpret the negative relationship between α power and the negative BOLD signal as a reduction of power decreases that would be linked to the locally reduced BOLD activity (Figure 2 C top). Future experiments manipulating attention might reveal a similar relationship between α and superficial layer cortical activity for suppressed (unattended) receptive fields. The negative relationship between α and this negative BOLD signal is in line with the hypothesized inhibitory nature of α (Klimesch et al., 2007; Jensen and Mazaheri, 2010; Bonnefond et al., 2017; Zumer et al., 2014).

Furthermore, we observed a significant relationship between superficial and deep layer BOLD and α (above 11 Hz) for the feature contrast voxel selection between congruent and incongruent EEG-fMRI regressor pairings (see Figure 4 A). This again could be considered as indirect evidence for the relationship between α band power changes and BOLD signal decreases. In addition, the significant relationship between α power and BOLD signal for the feature contrast, indicates that α is not exclusively linked to global signal modulations, which has been the traditional perspective (Klimesch et al., 2007; Jensen and Mazaheri, 2010). More recent theoretical frameworks have predicted that α would instead be related to feature specific processes too (Bonnefond et al., 2017, 2024), which would be in line with our findings. Interestingly, however, we found that the relationship between α and feature specific BOLD signal modulations is limited to the upper part of the α band (11 − 13 Hz). We observed that lower α frequencies (8 − 10 Hz) were negatively correlated with the BOLD signal for congruent and incongruent voxel selections (see Figure 4 A), while the negative relationship between upper frequency α was found to be much stronger in the congruent voxel selection, which is reflected in a significant interaction between congruency and α sub-band. Hence, we interpret the lower α effect as a source of general modulation. Thereby, α power would possibly decrease in task-relevant pools of neurons and increase in task-irrelevant pools of neurons (i.e. attention-related) to inhibit potentially distracting information (Klimesch et al., 2007; Jensen and Mazaheri, 2010). On the contrary, upper frequency α would reflect feature-specific processes, only observed in neurons responding specifically to that stimulus orientation. If the feature congruence contrast was limited to strictly positive voxel (Figure 4 B), such effect was not significant. This could be explained by the limited variability if the analysis focused solely on positive BOLD signal changes, where voxels with a very large variability in response (e.g. strongly positive to one but negative to the other orientation) are excluded. Nevertheless, the differential frequency response, with stronger effects for frequencies between 11 and 13 Hz persists.

The literature on differences in frequency within the α band depending on visual stimulus features is sparse. According to Klimesch (1997), lower frequency α power was assumed to reflect attentional processes, while upper α band power is related to semantic memory demands. Here, lower frequency α appears to be more related to the modulation of negative BOLD signal changes for general processes (see Figure 3). This would be in line with the findings on lower frequency α reported by Klimesch. In turn, our findings on upper frequency α oscillations are more often related to feature specific processes (see Figure 4), which could be related to the findings on upper frequency α by Klimesch. He related upper frequency α power to memory performance under high cognitive load, which in turn could be seen as a proxy for how well the stimuli (i.e. its features) were encoded. A second publication by Rodriguez-Larios et al. (2022) employed a single-subject analysis of independent components, which revealed two dissociable α rhythms, both of which were differentially modulated by visual distractors. The lower α component increased in power, while the upper frequency α component decreased in power under the presence of a visual distractor. Behavioural accuracy was positively related with lower frequency α power and negatively related with upper frequency α. Again, we interpret those findings as indirectly in line with the here presented results, with the lower frequency α being related to more general, possibly attentional processes, and the upper frequency α to the content of the memory (i.e. the encoding of the visual stimulus). Furthermore, it could be shown that individual α frequency (IAF) and task performance are related, such that higher IAF is linked to higher visual task accuracy (Di Gregorio et al., 2022; Trajkovic et al., 2024; Coldea et al., 2022), potentially via increased perceptual acuity through increased IAF (Tarasi and Romei, 2024). Since behavioural performance in the present study was consistently high at 94% on average and participants were instructed to respond quickly to potential oddball stimuli, a higher α frequency might reflect a more successful stimulus encoding and hence faster and more accurate behavioural performance. This would also be in line with recent theoretical work relating α frequency and the speed of visual feature sampling (Bonnefond et al., 2024). An increased “sampling rate” via perceptual windows of opportunity that occur in faster succession, would allow for the extraction of more information in a shorter time window, which would be crucial to minimize reaction times. In addition, α amplitude and frequency have been shown to reflect two distinct processes, while α frequency has been related to task performance, amplitude could be demonstrated to be related to visual awareness or confidence judgements about individual task performance (Benwell et al., 2017, 2019; Trajkovic et al., 2024) that could even be shown to depend on the exact cortical region (Samaha et al., 2017). Lastly, higher frequency α (extending into the β range) has been shown to be causally linked to feedforward processing, via phase-amplitude coupling (PAC) between feedback related α and stimulus induced, feed-forward related γ. In a simultaneous transcranial magnetic stimulation (TMS) and EEG study, Trajkovic et al.(2025) demonstrated that brief TMS pulses to the prefrontal cortex in order to activate attentional control, caused increased γ band activity over occipital sensor sites through high α / β PAC with γ. This suggests that indeed feedback directed power changes in the upper α band support bottom up stimulus processing. Taken together, the here observed stronger effect for upper α frequencies for feature specific BOLD is in line with those findings. Whether the here reported difference in α frequency indeed reflects computationally different processes, or whether the same process operates at multiple frequencies, remains to be investigated.

Not only did we find a dissociation in the frequency domain between the relationship of α and the BOLD signal, but furthermore found that the laminar activation patterns provide further evidence for potentially multiple α-related processes. Low frequency α is related predominantly to the BOLD signal in superficial layers (except feature specific BOLD increase voxels with incongruent EEG feature pairing where the SNR is limited due to the focus on positive signal changes only), while the upper frequency α could be associated predominantly to superficial and deep layer activity (see congruence contrast in Figure 4 A). Superficial layer activity has been shown to be under modulation of attention (Halgren et al., 2019; Scheeringa et al., 2016; Bastos et al., 2020; Lawrence et al., 2019b), but mostly feedback related cortical activity in deep layers has been associated with stimulus features (Kok et al., 2016) and (feature) predictions (Bastos et al., 2020). As brought forward in the context of stimulus predictability, α band activity in deep layers might be more related to the predictability of certain stimulus features, whereas superficial α is thought to be more generally modulated, irrespective of stimulus features (Bastos et al., 2020). This would further be in line with research in mice, showing that superficial layer activity acts suppressive on deep layers in order to fine tune stimulus feature selectivity (Pluta et al., 2019).

One remaining question would be, how anticipatory and actual stimulus processing compare. It has been suggested that pre-stimulus α band activity in the upper range is predictive for task performance (Di Gregorio et al., 2022; Trajkovic et al., 2024), but evidence about the laminar profile of anticipatory processes is still lacking. Due to the limited time window prior to the stimulus onset and gradient artifacts of the MRI machine that lasted until 300 ms before the onset of the stimulus, pre-stimulus α could not be investigated in depth here. Figure 2 B however, indicates that pre-stimulus α might play a role for the present task.

A second remaining question is the relationship between the observed α and γ band effects. For the congruence contrast of feature specific voxels the strongest effects have been observed in superficial layers, followed by deep layers, which again supports the hypothesis that this deep layer effect is at least partly related to feature related processing. If and how lower and upper-frequency α band oscillations across different cortical layers interact with each other and how each or both interact with γ band oscillations (Bonnefond et al., 2017), needs to be investigated in detail in future experiments.

Future publications will furthermore need to take β band oscillations into account as well, as those are hypothesized to play a crucial role for visual stimulus processing and might exert a potential top-down influence (Betti et al., 2021; Bastos et al., 2012). Here, we only looked at β band oscillations exploratory and found no significant correlation between β and the BOLD signal. This could be due to the selection process for the respective EEG virtual channels that were optimized specifically for α, but furthermore could be explained by the burst-like nature of β oscillations (Bonaiuto et al., 2020; Betti et al., 2021) which makes them harder to capture with the respective analysis strategies employed here.

Methods and Materials

Participants

A dataset consisting of 52 right-handed individuals (34 of whom identified as female) between the ages of 18 and 35 (μ = 24.0, σ = 4.0) was collected. We only included participants who did not need eye correction (due to practical reason concerning the scanning procedure) and did not have a history of neurological or psychiatric issues or had not undergone neurosurgery. All participants provided informed consent and were monetarily rewarded for their participation. The study received ethical approval from the local ethics committee.

Data Acquisition

Functional and anatomical magnetic resonance imaging (fMRI) data were collected using a Siemens MAGNETOM Prismafit 3T MRI scanner equipped with a 64-channel whole head and neck coil. Before entering the scanner, each subject received detailed instructions and was given the opportunity to practice the main experiment in a short block. Once prepared, the subject was placed inside the scanner. A T1-weighted scan was acquired in the sagittal orientation using a 3D MPRAGE sequence (Brant-Zawadzki et al., 1992) with the following parameters : TR/TI = 2.2/1.1s, 11° flip angle, FOV 256 × × 256 180 mm and an 0.8 mm isotropic resolution. Parallel imaging(iPAT = 2) was used to accelerate the acquisition, resulting in an acquisition time of 6 min and 31 s. For the functional data, we utilized a 3D gradient-EPI (Poser et al., 2010) with CAIPI acceleration capabilities (Narsude et al., 2016) as implemented by Stirnberg et al. (2017). A partial brain acquisition using a coronal slab was encoded with a FOV 208.8 × 208.8 × 39.6 mm covering most occipital and parietal lobes, including primary visual regions. The flip angle was set to 20°, resulting in a near isotropic voxel size of 0.9052 × 0.9052 × 0.9 mm (volume TR: 3.3 s; TE: 34 ms). The sequence was modified to allow an arbitrary time delay between every 3 consecutive volumes. Here, the delay was set to 3 s to ensure unperturbed EEG data acquisition during this delay and that the whole BOLD HRF could be subsequently sampled afterwards.

This protocol was used for both, the main experiment and the retinotopic field mapping. However, for the latter, the sequence gap was omitted because no EEG data was recorded. Each experimental block started with six dummy volumes to allow for the magnetization to reach a steady state, but only the last three of those dummy volumes were actually recorded (and later removed for the data analysis).

In addition, we simultaneously recorded EEG data using a 64 channel MR compatible EEG system (Brain Products Inc, GmbH, Munich, Germany, 2018) at a sampling rate of 5k Hz. Impedances were kept below 20k Ω during subject preparation. Electrode positions were recorded using a photogrammetry-based approach, as described in Clausner et al. (2017). A 3D model, computed from approximately 50 photographs of participants wearing an EEG cap, was aligned via facial features to a 3D representation of the anatomical MRI. Electrode positions were determined from the photogrammetry based 3D model, transformed into MRI space and projected along the vertex normals to the MRI scalp surface.

Furthermore, eye-tracking data was simultaneously collected using an EyeLink 1000+ (SR Research, 2018), but later omitted from the analysis protocol due to insufficient data quality and cumbersome handling in the scanner.

The full experimental protocol comprised a highresolution anatomical T1 scan lasting for 8 min, followed by four consecutive blocks of EEG-fMRI recordings for the main experiment. Each block lasted for 14 min, with a total duration of 4 × 14 = 56 min. Three blocks of population receptive field (pRF) mapping were recorded hereafter, each block lasting for 7 min, utilizing the same fMRI recording sequence but without the 3 s pause for clean EEG data recording (3 × 7 = 21 min). Additionally, 20 resting-state volumes of that sequence with an inverted flip angle lasting 1 min were acquired for distortion field estimation. However, this was not included in the final analysis protocol. Instead a non-linear recursive boundary estimation (Van Mourik et al., 2019a) was used that simultaneously provides the cortical layer estimation (explained in detail below). The total duration of the experiment was approximately 150 min, including ≈ 40 min preparation time, a 5 − 10 min break between the two main experimental parts, and 15 min for participants to wash and dry their hair after the experiment.

Stimulus presentation

Stimuli were projected onto a screen behind the subject’s head using an EIKI LC XL100 projector (https://www.eiki.com/) with a resolution of 1024 × 768 px and a maximum brightness of 5,000 ANSI-lumen, and a contrast ratio of 1000 : 1. The effective field of view comprised a 24 × 18° visual angle at a distance of 855 mm relative to the subject’s eyes. Throughout the entire experiment, stimuli were presented in an otherwise dark scanner room. During the anatomical scan, participants were able to read the experiment instructions again and were asked to remain still with their eyes either opened or closed for the rest of the recording.

Main experiment

Participants performed a demanding visual attention task, using central stimulus presentation. The stimuli could either be left (counterclockwise) or right (clockwise) oriented gratings (± 45° relative to the vertical axis of the screen). A subtle wavy pattern was incorporated as oddball stimuli and participants were instructed to respond to them using their right index finger. The non-oddball to oddball ratio was set to 5 : 1. Stimuli were presented on a gray background with 50% luminance, using the “Presentation” software (Neurobehavioral Systems, Inc., 2018). A fixation indicator was designed based on the findings of Thaler et al. (2013), which consisted of a black, filled circle overlaid with a white cross (also known as the “Greek cross”), containing a central fixation dot (see Figure 1). This design was found to yield higher fixation performance compared to traditional fixation stimuli, such as simple crosses or dots. In our experiment, the central fixation dot at the center of the fixation indicator was either red or green, indicating to the participant whether they should avoid blinking (red = avoid blinking).

Feature specific (left or right oriented) stimuli were constructed as Tukey-filtered gratings of 8° visual angle in diameter and a spatial frequency of alternating bright and dark lines of 3.125 cycles per 1° = 25 cycles that were presented at the central screen location. The contrast between bright and dark components was set to 70% luminance change. An area of 0.8° visual angle in diameter was cut out centrally to house the fixation mark. Gratings could be presented in either left or right orientation, deviating ± 45°from the vertical axis. Additionally, oddball trials were constructed similarly, but with a slightly wavy pattern of an amplitude of 0.3571° visual angle and a frequency of 0.6526 cycles per degree visual angle (≈ 4 cycles across the diameter of the stimulus area). Furthermore four different phase offsets were used in a pseudo-randomized, counter-balanced manner. The outer edge of the stimulus, as well as the edge towards the inner cut-out where the fixation was placed, was filtered using a Tukey filter, to avoid sharp edges.

All settings concerning stimulus appearance were extensively piloted to obtain a satisfactory trade-off between difficulty and accuracy. An example for an oddball stimulus can be found in Figure 1 A.

Receptive field mapping

To identify visual cortex regions of interest (ROIs), a population receptive field (pRF) mapping was conducted to obtain structural locations of V1, V2, and V3 for both hemispheres (Arcaro et al., 2009). The experiment consisted of three blocks, with 128 volumes recorded for each block. The stimulus presentation was implemented using the VistaDisp software package (Wandell et al., 2000) in PsychToolbox (Brainard, 1997). A sequence of fullcontrast chequered bars was presented, moving in different directions (W → E, SE → NW, N → S, and SW → NE) and their reverse directions in front of an otherwise empty screen of 50% luminance. The bars were 2.25°visual angle wide and up to 18°visual angle long, filling a circular area of 18°visual angle in diameter. The overlap with neighbouring bar locations was 1.125° (half a bar’s width). Sixteen different locations along the directional axis for each moving direction were sampled, resulting in 128 trials per block, with each location being sampled twice. Each location had an alternating full-contrast black and white pattern, presented five times for 0.66 s per cycle (= 0.66 s for 2 alternations). One volume was recorded for one of the sets of five consecutive pattern repetition cycles. For each diagonal moving direction, the pattern disappeared for the last eight locations (40 cycles) of that direction to allow for the BOLD response to fall back to baseline. The procedure is further described in the publication by Alvarez and colleagues (Alvarez et al., 2015).

Experimental Procedure

The sequence of events that forms a trial, as well as how the interleaved data acquisition sequence was constructed, is illustrated in Figure 1 A. After participants were instructed and informed consent was collected, the EEG cap was fitted, and electrode positions were recorded. Hereafter electrode housings were filled with an electrically conducting gel to bridge the gap between the electrodes and the skull. Afterwards participants were placed inside the scanner. Foam and pillows helped to keep the participant’s head stable and to remain comfortable throughout the experiment. A strap of tape across the forehead provided additional tactile feedback of any head motion and contributed to minimizing head movements. The eye tracking device was set up and calibrated after that. Before the main experiment started during the high resolution T1 weighted anatomical scan was - a practice block was performed for the actual task that followed hereafter. The practice block was a slightly modified version of the main task, such that the inter-stimulus-interval (ISI) was shortened and the ratio of oddball over non-oddball trials was increased to 1 : 3 to facilitate the training effect. The main experiment consisted of four blocks of 60 trials each, ten of which were oddball trials that were excluded from the later analysis. Participants were instructed to respond as fast as possible to the occurrence of such a trial by pressing the response button with their right index finger.

A trial was defined as the following sequence of events: 1.2 s before the stimulus onset the central dot of the fixation mark would turn red (indicating the subject to avoid blinking). With a probability of p = 0.167 an oddball trial would be shown and otherwise a regular stimulus for 1.6 s. If no oddball was shown and the subject did not respond by a button press, the red fixation stayed for additional 0.6 s before turning back to green, which would end the trial. In all other cases the subject would receive feedback in form of a centrally presented text indicating hit, miss or false alarm, followed by the green central fixation. Hence, each trial would in any case last 1.2+ 1.6+ 0.6 = 3.4 s of which the last 3 s went into the EEG analysis. During the last 3 s before the trial ended, MRI gradients and RF pulses were switched off, such that no MR data could be collected. This was done to ensure good EEG data quality. After each trial, three partial brain 3D EPI volumes (TR = 3.3 s) were recorded, sampling the BOLD response for a single stimulus presentation of that length (Liu and Gao, 2000).

For each of the four experimental blocks, 60 trials were presented, resulting in 240 trials per participant in total. Since trials could be constructed as left or right oriented gratings and oddball or nonoddball trials, four possible trial types could occur. Since the ratio between non-oddball and oddball trials was fixed at 5 : 1, each block consisted of five oddball and 25 non-oddball trials and therefore in total 20 oddball and 100 non-oddball trials for each orientation respectively (Σ= 240). The four different phase-offsets were unevenly distributed within blocks but counterbalanced across the experiment.

After the main experiment, participants could voluntarily rest for some minutes before the population receptive field (pRF) mapping (Dumoulin and Wandell, 2008) was performed. Central fixation during pRF mapping was ensured by a fixation dot that randomly changed colour between red and green at an average rate of 1 change per 3.3 s. This corresponds to 1 TR, since the same fMRI protocol as for the main experiment has been used, only without the gaps that allow more noise free EEG recordings during the main experiment. Participants were instructed to indicate a colour change by a button press with their right index finger. All three blocks for pRF mapping were recorded consecutively without a break, except for subject S11, where only the first two blocks were recorded due to the subject feeling uncomfortable in the scanner.

Data processing

Data analyses were performed using the following software packages and toolboxes: analyzePRF (Dumoulin and Wandell, 2008; Kay et al., 2013), ANTs (Avants et al., 2011), FieldTrip (Oostenveld et al., 2011), Freesurfer (Fischl, 2012), FSL (Smith et al., 2004), janus3D (Clausner et al., 2017), Metashape (Agisoft, St Petersburg, Russia, LLC, 2014), MRICron (Rorden et al., 2012), MRI Volume Masker 3000 TM (Clausner, 2022), MrVista (Wandell et al., 2000), OpenFmriAnalysis (Van Mourik et al., 2019a), SPM12 (Friston, 2007) and Workbench (Washington-University, 2018), including respective dependencies in either Bash, Python or MATLAB.

The dataset and analysis scripts are available at: https://data.ru.nl/login/reviewer2964305110/NSH73GXOEPYIQEXJMSKAZ6RIU4KY3EGXAYMDNXY

No participants have been excluded from any of the analyses. For the main experiment, trials were selected for both non-oddball conditions (= 25 trials per orientation per block = 200 trials per subject). Trials were only excluded in case of a false alarm response or if major EEG artifacts such as significant muscle artifacts during the period of interest have been detected, which was the case in

≈ 2% of all trials.

fMRI Motion Correction and Co-registration

A critical step in laminar fMRI is the correction for motion and co-registration of functional and anatomical data since even sub-millimeter misalignments affect the laminar segmentation substantially. Motion parameter estimation and correction was done using ANTs (Avants et al., 2011). As a first step, the first three volumes were removed from each set of volumes (for the three blocks of pRF mapping and the four blocks of the main experiment). Afterwards a manually drawn brain mask was created for every first volume of the first main experiment block and the first pRF mapping block using MRI Volume Masker 3000 TM (Clausner, 2022). Automatically generated masks were manually “fine tuned”, such that the outer boundary was enclosing the gray matter as close as possible. Extensive parts of cerebrospinal fluid, fatty components, arteries and other tissue were carefully excluded from the masks. The resulting masks were used to constrain motion parameter estimation and to correct the anatomical segmentation performed by Freesurfer (Fischl, 2012) within the respective region of interest (the field of view of the functional scans). The actual motion parameter estimation was then performed in two stages. In the first stage all volumes of one recording block were registered to the within-average over time of that block. During the second stage all newly computed within-block averages were registered to the first volume of the first block of the main experiment. Thus all blocks, including pRF mapping, used the first volume of the first block of the main experiment as the final reference. While for the first stage a rigid body transformation was used, an affine transformation was computed for the second stage. The initial linear transformation in that stage was followed by a non-linear transformation using symmetric normalization (SyN) (Avants et al., 2008).

A similar approach was used for functional to anatomical partial volume co-registration, including a rigid body, affine and non-linear transformation using symmetric normalization. Note, that the T1 weighted image was registered to the functional data (and not vice versa) to map the laminar segmentation that is obtained from the anatomical T1 (see below) to functional data space. All estimated motion parameters were combined and applied in a single operation to ensure that functional data was interpolated only once (Esteban et al., 2019).

Anatomical Segmentation

In a first step, the manually drawn functional masks were registered to native T1 space. This was done in order to ensure proper anatomical segmentation performed using FreeSurfer (Fischl, 2012). This procedure greatly improved the later nonlinear boundary registration of pial and white matter boundaries of functional and anatomical data, which are used for the construction of cortical layers and correct for field distortions. In detail, after the initial full brain segmentation, respective functional brain masks obtained as described above were fit to the full brain mask and replaced the respective parietal parts that were covered by the functional data. Hereafter the estimation of pial and white matter surface boundaries was recomputed. Corrected pial surfaces and uncorrected white matter surfaces were used as boundaries for the later laminar segmentation as this procedure makes use of the surface’s boundaries.

Population Receptive Field Mapping

Population receptive field (pRF) mapping was performed as implemented in the open source toolbox analyzePRF. More detailed information about the algorithmic implementation can be found in the reference literature (Dumoulin and Wandell, 2008; Kay et al., 2013). Binarized versions of each stimulation frame served as spatial regressors for the underlying general linear model (GLM). Each of the presented 64 unique bar locations (including blanks) were thresholded such that the background received a value of 0 and the entire bar irrespective of the chequerboard pattern received a value of 1. In order to save computation time, stimuli were downsampled from screen resolution to a resolution of 192 × 192 px. A Savitzky–Golay filter with a filter window of 61 TR s (201 seconds) was applied to the data. The data then was converted into percent signal change relative to the median. Whereas the Savitzky–Golay filter was applied for each experimental block separately, percent signal change was computed over all blocks combined. Based on a GLM including third order polynomials parameters were estimated for orientation (angle), distance to the center of the screen (eccentricity) and the explained variance per voxel (R2). The gray matter mask, obtained from the anatomical segmentation, was applied and only gray matter voxel locations were fed into the pRF analysis. Based on those maps, regions of interest (V1-3) were manually labelled using Freeview. To facilitate the manual drawing process, a functional atlas (Wang et al., 2015) containing all regions of interest was anatomically fitted to the functional data beforehand. Fitted regions from the atlas were overlaid together with the results of the pRF mapping onto the inflated pial surface as obtained from Freesurfer. Marked labels were then transformed into volumetric data and into functional data space using previously computed co-registration transformation matrices and volumes. It should be noted however, that regions V2 and V3 could not be reliably separated in all participants due to poor data quality. Hence, all analyses will focus on V1 only, but results for V2 and V3 are provided within the supplementary material (see Figures S2, S3 and S4 and Tables S2 and S3).

Estimation of Cortical Layers

Laminar segmentation was performed using coregistered gray and white matter boundaries as references for upper and lower bounds of the segmentation. In order to resolve cortical depth precisely, the curvature of the anatomical boundaries was taken into account. This is necessary since the relative thickness of cortical layers varies depending on the cortical curvature (Shafee et al., 2015). Each voxel covered by the gray matter mask, received a weight as a function of its volume belonging to each of the shell-like meshes forming the boundaries. If a layer boundary would cut the voxel exactly in half, adjacent layers would receive a weight of 0.5 each. Hence, voxels were not separately treated as belonging to different layers, but rather their signal was seen as a weighted mixture coming from different layers. Thus, a voxel located towards the white matter boundary would contribute more to the signal generated in deeper layers receiving a higher weight as compared to a voxel being closer to the surface, which would receive a lower weight at the reference location (Van Mourik et al., 2019b). Layer weights were computed using the open source toolbox OpenFmriAnalysis. As a result, five layer weights per voxel were obtained (CSF, superficial, middle, deep and white matter layer).

EEG data processing

The major goal of the EEG data preprocessing was to optimize noise suppression for each frequency band of interest (α, γ) in order to extract EEG signal components of interest as clean as possible. In previously published literature a supervised signal decomposition based on ICA has been utilized (Scheeringa et al., 2011, 2016). This approach requires the manual selection of target components for each frequency band. By removing all non-target components, noise can be suppressed and the resulting signal will only contain the data of interest. However, beamforming offers the major advantage of being able to perform unsupervised noise suppression (Veen and Buckley, 1988; Adjamian et al., 2009). During the piloting phase, beamformer methods were successfully applied to the EEG data collected in an (f)MRI environment and yielded accurate source reconstruction results, in line with previous studies using a similar approach (Brookes et al., 2008, 2009).

Low (α) and high (γ) frequency bands were processed separately to extract the desired response patterns. The data was filtered for the lower frequencies using a pass band between 2 and 32 Hz and for the high frequencies between 20 and 120 Hz respectively. A 50 Hz dft filter to suppress power line noise was applied to the latter as well. EEG electrode locations were obtained from the photogrammetry based 3D model and coregistered using the face shape to the anatomical MRI using janus3D, as described in Clausner et al. (2017). A finite element model (FEM) was computed from the high resolution anatomical T1 based on the FieldTrip-SimBio pipeline (Vorwerk et al., 2018). The leadfield was computed from the EEG electrode positions and the FEM model. Sources were modelled as equivalent current dipoles at locations limited to the respective coordinates of voxels included in the gray matter and ROI masks. Dipole orientations were derived from the cortical curvature and thickness, since this is crucial for a precise mapping especially in EEG (Baillet et al., 2001). Workbench (WashingtonUniversity, 2018) was used to compute the surface normals that connect pial and white matter surfaces. The orientation of the resulting vectors then served as the dipole orientation for each respective location. The described procedure was done separately for the left and right hemisphere in order to obtain separate filter weights, since distinct source activity for both hemispheres was to be expected (Michel et al., 2001). The resulting weight matrices were applied to the band pass filtered data in order to obtain virtual channels at the corresponding equivalent current dipole locations. A spectral analysis was performed on each virtual channel separately. Thereby the exact settings for low and high (time-) frequency decomposition varied slightly. While for low frequencies, a Hanning taper was applied using a data time window length of 400 ms (time steps: 20 ms), zero-padded to achieve a frequency resolution of 0.5 Hz and smoothed with a kernel width of ± 2 Hz; for high frequencies the data was processed using the same time related settings, applying a multi-taper (DPSS; Slepian 1978) approach (seven tapers) with a resolution of 2.5 Hz. The frequency domain was smoothed using a kernel width of ±10 Hz. Afterwards the virtual channel with the highest average amplitude change between 8 − 12 Hz (α) or 50 − 70 Hz (γ) relative to baseline, was selected. As a baseline period, a time window from − 0.3 to − 0.1 s for high and at − 0.3 for low frequencies was chosen. Gradient artifacts caused by ringing of MRI machine’s amplifier after the gradient coils were switched off were observed prior to − 0.3 s relative to stimulus onset which was unexpected. The initially planned start of the baseline period was − 0.5 s, however due to those artifacts − 0.3s relative to stimulus onset served as the new lower bound for the baseline period. For this reason, the low frequency baseline period comprised only a single time point at − 0.3 s, because a pre-stimulus α decrease was expected starting around 0.25 s prior to stimulus onset. Pre-stimulus α has often been related to visual detection performance (Van Diepen et al., 2019) and might thus reflect different processes than the actual target time interval. In fact a small but visible α decrease could be observed 0.25 s prior to stimulus onset (see Figure 2 B, top in supplementary material). The data was transformed into the log10 ratio between the baseline and each sample point in a time window between 0.1 and 1.6 s after stimulus onset for each hemisphere. The time-frequency (TF) transformed virtual channels with the highest average response were chosen to be the “best” channels that were later used to build the regressors for the combined EEG-fMRI analyses. In total four dipole locations (i.e. the TF transform of corresponding virtual channels) were selected for each subject individually: One for each of the two hemispheres and one for each of the two separate frequency bands. Figure 2 A (supplementary material) depicts the result of a full-brain DICS beamformer analysis (Gross et al., 2001) for illustrative purposes. Since the LCMV beamformer approach was limited to specific ROIs obtained from pRF mapping, the DICS full brain scan was performed in order to verify that indeed visually induced activity yields the strongest effects at the occipital pole. All steps previously mentioned were implemented in MATLAB R2021a (MathWorks, 2021) using the open source toolbox FieldTrip. See Figure 2 B (supplementary material) for a depiction of the average TF transformed virtual channel response over participants for one hemisphere and both frequency bands. Since the main hypotheses of the presented study do not directly address the frequency responses (i.e. power changes) themselves, which are well established response patterns, but rather their relation to the BOLD signal, a pre-selection of EEG data with the strongest response, still acts as valid scientific strategy and does not result in a case of “double dipping”.

Combined EEG-fMRI analysis

The general logic of fitting trial-by-trial based EEG time-frequency regressors individually for each TF bin to the BOLD signal for different cortical regions and across layers, follows largely what is described in Scheeringa et al. (2016). Several steps have been undertaken to prepare the EEG and fMRI data for the later combined analysis.

Nuisance regressors Contained all trial and response combinations that were not included in the task regressors (e.g. false alarm trials). Additional regressors contained blink or artifact trials, button presses and reaction times. All the aforementioned regressors were convolved with the hemodynamic response function as built into SPM12. Reaction time regressors were treated as a parametric modulation. The onset of the modulation was set to the average reaction time for each individual block and the actual reaction time as the modulating value. This procedure was chosen in accordance with previous literature (Scheeringa et al., 2016). Furthermore the average white matter signal and the average residual signal (average signal after regressing out gray and white matter signals) were included in the nuisance regressor matrix. All motion parameters (translation along and rotation around x, y, z) and their first derivatives were included as well as a set of high pass filters modelled as five sines and five cosines. Those five sine and cosine waves were constructed, such that they would span one to five full cycles across one experimental block. Task related fMRI regressors were built separately for left and right oriented gratings or both combined.

EEG data regressors Were built on the TF resolved virtual channel data, obtained as described above. Frequency bin span 0.5 Hz for low and 2.5 Hz for high frequency data and time bins were set to 0.4 s that were shifted by 0.1 s intervals (sliding window). One regressor was built for each frequency and time bin separately. This was done by convolving z-transformed trial-by-trial power changes of the EEG data with the hemodynamic response function that comes with SPM12, such that each regressor served as parameter modulator. Thereby the time onset was set to the mid data point of the time domain data bin and shifted by one to three TRs matching the corresponding three volumes that have been recorded after the corresponding trials. In the later general linear model (GLM) each TF based regressor was fit in a separate model. Task and nuisance fMRI regressors however were kept fixed for each model. Similar to fMRI based task-regressors, three sets of task-related regressors were built for left or right oriented trials and both combined. All analyses were performed for a (EEG) time window ranging from 0.1 s to 0.8 s after stimulus onset, for which corresponding time bins were averaged. This was done because the average reaction time to oddball stimuli was 759 ms. Since the main stimulus processing is assumed to take place before the response - but in order to include as many data points as possible - the time bin including 0.8 s after stimulus onset was included as well.

Before the combined EEG-fMRI analysis could be conducted, a normalization step has been applied in order to counteract the signal drop-off resulting from the interleaved sequence. Since in each trial the gap was followed by the collection of 3 consecutive volumes, each voxel of such volumes within one block was divided by the average BOLD signal in that voxel of every 1st, 2nd or 3rd volume in that block. Furthermore, a first level fMRI analysis on the motion corrected data was performed in order to identify voxels of interest. Thereby multiple selection criteria were applied to the first level t-maps: The feature unspecific BOLD response (fu) is defined as the response to any stimulus (irrespective of orientation). The feature specific BOLD response (fs) in turn is defined as the response of a voxel to a specific stimulus orientation compared to baseline or the contrast between both (leftright stimulus orientation). A positive t-value from the contrast thereby indicates a stronger response to left over right oriented gratings, whereas a negative t-value reflects a voxel’s response preference to right oriented gratings. This allows the formation of a joint selection to group voxels based on the feature specific response compared to baseline and the contrast. A voxel with e.g. feature specific BOLD increase (t > 0 for the right stimulus compared to baseline) and negative contrast value (t < 0) to that same stimulus would be said to be a) generally activated by the right stimulus and also stronger compared to the left stimulus (explained in detail below). As the threshold used for voxel selection was inconsistent in previous fMRI-EEG publications, we decided to adopt a transparent approach. Each selection was made such that the top 5%, 10% and 25% of voxels (according to t-value) were included. We corrected for multiple comparisons accordingly (see below). See Figure 5 for a visual representation of the analysis strategy and Table S1 for a summary of the used nomenclature (supplementary material).

Data preparation and combined EEG-fMRI analysis for three cortical layers

A) Layer specificity. Cortical layers are constructed as shell-like meshes, taking the gray and white matter boundaries as reference. Between those two reference shells two additional shells divide the space between gray and white matter into three layers. The area outwards relative to the gray matter boundary is assigned to the CSF layer, whereas the area inwards from the white matter boundary is assigned to the white matter layer. Each voxel contributes a fraction of its signal to those layers, depending of the proportional volume of a voxel within each shell like mesh of each layer (see example). Respective fractions are later used as weights to split the β coefficients resulting from GLM into different layer contributions. B) EEG based regressors. After transforming the EEG data into virtual channel data for each grid point within the gray matter (using LCMV beamforming), virtual channels are time-frequency transformed. For each hemisphere and frequency band separately (2 Hz to 32 Hz for α and 20 Hz to 120 Hz for γ) the virtual channel with the highest γ power increase or α power decrease after stimulus onset is selected respectively (Σ = 4 virtual channels). Regressors are build for each time-frequency bin separately. Time bins are averaged to boost SNR. Power values over trials are convolved with the HRF as built into SPM12 resulting in one parameter modulation regressor for each TF bin. C) Voxel selection. Voxels are selected based on t-maps resulting from first level contrasts. Thereby, both stimulus orientations vs baseline, each stimulus orientations vs baseline or the contrast between both orientations are considered. D) Statistical inference. EEG based regressors are entered into a general linear model (GLM) as predictors for the BOLD signal in each selected voxel. The resulting β coefficients for each voxel are multiplied with the respective layer weights (excluding white matter and CSF layers) in each voxel and averaged for the respective time window of interest (0.1s to 0.8s after stimulus onset) to obtain the final depth by frequency resolved data. This data was tested against the hypothesis that there was no significant relationship between the EEG and fMRI data (β coefficients do not differ from zero) for each respective condition, using a cluster permutation test (Maris and Oostenveld, 2007). Resulting clusters were averaged in each layer over frequencies for the widest possible window selected across layers. The layer profiles of the averaged clusters were tested against the hypothesis that the layer profile is as likely as any other layer profile - under the assumption of interchangeability of the data - using an auto-regressive rank order permutation (aros) test (Clausner and Gentili, 2022).

Importantly, fMRI based voxel selection and EEG based trial-type related regressors could either be combined congruently or incongruently (since feature specific response and the feature contrast can be divided into responding stronger to the left or right oriented stimuli). This means that results could be selected for data of all voxels that respond feature specific to one orientation with EEG based regressors built on trials of exactly this orientation (congruent; short: EEGco) or the respective other orientation (incongruent; short: EEGinco). A congruent pairing (EEGco) would e.g. combine data of each voxel with a stronger BOLD increase to left with EEG based regressors based on trials where the stimulus orientation was left. In the case of EEGinco, the same feature specific voxels would rather be combined in a selection with EEG-based regressors for right oriented stimuli. If voxels for both orientations have been selected (feature unspecific response) an EEG regressor that was built on both trial types was used in the result selection (no congruence separation).

Irrespective of voxel selection, a higher t-value selection threshold indicates higher specificity for the respective stimulus and a lower threshold increases the signal-to-noise ratio (SNR) by including more data. This changes the number of voxels selected. In previous publications, employing a similar experimental setup, 500 voxels with highest activation (Lawrence et al., 2018, 2019b) or the top 5%, 10% or 25% activated voxels (Scheeringa et al., 2016) were selected. A study by Markuerkiaga et al. (2021), specifically designed to assess the number of voxels required for optimal SNR in the context of laminar level fMRI, finds 250 voxels (for 3T fMRI) to yield the best contrast-to-noise ratio (CNR). However, since all previously mentioned publications set a more or less arbitrary threshold and the last did not take the correlation with EEG into account and hence potential changes in SNR and CNR for those cases, three thresholds have been selected to eliminate eventual uncertainties: 5%, 10% and 25% of most activated voxels.

A general linear model (GLM) has been computed with predictors for each TF bin separately for all voxel in V1 that later have been sub-selected according to the respective condition. Afterwards, each of the resulting regression coefficients (β coefficients) was multiplied with the voxel specific layer weights that have been obtained as described above. The general strategy rests on the idea that the predicted fMRI activity can be seen as a mixture of signal contributions from each layer in each voxel. For this analysis, contributions from white matter and CSF layers have been excluded and only signal contributions from superficial, middle and deep layers have been taken into account. After averaging results from different hemispheres, experimental blocks, percentiles of interest (e.g. the top 5% voxels of a respective contrast) and time bins for a time interval between 0.1 and 0.8 s after stimulus onset, the final data contains the average regression coefficients for depth × frequency × subject.

Subsequently, separate analyses were done for two frequency of interest (FOI) ranges centerd around the α and γ band responses observed in the EEG. Within these frequency ranges inferential statistics based a cluster level approach were computed. For low frequencies the FOI range was set to frequencies between 8 Hz and 14 Hz (α band), whereas for the high frequencies the FOI was defined as frequencies between 50 Hz and 70 Hz (γ band), which covers the peak response frequencies found in the average EEG data (see Figure 2 B in supplementary material). Within the respective range - and if not otherwise indicated - a single tailed cluster permutation test (Maris and Oostenveld, 2007) has been conducted separately for each threshold and FOI for α and γ in V1 since the expected direction of the effect was derived from the literature (Scheeringa et al., 2016; Zumer et al., 2014; Murta et al., 2015). Each significant cluster has been further processed by means of an auto-regressive rank order similarity (aros) test (Clausner and Gentili, 2022). The fundamental idea behind the aros test is whether group averages (i.e. averages of the signal of cortical layer in the present case), can be ranked such that the rank order is explained significantly better by the data than it would if the average data could not be meaningfully sorted (i.e. is shuffled). This is achieved by transforming the group averages into unique rank order values (e.g. superficial > deep > middle) and computing the average fit of the data to this rank order. In a second step data points are shuffled between the groups and the same procedure is applied (i.e. computing the rank order of the mean and the average fit of the now shuffled data to the new rank order). Repeating this permutation step many times (here 25000 times) yields a permutation distribution, to which the initially computed fit value of the un-shuffled data is compared. Rejecting the null hypothesis would result in the assumption that the rank order of the group averages indeed can be explained by the data significantly better than it would if the data points could not be meaningfully sorted into those groups. Thus, in the present case it could reveal how the correspondence between the EEG and fMRI signals could be sorted across layers. However, statements about the magnitude of the difference between two (or more) layers cannot be made. This approach provides insight about the specific activation profile across layers for specific conditions within a significant depth × frequency cluster. Since clusters span layer and frequency bins unevenly, the data that was used for the later aros test was collapsed over frequencies, such that the lowest and highest significant frequency bin served as the boundary over which the frequency domain was averaged. This was done irrespective of the respective cluster size within a specific layer (see Figure 5 D in supplementary material). Testing for layer specificity is not straightforward due to issues related to multiple comparison and non-normal distributed data. To circumvent this, Scheeringa et al. (2016) tested for EEG-fMRI layer specificity by fitting layer profiles for α and γ to each other using an ordinary regression and tested whether β coefficients differed to zero. While this approach is suitable for demonstration purposes, it does not reveal the exact nature of those differences between layers. While Scheeringa et al. proved the concept of layer specific feature extraction, the present paper aims to determine the relational activity across layers depending on the feature specific response as well for which reason the aros test has been developed. For the present case this means that while neglecting the effect size of the difference across layers, the overall profile of the differential activation across layers can be obtained without compromising statistical power due to multiple comparison.

The described procedure of selecting the respective voxels of interest, fitting EEG based regressors jointly with task and nuisance fMRI regressors to the data, weighting the resulting β regression coefficients with the corresponding layer contribution weights and applying a cluster permutation test, followed by an aros test on significant clusters, has been applied to each of the contrasts of interest. To account for multiple comparison across the respective selection thresholds, cluster and aros p-values have been adjusted using the false discovery rate (FDR) adjustment procedure proposed by Benjamini and Hochberg (1995). In the next section, we describe the different voxels selections applied (see Table S1 in supplementary material).

Feature unspecific BOLD activation

While previous literature mostly focused on the relationship between BOLD signal increases and different EEG frequency bands, we extended our analysis to include visual cortex BOLD signal decreases. In general, the sparse corpus of literature investigating negative BOLD signal deflections in conjunction with the presentation of visual stimuli mainly focused on attention related effects (Tootell et al., 1998; Tomasi et al., 2006) (especially for foveal presented stimuli and a demanding task; see Crespi et al. 2011). Furthermore, it has been related to decreased neuronal activity in monkey V1 (Shmuel et al., 2006). Previous literature suggests that attention related α power increases could be associated with BOLD-signal decreases (Zumer et al., 2014; Shmuel et al., 2006; Tootell et al., 1998) and hence might provide a meaningful insight into feature unspecific signal contributions. Note however that attention was not manipulated as part of this experiment. As described above, for each contrast, multiple thresholds have been applied such for voxels with the 5%, 10% and 25% most positive and most negative t-values as a response to any stimulus were selected.

Feature specific BOLD response

Previous literature on the relationship between laminar level fMRI and EEG in the visual domain mainly focused on proofs of concept and related visual cortex fMRI activity to EEG without considering different stimulus features (Scheeringa et al., 2011, 2016), which has been accomplished here. In general, the response to the two orthogonal stimulus features has been assessed two fold. Based on the t-value obtained from the first level feature contrast (leftright stimulus orientation), and by comparing the t-values of one orientation to baseline. For the first case, voxels respond preferentially to one over the other orientation. The result of the contrast and the response compared to baseline then create a joint selection. This is the response of a voxel to one stimulus orientation compared to baseline, relative to the response of that voxel to the other stimulus. Hence, voxels with a contrast value that is based on the difference between one positive and one negative or two negative values compared to baseline are systematically excluded. This joint selection between feature specific voxels with strictly positive response compared to baseline, can be interpreted as feature specific BOLD increase.

Each selection can be separated into voxels that respond preferentially to left or right stimuli. In addition, the EEG regressors can as well be separated into left and right by the trials they were constructed from. Using this strategy, the result of the GLM can be separated into congruent or incongruent parts, where feature specific signal changes were assessed for congruent (EEGco) or incongruent (EEGinco) combinations of selected voxels and EEG based regressors. Beside separately looking at congruent and incongruent combinations, contrasts have been computed as the difference of β coefficients between EEGco and EEGinco (EEGcoinco). The underlying set of voxels for EEGco and EEGinco is identical. Each condition only differs in which EEG regressors were used (built from trials with left or right oriented stimuli, respectively). This means EEGcoinco represents differences in estimated response amplitudes between regressors. Each set of results was followed by a single tailed cluster permutation test (Maris and Oostenveld, 2007). The final results structure encompasses the layer by frequency resolved data for each voxel selection threshold and combination with EEG power regressors, significant clusters and aros results for α and γ separately. Figure S1 in the supplementary material provides a dummy result figure with explanations on the arrangement of plots.

Lower and upper α

It has been proposed that α band activity might be composed of at least two different sub-bands with potentially different functional roles and underlying sources (Sokoliuk et al., 2019; Rodriguez-Larios et al., 2022; Zhao and Liu, 2025). To investigate whether different α sub-bands contribute differentially to the negative relationship with the BOLD signal, we additionally conducted multiple analyses for lower (8 − 10 Hz) and upper (11 − 13 Hz) α bands. Those bands were selected based on the time-frequency transformed average virtual channel EEG data (see Figure 2 B). After averaging over the time interest (0.1 − 0.8 s), the median power of the frequency window of interest (8 − 14 Hz) was computed. The centre of the lower or upper α band was defined as the power value closest to the median with a frequency below or above the peak frequency (10.5 Hz) respectively. This procedure is similar but not equal to a full width half maximum boundary selection. Lower and upper α bins were then defined as ± 1 Hz around the lower or upper centre values, which resulted in a lower α frequency range of 8 − 10 Hz and an upper α frequency range of 11 − 13 Hz. Beta coefficients for feature unspecific and feature specific analyses were averaged within those bins and fed into a mixed-effects model accounting for individual differences across layers and participants. For the feature unspecific analysis, separate models for positive and negative sub-selections have been computed, whereas for the feature specific analysis the interaction between α frequency and congruence has been investigated. Since one model was computed for each selection threshold (5%, 10%, 25%) p-values have been adjusted using the false discovery rate (FDR) adjustment procedure proposed by Benjamini and Hochberg (1995).

Acknowledgements

This work was supported by the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007–2013)/ERC starting grant (grant number 716862) attributed to Mathilde Bonnefond and the Fondation pour la Recherche Medicale grant ID FDT202106013010 awarded to Tommy Clausner. This work was conducted in the framework of the LabEx Cortex (“Construction, Function and Cognitive Function and Rehabilitation of the Cortex,” ANR-10-LABX-0042) of Université de Lyon. The authors would like to thank Jan Mathijs Schoffelen, Simon Homölle and Robert Oostenveld for their support with EEG source modelling. Furthermore, we would like to thank Rüdiger Stirnberg for his support on setting up the interleaved fMRI sequence, as well as Tim van Mourik for his support on cortical layer segmentation and Laurentius Huber for providing the crucial hint on accurate fMRI sub-millimeter within and between block volume (co-) registration and motion parameter estimation. Furthermore, the authors would like to thank Koen Haak for fruitful discussions and vital support in setting up the pRF recording pipeline, as well as Matthias Ekman for fruitful discussions on the improvement of the pRF mapping analysis pipeline. Lastly, the authors would like to thank Ole Jensen, Floris de Lange, David Norris and Jérémie Mattout for fruitful discussions and Paul Gaalman and the DCCN technical group for outstanding support.

Additional files

Supplementary Figures and Tables

Additional information

Funding

European Research Council

https://doi.org/10.3030/716862

Fondation pour la Recherche Médicale (FDT202106013010)

LabEx Cortex (ANR-10-LABX-0042)