Abstract
Current models of scene processing in human brain include three scene-selective areas: the Parahippocampal Place Area (or the temporal place areas; PPA/TPA), the restrosplenial cortex (or the medial place area; RSC/MPA) and the transverse occipital sulcus (or the occipital place area; TOS/OPA). Here, we challenged this simplistic model by showing that another scene-selective site can also be detected within the posterior intraparietal gyrus. Despite the smaller size of this site compared to the other scene-selective areas, the posterior intraparietal gyrus scene-selective (PIGS) site was detected consistently in a large pool of subjects (n = 59; 33 females). The reproducibility of this finding was tested based on multiple criteria, including comparing the results across sessions, utilizing different scanners (3T and 7T) and stimulus sets. Furthermore, we found that this site (but not the other three scene-selective areas) is significantly sensitive to ego-motion in scenes, thus distinguishing the role of PIGS in scene perception relative to other scene-selective areas. These results highlight the importance of including finer scale scene-selective sites in models of scene processing – a crucial step toward a more comprehensive understanding of how scenes are encoded under dynamic conditions.
1. Introduction
In human and non-human primates (NHPs), fMRI has been used for many decades to localize the cortical regions that are preferentially involved in scene perception (Epstein and Kanwisher, 1998; Tsao et al., 2008; Rajimehr et al., 2009; Nasr et al., 2011). Early studies focused mainly on larger activity sites that were more easily reproducible across sessions and individuals, ignoring smaller sites that were not easily detectable in all subjects and/or were not reproducible across scan sessions, based on the techniques available at that time. This led to relatively simple models of neuronal processing solely based on larger visual areas.
Specifically, these models suggested three scene-selective areas within the human visual cortex, with possible homologues in NHPs (Nasr et al., 2011; Kornblith et al., 2013; Li et al., 2022). The human cortical areas were originally named parahippocampal place area (PPA) (Epstein and Kanwisher, 1998), retrosplenial cortex (RSC) (Maguire, 2001) and transverse occipital sulcus (TOS) (Grill-Spector, 2003), based the local anatomical landmarks. However, subsequent studies noticed the discrepancy between the location of these functionally-defined areas and the anatomical landmarked, and instead named those regions temporal, medial and occipital place areas or TPA, MPA and OPA (Nasr et al., 2011; Dilks et al., 2013; Silson et al., 2016).
The idea that scene-selective areas are limited to these three regions is based largely on group-averaged activity maps, generated after applying large surface/volume-based smoothing to the data from individual subjects. In such group-averaged data, originally based on fixed-rather than random-effects, thresholds tended to be high to reduce the impact of nuisance artifacts (Nasr et al., 2011). Thus, though well founded, this approach conceivably may not have identified smaller scene-selective areas (Fig. 1A).
However, at the single subject level, multiple smaller scene-selective sites can be detected outside these scene-selective areas, especially when drastic spatial smoothing is avoided (Fig. 1B). This phenomenon is highlighted in a recent neuroimaging study in NHPs (Li et al., 2022) in which authors took advantage of high-resolution neuroimaging techniques using implanted head coils. Their findings suggested that scene-selective areas are likely not limited to the three expected sites, and that other, smaller, scene-selective areas may also be detected across the brain. Still, the reliability in detection of these smaller sites, their spatial consistency across large populations and their specific role in scene perception that distinguishes them from the other scene-selective areas, remain unclear.
Here, we used conventional (based on a 3T scanner) and high-resolution (based on a 7T scanner) fMRI to localize and study additional scene-selective site(s) that were detected outside PPA/TPA, RSC/MPA and TOS/OPA. We focused our efforts on the intraparietal region mainly because multiple previous studies reported indirect evidence for scene and/or scene-related information processing within this region (Lescroart and Gallant, 2019; Pitzalis et al., 2020; Sulpizio et al., 2020; Park et al., 2022). Consistent with these studies, we found at least one additional scene-selective area within the posterior intraparietal gyrus, adjacent to the motion-selective area V6 (Dechent and Frahm, 2003; Pitzalis et al., 2009). This site was termed PIGS, reflecting its location (posterior intraparietal gyrus) and function (scene-selectivity). PIGS was detected consistently across individual subjects and populations and localized reliably across scan sessions. Moreover, it showed sensitivity to ego-motion within visual scenes, a phenomenon not detectable in other scene-selective areas.
2. Methods
2.1. Participants
Fifty-nine human subjects (33 females), aged 22–68 years, participated in this study. All subjects had normal or corrected-to-normal vision and radiologically normal brains, without any history of neuropsychological disorder. All experimental procedures conformed to NIH guidelines and were approved by Massachusetts General Hospital protocols. Written informed consent was obtained from all subjects before the experiments.
2.2. General procedure
This study consists of 7 experiments during which we used fMRI to localize and study the evoked scene-selective responses. During these experiments, stimuli were presented via a projector (1024 × 768 pixel resolution, 60 Hz refresh rate) onto a rear-projection screen. Subjects viewed the stimuli through a mirror mounted on the receive coil array. Details of these stimuli are described in the following sections.
During all experiments, to ensure that subjects were attending to the screen, they were instructed to report color changes (red to blue and vice versa) for a centrally presented fixation object (0.1° × 0.1°) by pressing a key on the keypad. Subject detection accuracy remained above 75%, and showed no significant difference in color change detection performance across experimental conditions (p>0.10). MATLAB (MathWorks; Natick, MA, USA) and the Psychophysics Toolbox (Brainard, 1997; Pelli, 1997) were used to control stimulus presentation.
2.2.1. Experiment 1 – Localization of scene-selective areas
In fourteen subjects (6 females), we localized scene-selective areas PPA/TPA, RSC/MPA and TOS/OPA by measuring their evoked brain activity, using a 3T fMRI scanner, as they were presented with 8 colorful images of real-world (indoor) scenes vs. (group) faces. Scene and face images were retinotopically centered and subtended 20° × 26° of visual field without any significant differences between their root mean square (RMS) contrast (t(14) =1.10, p=0.29). Scene and face stimuli were presented in different blocks (16 s per block and 1 s per image). Each subject participated in 4 runs and each run, consisted of 10 blocks plus 32 s of blank presentation at the beginning and at the end of each block. Within each run, the sequence of blocks and the sequence of images within them was randomized.
2.2.2. Experiment 2 – Reproducibility of PIGS across scan sessions (3T vs. 7T)
To localize PIGS with higher spatial resolution and to enhance the signal/contrast to noise ratio (relative to Experiment 1), four subjects were randomly selected from those who participated in Experiment 1 and were scanned in a 7T scanner. These individuals were presented with 300 grayscale images of scenes and 48 grayscale images of (single) faces other than those used in Experiment 1. Here, scene images included pictures of indoor (100 images), manmade outdoor (100 images) and natural outdoor (100 images) scenes, selected from the Southampton-York Natural Scenes (SYNS) dataset (Adams et al., 2016).
As in Experiment 1, all images were retinotopically centered, and subtended 20° × 26° of visual field and there was no significant difference between the RMS contrast across the two categories (t(346) =0.75, p=0.38). Scene and face images were presented across different blocks. Each block contained 24 stimuli (1 s per stimuli), with no blank presentation between the stimuli. The sequence of stimuli was randomized within the blocks. Each subject participated in 12 runs (11 blocks per run; 24 s per block; 1 s per stimulus), beginning and ending with an additional block (12 s) of uniform black presentation. In each run, the sequence of blocks and the sequence of images within them was randomized.
2.2.3. Experiment 3 – Localization of area V6
This experiment was designed to clarify the relative localization of PIGS vs. area V6 (Pitzalis et al., 2009). All fourteen subjects who participated in Experiment 1 were examined again in a separate scan session using a 3T scanner. During this scan session, we localized area V6 by contrasting the response evoked by coherent radially-moving (optic flow) vs. randomly-moving white dots (20° × 26°), presented against a black background. The experiment was block-designed, and each block took 16 s, beginning and ending with an additional block of 16 s uniform black presentation. Other details of the experiment were similar to those in Experiment 1.
2.2.4. Experiment 4 – Localization of PIGS in a larger population
Considering the small size of PIGS, it was important to show that this area could survive group-averaging over larger populations, compared to Experiment 1. Accordingly, Experiment 4 localized this area in a large pool of subjects, consisted of thirty-one individuals (19 females) other than those who participated in Experiment 1. The stimuli and procedure were identical to Experiment 1.
2.2.5. Experiment 5 – Response to two independent set of scenes and non-scene objects
Experiments 1–4 used the response evoked by scenes vs. faces to localize PIGS. However, it remained unknown whether PIGS also showed a selective response to the ‘scenes vs. objects’ contrast. Accordingly, in two independent groups of subjects (no overlap), Experiment 5 tested the response evoked by scenes vs. nonscene objects in PIGS and the adjacent areas (i.e. V6, TOS/OPA and RSC/MPA).
Specifically, in Experiment 5a, thirteen subjects (7 females), other than those who participated in Experiment 1, were scanned in a 3T scanner. They were presented with 22 grayscale images of indoor/outdoor scenes, other than those presented in Experiments 1–4, and 88 grayscale images that included either a single or multiple everyday non-animate (non-face) objects. All stimuli were retinotopically centered and presented within a circular aperture (diameter=20°). The RMS contrast of the objects was significantly higher than the scenes (t(108)=3.72, p<10-3). Scene and object images were presented in different blocks according to their category (22 s per block and 1 s per image). Each subject participated in 12 runs and each run consisted of 9 blocks, plus 16 s of blank presentation at the beginning and the end of each block. As in other experiments, the sequence of blocks and the sequence of images within them was randomized.
In Experiment 5b, fourteen subjects (8 females), other than those who participated in Experiment 1 and 5a, were scanned in a 3T scanner. Each subject was presented with 32 grayscales images of indoor/outdoor scenes, 32 images of everyday (non-face) objects plus also their scrambled versions, and 32 images of single faces. Scene and non-scene stimuli were different than those used in Experiment 1–4 and 5a. In contrast to Experiment 5a, all non-scene images included only one single object and there was no significant difference between the RMS contrasts of scenes and the three object categories (F(3, 111)=0.42, p=0.74). Other details were similar to those in Experiment 5a.
2.2.6. Experiment 6 – Coherently- vs. incoherently-changing scenes
This experiment was designed to differentiate the role of PIGS in scene perception from TOS/OPA, RSC/MPA and PPA/TPA. Twelve subjects, from the fourteen subjects who participated in Experiment 1, participated in this experiment. The excluded two subjects could not participate further in our tests for personal reasons. Subjects were scanned in a 3T scanner on a different day relative to Experiments 1–3. During this scan, they were presented with rapidly ‘coherently- vs. incoherently-changing scenes’ (100 ms per image), across different blocks (16 s per block).
Coherently-changing scenes implied ego-motion (fast walking) along 3 different outdoor natural trails. Stimuli (20° × 26°) were generated as one of the experimenters walked through the trails while carrying a camera mounted on his forehead, taking pictures every 2 meters. Incoherently-changing scenes consisted of the same images as the coherently-changing blocks, but with randomized order. In other words, the only difference between the coherently- vs. incoherently-changing scenes was the sequence of stimuli within the block. For both coherently- and incoherently-changing scenes, images from different trails were presented across different blocks.
In separate blocks, subjects were also presented with 80 images that included multiple faces (20° × 26°) with the same timing as the scene images (i.e., 100 ms per image; 16 s per block). All stimuli were grayscaled. Each subject participated in 6 runs and each run consisted of 9 blocks, plus 8 s of blank presentation at the beginning and the end of each block and 4 s of blank presentation between blocks.
On different runs (within the same session), subjects were also presented with concentric rings, extending 20° × 26° (height × width) in the visual field, presented against a light gray background (40 cd/m2). In half of the blocks (16 s per block), rings moved radially (centrifugally vs. centripetally; 4°/s) and the direction of motion changed every 4 s to reduce the impact of motion after-effects. In the remaining half of the blocks, rings remained stationary throughout the whole block. Each subject participated in 2 runs and each run consisted of 8 blocks, plus 16 s of uniform gray presentation at the beginning and the end of each run. The sequence of moving and stationary blocks was pseudo-randomized across runs.
Experiment 7 – Response to biological motion
To test whether PIGS also responds selectively to biological motion, twelve individuals were selected randomly and were scanned in a 3T scanner while they were presented with the moving point-lights that represented complex biological movements such as crawling, cycling, jumping, paddling, walking, etc. (Jastorff and Orban, 2009). Each action was presented for 2 s and the sequence of actions was randomized across the blocks (20 s per block). As a control, in different blocks, the subjects were shown the same stimuli when all of the point-lights moved in the same direction (i.e., translation motion). Each subject participated in 11 runs and each run consisted of 12 blocks, plus 10 s of blank presentation at the beginning and the end of each run.
2.3. Imaging
2.3.1. 3T scans
In Experiments 1 and 3–6, subjects were scanned in a horizontal 3T scanner (Tim Trio, Siemens Healthcare, Erlangen, Germany). Gradient echo EPI sequences were used for functional imaging. Functional data were acquired using single-shot gradient echo EPI with nominally 3.0 mm isotropic voxels (TR=2000 ms; TE=30 ms; flip angle=90°; band width (BW)=2298 Hz/pix; echo-spacing= 0.5 ms; no partial Fourier; 33 axial slices covering the entire brain; and no acceleration). During the first 3T scan (see the General Procedure), structural (anatomical) data were acquired for each subject using a 3D T1-weighted MPRAGE sequence (TR=2530 ms; TE=3.39 ms; TI=1100 ms; flip angle=7°; BW=200 Hz/pix; echo-spacing=8.2 ms; voxel size=1.0×1.0×1.33 mm).
2.3.2. 7T scans
In Experiment 2, subjects were scanned in a 7T Siemens whole-body scanner (Siemens Healthcare, Erlangen, Germany) equipped with SC72 body gradients (maximum gradient strength, 70 mT/m; maximum slew rate, 200 T/m/s) using a custom-built 32-channel helmet receive coil array and a birdcage volume transmit coil. Voxel dimensions were nominally 1.0 mm, isotropic. Single-shot gradientecho EPI was used to acquire functional images with the following protocol parameter values: TR=3000 ms; TE=28 ms; flip angle=78°; BW=1184 Hz/pix; echo-spacing=1 ms; 7/8 phase partial Fourier; 44 oblique-coronal slices; and acceleration factor r=4 with GRAPPA reconstruction and FLEET-ACS data (Polimeni et al., 2015) with 10° flip angle. The field of view included the occipital-parietal brain areas to cover PIGS, RSC/MPA and TOS/OPA (but not PPA/TPA).
2.4. Data Analysis
2.4.1. Structural data analysis
For each subject, inflated and flattened cortical surfaces were reconstructed based on the high-resolution anatomical data (Dale et al., 1999; Fischl et al., 1999; Fischl et al., 2002), during which the standard pial surface was generated as the gray matter border with the surrounding cerebrospinal fluid or CSF (i.e., the GM-CSF interface). The white matter surface was also generated as the interface between white and gray matter (i.e. WM-GM interface). In addition, an extra surface was generated at 50% of the depth of the local gray matter (Dale et al., 1999).
2.4.2. Individual-level functional data analysis
All functional data were rigidly aligned (6 df) relative to subject’s own structural scan, using rigid Boundary-Based Registration (Greve and Fischl, 2009), and then were motion corrected. Data collected in the 3T (but not 7T) scanner was spatially smoothed using a 3D Gaussian kernel (2 mm FWHM). To preserve the spatial resolution, data collected within the 7T scanner was not spatially smoothed.
Subsequently, a standard hemodynamic model based on a gamma function was fit to the fMRI signal to estimate the amplitude of the BOLD response. For each individual subject, the average BOLD response maps were calculated for each condition (Friston et al., 1999). Finally, voxel-wise statistical tests were conducted by computing contrasts based on a univariate general linear model.
The resultant significance maps based on 3T scans were sampled from the middle of cortical gray matter (defined for each subject based on their structural scan (see section 2.4.1)). For 7T scans, the resultant significance maps were sampled from deep cortical layers at the gray-white matter interface. This procedure reduced the spatial blurring caused by superficial veins (Koopmans et al., 2010; Polimeni et al., 2010; De Martino et al., 2013; Nasr et al., 2016). For presentation, the resultant maps were projected either onto the subject’s reconstructed cortical surfaces or onto a common template (fsaverage; Freesurfer (Fischl, 2012)).
2.4.3. Group-level functional data analysis
To generate group-averaged maps, functional maps were spatially normalized across subjects, then averaged using random-effects models and corrected for multiple comparisons (Friston et al., 1999). For Figure 1A and to replicate our original finding (Nasr et al., 2011), the group-average maps were generated using fixed-effects. The resultant significance maps were projected onto a common human brain template (fsaverage).
2.4.4. Region of interest (ROI) analysis
The main ROIs included area PIGS, the two neighboring scene-selective areas (RSC/MPA, TOS/OPA), and area V6. In Experiment 6, we also included area PPA/TPA in our analysis. These ROIs were localized in two different ways: (1) functionally, for each subject based on their own evoked activity (section 2.4.4.1), and (2) probabilistically, based on activity measured in a different group of subjects (section 2.4.4.2).
2.4.4.1. Functionally-localized ROIs
For those subjects who participated in Experiments 6 and 7, we localized scene-selective areas PIGS, TOS/OPA, RSC/MPA, and PPA/TPA based on their stronger response to scenes compared to faces at a threshold level of p<10-2, using the method described in Experiment 1. For subjects in Experiment 6, we also localized area V6 based on the expected selective response in this region to coherent radially- vs. incoherently-moving random dots (see Section 2.2.3). In those subjects in which PIGS and V6 showed partial overlap, the overlapping parts were excluded for the analysis.
2.4.4.2. Probabilistically-localized ROIs
For those subjects who participated in Experiments 4 and 5, we tested the consistency of PIGS locations across populations, using probabilistic labels for areas PIGS, TOS/OPA, RSC/MPA and V6. These labels were generated based on the results of Experiment 1 (for PIGS, TOS/OPA and RSC/MPA) and Experiment 3 (for V6). Specifically, we localized the ROIs separately for the individual subjects who participated in Experiments 1 and 3. Then the labels were overlaid on a common brain template (fsaverage). We computed the probability that each vortex within the cortical surface belonged to one of the ROIs. The labels for PIGS, TOS/OPA, RSC/MPA and V6 were generated based on those vertices that showed a probability higher than 20%. This method assured us that our measurements were not biased by those subjects who showed stronger scene-selective responses. Moreover, by selecting a relatively low threshold (i.e., 20%), we avoided confining our ROIs to the center of activity sites.
2.4.5. Statistical tests
To test the effect of independent parameters, we applied paired t-tests and/or a repeated-measures ANOVA, with Greenhouse-Geisser correction whenever the sphericity assumption was violated.
2.5. Data sharing statement
All data, codes and stimuli are ready to be shared upon request.
MATLAB (RRID: SCR_001622; https://www.mathworks.com).
FreeSurfer (RRID:SCR_001847; https://surfer.nmr.mgh.harvard.edu/fswiki/FsFast).
Psychophysics Toolbox (RRID:SCR_002881; http://psychtoolbox.org/docs/Psychtoolbox).
3. Results
This study consists of seven experiments. Experiment 1 focused on localizing the scene-selective site (PIGS) within the posterior intraparietal region. Experiment 2 showed consistency in the spatial location of PIGS across sessions. Experiment 3 examined PIGS location relative to V6, an area involved in motion coherency and optic flow encoding. Experiment 4 showed that, despite its small size, PIGS is detectable in group-averaged maps in large populations. Experiment 5 showed that scene and non-scene objects are differentiable from each other based on the evoked response evoked within PIGS. Experiment 6 tested the response in PIGS to ego-motion in scenes, yielding a result that differentiated PIGS from the other scene-selective regions. Finally, Experiment 7 showed that PIGS does not respond selectively to biological motion.
3.1. Experiment 1 – Small scene-selective sites are detectable within the posterior intraparietal gyrus
When the level of spatial smoothing is relatively low, scene-selective sites (other than PPA/TPA, TOS/OPA and RSC/MPA) are detectable across the brain, especially within the posterior intraparietal gyrus (Fig. 1B). To test the consistency in location of these scene-selective sites across individuals, fourteen subjects were presented with scene and face stimuli while we collected their fMRI activity. Considering the expected small size of the scene-selective sites within the intraparietal region, we used limited signal smoothing in our analysis (FWHM = 2 mm; see Methods) to increase the chance of detecting these sites.
Figure 2 shows the activity maps evoked by the ‘scenes > faces’ contrast in seven exemplar subjects. All activity maps were overlaid on a common brain template to clarify the consistency in location of scene-selective sites across individuals. In all tested individuals, besides areas RSC/MPA and TOS/OPA, we detected at least one scene-selective site within the posterior portion of the intraparietal gyrus, close to (but outside) the parieto-occipital sulcus (POS). Accordingly, we named this site the posterior interparietal gyrus scene-selective site or PIGS.
When measured at the same threshold levels (p< 10-2), the relative size of PIGS was 73.86% ± 49.01% (mean ± S. D.) of RSC/MPA, 28.26% ± 15.67% of TOS/OPA, and 19.45% ± 8.43% of PPA/TPA. Considering the proximity of PIGS to the skull and head coil surface (Fig. 1), the relatively small size of PIGS could not be ascribe to the lower signal/contrast to noise ratio in that region.
To better clarify the consistency of PIGS localization across subjects, we also generated group-averaged activity maps based on random effects, and after correction for multiple comparisons. As demonstrated in Fig. 3A, PIGS was also detectable in the group-averaged activity maps, in almost the same location as in the individual subject maps. Overall, these results suggest that, despite the relatively small size of this scene-selective site, PIGS is consistently detectable across subjects in the same cortical location.
3.2. Experiment 2 – PIGS reproducibility across scan sessions
To test the reproducibility of our results, four subjects were selected randomly among those who participated in Experiment 1. These subjects were scanned again (on a different day), using a 7T (rather than a 3T) scanner, and a different set of scenes and faces (Fig. 4A).
As demonstrated in Fig. 4, despite utilizing a different scanner and a different set of stimuli, PIGS was still detectable in the same location (Fig. 4B-D). Here again, PIGS was localized within the posterior portion of the intraparietal gyrus and close to the posterior lip of parieto-occipital sulcus. Considering the higher contrast/signal to noise ratio of 7T (compared to 3T) scans, this result strongly suggested that the PIGS evidence was not simply a nuisance artifact in fMRI measurements.
3.3. Experiment 3 – Localization of areas PIGS vs. V6
Posterior intraparietal cortex also accommodates area V6, which is involved in motion coherency (opticflow) encoding (Pitzalis et al., 2009). Recent studies have suggested that scene stimuli evoke a strong response within V6 (Sulpizio et al., 2020). To test whether PIGS overlaps with area V6, we localized V6 in all subjects who participated in Experiment 1, based on visual presentation of random vs. radially moving dots (see Methods).
Figure 4D shows the co-localization of V6 and PIGS in four individual subjects. Consistent with previous studies (Pitzalis et al., 2009; Pitzalis et al., 2015), V6 was localized within the posterior portion of the POS without any overlap between its center and PIGS.
To test the relative localization of these two regions at the group level, we generated probabilistic labels for PIGS and V6 (see Methods). As demonstrated in Fig. 5, the probabilistic label for PIGS was localized within the intraparietal gyrus and outside the POS (Fig. 5A), while V6 was located within the POS (Fig. 5B). We also did not find any overlap between area V6 and areas RSC/MPA and TOS/OPA (Fig. 5C). Thus, despite the low threshold level used to generate these labels (probability > 20%), the areas PIGS and V6 were located side-by-side (Fig. 5D), without any overlapping between their centers.
3.4. Experiment 4 – PIGS localization in a larger population
The results of Experiments 1–3 suggest that PIGS can be localized consistently across individual subjects, and this area appears to be distinguishable from the adjacent area V6. However, considering the small size of this area, it appears necessary to test whether this area was detectable based on group averaging in a larger population. Accordingly, in Experiment 4 we scanned thirty-one individuals (other than those who participated in Experiments 1–3) while they were presented with the same stimuli as in Experiment 1 (Fig. 2).
As demonstrated in Fig. 3B, PIGS was also detectable in this new population in almost the same location as in Experiment 1. Specifically, PIGS was detected bilaterally within the posterior portion of the intraparietal gyrus, adjacent to the POS. We did not find a significant difference between the two populations in the size of PIGS when normalized either relative to the size of RSC/MPA (t(43) = 0.98, p = 0.33), or TOS/OPA (t(43) = 0.26, p = 0.80) or PPA/TPA (t(43) = 0.52, p = 0.61). Thus, the location and relative size of PIGS appeared to remain unchanged across populations.
These results suggest that one may rely on the probabilistically-generated labels to examine the evoked activity within PIGS. To test this hypothesis, we measured the level of scene-selective activity in PIGS, along with the areas TOS/OPA, RSC/MPA and V6, using the probabilistic labels generated based on the results of Experiments 1 and 3 (see Methods and Fig. 5). As demonstrated in Fig. 6A-B, results of this ROI analysis showed a significant scene-selective activity in PIGS (t(31) = 8.11, p< 10-8), TOS/OPA (t(31) = 7.91, p< 10-7) and RSC/MPA (t(31) = 9.11, p< 10-8). Importantly, despite the proximity of PIGS and V6, the level of scene-selective activity in PIGS was significantly higher than that in V6 (t(11) = 5.03, p< 10-4). Thus, it appears that the probabilistically-generated ROIs can be used to examine PIGS response, and to differentiate it from adjacent areas such as V6 (see also Experiment 5).
3.5. Experiment 5 – Selective response to scenes compared to non-scene objects in PIGS
Thus far, we localized PIGs in multiple experiments by contrasting the response evoked by scenes vs. faces. In Experiments 5a and 5b, we examined whether PIGS also showed a selective response to scenes compared to objects (not just faces). In Experiment 5a, twelve individuals, other than those who participated in Experiments 1–3, were scanned while viewing pictures of scenes (other than those used to localize PIGS) and everyday objects (Fig. 7A) (see Methods).
As demonstrated in Figs. 7B and 7C for one individual subject, ‘scenes vs. objects’ and ‘scenes vs. faces’ (Experiment 4) contrasts generated similar activity maps. Importantly, in both maps, PIGS was detectable in a consistent location adjacent to (but outside) the parieto-occipital sulcus. Moreover, results of an ROI analysis, using the probabilistically-generated labels based on the results of Experiments 1 and 3, yielded significant scene-selective activity within PIGS (t(11) = 6.57, p< 10-4), RSC/MPA (t(12) = 11.00, p< 10-6) and TOS/OPA (t(12) = 6.26, p< 10-3) (Figs. 8A and 8B). We also found that the level of scene-selective activity within PIGS is significantly higher than that in the adjacent area V6 (t(11) = 2.42, p= 0.03). Thus, scenes and (non-face) objects are differentiable from each other, based on the activity evoked within PIGS.
In Experiment 5b, fifteen individuals (other than those who participated in Experiments 1 and 5a), were scanned while viewing a new set of stimuli that included pictures of scenes, faces, everyday objects and scrambled objects (Fig. 7D). In contrast to Experiment 5a in which the number of objects within each image could vary, here, each image contained only one object (see Methods). Despite this change, contrasting the response to scene vs. non-scene images (averaged over objects, scrambled objects and faces) evoked a similar activity pattern, as Scene vs. Faces (Figs. 7E and 7F). Moreover, the ROI analysis yielded a significant scene-selective activity within PIGS (t(14) = 2.37, p= 0.03), RSC/MPA (t(14) = 10.33, p< 10-7) and TOS/OPA (t(14) = 4.79, p< 10-3) (Figs. 8). Here again, the level of scene-selective activity within PIGS was higher than V6 (t(14) = 2.27, p= 0.04). Together, results of Experiments 1–5 suggest that PIGS responds selectively to a wide range of scenes compared to non-scene objects, and that the level of this activity is higher than in the adjacent area V6.
3.6. Experiment 6 – PIGS response to ego-motion
Experiments 1–5 clarified the location of PIGS, and its general functional selectivity for scenes. However, a more specific role of this area in scene perception remains undefined. Experiment 6 tested the hypothesis that area PIGS is involved in encoding ego-motion within scenes. This hypothesis was motivated by the fact that PIGS is located adjacent to V6 (Fig. 5D), an area involved in encoding optic flow. Other studies have also suggested that ego-motion may influence the scene-selective activity within this region, without clarifying whether this activity was centered either within or outside V6 (Pitzalis et al., 2020; Sulpizio et al., 2020).
Twelve individuals, from those who participated in Experiment 1, took part in this experiment (see Methods). These subjects were presented with coherently-changing scene stimuli that implied ego-motion across different outdoor trails (Fig. 9). In separate blocks, they were also presented with incoherently-changing scenes and faces. Figure 8 shows the group-averaged scene-selective activity, evoked by coherently- (Fig. 10A) and incoherently-changing scene stimuli (Fig. 10B). Consistent with our hypothesis, PIGS showed a significantly stronger response (bilaterally) to coherently- (compared to incoherently-) changing scenes that implied ego-motion (Fig. 10C). However, the level of activity within RSC/MPA and TOS/OPA did not change significantly between these two conditions.
Consistent with the group-averaged activity maps, results of an ROI analysis (Fig. 11) yielded a significantly stronger response to coherently- (vs. incoherently-) changing scenes in PIGS (t(11) = 5.97, p<10-4) but not in RSC/MPA (t(11) = 0.12, p = 0.90) and TOS/OPA (t(11) = 0.48, p = 0.64). Interestingly, area PPA/TPA showed a stronger response to incoherently- (compared to coherently-) changing scenes (t(11) = 3.48, p< 0.01). To better clarify the difference between scene-selective areas, we repeated this test by applying a one-way repeated measures ANOVA to the differential response to ‘coherently- vs. incoherently-changing scenes’, measured across these four scene-selective areas. This test yielded a significant effect of area on the evoked differential activity (F(3, 11) = 53.89, p< 10-10). Post hoc analysis, with Bonferroni correction, showed that the level of differential activity evoked by ‘coherently- vs. incoherently-changing scenes’ was significantly higher within PIGS than all other scene-selective areas (p < 10-6). These results suggest a distinctive role for area PIGS in ego-motion encoding, that differentiates it from the other scene-selective areas. The absence of activity modulation in the other scene-selective areas also ruled out the possibility that the activity increase in PIGS was simply due to attentional modulation during coherently- vs. incoherently-changing scenes (see Discussion).
In addition to PIGS, we also found a significantly stronger response to coherently- (rather than incoherently-) changing scenes in area V6 (t(11) = 3.57, p< 0.01). However, the level of this selectivity was significantly weaker in V6 compared to that in PIGS (t(11) = 2.63, p= 0.02). Moreover, in the group-averaged activity maps, the contrast between coherently- vs. incoherently-changing scenes yielded a stronger response outside (rather than inside) the POS and also in area MT, located at the tip of medial temporal sulcus (Fig. 10C). Together, these results suggest that the impact of ego-motion on scene processing is stronger in PIGS than that in V6.
In the same session (but different runs), we also tested the selectivity of the PIGS response for simpler forms of motion. In different blocks, subjects were presented with radially moving vs. stationary concentric rings (see Methods). Consistent with the previous studies of motion perception (Pitzalis et al., 2009; Hacialihafiz and Bartels, 2015), the results of an ROI analysis here, did not yield any strong (significant) motion-selective activity within PIGS (t(11) = 1.84, p= 0.10), RSC/MPA (t(11) = 1.97, p= 0.08), PPA/TPA (t(11) = 1.93, p= 0.08) and V6 (t(11) = 2.03, p= 0.07). In contrast, we found a strong motion selectivity within area TOS/OPA (t(11) = 4.57, p< 10-3), likely due to its overlap with the motion-selective area V3A/B (Nasr et al., 2011). Thus, in contrast to optic flow and ego-motion, simpler forms of motion only evoke weak-to-no selective activity within PIGS and V6.
3.7. Experiment 7 – PIGS response to biological motion
The results of Experiment 6 showed that PIGS responds selectively to ego-motion in scenes, but not strongly to radially moving rings. However, it could be argued that PIGS may also respond to the other types of complex motion, e.g., biological motion. To test this hypothesis, we measured the PIGS response to biological vs. translational motion in twelve subjects (see Methods). As illustrated in Fig. 12, and consistent with the previous studies of biological motion (Puce et al., 1998; Beauchamp et al., 2003; Puce and Perrett, 2003; Pelphrey et al., 2005; Jastorff and Orban, 2009; Kamps et al., 2016), biological motion evoked a stronger response bilaterally within area MT and superior temporal sulcus but not within the posterior intraparietal gyrus. Consistent with the maps, an ROI analysis (based on the functionally-defined labels) showed no significant difference between the response to biological vs. translational motion within PIGS (t(11) = 1.27, p = 0.23), TOS/OPA (t(11) = 1.63, p = 0.13), RSC/MPA (t(11) = 1.40, p = 0.18), and PPA/TPA (t(11) = 0.41, p= 0.69). These results indicated that PIGS does not respond to all types of complex motion.
4. Discussion
These data suggest that selective scene processing is not limited to areas PPA/TPA, RSC/MPA and TOS/OPA, and that additional smaller scene-selective sites can also be found across the visual system. By focusing on one small scene-selective site, we showed that this site (PIGS) was consistently identifiable across individuals and groups. We also showed that inclusion of this site in the models of scene processing may clarify how ego-motion influences scene perception.
4.1. FMRI and all that “noise, noise, noise”!
The early fMRI studies dealt with a considerable amount of noise in measurements, partly due to using lower magnetic field scanners and imperfect hardware and software. This noise in measurements affected the reliability of the findings. Consequently, those early studies focused on larger activity sites that were more reliably detectable across subjects/sessions. The smaller sites were either ignored or eliminated by excessive signal smoothing, applied to enhance the level of contrast to noise ratio.
However, advances in neuroimaging techniques have now made it possible to detect and distinguish fMRI activity at the spatial scale of cortical columns (Yacoub et al., 2007; Zimmermann et al., 2011; Nasr et al., 2016). Although the reliability of the fMRI signal still depends on the number of trial repetitions, a spatially confined, but extensively repeated, evoked response can be detected reliably across different sessions (Nasr et al., 2016; Kennedy et al., 2023).
The present data shows that PIGS could be localized consistently across multiple subjects and across different sessions and scanners. Furthermore, our results indicated that the probabilistic labels, generated based on one population, can be used to localize PIGS, and to distinguish its function from the adjacent regions (e.g., V6) in a second population. Together, these results highlight the reliability of current fMRI techniques in detecting smaller cortical regions, in the level of individual subjects.
4.2. PIGS responds selectively to a variety of scene stimuli
To establish a true category-selective response, the stimulus set should sample enough variety to reflect the range and variability among the category members. Consistent with this are the many (and continuing) studies seeking to define the range and fundamental aspects of ‘place selective’ (Epstein and Kanwisher, 1998; Troiani et al., 2014) and ‘face selective’ (Kanwisher et al., 1997; Yue et al., 2011) stimuli in extrastriate visual cortex, decades after their first discovery.
Accordingly, here we tested five different scene stimulus sets across our experiments, including a wide variety of indoor/outdoor and natural/manmade scenes. In all cases, we were able to evoke a selective response within PIGS, and the level of this response was comparable to that in the adjacent scene-selective areas RSC/MPA and TOS/OPA. Thus, the scene-selective response in PIGS appeared not to be limited to a single subset of scenes. However, it remains unclear whether scene stimuli are differentiable from each other based on the pattern of evoked response in this region. More experiments are necessary to test this hypothesis (see also the Limitations).
4.3. PIGS is not just another scene selective area
Our results (Experiment 6) suggest that ego-motion can significantly influence the activity evoked within PIGS. This phenomenon distinguishes the role of PIGS in scene perception, relative to other scene-selective regions. Specifically, previous studies have shown that PPA/TPA and RSC/MPA show weak-to-no sensitivity to motion per se (Hacialihafiz and Bartels, 2015). In comparison, area TOS/OPA shows a stronger motion-selective response, presumably related to its (partial) overlap with area V3A/B (Tootell et al., 1997; Nasr et al., 2011). Instead, the current data show that the ego-motion related activity within PIGS is stronger than in TOS/OPA.
This finding is consistent with the fact that PIGS is located adjacent to area V6 (Figs. 4 and 5), an area that contributes to encoding optic flow (Pitzalis et al., 2009). Considering PIGS and V6 proximity, hypothetical inputs from V6 may contribute to the strong ego-motion selective response in PIGS. This said, the current data also suggests that the role of PIGS differs from that in V6, in terms of ego motion encoding. Compared to V6, PIGS showed a stronger impact of ego-motion on scene processing, while V6 shows a stronger response to optic flow induced by random dot arrays. Thus, PIGS contributes to scene encoding and ego motion within scenes, while V6 is likely involved in detecting optic flow caused by egomotion.
4.4. Ego-motion Encoding in PIGS vs. TOS/OPA
We showed that PIGS and TOS/OPA are located on two different sides of the IPS with TOS/OPA located more ventrally compared to PIGS. We also showed a stronger of ego-motion on activity within PIGS compared to TOS/OPA. In contrast, TOS/OPA (but not PIGS) responded selectively to simpler forms of motion. These results suggest that PIGS and TOS/OPA are likely two different visual areas, with PIGS being involved in encoding higher-level ego-motion cues.
However, at least two previous studies suggested that area TOS/OPA may also contribute to ego-motion encoding in scenes. Specifically, Kamps and colleagues have shown increased response in TOS/OPA during ego-motion vs. static scene presentation (Kamps et al., 2016). Jones et al. have also shown that ego-motion (and not other types of movements) enhances TOS/OPA activity when compared to scrambled scenes (Jones et al., 2023). In contrast to these findings, our tests showed weak-to-no egomotion related activity enhancement in area TOS/OPA.
This difference may well reflect methodological discrepancies. Specifically, in the study by Kamps et al., the static and ego-motion stimuli were presented with two different refresh rates. While in our study, the coherently- and incoherently-changing stimuli were refreshed with the same temporal frequency (see Methods). In the study by Jones et al., the response to scrambled scenes was used as a control condition, whereas our stimuli were more equivalent, differing only in the sequence of image presentation. Moreover, these studies used higher levels of spatial smoothing (FWHM = 5 mm), compared to the values we used here during pre-processing. Also, for understandable reasons, they limited their analysis to previously known scene-selective areas. These technical differences make it difficult to directly compare the two sets of results.
4.5. Ego-motion but not attention
Experiment 6 showed stronger scene-selective activity within PIGS when subjects were presented with coherently- (compared to incoherently-) changing scenes. It could be argued that coherently-changing scenes attract more attention compared to incoherently-changing scenes. On the face of it, this hypothesis appears to be consistent with the intraparietal role in controlling the spatial attention (Behrmann et al., 2004; Szczepanski et al., 2010). However, multiple studies have shown that attention to scenes increases the level of activity within the scene-selective areas (O’craven et al., 1999; Nasr and Tootell, 2012a; Baldauf and Desimone, 2014). However, we did not find any significant activity increases in response to coherently- (vs. incoherently-) changing scenes in PPA/TPA, RSC/MPA and TOS/OPA. Thus, modulation of attention, per se, could not be responsible for the enhanced activity within PIGS in response to coherently- (compared to incoherently-) changing scenes.
4.6. Direction-selective response within the intraparietal cortex
Motion-selective sites are expected to show at least some level of sensitivity to motion direction (Albright et al., 1984; Zimmermann et al., 2011). We did not test the sensitivity of PIGS to the direction of ego motion. However, Pitzalis et al. have shown evidence for motion direction encoding within the V6 + region (Pitzalis et al., 2020). Furthermore, Tootell et al. reported evidence for motion direction (approaching vs. withdrawing) encoding within posterior intraparietal cortex (Tootell et al., 2022). Although none of these studies showed any evidence for a new scene-selective area, they raised the possibility that PIGS may also contribute towards encoding ego-motion direction, and even higher level cognitive concepts such as detecting an intrusion to personal space (Holt et al., 2014).
4.7. Limitations
In the past, many studies have scrutinized the response function of scene-selective areas to numerous stimulus contrasts. According to these studies, scene-selective areas can differentiate many object categories based on their low-, mid-, and/or higher-level visual features such as their natural size (Konkle and Oliva, 2012), (non-)animacy (Yue et al., 2020; Coggan and Tong, 2023), rectilinearity (Nasr et al., 2014), spatial layout (Harel et al., 2013), orientation (Nasr and Tootell, 2012b), spikiness (Coggan and Tong, 2023), location within the visual field (Levy et al., 2001), and spatial content (Bar et al., 2008). Our findings are only a first step toward characterizing PIGS in greater detail. More tests are required to reach the current (yet incomplete) knowledge about the response function of PIGS.
5. Conclusion
Neuroimaging studies of scene perception have typically focused on linking scene perception to the evoked activity within PPA/TPA, TOS/OPA and RSC/MPA. Although other scene-selective sites are detectable across the visual cortex, they are largely ignored because of their relatively small size. Our data suggests that the future inclusion of these small sites in models of scene perception may help clarify current models of scene processing in dynamic environments.
Acknowledgements
This work was supported by NIH NEI (grants R01 EY017081 and R01 EY030434), and by the MGH/HST Athinoula A. Martinos Center for Biomedical Imaging. Crucial resources were made available by a NIH Shared Instrumentation Grant S10-RR019371. We thank Ms. Azma Mareyam for help with hardware maintenance during this study. We also thank Dr. Claudio Galletti for his helpful comments.
References
- 1.The southampton-york natural scenes (syns) dataset: Statistics of surface attitudeSci Rep 6:1–17
- 2.Columnar organization of directionally selective cells in visual area MT of the macaqueJ Neurophysiol 51:16–31
- 3.Neural mechanisms of object-based attentionScience 344:424–427
- 4.Scenes unseen: the parahippocampal cortex intrinsically subserves contextual associations, not scenes or places per seJ Neurosci 28:8539–8544
- 5.FMRI responses to video and point-light displays of moving humans and manipulable objectsJ Cogn Neurosci 15:991–1001
- 6.Parietal cortex and attentionCurr Opin Neurobiol 14:212–217
- 7.The Psychophysics ToolboxSpat Vis 10:433–436
- 8.Spikiness and animacy as potential organizing principles of human ventral visual cortexCereb Cortex
- 9.Cortical surface-based analysis. I. Segmentation and surface reconstructionNeuroimage 9:179–194
- 10.Cortical depth dependent functional responses in humans at 7T: improved specificity with 3D GRASEPLoS One 8
- 11.Characterization of the human visual V6 complex by functional magnetic resonance imagingEur J Neurosci 17:2201–2211
- 12.The occipital place area is causally and selectively involved in scene perceptionJ Neurosci 33:1331–1336
- 13.A cortical representation of the local visual environmentNature 392:598–601
- 14.FreeSurferNeuroimage 62:774–781
- 15.Cortical surface-based analysis. II: Inflation, flattening, and a surface-based coordinate systemNeuroimage 9:195–207
- 16.Whole brain segmentation: automated labeling of neuroanatomical structures in the human brainNeuron 33:341–355
- 17.Multisubject fMRI studies and conjunction analysesNeuroimage 10:385–396
- 18.Accurate and robust brain image alignment using boundary-based registrationNeuroimage 48:63–72
- 19.The neural basis of object perceptionCurr Opin Neurobiol 13:159–166
- 20.Motion responses in scene-selective regionsNeuroimage 118:438–444
- 21.Deconstructing visual scenes in cortex: gradients of object and spatial layout informationCereb Cortex 23:947–957
- 22.Neural correlates of personal space intrusionJ Neurosci 34:4123–4134
- 23.Human functional magnetic resonance imaging reveals separation and integration of shape and motion cues in biological motion processingJ Neurosci 29:7315–7329
- 24.The occipital place area represents visual information about walking, not crawlingCereb Cortex
- 25.The occipital place area represents first-person perspective motion information through scenesCortex 83:17–26
- 26.The fusiform face area: a module in human extrastriate cortex specialized for face perceptionJ Neurosci 17:4302–4311
- 27.Two fine-scale channels for encoding motion and stereopsis within the human magnocellular streamProg Neurobiol
- 28.Layer-specific BOLD activation in human V1Hum Brain Mapp 31:1297–1304
- 29.A network for scene processing in the macaque temporal lobeNeuron 79:766–781
- 30.Human scene-selective areas represent 3D configurations of surfacesNeuron 101:178–192
- 31.Center–periphery organization of human object areasNat Neurosci 4
- 32.Submillimeter fMRI reveals an extensive, fine-grained and functionally-relevant scene-processing network in monkeysProg Neurobiol 211
- 33.The retrosplenial contribution to human navigation: a review of lesion and neuroimaging findingsScand J Psychol 42:225–238
- 34.Role of fusiform and anterior temporal cortical areas in facial recognitionNeuroimage 63:1743–1753
- 35.A cardinal orientation bias in scene-selective visual cortexJ Neurosci 32:14921–14926
- 36.Thinking outside the box: rectilinear shapes selectively activate scene-selective cortexJ Neurosci 34:6721–6735
- 37.Interdigitated Color-and Disparity-Selective Columns within Human Visual Cortical Areas V2 and V3J Neurosci 36:1841–1857
- 38.Scene-selective cortical regions in human and nonhuman primatesJ Neurosci 31:13771–13785
- 39.fMRI evidence for objects as the units of attentional selectionNature 401:584–587
- 40.Ramp-shaped neural tuning supports graded population-level representation of the object-to-scene continuumSci Rep 12
- 41.The VideoToolbox software for visual psychophysics: transforming numbers into moviesSpat Vis 10:437–442
- 42.Functional anatomy of biological motion perception in posterior temporal cortex: an fMRI study of eye, mouth and hand movementsCereb Cortex 15:1866–1876
- 43.The human cortical areas V6 and V6AVis Neurosci
- 44.Human V6: the medial motion areaCereb Cortex 20:411–424
- 45.Neural bases of self-and object-motion in a naturalistic visionHum Brain Mapp 41:1084–1111
- 46.Laminar analysis of 7T BOLD using an imposed spatial activation pattern in human V1Neuroimage 52:1334–1346
- 47.Reducing sensitivity losses due to respiration and motion in accelerated echo planar imaging by reordering the autocalibration data acquisitionMagn Reson Med
- 48.Electrophysiology and brain imaging of biological motionPhilos Trans R Soc Lond B Biol Sci 358:435–445
- 49.Temporal cortex activation in humans viewing eye and mouth movementsJ Neurosci 18:2188–2199
- 50.An anterior temporal face patch in human cortex, predicted by macaque mapsProc Natl Acad Sci U S A 106:1995–2000
- 51.Scene-selectivity and retinotopy in medial parietal cortexFront Hum Neurosci 10
- 52.A common neural substrate for processing scenes and egomotion-compatible visual motionBrain structure and Function 225:2091–2110
- 53.Mechanisms of spatial attention control in frontal and parietal cortexJ Neurosci 30:148–160
- 54.Interdigitated Columnar Representation of Personal Space and Visual Space in Human Parietal CortexJ Neurosci 42:9011–9029
- 55.Functional analysis of V3A and related areas in human visual cortexJ Neurosci 17:7060–7078
- 56.Multiple object properties drive scene-selective regionsCereb Cortex 24:883–897
- 57.Comparing face patch systems in macaques and humansProc Natl Acad Sci U S A 105:19514–19519
- 58.Robust detection of ocular dominance columns in humans using Hahn Spin Echo BOLD functional MRI at 7 TeslaNeuroimage 37:1161–1177
- 59.Curvature processing in human visual cortical areasNeuroimage 222
- 60.Lower-level stimulus features strongly influence responses in the fusiform face areaCereb Cortex 21:35–47
- 61.Mapping the organization of axis of motion selective features in human area MT using high-field fMRIPLoS One 6
Article and author information
Author information
Version history
- Sent for peer review:
- Preprint posted:
- Reviewed Preprint version 1:
- Reviewed Preprint version 2:
- Version of Record published:
Copyright
© 2024, Kennedy et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.