A geometric shape regularity effect in the human brain
Figures
Measuring and modeling the perceptual similarity of geometric shapes.
(A) The 11 quadrilaterals used throughout the experiments (colors are consistently used in all other figures). (B) Sample displays for the behavioral visual search task used to estimate the 11 × 11 shape similarity matrix. Participants had to locate the deviant shape. The right insert shows two trials from the behavioral visual search task, used to estimate the 11 × 11 shape similarity matrix. Participants had to find the intruder within nine shapes. (C) Multidimensional scaling of human dissimilarity judgments; the gray arrow indicates the projection on the Multi-Dimensional Scaling (MDS) space of the number of geometric primitives in a shape. (D) The behavioral dissimilarity matrix (left) was better captured by a geometric feature coding model (middle) than by a convolutional neural network (right). The graph at right (E) shows the general linear model (GLM) coefficients for each participant. An accompanying explainer video is provided in Figure 1—video 1.
Additional convolutional neural network (CNN) encoding models of human behavior.
(A) Correlation matrix between all pairs of representational dissimilarity matrices (RDMs) generated by each CNN and layer considered. (B) Replication of Figure 1D with different CNNs. Stars indicate a significant difference between the geometric feature model and the respective CNN encoding model (p < 0.001). t and p values also indicate whether the CNN encoding model is a significant predictor of participant behavior (the geometric feature model is always highly significant). (C) Replication of B using different layers of CORnet, organized from early to late layers, from left to right. Note that the late layers are much more significant predictors of human behavior than the early ones – although still far inferior to the geometric feature model. (D) Replication of our general linear model (GLM) analysis including only shapes for which there is no obvious name in English – though we gave them names in this manuscript to refer to them: ‘kite’, ‘rightKite’, ‘hinge’, ‘rustedHinge’, and ‘random’.
Explainer video for Figure 1.
Explainer videos are not peer reviewed.
Localizing the brain systems involved in geometric shape perception.
(A) Visual categories used in the localizer. (B) Task: Passive presentation by miniblocks of consistent visual categories. In some miniblock, among a series of six pictures from a given category, participants had to detect a rare target star. (C) Statistical map associated with the contrast ‘single geometric shape > faces, houses, and tools’, projected on an inflated brain (top: adults; bottom: children; clusters significant at cluster-corrected p < 0.05 with non-parametric two-tailed bootstrap test as reported in the text). (D) BOLD response amplitude (regression weights, arbitrary units) within each significant cluster with subject-specific localization. Geometric shapes activate the intraparietal sulcus (IPS) and posterior inferior temporal gyrus (pITG), while causing reduced activation in broad bilateral ventral areas compared to other stimuli; see Figure 2—figure supplement 3 for analysis of subject-specific ventral subregions. An accompanying explainer video is provided in Figure 2—video 1.
Overview of the stimuli used for the category localizer.
(A) Average pixel value (left) and average standard deviation across pixels (right) for stimuli within each category (y axis). An ANOVA indicated no significant effect of the stimulus category on either the average or the standard deviation across pixels. (B) Average (top) and max (bottom) pixel value at each location across the eight possible visual categories used in the localizer.
Details of the fMRI results in children.
(A) Statistical map associated with the contrast ‘single geometric shape > faces, houses, and tools’, projected on an inflated brain (top: adults; bottom: children; for illustration purposes, we display the uncorrected statistical map at the p < 0.01 level). Notice how similar the activations are in both age groups. (B) Same as A, but for the contrast ‘single geometric shape > all single-object visual categories (face, house, tools, Chinese characters)’. The activation maps are very similar to the previous contrast and very similar across age groups. (C) Whole-brain correlation of the BOLD signal with geometric regularity in children, as measured by the error rate in a previous online intruder detection task (Sablé-Meyer et al., 2021). Positive correlations are shown in red and negative ones in blue. Voxel threshold p < 0.001, no correction for multiple comparisons, but the p-value indicates the only cluster that was significant at the cluster-level corrected p < 0.05 threshold. (D) Results of RSA analysis in children. No cluster was significant at the p < 0.05 level for the geometric feature models; one right-lateralized occipital cluster reached significance for the convolutional neural network (CNN) encoding model (cluster-level corrected p = 0.019), and its symmetrical counterpart was close to the significance threshold (cluster-level corrected p = 0.062).
fMRI response of subject-specific voxels in the ventral visual pathway to geometric shapes and other visual stimuli.
The brain slices show the group-level clusters associated with various contrasts known to elicit a selective response in the ventral visual pathway, in both age groups: VWFA (words > others; green), FFA (faces > others; purple), tool-selective regions of interest (ROIs) (tools > others; red), and PPA (houses > others; light blue). Within each ROI, plots show the mean beta coefficients for the BOLD effect within a subject-specific selection of the 10% best voxels, using independent runs for selection and plotting to avoid circularity and ‘double-dipping’.
Overlap with math-responsive network and comparison with previous findings.
(A) Overlap in red between our geometry contrast in green (shape versus other single objects) and our number contrast in orange (numbers > words), in three slices: two that intersect with the bilateral IPS areas (z = 60 and z = 52) and the rITG (z = 2) in both populations. To help visualize areas that coincide between populations but did not reach significance in one or the other, the maps here are uncorrected, p < 0.01. (B) Statistical map from Pinel et al., 2001 showing areas where activation was correlated with numerical distance in a number comparison task, slice at z = 48 (p < 0.001, uncorrected). (C) Statistical map from Amalric and Dehaene, 2016 showing the overlap between three math-related tasks, including high-level mathematical judgments in mathematicians; slice at z = 52. (D) Statistical tests for the ‘shape > other categories’ contrast from the regions of interest (ROIs) identified in independent work (Amalric and Dehaene, 2016) in both populations, with all ROIs having a significant test at the p < 0.05 level except the LpITG.
Explainer video for Figure 2.
Explainer videos are not peer reviewed.
Dissociating two neural pathways for the perception of geometric shape.
(A) fMRI intruder task. Participants indicated the location of a deviant shape via button clicks (left or right). Deviants were generated by moving a corner by a fixed amount in four different directions. (B) Performance inside the fMRI: both populations tested displayed an increasing error rate with geometric shape complexity, which significantly correlates with previous data collected online. (C) Whole-brain correlation of the BOLD signal with geometric regularity in adults, as measured by the error rate in a previous online intruder detection task (Sablé-Meyer et al., 2021). Positive correlations are shown in red and negative ones in blue. Voxel threshold p < 0.001, cluster-corrected by permutation at p < 0.05. Side panels show the activation in two significant regions of interest (ROIs) whose coordinates were identified in adults and where the correlation was also found in children (one-tailed test, corrected for the number of ROIs tested this way). (D) Whole-brain searchlight-based RSA analysis in adults (same statistical thresholds). Colors indicate the model which elicited the cluster: purple for convolutional neural network (CNN) encoding, orange for the geometric feature model, green for their overlap. An accompanying explainer video is provided in Figure 3—video 1.
Additional models: fMRI.
t-values inside significant clusters at the p < 0.05 level for four models: geometric features (top left), convolutional neural network (CNN) encoding (top right), DINOv2 last layer (bottom left), and skeletal representations from Morfoisse and Izard, 2021 (bottom right). Skeletal representations from Ayzenberg and Lourenco, 2019 did not yield any significant clusters in adults. In children (bottom), only the DINOv2 elicited any significant cluster.
Explainer video for Figure 3.
Explainer videos are not peer reviewed.
Using magnetoencephalography (MEG) to time the two successive neural codes for geometric shapes.
(A) Task structure: participants passively watch a constant stream of geometric shapes, one per second (presentation time 800 ms). The stimuli are presented in blocks of 30 identical shapes up to scaling and rotation, with four occasional deviant shapes. Participants do not have a task to perform besides fixating. (B, C) Performance of a classifier using MEG signals to predict whether the stimulus is a regular shape or an oddball. Left: performance for each shape; middle: correlation with geometrical regularity (same x axis as in Figure 3C); right: visualization of the average decoding performance over the cluster. In B, training of the classifier was performed on MEG signals from all 11 shapes; In C, 11 different classifiers were trained separately, one for each shape. (D) Sensor-level temporal RSA analysis. At each timepoint, the 11 × 11 dissimilarity matrix of MEG signals was modeled by the two model representational dissimilarity matrices (RDMs) in Figure 1D, and the graph shows the time course of the corresponding whitened correlation coefficients. Below the time courses, we display the average empirical dissimilarity matrix across participants at two notable timepoints: when the correlation with the convolutional neural network (CNN) and geometric feature models are maximal (CNN: t = 84 ms; geometric features: 232 ms) (E) Source-level temporal-spatial searchlight RSA. Same analysis as in C, but now after reconstruction of cortical source activity. An accompanying explainer video is provided in Figure 4—video 1.
Additional models: behavior and magnetoencephalography (MEG).
(A) Correlation between several models and the average representational dissimilarity matrix (RDM) across participants. In particular, we have added the last two layers of DINOv2 (Oquab et al., 2023) as well as two different implementations of distances in skeletal spaces (Ayzenberg and Lourenco, 2019; Morfoisse and Izard, 2021). (B) Figure 1D with the symbolic model replaced with the empirical RDM obtained from the last layer of DINOv2 using Euclidean distance. (C) Figure 4C with the same replacement of the symbolic model with the last layer of DINOv2. (D) Figure 4D with the same replacement of the symbolic model with the last layer of DINOv2. (E) Time course of the similarity between empirical RDMs and two additional neural networks: a vision transformer network (ViT, top Dosovitskiy et al., 2020), and a large convolutional neural network (CNN) (ConvNeXT, bottom, Liu et al., 2022), both with many parameters (respectively ~1 billion and ~800 million) and trained on 2 billion images Cherti et al., 2023; Schuhmann et al., 2022.
Explainer video for Figure 4.
Explainer videos are not peer reviewed.
Videos
Spatiotemporal dynamics of the magnetoencephalography (MEG) data compared to the models.
Searchlight-based timepoint-per-timepoint RSA analysis across shapes of the MEG data. Significant (p < 0.05) sources associated with the convolutional neural network (CNN) model are shown in purple, significant sources associated with the geometric feature model in orange; overlap is shown in green. Cluster-corrected clusters reported in Figure 4.
Tables
Coordinates and characteristics of significant fMRI clusters responding to geometric shapes in localizer runs.
For each age group, each line gives the peak coordinates, volume, and statistics of a cluster with p < 0.05 (whole brain, permutation test) for the contrast ‘single shape > other single visual categories’. The sign of the peak t-value and the shading indicate whether the contrast was positive (white background) or negative (gray background). Coordinates are given in MNI space.
| Age group | X | Y | Z | Peak t-value | Volume in cm2 | Cluster corrected p-value |
|---|---|---|---|---|---|---|
| Adults | 51.5 | –54.5 | –8.5 | 5.12 | 4.4 | <0.01 |
| 35.5 | –48.5 | 55.5 | 4.93 | 7.2 | <0.01 | |
| 39.5 | –76.5 | –12.5 | –6.33 | 51.4 | <0.01 | |
| –34.5 | –70.5 | –10.5 | –5.92 | 42 | <0.01 | |
| Children | –12.5 | –76.5 | –48.5 | 5.51 | 3.9 | 0.01 |
| 55.5 | –24.5 | 45.5 | 5 | 8.3 | <0.01 | |
| 21.5 | –80.5 | –46.5 | 4.9 | 4.4 | 0.01 | |
| –42.5 | –38.5 | 35.5 | 4.9 | 6.4 | <0.01 | |
| 21.5 | –94.5 | –8.5 | –6.34 | 40.1 | <0.01 | |
| –24.5 | –98.5 | –8.5 | –6.17 | 55.3 | <0.01 |
Coordinates and characteristics of significant fMRI clusters in the RSA analysis.
Coordinates and characteristics of significant fMRI clusters in the RSA analysis. Same organization as Appendix 1—table 1 for the RSA analysis.
| Age group | Model | X | Y | Z | Peak t-value | Volume in cm2 | Cluster corrected p-value |
|---|---|---|---|---|---|---|---|
| Adults | Geometric features | –0.5 | 13.5 | 49.5 | 5.4 | 64.4 | <0.01 |
| –22.5 | –54.5 | 51.5 | 5.38 | 13.6 | <0.01 | ||
| –14.5 | –66.5 | 5.5 | 5.28 | 6.1 | <0.01 | ||
| 31.5 | 31.5 | –8.5 | 4.86 | 2.2 | 0.03 | ||
| –26.5 | –2.5 | 49.5 | 4.79 | 5.3 | <0.01 | ||
| –8.5 | –72.5 | –38.5 | 4.62 | 7 | <0.01 | ||
| –34.5 | 23.5 | 5.5 | 4.15 | 3.2 | <0.01 | ||
| –10.5 | 71.5 | 3.5 | 4.03 | 1.9 | 0.03 | ||
| –46.5 | –0.5 | 33.5 | 3.88 | 1.6 | 0.04 | ||
| 21.5 | –98.5 | –6.5 | 3.72 | 2.3 | 0.02 | ||
| CNN encoding | 23.5 | –14.5 | 57.5 | 5.4 | 2.4 | 0.03 | |
| –48.5 | –82.5 | –0.5 | 4.96 | 4.4 | <0.01 | ||
| –30.5 | –80.5 | 25.5 | 4.55 | 2.9 | 0.02 | ||
| 1.5 | 21.5 | 45.5 | 4.54 | 3.8 | <0.01 | ||
| 27.5 | 33.5 | 5.5 | 4.51 | 3.7 | <0.01 | ||
| 45.5 | –80.5 | –2.5 | 4.38 | 2.2 | 0.03 | ||
| 53.5 | 13.5 | 31.5 | 4.05 | 3.9 | <0.01 | ||
| Children | CNN encoding | –22.5 | –84.5 | 11.5 | 4.59 | 1.4 | 0.06 |
| 43.5 | –82.5 | 11.5 | 4.29 | 2.2 | 0.02 |