Cross-movie prediction of individualized functional topography

  1. Guo Jiahui
  2. Ma Feilong
  3. Samuel A Nastase
  4. James V Haxby
  5. M Ida Gobbini  Is a corresponding author
  1. Center for Cognitive Neuroscience, Dartmouth College, United States
  2. Princeton Neuroscience Institute, Princeton University, United States
  3. Department of Medical and Surgical Sciences (DIMEC), University of Bologna, Italy
  4. IRCCS, Istituto delle Scienze Neurologiche di Bologna, Italy

Abstract

Participant-specific, functionally defined brain areas are usually mapped with functional localizers and estimated by making contrasts between responses to single categories of input. Naturalistic stimuli engage multiple brain systems in parallel, provide more ecologically plausible estimates of real-world statistics, and are friendly to special populations. The current study shows that cortical functional topographies in individual participants can be estimated with high fidelity from naturalistic stimuli. Importantly, we demonstrate that robust, individualized estimates can be obtained even when participants watched different movies, were scanned with different parameters/scanners, and were sampled from different institutes across the world. Our results create a foundation for future studies that allow researchers to estimate a broad range of functional topographies based on naturalistic movies and a normative database, making it possible to integrate high-level cognitive functions across datasets from laboratories worldwide.

Editor's evaluation

This valuable study presents a tool for hyperaligning functional brain topography between individuals, which is based on fMRI connectivity data gathered when participants watched different movies. The tool is validated through strong correlations between functional topographic maps generated from a participant's own localizer data and those derived from other participants' data based on this hyperalignment, even when the training and target participants were drawn from different datasets. The study will potentially be of interest to researchers working with a wide range of fMRI datasets.

https://doi.org/10.7554/eLife.86037.sa0

Introduction

Category-selective functional topographies are a prominent and consistent feature of lateral occipital, ventral temporal, and lateral temporal visual cortices (Downing et al., 2001; Epstein et al., 1999; Grill-Spector and Weiner, 2014; Kanwisher et al., 1997). Category-selective topographies are mostly similar across individuals but are idiosyncratic in terms of their precise conformation and location (Zhen et al., 2015; Zhen et al., 2017). Because of these idiosyncrasies, category-selective topographies and areas are typically mapped in each individual using a functional localizer fMRI scan (Fedorenko et al., 2010; Saxe et al., 2006). Functional localizers map individualized topographies with simple contrasts between responses to different categories, such as contrasting responses to faces versus objects to localize face-selective areas.

We reported an alternative approach to map category-selective topographies using fMRI data collected while participants view a naturalistic movie (Guntupalli et al., 2016; Haxby et al., 2011; Jiahui et al., 2020). With this approach, movie-viewing and functional localizer data are collected in a normative sample, and new participants need only be scanned during movie viewing. Movie data are used to calculate transformation matrices using hyperalignment (Guntupalli et al., 2016; Haxby et al., 2011; Jiahui et al., 2020; Feilong et al., 2018; Feilong et al., 2021; Feilong et al., 2021; Guntupalli et al., 2018) that afford projecting the localizer data from the normative sample into the idiosyncratic cortical topography of new participants. Using this hyperalignment procedure, we can estimate the idiosyncratic details of individual topographies with high fidelity based on localizer data from the normative sample. Unlike functional localizers, naturalistic stimuli (e.g., movies) evoke a rich variety of brain states and engage multiple brain systems in parallel. This makes it possible to efficiently map multiple functional topographies using data from a single movie and avoid the time and cost of running multiple localizers. Compared to controlled localizers, movies better simulate real-world cognition and better engage participants’ attention (Vanderwal et al., 2015; Vanderwal et al., 2017; Vanderwal et al., 2019), contributing to more ecologically valid and higher-quality maps. In addition, movies are more friendly and engaging for special populations, such as young children.

In previous work, we used response hyperalignment (RHA) to predict functional topographies in new participants. RHA requires that all participants watch the same movie to obtain time-locked responses to the same stimuli. It is often important, however, to tailor the movie to meet the specific needs of participants in different experiments. For example, participants from different countries may prefer movies that reflect their diverse backgrounds and are in their native languages (Hanke et al., 2016; Sengupta et al., 2016); movies for infants and young children are differently structured from those for adults (Vanderwal et al., 2015). Thus, it is unrealistic to limit all participants from diverse populations and backgrounds to watch the same movie. Additionally, experimenters may need to shorten or edit the stimuli to fit their data collection schedule. Finally, participants are often scanned with different parameters from one experiment to another, at different institutes across the world, and with different scanner models. Due to these factors, it is impractical to expect two laboratories to acquire the same movie scans across individuals.

Here, we test whether connectivity hyperalignment (CHA) (Guntupalli et al., 2018) can be used to map category-selective functional topographies. CHA, in contrast to RHA, affords calculation of transformation matrices using stimuli that are not the same for normative and index participants. We analyzed four different datasets collected with three different movies, three different scanners, and two different types of functional localizers that used dynamic or static stimuli. We first demonstrated that CHA based on participants’ connectomes that were calculated using their responses to movies was able to generate high-fidelity maps of category-selective topographies within datasets that were equivalent to maps estimated using RHA. Then, critically, we showed that cross-dataset predictions that used connectomes calculated from different movies for the normative and index brains were as good as those from participants in the same dataset. This means that different laboratories can use different movies to derive functional topographies from a normative sample.

In summary, we demonstrate that a target participant’s individualized category-selective topography can be accurately estimated using CHA, regardless of whether different movies are used to calculate the connectome and regardless of other data collection parameters. Movies engage multiple cognitive domains in parallel, such as visual perception, audition, language comprehension, theory of mind, and social interaction. In addition to estimating different functional topographies from a single movie, our approach allows us to estimate topographies from different movies. We provide a novel alternative for future data collection that can save time and money using rich and efficient movie scans.

Results

High-fidelity prediction with CHA

We predicted category-selective topographies by projecting other participants’ functional localizer data into each participant’s native cortical topography using a new, enhanced CHA algorithm. For each participant, we calculated transformation matrices based on functional connectivity estimated during movie viewing in an iterative way (see Materials and methods). These transformation matrices resample fMRI data from others’ brains into a given participant’s cortex. We then projected the functional localizer data for all other participants into the given participant’s native cortical space and calculated independent functional contrasts based on that participant’s own localizer data and based on other participants’ localizer data projected into that participant’s cortex. We also estimated functional topographies by projecting others’ localizer data into that participant’s cortex based on high-performing surface-based anatomical alignment as a control analysis. We calculated the correlations between topographies based on participants’ own localizer contrasts and on other participants’ data. Because the localizer task comprises several scanning runs, we calculated the reliability of the localizer across runs with Cronbach’s alpha to provide an estimate of the noise ceiling for these correlations. We repeated this procedure for all participants.

We tested the estimation of visual category-selective functional topographies (faces, bodies, scenes, and objects) in four different datasets using three different movies, localizers with static or dynamic stimuli, different scanning sequence parameters, and three different scanner models (see Materials and methods).

Category-selective topographies estimated with CHA recovered the idiosyncrasies of individuals’ topographies, capturing fine details of the individual-specific configuration and extent. By contrast, topographies estimated with anatomical alignment generated highly blurred maps that were essentially the same for all participants, losing individual-specific idiosyncratic features (Figure 1A).

Figure 1 with 1 supplement see all
Predicting individual category-selective topographies using connectivity hyperalignment (CHA).

(A) Face-selective topographies (faces-vs-all) and zoomed-in views of an example participant estimated from this participant’s own localizer (Own Localizer), and other participants’ localizers using CHA, and surface anatomical alignment (AA). (B) Scatter plots display the Pearson correlation coefficients between estimated face-selective topographies based on own localizer data and other participants’ localizer data in individual participants in four different datasets. The y-axis corresponds to correlations between each target participant’s own localizer-based face-selective topographies and face-selective topographies estimated from other participants using CHA. The x-axis corresponds to correlations between each target participant’s own localizer-based face-selective topographies and face-selective topographies estimated from other participants with surface-based anatomical alignment. (C) Bar plots show the mean correlations across participants in four datasets (Budapest & Sraiders: n = 20; Forrest: n = 15; Raiders: n = 9. Same sample sizes in other figures for each dataset unless noted.) and for all four category-selective topographies. Black bars stand for the mean Cronbach’s alphas across participants. Error bars indicate ±1 standard error of the mean. Category topographies were defined based on contrasts between the target category and all other categories. (D) Scatter plots of Pearson correlation coefficients using CHA and response hyperalignment (RHA) for individual participants within four different datasets for the face-selective topography. Values on the y-axis stand for correlations between each target participant’s own localizer-based topographies and topographies estimated from other participants in the same dataset using RHA. Values on the x-axis stand for correlations between each target participant’s own localizer-based topographies and topographies estimated from other participants in the same dataset using CHA.

The superior performance of CHA-based estimation over anatomical-alignment-based estimation was consistent across participants, visual stimulus categories, and datasets. In all four category-selective topographies and in all four datasets, correlations between estimations based on hyperalignment and their own localizer data were significantly higher than the correlations between estimations based on anatomical alignment and each participant’s own localizer (Fisher z-transformed, p<0.001, Bonferroni corrected). We compared these correlations between topographies estimated from a participant’s own localizer data and those from other participants’ data to the reliability of the localizer, calculated with Cronbach’s alpha. Predictions made with hyperalignment were close to and sometimes even exceeded the reliability values (Figure 1B), which indicate that the predicted category-selective topographies from other participants’ data using hyperalignment were as precise and sometimes even better than the topographies estimated with their own localizer data.

Estimates using CHA to calculate transformation matrices were also equivalent to estimates using RHA (Figure 1D). RHA, however, requires that all subjects watch the same movie, whereas CHA can use connectivity matrices derived from responses to different movies, potentially making our new approach more flexible. Next we tested the validity of estimating topographies using transformation matrices that were based on functional connectivities calculated from responses to different movies for the test participant and other participants.

CHA enables cross-movie predictions

Experimental design considerations and constraints can make using the same stimulus across all studies and participants inadvisable, and datasets are often collected under diverse conditions. Here, we aim to test whether connectivity-based hyperalignment can predict category-selective topographies in new individuals even if their connectomes are estimated from data collected while they watched a different movie. Using this method, participants across datasets without matched time-locked functional series can benefit from those who have functional localizer data but were scanned with different naturalistic stimuli.

We estimated category-selective topographies for each participant in each dataset from participants in the other dataset that used the same type of localizer (dynamic or static) by calculating transformation matrices based on functional connectivities measured while watching different movies. We also estimated topographies based on anatomical alignment. The cross-movie predictions using CHA outperformed predictions based on anatomical alignment and were nearly as precise as within-movie predictions (Figure 2A). The superior performance was consistent across datasets and categories (p<0.001 for all comparisons, Figure 2B) and in all individual participants (Figure 2—figure supplement 2). Similarly, accuracies of these predictions matched and sometimes even exceeded the reliability measures of their own localizer runs (Figure 2B).

Figure 2 with 2 supplements see all
Predicting category-selective topographies using connectivity profiles across movies.

(A) Scatter plots of Pearson correlation coefficients for individual participants in four different datasets and for four categories. Values on the y-axis stand for correlations between each target participant’s own localizer-based topographies and topographies estimated from other participants in the same movie using connectivity hyperalignment (CHA). Values on the x-axis stand for correlations between each target participant’s own localizer-based topographies and topographies estimated from participants in another dataset based on cross-movie CHA. (B) Bar plots display the mean Pearson correlation coefficients (r) and Cronbach’s alphas across participants in all four datasets for all four categories. Error bars stand for ±1 standard error of the mean. S to B: Sraiders to Budapest, B to S: Budapest to Sraiders, R to F: Raiders to Forrest, F to R: Forrest to Raiders.

Cross-movie predictions of cortical topographies based on different localizer types (static to dynamic or dynamic to static) produced lower correlations than did cross-movie predictions based on the same localizer type (Figure 2—figure supplement 1), consistent with previous reports showing significant differences between topographies estimated by static and dynamic localizers, especially in superior temporal and frontal cortices (Fox et al., 2009; Pitcher et al., 2011).

To demonstrate how hyperalignment increased prediction performance for individual participants from a different dataset, we plotted topographies estimated using hyperalignment and anatomical alignment, as well as from their own localizer runs (Figure 3, Figure 3—figure supplement 1 and Figure 3—figure supplement 2). Topographies between datasets recovered similar idiosyncratic features as the topographies predicted within datasets.

Figure 3 with 2 supplements see all
Sample contrast maps and enlarged views of the ventral temporal cortex.

Contrast maps for face-selective topographies (faces-vs-all) and their zoomed-in views of the ventral temporal cortex were plotted in four sample participants in (A) Budapest, (B) Sraiders, (C) Forrest, and (D) Raiders. In all four subplots, in the left-most panel, faces-vs-all maps were plotted on the sample participants’ own cortical surfaces. The next two columns display maps estimated from other participants’ data. In the right two columns, the first column presents predicted face-selective topographies from participants in the same dataset using connectivity hyperalignment (CHA). The next column presents face-selective topographies from participants in another dataset (cross-movie CHA). The zoomed-in panels are displayed accordingly with the whole-brain map. The color bar is the same as that in Figure 1. S to B: Sraiders to Budapest, B to S: Budapest to Sraiders, R to F: Raiders to Forrest, F to R: Forrest to Raiders.

To further examine the topographies predicted using different datasets and compare the prediction performances to reliability measures, we calculated local correlations between maps estimated from each participant’s own localizer runs and those estimated from other participants’ runs with a searchlight analysis. We also calculated Cronbach’s alpha across localizer runs in each searchlight. Generally, searchlights in the high-level visual areas and with strong category selectivity (e.g., ventral temporal cortex, lateral temporal cortex) showed the highest mean correlation values, which often exceeded 0.8 (Figure 4, Figure 4—figure supplement 1, Figure 4—figure supplement 3, and Figure 4—figure supplement 10). The lower mean correlations in other cortices (e.g., sensorimotor cortex) reflect low reliabilities of the localizer runs.

Figure 4 with 10 supplements see all
Searchlight analysis of Cronbach’s alphas and prediction performances.

(A, B, C, and D) The left-most column presents Cronbach’s alphas of the own-localizer-based face-selective topographies in each dataset using a searchlight analysis (15 mm radius). The next two columns present local correlations (correlation maps) using the searchlight analysis between face-selective maps estimated from participants’ own localizers and from other participants based on within-movie and between-movie connectivity hyperalignment (CHA) (hyperalignment [HA], top row) and surface alignment (AA, bottom row). Histogram plots present Cronbach’s alphas (dark gray) and coefficients for the correlation maps above (estimated with CHA in color, with AA in light gray). The left and right hemisphere histograms were plotted separately. B to S: Budapest to Sraiders, S to B: Sraiders to Budapest, R to F: Raiders to Forrest, F to R: Forrest to Raiders.

Discussion

In this study, using four datasets that contain three different movies, two different types of functional localizers, and collected with three different scanners, we showed that individualized category-selective topographies can be estimated with high fidelity using CHA. Unlike RHA, which requires the same ‘time-locked’ response time series in the normative sample and new participants, CHA affords the calculation of transformation matrices based on responses to completely different movies. By showing that CHA based on participants’ connectomes calculated using their responses to different movies generated high-fidelity mappings that were as good as those using RHA with participants in the same dataset, we demonstrated that CHA is able to effectively predict topographies across diverse situations. This study opens new possibilities connecting independent public and in-lab datasets for future data analysis so that researchers can derive multiple topographies at once for each individual with excellent performance based on the naturalistic movie data and the localizer data from another normative dataset. Our results also provide a novel alternative for new data collection to take better advantage of naturalistic stimuli.

We used a new, enhanced CHA in this study that optimized our previous CHA algorithm with iterative steps. In each step, transformation matrices to each index brain were calculated from other participants’ brains and the matrices were applied to both the movie and the localizer data. Because using dense connectivity targets (e.g., using all vertices as connectivity targets) with anatomically alignment data often leads to suboptimal alignment across participants (Hanke et al., 2014), we started with coarse connectivity targets and gradually increased the number of connectivity targets to form a denser representation of connectivity profiles. The iterations improved the prediction performance step by step, and at the final step (step 6, all vertices were used as connectivity targets) in this analysis, the enhanced CHA generated comparable performance with RHA (Figure 4—figure supplement 4). We investigated the influence of naturalistic movie length and the size of the training group on the prediction accuracy of individualized functional topographies. By incrementally increasing both the number of movie runs in the training and target dataset and the participants in the training group in the Budapest and Sraiders dataset, we observed enhanced prediction accuracy (Figure 4—figure supplement 5). Notably, even with just one movie run in the training or target dataset, or with a mere five participants in the training group, our prediction performance (Pearson r) ranged from about 0.6 to 0.7. This accuracy significantly outperformed results obtained using surface-based alignment. In addition, this study is based on the new optimized 1-step hyperalignment procedure (Jiahui et al., 2020). The classic hyperalignment method (2-step), builds a common information model space at the initial step that is based on all normative group participants, then projects information encoded in idiosyncratic representational spaces to the common model space, and lastly projects the information back to the individual participant’s space based on the transpose of the transformation matrices from the former step. Different from the 2-step method, the 1-step method directly projects the data for each normative sample brain to the index participant’s space without the intermediate step of building a common information model space. This method requires fewer steps and is free from the accumulation of errors across steps. The 1-step method consistently improved the prediction performances across all conditions and datasets (Figure 4—figure supplement 6). This method is particularly useful for estimating information encoded in each individual’s brain space. Our original algorithm is designed to apply transformation matrices to the time series of localizer data of training participants before generating contrast maps. To explore whether directly applying these matrices to pre-calculated contrast maps yields comparable results, we conducted an additional analysis across the four categories. Our findings indicate that the prediction outcomes were indeed quite similar between the two approaches for both the within- and across-datasets predictions (Figure 4—figure supplement 7). However, it is worth noting that the improvements observed with enhanced CHA were not as pronounced when applied directly to the contrast maps as opposed to the time series. In our study, we used fine-scale connectomes, noting that some participants are more similar to the target participant in specific searchlights. It is an interesting question whether predictions could be enhanced by exclusively selecting those more similar participants for the target participant. To explore this option, we examined a searchlight in the right ventral temporal cortex that was roughly at the location of the posterior fusiform area using the top and bottom nine participants similar to each target participant measured by their fine-scale connectome similarities in the budapest dataset. Generally, using all or part of the participants for the prediction generated similar results (Figure 4—figure supplement 8). Compared to using all the participants, using only the top nine participants who are the most similar to the target participants did not significantly improve the prediction (Tukey test, z=–0.09, p=0.996), but using only the bottom nine participants generated significantly lower prediction accuracies (Tukey test, z=2.492, p=0.034). This suggests a trade-off between the number of participants included in the prediction and the similarity of the participants. Future studies are needed to explore the optimal threshold for the number of participants included for each searchlight to refine the algorithm.

By leveraging transformation matrices obtained from hyperaligning participants based on movie-viewing data, we successfully mapped these relationships to the training participants’ localizer data, enabling robust predictions. Prior work employing diffusion-weighted imaging has underscored the link between anatomical connectivity and category selectivity across diverse visual fields (Osher et al., 2016; Saygin et al., 2012) and has established a notable congruence between structural and functional connectivities (Hermundstad et al., 2013). These findings suggest that the unique anatomical connectivity patterns of individuals may serve as a foundational mechanism, contributing to the stable fine-scale functional connectome that underpins our approach. The connectivity-based shared response model (cSRM) proposed by Nastase et al., 2020, used connectivity to functionally align individuals similar to the CHA algorithm. While both approaches share overarching goals, they diverge considerably in implementation and application. First and most important, cSRM used inter-subject functional connectivity rather than within-subject functional connectivity to initially estimate the connectome. As a result, cSRM requires participants to have time-locked fMRI time series. Therefore, unlike our algorithm, the cSRM approach does not support cross-content applications and also is not suitable for use with resting-state data. Second, cSRM is implemented based on a predefined cortical parcellation rather than the overlapping, regularly spaced cortical searchlights applied in our method which are not constrained by areal borders. For the application, cSRM has mainly been used to do ROI analysis rather than the estimation of the whole-brain topography that requires broader coverage of the cortex with a searchlight analysis. Third, our method is specifically designed to work in each individual’s space, while cSRM decomposes data across subjects into shared and subject-specific transformations, focusing on a communal connectivity space. In summary, although cSRM presents a promising alternative for similar aims, its current implementation precludes it from fulfilling the range of applications for which our method is optimized.

The within-movie and cross-movie CHA predictions generated highly similar topographies (Figure 3). This result raises a fascinating question of whether different movie inputs estimate similar fine-grained connectivity profiles in the brain. Previous studies reported that the coarse-grained connectome (based on coarse parcellations) varies across separate cognitive tasks (Shine et al., 2016; Telesford et al., 2016), and that naturalistic movies yield the most condition-specific functional atlases among other classic cognitive tasks (Salehi et al., 2020). In the Budapest and Sraiders datasets, the same group of participants watched the Grand Budapest Hotel and Raiders of the Lost Ark in different sessions in the same 3 T scanner. We built connectivity profiles for each participant separately for the two movies and correlated the two fine-grained connectomes in each searchlight. Results showed that the two fine-grained connectomes based on different movies were very similar in most of the brain regions (r>0.8, Figure 4—figure supplement 9A, B). We split each movie into two halves (Run 1–3/Run 4–5 for Budapest; Run 1–2/Run 3–4 for Sraiders) and averaged the connectome similarities across split halves over searchlights and participants. We found that the across-movie connectome similarities for split halves were high (r>0.74), and the within-movie similarities were even higher in both datasets (r>0.85, Figure 4—figure supplement 9C). Our analysis showed that although the fine-grained connectome was affected by the input naturalistic stimulus content, it was nonetheless highly stable. This result suggested the brain may undergo shared cognitive processes across different movie free-viewing tasks. It could be because featured movies sample a broad range of real-life statistics, and the rich information elicits overall similar representations and connectivities when the entire time series is considered. Studies comparing movie-viewing and resting-state functional connectivity have shown that both paradigms yield overlapping macroscale cortical organizations (Samara et al., 2023), though naturalistic viewing introduces unique modality-specific hierarchical gradients. However, there remains a gap in research comparing the fine-scaled connectomes of naturalistic and resting-state paradigms. Guntupalli et al., 2018, revealed a shared fine-scale structure that coexists with the coarse-scale structure, and CHA successfully improved intersubject correlations across a wide variety of tasks. Feilong et al., 2021, noted that the fine-scaled connectivity profiles in both resting and task states are highly predictive of general intelligence. This suggests a reliable and biologically relevant fine-scale resting-state connectivity structure among individuals. Therefore, it is plausible that individualized functional topography could be effectively estimated using resting-state functional connectivity, expanding the applicability of our approach. Future studies are needed to explore this direction.

The four datasets in our study included two types of category-selective localizers (dynamic and static). The dynamic localizer used short video clips for each category and the traditional static localizer used still images. For all categories, the dynamic localizer elicited stronger and broader category-selective activations than the static localizer, and the searchlight analysis showed that the dynamic localizer had higher reliabilities across the cortex, especially in regions that were selectively responsive to the target category. Due to differences between topographies activated by the dynamic and the static localizers, predictions across localizer types generated lower correlations than those within localizer types. For example, for the face-selective topographies, the dynamic localizer activated more areas than the static localizer (e.g., in superior temporal and frontal cortices). In the ventral temporal cortex, especially in the right hemisphere, both dynamic and static localizers performed well in the cross-localizer-type predictions. But in cortical areas where the static localizer did not match the dynamic localizer, predictions from the same dynamic localizer always outperformed the predictions from a different static localizer (Figure 4—figure supplement 1, Figure 4—figure supplement 3, and Figure 4—figure supplement 10). The low correlations were not because the prediction method failed but reflected the difference in the topographies activated by different types of localizers.

This study successfully illustrated that accurate individualized predictions are both robust and applicable across a variety of conditions, including movie types, languages, scanning parameters, and scanner models. Importantly, the intricate connectivity profiles remain consistent even when participants view entirely different movies, as evidenced by Figure 4—figure supplement 9, reinforcing the prediction’s stability in various scenarios. However, all four datasets in this study only included typical participants with anatomically intact brains. An unanswered question is whether individualized topographies of neuropsychological populations with atypical cortical function (e.g., developmental prosopagnosics) or with lesioned brains (e.g., acquired prosopagnosics) could also be accurately predicted using the hyperalignment-based methods. Up to now, as far as we know, no previous literature has investigated this question. Beyond neuropsychological groups, it is also valuable to investigate how well the predictions will be across a wide range of age, from infants to the elderly. Future research is essential to adapt our algorithms to diverse populations.

In summary, our study demonstrated that accurate predictions of individualized category-selective topographies can be achieved with high fidelity using CHA across different naturalistic movie contents, across different scanners, and across different scanning parameters. Compared to traditional functional localizers, naturalistic stimuli are more ecologically valid, engaging multiple cognitive systems in parallel, and more friendly to participants. Our method not only can be applied directly to current public and in-lab datasets, but has the important potential to allow researchers to derive a broad range of topographies based on naturalistic movies and a normative database in the future. By building such a database that comprises various high-quality topographies and naturalistic stimuli, our study opens the gate to new research possibilities that could integrate high-level cognitive functions across datasets from laboratories worldwide.

Materials and methods

Datasets

The Budapest dataset

Request a detailed protocol

The Budapest dataset included 20 participants (mean age 27.2 years, 10 females) for this analysis. These participants were scanned while watching both Grand Budapest Hotel and Raiders of the Lost Ark and were a subset of the dataset in Jiahui et al., 2020. The Grand Budapest Hotel dataset contained five movie runs (~50 min, each part lasting 9–13 min each) and four dynamic localizer runs. Before entering the scanner, participants watched the first part of the movie (~45 min) outside. The rest of the movie was divided into five parts (each part lasting 9–13 min, ~50 min in total) and participants watched each part/run with audio. The dynamic localizer data were collected in a separate scanning section (Pitcher et al., 2011). This dataset comprised four blocked-designed runs (3.9 min each), and each run comprised 10 blocks (18 s each), two per category (faces, bodies, scenes, objects, and scrambled objects). Each block comprised six 3-s-long video clips in random order. Participants did a one-back task during the localizer scan to maintain attention.

All scans in the Grand Budapest Hotel dataset were acquired using a 3 T S Magnetom Prisma MRI scanner with a 32-channel head coil at the Dartmouth Brain Imaging Center. BOLD images were acquired in an interleaved fashion using gradient-echo echo-planar imaging with pre-scan normalization, fat suppression, multiband (i.e., simultaneous multi-slice) acceleration factor of 4 (using blipped CAIPIRINHA), and no in-plane acceleration (i.e., GRAPPA acceleration factor of 1): TR/TE = 1000/33 ms, flip angle = 59°, resolution = 2.5 mm3 isotropic voxels, matrix size = 96 × 96, FoV = 240 × 240 mm2, 52 axial slices with full brain coverage and no gap, anterior-posterior phase encoding. See more details in Visconti di Oleggio Castello et al., 2020.

The Sraiders dataset

Request a detailed protocol

The same participants were included for analysis in the Sraiders dataset as in the Budapest dataset. The movie Raiders of the Lost Ark was split into eight parts (~15 min each), and the first four parts were watched outside of the scanner prior to the scanning (~56 min). The later four parts were watched in the scanner (57 min) with audio (Nastase, 2018). The Sraiders dataset and the Budapest dataset shared the same dynamic localizer data. The Sraiders dataset was collected with the same scan protocols as the Budapest dataset (Nastase, 2018; Feilong et al., 2022).

The Forrest dataset

Request a detailed protocol

This dataset contains scans from 15 adults (mean age 29.4 years, 6 females). Participants were scanned at the Otto-von-Guericke University in Germany and were native German speakers (Hanke et al., 2016; Sengupta et al., 2016). The dataset is publicly available at http://www.studyforrest.org/ (Hanke et al., 2014). A shortened version of the movie Forrest Gump was divided into eight parts with each part lasting approximately 15 min. Participants watched each part/run in the scanner with audio (Hanke et al., 2016). A category-selective localizer using still images was included in this dataset. This static localizer comprised four runs (5.2 min each). Each run comprised two 16 s blocks for each of the six categories (human faces, human bodies without heads, small objects, houses and outdoor scenes that include nature and street scenes, and phase scrambled images). In each block, 16 images from one category were displayed (900 ms display + 100 ms intertrial interval each). Participants were asked to do a one-back task to maintain attention.

Scanning was carried out using a whole-body 3 T Philips Achieva dStream MRI scanner equipped with a 32-channel head coil. Data were collected with gradient-echo, 2 s repetition time (TR), 30 ms echo time (TE), 90° flip angle, 1943 Hz/px bandwidth, and parallel acquisition with sensitivity encoding (SENSE) reduction factor 2. Each volume comprised 35 axial slices with anterior-to-posterior phase-encoding direction that were collected in ascending order, which mostly covered the entire brain. Each slice was 3.0 mm thick with a 10% inter-slice gap, and had a 240×240 mm2 field-of-view comprising 80×80 3 mm2 isotropic voxels. More acquisition parameters can be found in Hanke et al., 2016, and Sengupta et al., 2016.

The Raiders dataset

Request a detailed protocol

A subset of nine participants from the original eleven participants (7 men, mean age = 24.8 years) participated in the face and object study at Dartmouth in Haxby et al., 2011, and were included in this dataset. The audio-visual movie Raiders of the Lost Ark was split into eight parts (~15 min each), similarly to those used in the Sraiders Dataset. Participants watched all eight parts in the scanner with audio (one part/per run). The Raiders dataset contains a static localizer that was similarly designed as in the Forrest dataset.

Brain images were acquired using a 3 T Philips Intera Achieva scanner with an eight-channel head coil at Dartmouth College. For the movie study, whole-brain volumes of 413-mm-thick sagittal images (TR = 2.5 s, TE = 35 ms, flip angle = 90°, 80×80 matrix, FOV = 240×240 mm2, resolution = 0.938×0.938×1.0 mm3) were obtained in an interleaved slice order. For more details see Haxby et al., 2011.

MRI preprocessing

Request a detailed protocol

All datasets were preprocessed with fMRIPrep (Esteban et al., 2019), using version 20.1.1 for the Budapest dataset, 20.2.0 for the Sraiders dataset, 20.1.1 for the Forrest dataset, and 20.1.1 for the Raiders dataset. After fMRIPrep, functional data were projected onto a standard cortical surface aligned to the fsaverage template (Fischl et al., 1999) based on cortical folding patterns. The datasets were further preprocessed following Jiahui et al., 2020; Feilong et al., 2018. The datasets were resampled to a cortical mesh with 18,742 vertices across both hemispheres (approximately 3 mm vertex spacing; 20,484 vertices before removing non-cortical vertices). Six motion parameters and their derivatives, global signal, framewise displacement (Power et al., 2014), six principal components from cerebrospinal fluid and white matter (Behzadi et al., 2007), and polynomial trends up to second order were rf out from both movie and localizer data for each run independently.

Searchlight hyperalignment

CHA (step 1)

Request a detailed protocol

Each participant’s connectivity profile was built based on that participant’s movie data. We first defined the connectivity seeds and targets. In this analysis, the connectivity seeds were the same as the surface cortical vertices. The connectivity targets were defined using a sparser cortical surface with 642 vertices in each hemisphere before removing the medial wall. We then centered a 13 mm searchlight on each of these vertices and computed the average time series for the searchlight over vertices from the denser cortical model. The mean time series was assigned to the center vertex to serve as the connectivity target. For each hemisphere, the connectivity profile was calculated as the correlation between the connectivity seeds in this hemisphere and the whole-brain 1175 connectivity targets. The connectivity profile of each participant was normalized to zero mean and unit variance for each connectivity seed before hyperalignment.

We used an optimized hyperalignment method that directly transforms one participant’s connectivity profile to another participant’s cortical space, without the interim step of projecting the connectome into a common model space (Jiahui et al., 2020). In detail, for each 15 mm searchlight, a participant’s patterns of connectivity to targets were aligned to another participant’s connectivity patterns using the Procrustes transformation. The transformation matrices from each searchlight in a hemisphere were then aggregated into a single transformation matrix for each pair of participants.

Response hyperalignment

Request a detailed protocol

RHA was applied with the same steps as the CHA. The only difference is that instead of using connectivity profiles in each searchlight for each participant, we directly used the response pattern of the movie (time points of the movie × vertices in the searchlight) to align a pair of participants. In this method, response patterns in a pair of participants must be from neural responses to the same movie. Due to this restriction, RHA was only applied to participants from the same dataset.

Advanced CHA

Request a detailed protocol

Using dense connectivity targets (e.g., using all 18,742 vertices on the surface) with anatomically aligned data usually generates poor functional correspondence across participants (Busch et al., 2021). It is, however, beneficial to include more targets for calculating connectivity patterns after the first iteration of CHA and repeated iterations to lead to a better solution by gradually aligning the information at finer scales.

We used six steps to further improve the CHA method. Step 1 was the initial CHA step as described above that was based on the raw anatomically aligned movie data. The resultant transformation matrices were applied to those movie runs, and the hyperaligned data were then used in step 2 to calculate new connectivity patterns and calculate new transformation matrices. We repeated this procedure iteratively six times and derived transformation matrices for each step. In steps 1, 2, and 3, 642×2 (icoorder3, before removing the medial wall) connectivity targets were defined with 13 mm searchlights. In steps 4 and 5, 2562×2 (icoorder 4, before removing the medial wall) connectivity targets were used with 7 mm searchlights to calculate target mean time series. In the final step 6, all 18,742 vertices were included as separate connectivity targets, using each vertex’s time series rather than calculating the mean in a searchlight. Each step of this advanced CHA algorithm increased the prediction performance (Figure 4—figure supplement 2).

Predicting individual contrast maps

Estimating contrast maps from each participant’s own localizer data

Request a detailed protocol

We estimated each participant’s category-selective maps by calculating the unthresholded GLM univariate contrasts using his/her own localizer data in each run and averaging the t-values across all the localizer runs. We included face-, body-, scene-, and object-selective maps in the analysis. The contrast maps in each category were calculated based on the contrast of the target category vs. all the other categories. For example, the face-selective map was calculated using faces vs. all the other categories in the localizer data (e.g., bodies, objects).

Estimating contrast maps from other participants’ localizer data

Request a detailed protocol

Transformation matrices from each participant to a target participant derived from hyperalignment were applied to the localizer runs of all other participants to project their localizer data into that target participant’s cortical anatomy. These hyperaligned localizer runs and anatomical surface aligned localizer runs were used separately for GLM univariate analysis for each run in each other participant, and then averaged across the t-maps from all runs and all other participants to estimate the target participant’s contrast maps for each category.

In summary, each participant’s category-selective map was estimated based on that target participant’s own localizer data and on all other participants’ localizer data that was projected into that participant’s cortical space using hyperalignment and anatomical surface alignment (see Figure 1—figure supplement 1). After obtaining these estimated maps, we calculated correlations between the target participant’s category-selective maps based on his/her own localizer data and the maps estimated from other participants’ data (hyperaligned or anatomically aligned). We also calculated Cronbach’s alpha values (Jiahui et al., 2020; Feilong et al., 2018; Jiahui et al., 2022) across the multiple runs to measure the reliability of the category-selective maps for each participant and compared the correlations to the reliability values. Cronbach’s alpha calculates the correlation score between localizer-based maps across the runs, and it reflects the amount of noise in maps based on individual localizer runs. Traditionally, the reliability was estimated based on split-half correlations. The common odd/even split measure underestimated reliability and necessitated recalculation of correlations between maps for only half the data to provide valid comparisons. In contrast, Cronbach’s alpha involves all localizer runs and provides a more accurate statistical estimate of the reliability of the topographies estimated with localizer runs. To measure the local estimation performance and compare that to local reliabilities, we calculated correlations and Cronbach’s alphas in searchlights with a radius of 15 mm.

Data availability

All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data and materials that support the findings of this study can be found at https://github.com/GUO-Jiahui/CHA_Cross-Movie_Prediction; (copy archived at Jiahui, 2023).

The following previously published data sets were used
    1. Oleggio Castello MVD
    2. Chauhan V
    3. Jiahui G
    4. Gobbini MI
    (2020) OpenNeuro
    ID ds003017. An fMRI dataset in response to 'The Grand Budapest Hotel', a socially-rich, naturalistic movie.

References

  1. Thesis
    1. Nastase SA
    (2018)
    The Geometry of Observed Action Representation During Natural Vision
    Dartmouth College.

Decision letter

  1. Ming Meng
    Reviewing Editor; South China Normal University, China
  2. Chris I Baker
    Senior Editor; National Institute of Mental Health, United States
  3. Ming Meng
    Reviewer; South China Normal University, China
  4. Zonglei Zhen
    Reviewer; Beijing Normal University, China

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Cross-movie prediction of individualized functional topography" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, including Ming Meng X as the Reviewing Editor and Reviewer #1, and the evaluation has been overseen by Chris Baker as the Senior Editor. The following individual involved in the review of your submission has agreed to reveal their identity: Zonglei Zhen (Reviewer #2).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

1) Add discussion of the limit of the present hyper alignment approach: for example, to what extent the present hyper alignment approach would be applicable to individuals with atypical functional brain topography such as brain lesion patients with e.g., acquired prosopagnosia? Even in typical populations, while bilateral fusiform face areas can be identified in the majority through functional localizer scans, the left fusiform face area sometimes cannot be found. Moreover, many top-down factors are known to modulate functional brain topography. Due to these factors, brain responses and functional connectivity may be different even when the same subject watched the same movie twice (e.g., Cui et al., 2021).

2) Explain how the length of movie-viewing fMRI may affect the accuracy in predicting the idiosyncratic cortical topography? Similarly, how does the number of participants in the normative database affect the prediction of the category-selective topography? This information is important for the researchers who are interested in using the approach in their studies.

3) The data show that category-selective topography can be accurately estimated using connectivity hyper alignment, regardless of whether different movies are used to calculate the connectome and regardless of other data collection parameters. However, can the functional connectome from resting state fMRI accomplish the same as the movie-watching fMRI? If yes, this would expand the approach to much broader data.

4) The authors averaged the hyper-aligned functional localizer data from all of the subjects to predict individual category-selective topographies. As there is large spatial variability in the functional areas across subjects, averaging the data from many subjects may blur the boundaries of the functional areas. A better solution might be to average those subjects who show highly similar connectome to the target subjects.

5) Add discussion to clarify relations between the present hyperalignment approach and approaches in the literature that address the same question. Specifically, as reviewer #2 pointed out, 'Saygin and her colleagues have demonstrated that structural connectivity fingerprints can predict cortical selectivity for multiple visual categories across cortex (Osher DE et al., 2016, Cerebral Cortex; Saygin et al., 2011, Nat. Neurosci). I think there's a connection between those studies and the current study. If the author can discuss the connection between them, it may help us understand why CHA work so well.' And as reviewer #3 pointed out, 'the authors do not cite a paper that has already successfully demonstrated a functional alignment method that can address exactly this need: a connectivity-based Shared Response Model (cSRM; Nastase et al., 2020, NeuroImage). It would be relevant for the authors to consider the cSRM method in relation to their enhanced CHA method in detail. In particular, both the relative predictive performance as well as associated computational costs would be useful for researchers to understand in considering enhanced CHA for their applications.'

6) Justify the particular six step, iterative approach. That is: why were six steps chosen over any other number? At present, it is not clear if there is an explicit loss function that the authors are minimizing over their iterations. The relative computational cost of six iterations is also likely significant, particularly compared to previous hyperalignment algorithms. A more detailed theoretical understanding of why six iterations are necessary-or if other researchers could adopt a variable number according to the characteristics of their data-would significantly improve the transferability of this method.

7) The existing evaluations for enhanced CHA appear to be entirely based on image-derived correlations. That is, the authors compare the predicted image from CHA with the ground-truth image using correlation. While this provides promising initial evidence, correlation-based measures are often difficult to interpret given their sensitivity to image characteristics such as smoothness. Including Cronbach's α reliability as a baseline does not address this concern, as it is similarly an image-based statistic. It would be useful to see additional predictive experiments using frameworks such as time-segment classification, inter-subject decoding, or encoding models.

8) Make available the code for implementing CHA, or justify why this could not be done at the present.

Reviewer #1 (Recommendations for the authors):

In addition to adding more discussions on the limit of the present hyperalignment approach as I mentioned in the public review section, I would suggest more direct comparisons of the current CHA results and previous RHA results. I.e., perhaps consider moving Figure S2 to the main text?

Reviewer #3 (Recommendations for the authors):

– On L336 of The Raiders Dataset, the authors note that a subset of nine of the original eleven participants are included in the current experiments; however, from the current it is not obvious why two participants were excluded.

– Please confirm the radius of the searchlights used throughout the experiments. For example, in L361 of Connectivity Hyperalignment (Step One) the searchlight is described as 13mm radius, while on L370 of the same section it is a 15mm radius.

– In Figure S2, I noted the following two typos: In section (A), The second axis description should read "Values on the x-axis stand for correlations between each target participant's own localized-based topographies and topographies from other participants in the same dataset using CHA." In section (B), "Conbach's alphas" should be "Cronbach's alphas."

– In Figure S3, the in-figure legend (e.g., F to B) does not appear to relate to the figure content and is not explained in the figure description.

– It seems that code for implementing CHA is not currently available, as the GitHub repository listed in the Data Availability (but not in-text) does not contain executable code as far as I can tell. This would be particularly useful for other author's hoping to apply this method in their own datasets!

https://doi.org/10.7554/eLife.86037.sa1

Author response

Essential revisions:

1) Add discussion of the limit of the present hyper alignment approach: for example, to what extent the present hyper alignment approach would be applicable to individuals with atypical functional brain topography such as brain lesion patients with e.g., acquired prosopagnosia? Even in typical populations, while bilateral fusiform face areas can be identified in the majority through functional localizer scans, the left fusiform face area sometimes cannot be found. Moreover, many top-down factors are known to modulate functional brain topography. Due to these factors, brain responses and functional connectivity may be different even when the same subject watched the same movie twice (e.g., Cui et al., 2021).

We thank the reviewer for the suggestion and agree that it would be fascinating if the predictions can be made with high fidelity in neuropsychological populations. Although we are optimistic that our algorithm is able to generalize across diverse populations, to date, no previous literature has provided empirical evidence to illustrate the effectiveness, including optimizations and special applications beyond typical brains. Besides the neuropsychological population, it would also be valuable to study the generalization across a broad age range, for example, from infants to the elderly. The brain changes across age both anatomically and functionally, so it is a challenge to predict functional topographies based on a normative group that only includes young participants. With all these potential applications in mind, future research is needed to illustrate the efficacy, build the pipeline, and construct the representative normative groups to meet the requirements of accurate individualized predictions in diverse populations.

In typical populations, although participants have great individual variabilities in their functional topographies, for instance, some participants have distinguishable patches of activations in their left ventral temporal cortex while some participants don’t, our algorithms successfully captured these individualized differences in the prediction. Author response image 1 shows, as an example, the face-selective topographies of two individuals that have markedly different face-selective topographies on the left ventral temporal cortex. The left participant has prominent face-selective areas on the left ventral temporal cortex that are in similar sizes as the right side, while the right participant only has a few scattered small face-selective spots on the left side. No matter what their face-selective areas look like, our algorithm accurately recovered the individualized locations, shapes, and sizes, retaining the individual variability in the functional topographies.

Author response image 1

Functional connectivity profiles based on naturalistic stimuli are very stable across the cortex, even when participants watch different movies. In Figure 4—figure supplement 9, the mean correlations of fine-scaled connectome for most searchlights (r = 15mm) when participants watched The Grand Budapest Hotel and the Raiders of the Lost Ark were generally around 0.8. The mean correlations were about 0.9 between the first and second half of the same movie although the stimuli contents were different between the two halves. Thus, the fine-grained functional connectivity profiles remain highly stable and reliable across movie contents, which contributes to the robustness of cross-movie, time, and other parameters (e.g., scanner models, scanning parameter) predictions using our algorithms.We added a paragraph in the discuss section to address the concerns (page 18-19):

“This study successfully illustrated that accurate individualized predictions are both robust and applicable across a variety of conditions, including movie types, languages, scanning parameters, and scanner models. Importantly, the intricate connectivity profiles remain consistent even when participants view entirely different movies, as evidenced by Figure 4—figure supplement 9, reinforcing the prediction's stability in various scenarios. However, all four datasets in this study only included typical participants with anatomically intact brains. An unanswered question is whether individualized topographies of neuropsychological populations with atypical cortical function (e.g., developmental prosopagnosics) or with lesioned brains (e.g., acquired prosopagnosics) could also be accurately predicted using the hyperalignment-based methods. Up to now, as far as we know, no previous literature has investigated this question. Beyond neuropsychological groups, it is also valuable to investigate how well the predictions will be across a wide range of age, from infants to the elderly. Future research is essential to adapt our algorithms to diverse populations.”

2) Explain how the length of movie-viewing fMRI may affect the accuracy in predicting the idiosyncratic cortical topography? Similarly, how does the number of participants in the normative database affect the prediction of the category-selective topography? This information is important for the researchers who are interested in using the approach in their studies.

To investigate the influence of movie-viewing data length and the number of participants in the normative database on prediction performance, we systematically varied these parameters. Specifically, we altered the number of runs utilized in the analysis for both the normative and target data and experimented with varying the number of participants in the normative dataset using the Budapest and the Sraiders datasets. We have included a new Figure 4—figure supplement 5 to present a summary of these findings.

The results reveal that both within-dataset and between-dataset prediction performances are positively correlated with the length of movie-viewing fMRI data used for both the normative and target groups. A similar trend was observed with respect to the number of participants included in the normative dataset. It is important to highlight, though, that, even when analyzing as little as one run of movie-viewing data—roughly 10-15 minutes, our hyperalignment-based prediction performance was significantly higher than that achieved using traditional surface alignment. This held true even when the normative dataset included as few as five participants.

In summary, our results show that prediction performance generally improves with longer movie-viewing sessions and larger normative datasets. However, it is noteworthy that even with minimal data—10 minutes of movie-viewing and a small number of participants in the normative dataset—our algorithm still outperforms traditional surface alignment methods significantly.

We also added sentences in the Discussion section (page 15):

“We investigated the influence of naturalistic movie length and the size of the training group on the prediction accuracy of individualized functional topographies. By incrementally increasing both the number of movie runs in the training and target dataset and the participants in the training group in the Budapest and Sraiders dataset, we observed enhanced prediction accuracy (Figure 4—figure supplement 5). Notably, even with just one movie run in the training or target dataset, or with a mere five participants in the training group, our prediction performance (Pearson r) ranged from about 0.6 to 0.7. This accuracy significantly outperformed results obtained using surface-based alignment.”

3) The data show that category-selective topography can be accurately estimated using connectivity hyper alignment, regardless of whether different movies are used to calculate the connectome and regardless of other data collection parameters. However, can the functional connectome from resting state fMRI accomplish the same as the movie-watching fMRI? If yes, this would expand the approach to much broader data.

We agree with the reviewer that demonstrating the applicability of the resting state data will expand the application scenarios of this approach. Most previous findings on resting state connectivity, including the comparison between the naturalistic and the resting state paradigms, focused on the macro-scale similarities and differences (e.g., Samara et al., 2023). Very few studies have investigated the fine-scaled connectome based on resting state data. The study on connectivity hyperalignment (Guntupalli et al., 2018) demonstrated a shared fine-scale connectivity structure among individuals that co-exists with the common coarse-scale structure and built the algorithm to successfully hyperalign individuals to the shared fine-scaled space. Another study from our lab (Feilong et al., 2021) revealed that the fine-scaled connectivity profiles in both resting and task states are highly predictive of general intelligence, indicating reliable and biologically relevant fine-scaled resting state connectome structures. Thus, it is highly plausible that our approach is able to be generalized to the resting state data, generating significantly better predictions of individualized functional topographies than traditional surface alignment. However, due to the limitations of the current datasets, we do not have resting state data available in the current datasets to perform this analysis. We are in the process of collecting new data to explore this hypothesis in future work.

We added sentences to the Discussion section to discuss this idea (page 18):

“Studies comparing movie-viewing and resting state functional connectivity have shown that both paradigms yield overlapping macroscale cortical organizations (29), though naturalistic viewing introduces unique modality-specific hierarchical gradients. However, there remains a gap in research comparing the fine-scaled connectomes of naturalistic and resting state paradigms. Guntupalli and colleagues (14) revealed a shared fine-scale structure that coexists with the coarse-scale structure, and connectivity hyperalignment successfully improved intersubject correlations across a wide variety of tasks. Feilong et al. (13) noted that the fine-scaled connectivity profiles in both resting and task states are highly predictive of general intelligence. This suggests a reliable and biologically relevant fine-scale resting state connectivity structure among individuals. Therefore, it is plausible that individualized functional topography could be effectively estimated using resting state functional connectivity, expanding the applicability of our approach. Future studies are needed to explore this direction.”

4) The authors averaged the hyper-aligned functional localizer data from all of the subjects to predict individual category-selective topographies. As there is large spatial variability in the functional areas across subjects, averaging the data from many subjects may blur the boundaries of the functional areas. A better solution might be to average those subjects who show highly similar connectome to the target subjects.

We appreciate the reviewer’s insightful question about optimizing prediction performance by selecting participants most similar in functional connectivity to the target individuals. This is a promising direction and difficult problem as well. Our approach is based on fine-scale connectome to hyperalign participants, thus different groups of participants may be similar to the target participant in different searchlights. In addition, based on results discussed in the response to Q2, the more participants included in the normative dataset, the better the prediction performance. Thus, there is a trade-off between the number of participants included in the normative dataset for the prediction and the overall similarity of those participants to the target participant.

To quantitatively explore this idea, we used a searchlight in the right ventral temporal cortex, roughly at the location of posterior fusiform face area (pFFA). We sorted participants by their connectome similarity to each target participant and then examined prediction performance based on either the top nine most similar participants or the bottom nine least similar participants. Our results, presented in Figure 4—figure supplement 8, reveal that hyperalignment consistently outperforms surface alignment regardless of the subset of participants used. Notably, using the nine most similar participants did not significantly alter prediction performance (Tukey Test, z = -0.09, p = 0.996), while using the least similar participants did negatively impact it (Tukey Test, z = 2.492, p = 0.034). Interestingly, the stability of hyperalignment-based predictions remained high even when only a subset of participants was used, contrasting with the variability observed in surface-alignment-based predictions.

Overall, these findings suggest that while selecting functionally similar participants is a promising avenue for future optimization, the process will require nuanced, searchlight-specific criteria. Each searchlight may necessitate its own set of optimal participants to balance between the performance boost from having more participants and the fidelity gained from participant similarity.

We added the following to the discussion in the manuscript (page 16):

“In our study, we used fine-scale connectomes, noting that some participants are more similar to the target participant in specific searchlights. It is an interesting question whether predictions could be enhanced by exclusively selecting those more similar participants for the target participant. To explore this option, we examined a searchlight in the right ventral temporal cortex that was roughly at the location of the posterior fusiform area (pFFA) using the top and bottom nine participants similar to each target participant measured by their fine-scale connectome similarities in the budapest dataset. Generally, using all or part of the participants for the prediction generated similar results (Figure 4—figure supplement 8). Compared to using all the participants, using only the top nine participants who are the most similar to the target participants did not significantly improve the prediction (Tukey Test, z = -0.09, p = 0.996), but using only the bottom nine participants generated significantly lower prediction accuracies (Tukey Test, z = 2.492, p = 0.034). This suggests a trade-off between the number of participants included in the prediction and the similarity of the participants. Future studies are needed to explore the optimal threshold for the number of participants included for each searchlight to refine the algorithm.”

5) Add discussion to clarify relations between the present hyperalignment approach and approaches in the literature that address the same question. Specifically, as reviewer #2 pointed out, 'Saygin and her colleagues have demonstrated that structural connectivity fingerprints can predict cortical selectivity for multiple visual categories across cortex (Osher DE et al., 2016, Cerebral Cortex; Saygin et al., 2011, Nat. Neurosci). I think there's a connection between those studies and the current study. If the author can discuss the connection between them, it may help us understand why CHA work so well.' And as reviewer #3 pointed out, 'the authors do not cite a paper that has already successfully demonstrated a functional alignment method that can address exactly this need: a connectivity-based Shared Response Model (cSRM; Nastase et al., 2020, NeuroImage). It would be relevant for the authors to consider the cSRM method in relation to their enhanced CHA method in detail. In particular, both the relative predictive performance as well as associated computational costs would be useful for researchers to understand in considering enhanced CHA for their applications.'

We thank the reviewer for raising this point that provides us with the chance of clarifying how our approach differs with methods previously reported in the literature. The computational logic underlying our approach is that we derived the transformation matrices between the training and the target participants in the high-dimensional space based on functional connectivity calculated from the movie data. Then, we applied these transformation matrices to the training participant’s localizer data to accomplish the prediction. On the other hand, Saygin and colleagues directly used diffusion-weighted imaging (DWI) data and predicted participants’ functional responses based on the anatomical-functional correspondence. They evaluated the prediction by calculating the mean absolute errors (AE) of the difference between the actual and predicted contrast responses. Although AE linearly increases with the quality of the prediction, it is difficult to measure the prediction performance of the shape, size, and location of the functional areas precisely using this mean value. With our algorithm, we were able to predict the general location and size of the areas and recover the individualized shapes, generating more powerful predictions. We also used the searchlight analysis to evaluate the performance across the cortex systematically. In addition, Osher et al. (2016) and Saygin et al. (2012) always have a few participants failing to show better predictions based on the connectivity than the group averaged method. Our algorithm is more stable, as all participants across all four datasets had better predicted performance using our algorithm than using the group average. However, although we did not directly use the anatomical-functional correspondence with DWI, the relationships between individual structural connectivity and cortical visual category selectivity could be one of the biological underpinnings that contribute to this robust and accurate prediction.

The Connectivity-Based Shared Response Model (cSRM, Nastase et al., 2020) offers an alternative framework for aligning individuals through functional connectivity. While the overarching aim of cSRM and our methodology converges, substantial differences emerge in the respective implementation and application between the two methods that make our approach the more suitable for predicting individualized topographies. The most significant difference between the two is that, instead of focusing on within-individual connectivity profiles, cSRM used inter-subject functional connectivity (ISFC) in the initial step. This design requires that all participants must have time-locked time series, making the algorithm unusable for cross-content prediction and making it incompatible with resting-state data. Our approach, on the other hand, does not require time-locked stimuli, thereby offering a more flexible framework that permits generalization across different types of stimuli and experimental settings and enables bringing data across laboratories across the world together. Secondly, cSRM predominantly focuses on Region of Interest (ROI) analyses, whereas our model employs searchlight-based analyses designed to comprehensively cover the entire cortical sheet. Whole-brain coverage is needed to generate the topography that reflects the patterns across the cortex. Finally, with the optimized 1step method, our approach directly hyeraligns the training and target participants together, avoiding the accumulation of errors from the intermediate common space. cSRM, with an implementation similar to the classic connectivity hyperalignment, creates and hyperaligns all participants to a shared information space. In summary, while our approach and cSRM share a similar theoretical foundation, our approach has been specifically optimized to address the challenges and complexities in predicting individualized whole-brain functional topographies. Moreover, our approach demonstrates a remarkable ability to generalize across a variety of contexts and stimuli, offering a significant advantage in dealing with diverse experimental settings and datasets.

We have added the contents to the Discussion section (page 16-17):

“By leveraging transformation matrices obtained from hyperaligning participants based on movie-viewing data, we successfully mapped these relationships to the training participants’ localizer data, enabling robust predictions. Prior work employing diffusion-weighted imaging (DWI) has underscored the link between anatomical connectivity and category selectivity across diverse visual fields (22, 23) and has established a notable congruence between structural and functional connectivities (24). These findings suggest that the unique anatomical connectivity patterns of individuals may serve as a foundational mechanism, contributing to the stable finescale functional connectome that underpins our approach. The connectivity-based Shared Response Model (cSRM) proposed by Nastase and colleagues (25) used connectivity to functionally align individuals similar to the connectivity hyperalignment algorithm. While both approaches share overarching goals, they diverge considerably in implementation and application. First and most important, cSRM used inter-subject functional connectivity (ISFC) rather than within-subject functional connectivity to initially estimate the connectome. As a result, cSRM requires participants to have time-locked fMRI time series. Therefore, unlike our algorithm, the cSRM approach does not support cross-content applications and also is not suitable for use with resting-state data. Second, cSRM is implemented based on a predefined cortical parcellation rather than the overlapping, regularly-spaced cortical searchlights applied in our method which are not constrained by areal borders. For the application, cSRM has mainly been used to do ROI analysis rather than the estimation of the whole-brain topography that requires broader coverage of the cortex with a searchlight analysis. Third, our method is specifically designed to work in each individual’s space, while cSRM decomposes data across subjects into shared and subject specific transformations, focusing on a communal connectivity space. In summary, although cSRM presents a promising alternative for similar aims, its current implementation precludes it from fulfilling the range of applications for which our method is optimized.”

6) Justify the particular six step, iterative approach. That is: why were six steps chosen over any other number? At present, it is not clear if there is an explicit loss function that the authors are minimizing over their iterations. The relative computational cost of six iterations is also likely significant, particularly compared to previous hyperalignment algorithms. A more detailed theoretical understanding of why six iterations are necessary-or if other researchers could adopt a variable number according to the characteristics of their data-would significantly improve the transferability of this method.

In the advanced connectivity hyperalignment implementation, we gradually increased the number of targets. The six steps were not intentionally chosen but were the result of the increase to the maximum number of fine-grained targets, namely single cortical vertices.

Our datasets were resampled to the cortical mesh with 18,742 vertices across both hemispheres (approximately 3 mm vertex spacing; icoorder 5; 20,484 vertices before removing non-cortical vertices). Step 1 was the classic standard connectivity hyperalignment implementation based on the anatomically-aligned data. Since using dense connectivity targets (e.g., using all 18742 vertices on the surface) with anatomically-aligned data generates poor functional correspondence across participants (Busch et al., 2021), we used 1,284 vertices (icoorder 3, before removing the medial wall) as connectivity targets in step 1. However, it is beneficial to include more targets for calculating connectivity patterns after the first iteration of connectivity hyperalignment and repeated iterations to lead to a better solution by gradually aligning the information at finer scales. To better align across participants, we iterated the alignment for another two times (step 2 and step 3) with the same number of 1,284 coarse connectivity targets to ensure improved alignment before increasing the number of targets in the later steps. In step 4, we increased the number of targets to 5,124 (icoorder 4, before removing the medial wall), and iterated with this number of vertices for two times in total (step 4 and step 5) before using all vertices as targets. In the final step (step 6), all vertices were used as connectivity targets.

It is true that the multiple iteration steps largely increased the computational complexity compared to the classic connectivity hyperalignment, but the prediction increase was steady across all datasets and became comparable to response hyperalignment performance which requires time-locked stimuli. We did not use an explicit loss function in the algorithm, but followed the natural progression of the number of potential connectivity targets in the implementation. On the other hand, the difference between the performance of the improved and the classic connectivity hyperalignment was relatively small (difference of r < 0.05), which indicates the effectiveness of our classic algorithm. It is up to the researchers’ own options to adopt the number of iterations and the pace of increasing the number of targets in each step. If computational resources are limited or if a shorter total computational time is the primary priority, using the classic connectivity hyperalignment may be the best option to balance the trade-offs.

The Materials and methods section had the details of the implementation (page 22-23):

“Using dense connectivity targets (e.g., using all 18742 vertices on the surface) with anatomically-aligned data usually generates poor functional correspondence across participants (33). It is, however, beneficial to include more targets for calculating connectivity patterns after the first iteration of connectivity hyperalignment and repeated iterations to lead to a better solution by gradually aligning the information at finer scales.

We used six steps to further improve the connectivity hyperalignment method. Step 1 was the initial connectivity hyperalignment step as described above that was based on the raw anatomically aligned movie data. The resultant transformation matrices were applied to those movie runs, and the hyperaligned data were then used in step 2 to calculate new connectivity patterns and calculate new transformation matrices. We repeated this procedure iteratively six times and derived transformation matrices for each step. In steps 1, 2, and 3, 642 × 2 (icoorder3, before removing the medial wall) connectivity targets were defined with 13 mm searchlights. In step 4 and 5, 2562 × 2 (icoorder 4, before removing the medial wall) connectivity targets were used with 7 mm searchlights to calculate target mean time series. In the final step 6, all 18742 vertices were included as separate connectivity targets, using each vertex’s time series rather than calculating the mean in a searchlight. Each step of this advanced connectivity hyperalignment algorithm increased the prediction performance (Figure 4—figure supplement 2).”

But to help the readers understand the logic of the advanced connectivity hyperalignment algorithm used in this study, we expanded the Discussion section (page 15):

“Because using dense connectivity targets (e.g., using all vertices as connectivity targets) with anatomically-alignment data often leads to suboptimal alignment across participants (33), we started with coarse connectivity targets and gradually increased the number of connectivity targets to form a denser representation of connectivity profiles. The iterations improved the prediction performance step by step, and at the final step (step 6, all vertices were used as connectivity targets) in this analysis, the enhanced CHA generated comparable performance with RHA (Figure 4—figure supplement 4).”

7) The existing evaluations for enhanced CHA appear to be entirely based on image-derived correlations. That is, the authors compare the predicted image from CHA with the ground-truth image using correlation. While this provides promising initial evidence, correlation-based measures are often difficult to interpret given their sensitivity to image characteristics such as smoothness. Including Cronbach's α reliability as a baseline does not address this concern, as it is similarly an image-based statistic. It would be useful to see additional predictive experiments using frameworks such as time-segment classification, inter-subject decoding, or encoding models.

We appreciate the reviewer’s concern regarding the stability of local correlations in relation to image characteristics. To address this, we conducted additional analysis using different searchlight sizes (with radii of 10 mm, 15 mm, and 20 mm) to evaluate the predicted categoryselective maps, focusing specifically on the Budapest dataset. The local correlations between the predicted category-selective maps (obtained using enhanced CHA) and participants’ own maps based on classic localizer runs were calculated for each searchlight. We averaged these correlations across participants and plotted the resulting maps, as shown in Figure 4—figure supplement 10. Although using a larger searchlight radius is similar to employing a larger smoothing kernel, the results remained relatively stable across different searchlight sizes, particularly in regions selectively responsive to the specific category. This stability suggests that while the evaluation may be influenced by image-related features, the conclusion would remain consistent under varying parameters.

As for the use of enhanced CHA, it serves as an optimized version of the classic CHA, specifically designed for predicting individualized functional topographies. Evaluating prediction performance in our study is based on t-value contrast maps for each participant. Given this, it's unclear how time-segment classification or other decoding/encoding models could be appropriately implemented for performance evaluation. However, prior research from our lab has already established the effectiveness of classic CHA. Specifically, Guntupalli et al. (2018) showed that classic CHA significantly improved intersubject correlations (ISC) of connectivity profiles across the cortex. They also revealed that CHA captured fine-scale variations in connectivity profiles for nearby cortical nodes across participants and led to improved betweensubject multivariate pattern classification accuracies (bsMVPC) of movie segments. These findings serve as robust evidence for the effectiveness of classic CHA, laying the groundwork for our enhanced CHA approach.

We added Figure 4—figure supplement 10 to the supplementary material:

8) Make available the code for implementing CHA, or justify why this could not be done at the present.

We will make the implementation code available once the article is accepted.

Reviewer #1 (Recommendations for the authors):

In addition to adding more discussions on the limit of the present hyperalignment approach as I mentioned in the public review section, I would suggest more direct comparisons of the current CHA results and previous RHA results. I.e., perhaps consider moving Figure S2 to the main text?

We thank the reviewer for pointing this out. To help readers better understand the comparison between CHA and RHA results, we moved Figure S2A (current Figure 2—figure supplement 1A) to the main text and combined it with Figure 1 as panel D.

The corresponding content in the manuscript is in the Results section (page 4):

“Estimates using CHA to calculate transformation matrices were also equivalent to estimates using RHA (Figure 1D). RHA, however, requires that all subjects watch the same movie, whereas CHA can use connectivity matrices derived from responses to different movies, potentially making our new approach more flexible.”

Reviewer #3 (Recommendations for the authors):

– On L336 of The Raiders Dataset, the authors note that a subset of nine of the original eleven participants are included in the current experiments; however, from the current it is not obvious why two participants were excluded.

Nine of the total eleven original participants have the localizer data, thus, only these nine participants were included in this study.

– Please confirm the radius of the searchlights used throughout the experiments. For example, in L361 of Connectivity Hyperalignment (Step One) the searchlight is described as 13mm radius, while on L370 of the same section it is a 15mm radius.

The details are correct. We used slightly smaller sized (13 mm) searchlights when building the connectivity profiles and the common sized (15 mm) searchlights in the functional alignment.

– In Figure S2, I noted the following two typos: In section (A), The second axis description should read "Values on the x-axis stand for correlations between each target participant's own localized-based topographies and topographies from other participants in the same dataset using CHA." In section (B), "Conbach's alphas" should be "Cronbach's alphas."

We made the revision as suggested for section (B). We moved Figure S2A (current Figure 2figure supplement 1A) to current Figure 1D and kept the figure caption to explicitly describe that the predicted topographies were estimated from other participants in the same dataset.

– In Figure S3, the in-figure legend (e.g., F to B) does not appear to relate to the figure content and is not explained in the figure description.

For each dataset, we included RHA, within-movie CHA, AA, and all possible cross-movie CHA predictions. So for each individual participant in the figure, all colored dots reflecting the listed contents were plotted. We apologize for the confusion, and added the explanation of the legends in the figure caption.

– It seems that code for implementing CHA is not currently available, as the GitHub repository listed in the Data Availability (but not in-text) does not contain executable code as far as I can tell. This would be particularly useful for other author's hoping to apply this method in their own datasets!

We will make the implementation code available once the article is accepted.

References

Feilong, M., Guntupalli, J. S., and Haxby, J. V. (2021). The neural basis of intelligence in finegrained cortical topographies. eLife, 10, e64058. https://doi.org/10.7554/eLife.64058

Guntupalli, J. S., Feilong, M., and Haxby, J. V. (2018). A computational model of shared finescale structure in the human connectome. PLOS Computational Biology, 14(4), e1006120. https://doi.org/10.1371/journal.pcbi.1006120

Guntupalli, J. S., Hanke, M., Halchenko, Y. O., Connolly, A. C., Ramadge, P. J., and Haxby, J. V. (2016). A Model of Representational Spaces in Human Cortex. Cerebral Cortex, 26(6), 2919–2934. https://doi.org/10.1093/cercor/bhw068

Jiahui, G., Feilong, M., Visconti di Oleggio Castello, M., Guntupalli, J. S., Chauhan, V., Haxby, J. V., and Gobbini, M. I. (2020). Predicting individual face-selective topography using naturalistic stimuli. NeuroImage, 216, 116458. https://doi.org/10.1016/j.neuroimage.2019.116458

Jiahui, G., Feilong, M., Visconti di Oleggio Castello, M., Nastase, S. A., Haxby, J. V., and Gobbini, M. I. (2023). Modeling naturalistic face processing in humans with deep convolutional neural networks. Proceedings of the National Academy of Sciences, 120(43), e2304085120. https://doi.org/10.1073/pnas.2304085120

Nastase, S. A., Liu, Y.-F., Hillman, H., Norman, K. A., and Hasson, U. (2020). Leveraging shared connectivity to aggregate heterogeneous datasets into a common response space. NeuroImage, 217, 116865. https://doi.org/10.1016/j.neuroimage.2020.116865

Osher, D. E., Saxe, R. R., Koldewyn, K., Gabrieli, J. D. E., Kanwisher, N., and Saygin, Z. M. (2016). Structural Connectivity Fingerprints Predict Cortical Selectivity for Multiple Visual Categories across Cortex. Cerebral Cortex (New York, NY), 26(4), 1668–1683. https://doi.org/10.1093/cercor/bhu303

Samara, A., Eilbott, J., Margulies, D. S., Xu, T., and Vanderwal, T. (2023). Cortical gradients during naturalistic processing are hierarchical and modality-specific. NeuroImage, 271, 120023. https://doi.org/10.1016/j.neuroimage.2023.120023

Saygin, Z. M., Osher, D. E., Koldewyn, K., Reynolds, G., Gabrieli, J. D. E., and Saxe, R. R. (2012). Anatomical connectivity patterns predict face selectivity in the fusiform gyrus. Nature Neuroscience, 15(2), 321–327. https://doi.org/10.1038/nn.3001

https://doi.org/10.7554/eLife.86037.sa2

Article and author information

Author details

  1. Guo Jiahui

    Center for Cognitive Neuroscience, Dartmouth College, Hanover, United States
    Contribution
    Conceptualization, Formal analysis, Investigation, Methodology, Software, Visualization, Writing – original draft, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-1528-9025
  2. Ma Feilong

    Center for Cognitive Neuroscience, Dartmouth College, Hanover, United States
    Contribution
    Writing – original draft, Writing – review and editing, Investigation, Methodology
    Competing interests
    No competing interests declared
  3. Samuel A Nastase

    Princeton Neuroscience Institute, Princeton University, Princeton, United States
    Contribution
    Writing – original draft, Methodology
    Competing interests
    No competing interests declared
  4. James V Haxby

    Center for Cognitive Neuroscience, Dartmouth College, Hanover, United States
    Contribution
    Conceptualization, Resources, Supervision, Funding acquisition, Investigation, Visualization, Methodology
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-6558-3118
  5. M Ida Gobbini

    1. Department of Medical and Surgical Sciences (DIMEC), University of Bologna, Bologna, Italy
    2. IRCCS, Istituto delle Scienze Neurologiche di Bologna, Bologna, Italy
    Contribution
    Conceptualization, Resources, Supervision, Funding acquisition, Investigation, Visualization, Methodology
    For correspondence
    mariaida.gobbini@unibo.it
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-6727-7934

Funding

National Science Foundation (1607845)

  • James V Haxby

National Science Foundation (1835200)

  • M Ida Gobbini

National Institute of Mental Health (MH127199)

  • James V Haxby

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

This work was supported by NSF grants 1607845 (JVH) and 1835200 (MIG), and NIH grant R01 MH127199 (JVH and MIG).

Ethics

All participants gave their written informed consent to participate in the study. Data collection of the Forrest dataset was approved by the Ethics Committee of Otto-von-Guericke University (approval reference 37/13). Data collection of the other datasets (Raiders, Budapest, SRaiders) were approved by the Dartmouth Committee for the Protection of Human Subjects.

Senior Editor

  1. Chris I Baker, National Institute of Mental Health, United States

Reviewing Editor

  1. Ming Meng, South China Normal University, China

Reviewers

  1. Ming Meng, South China Normal University, China
  2. Zonglei Zhen, Beijing Normal University, China

Version history

  1. Preprint posted: November 22, 2022 (view preprint)
  2. Received: January 9, 2023
  3. Accepted: November 9, 2023
  4. Version of Record published: November 23, 2023 (version 1)

Copyright

© 2023, Jiahui et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 223
    Page views
  • 35
    Downloads
  • 1
    Citations

Article citation count generated by polling the highest count across the following sources: PubMed Central, Crossref, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Guo Jiahui
  2. Ma Feilong
  3. Samuel A Nastase
  4. James V Haxby
  5. M Ida Gobbini
(2023)
Cross-movie prediction of individualized functional topography
eLife 12:e86037.
https://doi.org/10.7554/eLife.86037

Share this article

https://doi.org/10.7554/eLife.86037

Further reading

    1. Neuroscience
    Eyal Y Kimchi, Anthony Burgos-Robles ... Kay M Tye
    Research Article

    Basal forebrain cholinergic neurons modulate how organisms process and respond to environmental stimuli through impacts on arousal, attention, and memory. It is unknown, however, whether basal forebrain cholinergic neurons are directly involved in conditioned behavior, independent of secondary roles in the processing of external stimuli. Using fluorescent imaging, we found that cholinergic neurons are active during behavioral responding for a reward – even prior to reward delivery and in the absence of discrete stimuli. Photostimulation of basal forebrain cholinergic neurons, or their terminals in the basolateral amygdala (BLA), selectively promoted conditioned responding (licking), but not unconditioned behavior nor innate motor outputs. In vivo electrophysiological recordings during cholinergic photostimulation revealed reward-contingency-dependent suppression of BLA neural activity, but not prefrontal cortex. Finally, ex vivo experiments demonstrated that photostimulation of cholinergic terminals suppressed BLA projection neuron activity via monosynaptic muscarinic receptor signaling, while also facilitating firing in BLA GABAergic interneurons. Taken together, we show that the neural and behavioral effects of basal forebrain cholinergic activation are modulated by reward contingency in a target-specific manner.

    1. Neuroscience
    Olgerta Asko, Alejandro Omar Blenkmann ... Anne-Kristin Solbakk
    Research Article Updated

    Orbitofrontal cortex (OFC) is classically linked to inhibitory control, emotion regulation, and reward processing. Recent perspectives propose that the OFC also generates predictions about perceptual events, actions, and their outcomes. We tested the role of the OFC in detecting violations of prediction at two levels of abstraction (i.e., hierarchical predictive processing) by studying the event-related potentials (ERPs) of patients with focal OFC lesions (n = 12) and healthy controls (n = 14) while they detected deviant sequences of tones in a local–global paradigm. The structural regularities of the tones were controlled at two hierarchical levels by rules defined at a local (i.e., between tones within sequences) and at a global (i.e., between sequences) level. In OFC patients, ERPs elicited by standard tones were unaffected at both local and global levels compared to controls. However, patients showed an attenuated mismatch negativity (MMN) and P3a to local prediction violation, as well as a diminished MMN followed by a delayed P3a to the combined local and global level prediction violation. The subsequent P3b component to conditions involving violations of prediction at the level of global rules was preserved in the OFC group. Comparable effects were absent in patients with lesions restricted to the lateral PFC, which lends a degree of anatomical specificity to the altered predictive processing resulting from OFC lesion. Overall, the altered magnitudes and time courses of MMN/P3a responses after lesions to the OFC indicate that the neural correlates of detection of auditory regularity violation are impacted at two hierarchical levels of rule abstraction.