A partially nested cortical hierarchy of neural states underlies event segmentation in the human brain

  1. Linda Geerligs  Is a corresponding author
  2. Dora Gözükara
  3. Djamari Oetringer
  4. Karen L Campbell
  5. Marcel van Gerven
  6. Umut Güçlü
  1. Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Netherlands
  2. Department of Psychology, Brock University, Canada

Abstract

A fundamental aspect of human experience is that it is segmented into discrete events. This may be underpinned by transitions between distinct neural states. Using an innovative data-driven state segmentation method, we investigate how neural states are organized across the cortical hierarchy and where in the cortex neural state boundaries and perceived event boundaries overlap. Our results show that neural state boundaries are organized in a temporal cortical hierarchy, with short states in primary sensory regions, and long states in lateral and medial prefrontal cortex. State boundaries are shared within and between groups of brain regions that resemble well-known functional networks. Perceived event boundaries overlap with neural state boundaries across large parts of the cortical hierarchy, particularly when those state boundaries demarcate a strong transition or are shared between brain regions. Taken together, these findings suggest that a partially nested cortical hierarchy of neural states forms the basis of event segmentation.

Editor's evaluation

This article addresses the question of how the brain segments naturalistic events and the relationship between perceived event boundaries and neural pattern shifts. By applying an innovative analysis to a large, publicly available dataset, they observe evidence of different timescales of neural state shifts that correspond with perceived event bounds. These results will be of interest to cognitive neuroscientists investigating the relationship between neural states and event segmentation.

https://doi.org/10.7554/eLife.77430.sa0

Introduction

Segmentation of information into meaningful units is a fundamental feature of our conscious experience in real-life contexts. Spatial information processing is characterized by segmenting spatial regions into objects (e.g., DiCarlo and Cox, 2007). In a similar way, temporal information processing is characterized by segmenting our ongoing experience into separate events (Kurby and Zacks, 2008; Newtson et al., 1977). Segmentation improves our understanding of ongoing perceptual input (Zacks et al., 2001a) and allows us to recall distinct events from our past (Flores et al., 2017; Sargent et al., 2013; Zacks et al., 2006). Recent work has shown that the end of an event triggers an evoked response in the hippocampus (Baldassano et al., 2017; Ben-Yakov and Henson, 2018), suggesting that events form the basis of long-term memory representations. Events that are identified in written, auditory, and audiovisual narratives (movies) are often very similar across individuals and can be segmented hierarchically on different timescales (Newtson and Rindner, 1979; Zacks et al., 2001a). According to event segmentation theory (EST), perceptual systems spontaneously segment activity into meaningful events as a side effect of predicting future information (Zacks et al., 2007). That is, event boundaries are perceived when predictions become less accurate, which can be due to a change in motion or features of the situation such as characters, causes, goals, and spatial location (Zacks et al., 2009). However, event boundaries are observed even when a change is predictable, suggesting that other mechanisms play a role (Pettijohn and Radvansky, 2016). One proposal is that experiences are grouped into categories (or event types), which we have learned previously. When a new event type is detected, an event boundary occurs (Shin and DuBrow, 2021).

While much is known about temporal event segmentation at a behavioral level, less is known about its neural underpinnings. A number of studies have investigated which brain regions show evoked responses around event boundaries. Although the exact regions vary across studies, commonly identified regions include the precuneus and medial visual cortex, as well as area V5 and the intraparietal sulcus (Kurby and Zacks, 2018; Speer et al., 2007; Zacks et al., 2001b; Zacks et al., 2010). Increased brain responses at event boundaries in these regions likely reflect updating processes that occur when shifting to a new event model (Ezzyat and Davachi, 2011). Recently, a different approach has been introduced to investigate the neural underpinnings of event segmentation (Baldassano et al., 2017). These authors applied a data-driven method, based on hidden Markov models (HMMs) to functional magnetic resonance imaging (fMRI) data obtained during movie watching to identify timepoints where brain activity in a particular region transitioned from one temporarily stable activity pattern to a different pattern. We refer to these periods of relative stability as neural states to distinguish them from subjectively perceived events (Geerligs et al., 2021; Zacks et al., 2007).

These neural states occur on different timescales across the cortical hierarchy, with short-lived states in early sensory regions and long-lasting states in higher-level regions such as the precuneus and angular gyrus (Baldassano et al., 2017), in line with previous observations of a temporal hierarchy of information processing in the brain (Hasson et al., 2008; Honey et al., 2012; Lerner et al., 2011; Stephens et al., 2013). Interestingly, for a set of four brain regions, Baldassano et al., 2017 showed that neural state boundaries overlapped across different regions and with subjectively experienced event boundaries. These results suggest that neural state segmentation could be the source of perceived event boundaries and that states may be organized in a nested cortical hierarchy, such that the boundaries of faster states in regions lower in the cortical hierarchy are nested within the boundaries of slower regions higher up in the hierarchy. In such a nested hierarchy, each brain region integrates information within discretized neural states that may align with sensory units in the external environment (e.g., phonemes, words, sentences) and provide its output to the brain regions higher in the cortical hierarchy (Nelson et al., 2017), until neural states at the highest level of the hierarchy align with subjectively experienced events. This way information traveling up the hierarchy is gradually integrated into complex and long-lasting multimodal representations (Hasson et al., 2015). Although this is an intriguing hypothesis, the evidence for it is limited, as it has only been investigated in one previous study using four predefined regions of interest (Baldassano et al., 2017). In addition, it remains unknown which brain regions show neural state boundaries that align with perceived event boundaries and the temporal hierarchy of state segmentation remains unexplored across large parts of the cortex.

This study had two main aims. First, to investigate whether event segmentation is indeed underpinned by neural state segmentation occurring in a nested cortical hierarchy. Second, to characterize the temporal hierarchy of neural state segmentation across the entire cortex. If the brain segments ongoing input in a nested hierarchical fashion, we would expect to find especially long-lasting neural states in the frontal cortex, which is often considered the top of the cortical hierarchy (Fuster, 2001). We would also expect to find overlap between neural state boundaries and event boundaries across all levels of the cortical hierarchy, although this overlap should be most consistent for areas at higher levels of the hierarchy where the timescales of neural state segmentation should closely match the experienced timescale of events. Finally, we would expect that state boundaries are most strongly shared between groups of brain regions involved in similar cognitive functions (i.e., networks) and to a lesser extent between more distinct sets of brain areas.

To test these hypotheses, we used a novel data-driven state segmentation method that was specifically designed to reliably detect boundaries between distinct neural states (Geerligs et al., 2021). By using a large movie fMRI dataset from the Cam-CAN project (Shafto et al., 2014) that shows reliable stimulus-driven activity (i.e., significant inter-subject correlations) over nearly all cortical brain regions (Campbell et al., 2015; Geerligs and Campbell, 2018), we were able to study neural state segmentation across the entire cortex for the first time. In comparison to previous work, we investigate state segmentation in a more focused and extensive way by identifying the degree of change moving from one neural state to the next (i.e., boundary strength) and examining relationships between neural state boundaries across functional networks.

Results

To identify neural state boundaries, we applied an improved version of the greedy state boundary search (GSBS; Geerligs et al., 2021) to a large fMRI dataset in which 265 participants (aged 18–50 years) viewed an 8 min Alfred Hitchcock movie (Shafto et al., 2014). After hyperaligning the data (Guntupalli et al., 2016) and hemodynamic deconvolution, GSBS was applied to multi-voxel brain activity time courses from overlapping spherical searchlights covering the entire cortex. GSBS identifies a set of neural state boundaries for each searchlight and for different numbers of states (k). GSBS then uses the t-distance metric to identify the optimal number of state boundaries in each searchlight (Geerligs et al., 2021). This metric identifies the optimal number of state boundaries such that the Pearson correlations (across voxels) of timepoints within a state are maximized and correlations of timepoints in consecutive states are minimized. To optimize the validity and reliability of the neural states detected by GSBS, we improved the algorithm in several ways, as shown in the ‘Supplementary methods’ section in Appendix 1. Searchlights in which we were unable to identify reliable neural state boundaries were excluded from further analysis (see Appendix 1—figure 5). Searchlight-level results were projected to the voxel level by averaging results across overlapping searchlights.

The median duration of neural states differed greatly between brain regions, ranging from 4.5 s in the voxels with shortest states up to 27.2 s in the voxels with the longest states (see Figure 1A). Most voxels showed median state durations between 5.1 and 18.5 s per state. To determine whether regional differences in state duration were reliable, neural state boundaries were identified in two independent groups of participants. At the voxel level, there was a very high Pearson correlation between the median state durations of the two groups (r = 0.85; see Figure 1A). This correlation was lower when we computed it at the level of searchlights (i.e., before projecting to the voxel level; r = 0.62). This suggests that regional differences in neural state timescales are highly reliable across participant groups and that the variability present in specific searchlights can be reduced substantially by averaging across overlapping searchlights. The timing of neural state boundaries was not associated with head motion (see ‘Supplementary results’ in Appendix 1).

The cortical hierarchy of neural state durations.

(A) The optimal number of states varied greatly across regions, with many shorter states in the primary visual, auditory, and sensorimotor cortices and few longer states in the association cortex, such as the medial and lateral prefrontal gyrus. These results are highly consistent across two independent groups of participants. Parts of the correlation matrices for two selected searchlights are shown in the insets for each of the groups, representing approximately 1.6 min of the movie. The white lines in these insets are the neural state boundaries that are detected by greedy state boundary search (GSBS). (B) The variability in state durations, as quantified by the interquartile range (IQR)/median state duration, was particularly high in the middle and superior temporal gyri and the anterior insula.

Figure 1A shows that there were particularly short neural states in visual cortex, early auditory cortex, and somatosensory cortex. State transitions were less frequent in areas further up cortical hierarchy, such as the angular gyrus, areas in posterior middle/inferior temporal cortex, precuneus, the anterior temporal pole, and anterior insula. Particularly long-lasting states were observed in high-level regions such as the medial prefrontal gyrus and anterior portions of the lateral prefrontal cortex, particularly in the left hemisphere.

We also investigated how the variability of state durations differed across the cortex. Because variability of state duration, as measured by the interquartile range (IQR), tends to increase as the median state duration increases, we used a nonparametric alternative to the coefficient of variation (IQR divided by the median). We found very pronounced variability in state durations in the middle and superior temporal gyri and the anterior insula, while the variability was consistently lower in all other cortical areas. This effect was highly reliable across the two independent groups of participants (r = 0.84 at voxel level; r = 0.54 at the searchlight level).

Neural states and perceived event boundaries

In a nested cortical hierarchy, some boundaries at lower levels of the cortical hierarchy are thought to propagate to higher levels of the hierarchy until they are consciously experienced as event boundaries. Therefore, we would expect state boundaries to align with perceived event boundaries across all of the different levels of the hierarchy. The most consistent alignment would be expected in higher-level cortical areas where the number of states should more closely align with the number of perceived events. Event boundaries were determined by asking participants to indicate when they felt one event (meaningful unit) ended and another began (Ben-Yakov and Henson, 2018). To determine the similarity between neural state boundaries and perceived event boundaries, we computed the absolute boundary overlap. This is defined as the number of timepoints where neural state and perceived event boundaries overlapped, scaled such that the measure is one when all neural state boundaries align with a perceived event boundary and zero when the overlap is equal to the overlap that would be expected given the number of boundaries.

We found that a large number of brain regions, throughout the cortical hierarchy, showed significant absolute boundary overlap between neural states and perceived event boundaries after false discovery rate (FDR) correction for multiple comparisons (see Figure 2A). In particular, we observed that the anterior cingulate cortex, dorsal medial prefrontal cortex, left superior and middle frontal gyrus, and anterior insula show the strongest absolute overlap between neural state boundaries and perceived boundaries. This suggests that neural state boundaries in these regions are most likely to underlie the experience of an event boundary.

Overlap between neural state and event boundaries.

(A) Absolute boundary overlap between neural state boundaries and the perceived event boundaries identified by a different group of participants outside the scanner. The metric is scaled between zero (expected overlap) and one (all neural state boundaries overlap with an event boundary). The medial prefrontal cortex, anterior insula, anterior cingulate cortex, and left superior and middle frontal gyrus show the strongest alignment between neural state boundaries and perceived event boundaries. (B) Relative overlap between neural state boundaries and perceived event boundaries (scaled w.r.t. the maximal possible overlap given the number of neural state boundaries). Regions in different parts of the cortical hierarchy (early and late) show a significant association between the neural state boundaries and perceived event boundaries. (C) Increase in absolute overlap between neural state boundaries and event boundaries when neural state boundaries are weighted by their strengths, in comparison to using binary boundaries (as in A). Regions with strong neural state boundaries (i.e., a large change between successive states) were more likely to overlap with perceived event boundaries than weak boundaries. Statistical analyses for (A–C) were based on data from 15 independent groups of participants, the depicted difference/overlap values were based on data in which all 265 participants were averaged together. All of the colored regions showed a significant association after false discovery rate (FDR) correction for multiple comparisons.

The absolute boundary overlap is partly driven by regional differences in the number of neural state boundaries. However, our hypothesis of a nested cortical hierarchy suggests that regions in early stages of the cortical hierarchy, with many neural state boundaries, should also show overlap between neural state and perceived event boundaries. To correct for regional differences in the possibility for overlap, due to the differing number of neural state boundaries in a region, we computed the relative boundary overlap. The relative boundary overlap is scaled by the maximum possible overlap given the number of state and event boundaries. It is one when all perceived event boundaries coincide with a neural state boundary (even if there are many more neural states than events) or when all neural state boundaries coincide with an event boundary. A value of zero indicates that the overlap is equal to the expected overlap. This metric gives a different pattern of results, showing that regions across different levels of the cortical hierarchy (early and late) have strong overlap between neural states and perceived event boundaries (see Figure 2B). Regions early in the cortical hierarchy, such as the medial visual cortex, the medial and superior temporal gyri, and the postcentral gyrus, show strong relative overlap. The same is true for regions later in the hierarchy, including the anterior insula, most areas of the default mode network (DMN), including the medial frontal gyrus, anterior parts of the precuneus and the angular gyrus, and large parts of the lateral frontal cortex. These results suggest that there is overlap between event and neural state boundaries throughout the cortical hierarchy.

To understand why some neural state boundaries are associated with event boundaries and some are not, we investigated the degree of neural state change at each boundary. Specifically, we define boundary strength as the Pearson correlation distance between the neural activity patterns of consecutive neural states. We weighted each neural state boundary by the boundary strength and then recomputed the absolute overlap between neural state and event boundaries. Like before, the absolute overlap is one when all neural state boundaries align with a perceived event boundary and zero when the overlap is equal to the overlap that would be expected given the strengths of all neural state boundaries. However, after weighting boundaries by strength, given the same number of boundaries that overlap, the absolute overlap will be higher when the weaker neural state boundaries do not overlap with events and strong boundaries do overlap with event boundaries.

When we took the strength of neural state boundaries into account in this way, the absolute overlap with event boundaries increased compared to when we used a binary definition of neural state boundaries. This was observed particularly in the middle and superior temporal gyri, extending into the inferior frontal gyrus (see Figure 2C), but also in the precuneus and medial prefrontal cortex. This means that boundaries that coincided with a larger shift in brain activity patterns were more often associated with an experienced event boundary.

Neural state networks

If neural state boundaries are organized in a nested cortical hierarchy, different brain regions would be expected to show substantial overlap in their neural state boundaries. Therefore, we investigated for each pair of searchlights whether the relative neural state boundary overlap was larger than would be expected by chance, given the number of boundaries in these regions. We observed that the overlap was significantly larger than chance for 85% of all pairs of searchlights, suggesting that neural state boundaries are indeed shared across large parts of the cortical hierarchy (see Figure 3A). Although the overlap was highly significant for the majority of searchlight pairs, we did not observe a perfectly nested architecture. If that were the case, the relative boundary overlap between searchlights would have been one. To make sure the observed relative boundary overlap between searchlights was not caused by noise shared across brain regions, we also computed the relative boundary overlap across two independent groups of participants (similar to the rationale of inter-subject functional connectivity analyses; Simony et al., 2016). We observed that the relative boundary overlap computed in this way was similar to the relative overlap computed within a participant group (r = 0.69; see Appendix 1—figure 6), suggesting that shared noise is not the cause of the observed regional overlap in neural state boundaries.

Neural state boundaries are shared within and across distinct functional networks that span the cortical hierarchy.

(A) The relative boundary overlap between each pair of searchlights, ordered according the functional networks they are in. The black lines show the boundaries between functional networks. Searchlight pairs shown in white did not have significant relative boundary overlap after false discovery rate (FDR) correction for multiple comparisons. (B) Visualization of the detected functional networks. The network label at each voxel is determined by the functional network that occurs most often in all the searchlights that overlap with that voxel. (C) Visualization of the neural state durations within each network. Each searchlight is shown as a dot. The colored bars show the mean and 1 standard deviation around the mean for each network. The data shown in (A–C) are based on data averaged across all participants. The test for statistical significance in (A) was performed with data of 15 independent groups of participants. All of the colored regions in (A) showed a significant association after FDR correction for multiple comparisons.

To investigate the overlap between regions in more detail, we identified networks of brain regions that shared state boundaries by computing the relative boundary overlap between each pair of searchlights and using consensus partitioning based on Louvain modularity maximization to identify networks (Blondel et al., 2008; Lancichinetti and Fortunato, 2012). We found that state boundaries were shared within long-range networks. Some of these networks resembled canonical networks typically identified based on (resting state) time-series correlations (see Figure 3B). To quantify this, we computed the proportion of searchlights overlapping with each of the networks defined by Power et al., 2011 (see Appendix 1—table 1). We identified an auditory network that extended into regions involved in language processing in the left inferior frontal gyrus, a fronto-parietal control network (FPCN), a cingulo-opercular network (CON), and a motor network. The DMN we identified was fractionated into anterior, superior, and posterior components. It should be noted that all three of the DMN subnetworks include some anterior, superior, and posterior subregions; the names of these subnetworks indicate which aspects of the networks are most strongly represented. Appendix 1—table 2 shows the overlap of each of these subnetworks with the anterior and posterior DMN as identified in Campbell et al., 2013. The sensorimotor network (SMN) we identified was split into a lateral and medial component. We also identified a network overlapping with the dorsal attention network (DAN), although the network we identified only covered posterior parts of the DAN but not the frontal eye fields. While the visual network is typically identified as a single network in functional connectivity studies, we observed two networks, roughly corresponding to different levels of the visual hierarchy (early and late). Figure 3 visualizes for each voxel which functional network label occurs most frequently for the searchlights overlapping that voxel. In contrast, the full extent of each of the functional networks can be seen in Figure 4.

Separate visualizations for each of the identified functional networks.

The colors indicate the median state duration for each of the searchlights within the functional network. SMN, sensorimotor network; DMN, default mode network; FPCN, fronto-parietal control network; CON, cingulo-opercular network; DAN, dorsal attention network. The median state duration estimates are based on data averaged across all participants.

Figure 3C shows the average timescale within each of these functional networks. The networks with the longest state durations were the anterior DMN and the FPCN, while the early visual network, lateral SMN, and DAN had particularly short state durations. Although regions within functional networks tended to operate on a similar temporal scale, we also observed a lot of variability in state duration within networks, particularly in the auditory network (see Figure 3C). Many networks also showed a clear within-network gradient of timescales, such as the auditory network, the SMNs, the posterior DMN, and the FPCN (see Figure 4). These results suggest that the relative boundary overlap between regions is not simply driven by a similarity in the number of states, but rather by a similarity in the state boundary timings. This is also supported by the results in Appendix 1—figure 6, showing that the relative boundary overlap was highly distinct from the absolute pairwise differences in median state duration (r = −0.05).

Although the networks we identified show overlap with functional networks previously identified in resting state, they clearly diverged for some networks (e.g., the visual network). Some divergence is expected because neural state boundaries are driven by shifts in voxel-activity patterns over time, rather than by the changes in mean activity that we typically use to infer functional connectivity. This divergence was supported by the overall limited similarity with the previously identified networks by Power et al., 2011 (adjusted mutual information [aMI] = 0.39), as well as the differences between the correlation matrix that was computed based on the mean activity time courses in each searchlight and the relative boundary overlap between each pair of searchlights (Appendix 1—figure 6; r = 0.31). Interestingly, regions with strongly negatively correlated mean activity time courses typically showed overlap that was similar to or larger than the overlap expected by chance. Indeed, the relative boundary overlap between each pair of searchlights was more similar to the absolute Pearson correlation coefficient between searchlights (r = 0.39) than when the sign of the correlation coefficient was preserved (r = 0.31). This suggests that pairs of regions that show negatively correlated BOLD activity still tend to show neural state boundaries at the same time.

It should be noted that although the overlap in neural state boundaries was strongest for the searchlights that were part of the same functional networks, we also found a lot of evidence for the hypothesis that boundaries are shared across different levels of the cortical hierarchy (see Figure 3A). Overlap was particularly strong between all higher-order networks (DAN, CON, FPCN, and DMNs), as well as between the motor network and SMN. The sensorimotor, motor, and auditory networks also showed highly significant overlap with the higher-order networks. Lower levels of overlap were observed between the early visual network and all other networks (except the DAN), as well as between the auditory and the sensorimotor networks.

Shared neural state boundaries and event boundaries

Previous research on event segmentation has shown that the perception of an event boundary is more likely when multiple features of a stimulus change at the same time (Clewett et al., 2019). When multiple sensory features changes at the same time, this could be reflected in many regions within the same functional network showing a state boundary at the same time (e.g., in the visual network when many aspects of the visual environment change), or in neural state boundaries that are shared across functional networks (e.g., across the auditory and visual networks when a visual and auditory change coincide). Similarly, boundaries shared between many brain regions within or across higher-level cortical networks might reflect a more pronounced change in conceptual features of the narrative (e.g., the goals or emotional state of the character). Therefore, we expect that in a nested cortical hierarchy neural state boundaries that are shared between many brain regions within functional networks, and particularly those shared widely across functional networks, would be more likely to be associated with the perception of an event boundary. To investigate this, we first weighted each neural state boundary in each searchlight by the proportion of searchlights within the same network that also showed a boundary at the same time. This is very similar to how we investigated the role of boundary strength above.

We found that when we took the within-network co-occurrence of neural state boundaries into account in this way, the absolute overlap with event boundaries increased compared to when we used a binary definition of neural state boundaries. So when a particular neural state boundary is shared with more regions within the same network, it is more likely to coincide with an event boundary (see Figure 5A). This effect was observed across all networks, except the early visual network. It was strongest for regions in the anterior DMN, the FPCN, and the auditory network and slightly less pronounced for regions in the posterior and superior DMN, as well as the CON. On a regional level, the strongest effects were observed within the precuneus, angular gyrus, medial prefrontal cortex, temporal pole, insula, superior temporal gyrus, and the middle frontal gyrus.

Increase in absolute overlap between neural state boundaries and event boundaries when boundary co-occurence is taken into account.

Increase in absolute overlap between neural state boundaries and event boundaries when neural state boundaries are weighted by the percentage of searchlights in the same functional networks or across the whole brain that also have a boundary at the same timepoint. (A) Within-network-weighted absolute overlap is compared to using binary boundaries (as in Figure 2A). (B) Whole-brain-weighted absolute overlap is compared to within-network-weighted absolute overlap. (C) The average absolute boundary overlap between events and neural states within each functional network. This is done for both binary boundaries, boundaries weighted by within-network co-occurrence, and boundaries weighted by whole-brain co-occurrence. A red star indicates a significant difference between binary boundaries and boundaries weighted by within-network co-occurrence. A blue star indicates a significant difference between boundaries weighted by whole-brain co-occurrence and boundaries weighted by within-network co-occurrence. The data shown in (A–C) are based on data averaged across all participants. The tests for statistical significance were performed with data of 15 independent groups of participants. All of the colored regions in (A, B) showed a significant association after false discovery rate (FDR) correction for multiple comparisons.

Next, we weighted each neural state boundary by the proportion of searchlights across the whole brain that showed a boundary at the same time. We investigated where absolute overlap for this whole-brain co-occurrence was stronger than when the neural state boundaries were weighted by within-network co-occurrence (see Figure 5B). This was the case specifically for the early and late visual networks, the DAN, the lateral and medial SMN networks, and the motor network. For the auditory network, the opposite effect was observed; whole-brain co-occurrence showed lower absolute overlap with event boundaries than within-network co-occurrence. On a regional level, increases in overlap with event boundaries were most pronounced in the medial parts of the occipital lobe, the supplementary motor area and precentral gyri, as well as the superior parietal gyri.

To investigate the role of boundary co-occurrence across networks in more detail, we investigated for each pair of searchlights whether boundaries that are shared have a stronger association with perceived event boundaries as compared to boundaries that are unique to one of the two searchlights. We found that boundary sharing had a positive impact on overlap with perceived boundaries, particularly for pairs of searchlights within the auditory network and between the auditory network and the anterior DMN (see Figure 6A and B). In addition, we saw that neural state boundaries that were shared between the auditory network and the early and late visual networks, and the superior and posterior DMN were more likely to be associated with a perceived event boundary than non-shared boundaries. The same was true for boundaries shared between the anterior DMN and the lateral and medial SMN network and the posterior DMN. Boundary sharing between the other higher-level networks (pDMN, sDMN, FPCN, and CON) as well as between these higher-level networks and the SMN networks was also beneficial for overlap with event boundaries. On a regional level, the strongest effects of boundary sharing were observed in the medial prefrontal cortex, medial occipital cortex, precuneus, middle and superior temporal gyrus, and insula (see Figure 6B). Analyses shown in the ‘Supplementary results’ section in Appendix 1 demonstrate that these increases in overlap for shared vs. non-shared boundaries cannot be attributed to effects of noise (see also Appendix 1—figure 7).

Increase in absolute overlap with event boundaries for shared vs. non-shared neural state boundaries.

(A) For each pair of searchlights, we compare the relative boundary overlap between neural state boundaries between boundaries that are shared and boundaries that are not shared. In particular, for each pair of searchlights, the searchlight with the highest relative boundary overlap with events is used as the reference in the comparison. The white lines show the boundaries between functional networks. (B) shows the percentage of connections in (A) that show a significant increase in relative boundary overlap with events for shared vs. non-shared neural state boundaries. Here, the data are summarized to a network-by-network matrix for ease of interpretation. The data shown in this figure are based on data averaged across all participants. The tests for statistical significance were performed with data of 15 independent groups of participants. All of the colored regions in (A) showed a significant association after false discovery rate (FDR) correction for multiple comparisons.

So far, we have focused on comparing state boundary time series across regions or between brain regions and events. However, that approach does not allow us to fully understand the different ways in which boundaries can be shared across parts of the cortical hierarchy at specific points in time. To investigate this, we can group timepoints together based on the similarity of their boundary profiles; that is, which searchlights do or do not have a neural state boundary at the same timepoint. We used a weighted stochastic block model (WSBM) to identify groups of timepoints, which we will refer to as ‘communities.’ We found an optimal number of four communities (see Figure 7). These communities group together timepoints that vary in how the degree to their neural state boundaries are shared across the cortical hierarchy: timepoints in the first community show the most widely spread neural state boundaries across the hierarchy, while timepoints in the later communities show less widespread state transitions. We found that from community 1–4, the prevalence of state boundaries decreased for all networks, but most strongly for the FPCN and CON, sDMN, aDMN, and auditory networks. However, the same effect was also seen in the higher visual and SMN and motor networks. This might suggest that boundaries that are observed widely across lower-level networks are more likely to traverse the cortical hierarchy.

Communities of timepoints identified with weighted stochastic block model (WSBM).

This algorithm groups together timepoints that show similar boundary profiles (presence or absence of boundaries across searchlights). (A) Neural state boundaries are shown for each community per timepoint for each searchlight, grouped in functional networks. Boundaries shown in red coincide with an event boundary. (B) Per functional network, we show the ratio of the average state boundary occurrence within each community versus the average occurrence across all timepoints. The same is shown for the event boundaries. The data shown in this figure are based on data averaged across all participants.

We also found a similar drop in prevalence of event boundaries across communities, supporting our previous observation that the perception of event boundaries is associated with the sharing of neural state boundaries across large parts of the cortical hierarchy. We repeated this analysis in two independent groups of participants to be able to assess the stability of this pattern of results. Although group 1 showed an optimum of four communities and group 2 an optimum of five communities, the pattern of results was highly similar across both groups (see Appendix 1—figure 8).

Discussion

While event segmentation is a critical aspect of our ongoing experience, the neural mechanisms that underlie this ability are not yet clear. The aim of this article was to investigate the cortical organization of neural states that may underlie our experience of distinct events. By combining an innovative data-driven state segmentation method with a movie dataset of many participants, we were able to identify neural states across the entire cortical hierarchy for the first time. We observed particularly fast states in primary sensory regions and long periods of information integration in the left middle frontal gyrus and medial prefrontal cortex. Across the entire cortical hierarchy, we observed associations between neural state and perceived event boundaries and our findings demonstrate that neural state boundaries are shared within long-range functional networks as well as across the temporal hierarchy between distinct functional networks.

A partially nested cortical hierarchy of neural states

Previous findings have suggested that neural states may be organized in a nested cortical hierarchy (Baldassano et al., 2017). In line with this hypothesis, we observed that neural state boundaries throughout the entire hierarchy overlap with perceived event boundaries, but this overlap is particularly strong for transmodal regions such as the dorsal medial prefrontal cortex, anterior cingulate cortex, left superior and middle frontal gyrus, and anterior insula. In line with EST, the strong alignment in the anterior cingulate cortex suggests that the disparity between predicted and perceived sensory input may play a role in the experience of an event boundary (Holroyd and Coles, 2002; Kurby and Zacks, 2008), while the involvement of the dorsal medial prefrontal cortex is in line with previous studies linking this region to representations of specific events (Baldassano et al., 2018; Krueger et al., 2009; Liu et al., 2022).

Once we accounted for the maximal possible overlap given the number of neural states in a particular brain region, we also found strong overlap in unimodal areas such as the visual, auditory, and somatosensory cortices. This finding suggests that some of the neural state boundaries that can be identified in early sensory regions are also consciously experienced as an event boundary. Potentially because these boundaries are propagated to regions further up in the cortical hierarchy. Which of the boundaries in lower-level areas propagate to higher-order cortical areas may be moderated by attentional mechanisms, which are known to alter cortical information processing through long-range signals (Buschman and Miller, 2007; Gregoriou et al., 2009). When participants are not attending the sensory input (i.e., during daydreaming), there may be much lower correspondence between neural state boundaries in higher- and lower-level regions.

So, what do these neural states represent? Recent work by Chien and Honey, 2020 has shown that neural activity around an artificially introduced event boundary can be effectively modeled by ongoing information integration, which is reset by a gating mechanism, very much in line with the mechanism proposed to underlie event segmentation (Kurby and Zacks, 2008). Similarly, neural states may represent information integration about a particular stable feature of the environment, which is reset when that feature undergoes a substantial change (Bromis et al., 2022). This suggests that neural states in early visual cortex may represent short-lived visual features of the external environment, while states in anterior temporal cortex may contain high-level semantic representations related to the ongoing narrative (Clarke and Tyler, 2015). For transmodal regions such as the medial prefrontal cortex, or middle frontal gyrus, that have been associated with many different high level cognitive processes (Duncan, 2010; van Kesteren et al., 2012; Simony et al., 2016), it is not yet clear what a distinct neural state might represent. Just as perceived event boundaries can be related to changes in one or multiple situational dimensions, such as changes in goals or locations (Clewett et al., 2019; Zacks et al., 2009), neural state boundaries in transmodal cortical areas may not necessarily reflect one particular type of change. State boundaries in these regions are likely also dependent on the goals of the viewer (Wen et al., 2020).

We also investigated the factors that distinguish neural state boundaries that traverse the hierarchy from those that do not. It has previously been shown that changes across multiple aspects of the narrative are more likely to result in an experienced event boundary (Zacks et al., 2010). In line with this, we observed that boundaries that were represented in more brain regions at the same time were also more likely to be associated with the experience of an event boundary. The strength of the neural state boundary, as measured by the amount of change in neural activity patterns, was also identified as a factor that can to some degree distinguish neural states that appear in subjective experience from the neural states that do not, particularly in temporal cortex, inferior frontal gyrus, precuneus, and medial prefrontal gyrus. This suggests that a neural state boundary is not an all or none occurrence. Instead, the reset of representations at neural state boundaries (Chien and Honey, 2020) may differ based on what is happening in other brain regions, on the current attentional focus, or based on the degree of change in the representations of the environment in that particular brain region.

More evidence for the idea of a nested cortical hierarchy of neural state boundaries comes from our connectivity analyses, which show that neural state boundaries are shared both within and across groups of regions that partly resemble well-known functional brain networks. This sharing of boundaries across different cortical areas may suggest that neural states in higher-level cortical regions represent an overarching representation that corresponds to many distinct states in lower-level cortical areas, which all represent different features of that overarching representation (e.g., words spoken, characters on screen, or locations within a particular situation). This is in line with previous conceptualizations of events as partonomic hierarchies (Zacks et al., 2001a) and with other models of hierarchical neural representations, such as the hub-and-spokes model for semantic representations, which proposes that semantic knowledge is represented by the interaction between modality-specific brain regions and a transmodal semantic representational hub in the anterior temporal lobe (Lambon Ralph et al., 2010; Rogers et al., 2004). It is also in line with a recently proposed hierarchical representation of episodic memories, in which items that are linked within small-scale events are in turn linked within large-scale episodic narratives (Andermane et al., 2021).

Timescales of information processing across the cortex

While previous studies have been able to show regional differences in the timescale of information processing across parts of the cortex (Baldassano et al., 2017; Hasson et al., 2008; Honey et al., 2012; Lerner et al., 2011; Stephens et al., 2013), here we were able to reveal neural state timescales across the entire cortex for the first time. The validity of our results is supported by extensive validations using simulations (see ‘Supplementary methods’ in Appendix 1 and Geerligs et al., 2021) and the reliability of our observations across independent groups of participants. It is also supported by the similarity between our results and previous findings based on very different approaches, such as experiments with movies and auditory narratives that have been scrambled at different timescales (Hasson et al., 2008; Honey et al., 2012; Lerner et al., 2011), or resting-state fluctuations in electrocorticography (Honey et al., 2012) and fMRI data (Stephens et al., 2013).

Although we characterized brain areas based on their median state length, we observed that neural states within a region were not of equal duration, suggesting that regional timescales may change dynamically based on the features of the stimulus. This is also in line with the observed correspondence between neural state and perceived event boundaries. Event boundaries have previously been shown to align with changes in features of the narrative, such as characters, causes, goals, and spatial locations (Zacks et al., 2009). Therefore, the overlap between state boundaries and perceived event boundaries across the cortex also suggests that characteristics of the sensory input are driving the occurrence of neural state boundaries. Together, these findings show that the timescale of information processing in particular brain regions is not only driven by stable differences in the rate of temporal integration of information, which may be associated with interregional interactions in the neural circuitry (Honey et al., 2012), but also by the properties of the input that is received from the environment. Our results show that some of the areas that were not covered in previous investigations (Baldassano et al., 2017; Hasson et al., 2008; Honey et al., 2012; Lerner et al., 2011; Stephens et al., 2013), such as the medial prefrontal cortex and middle frontal gyrus, have the longest timescales of information processing. This suggests these regions at the top of the cortical hierarchy (Clarke and Tyler, 2015; Fuster, 2001) also have the slowest timescales of information processing, in line with expectations based on the hierarchical process memory framework (Hasson et al., 2015).

Functional networks of neural state boundaries

In line with previous work (Baldassano et al., 2017), we found that neural state boundaries are shared across brain regions. Our results show for the first time that these boundaries are shared within distinct functional networks. Interestingly, the networks we identify partially resemble the functional networks that are typically found using regular functional connectivity analyses (c.f. Power et al., 2011; Yeo et al., 2011), though there are some differences. For instance, the visual network was segregated into two smaller subnetworks, and for other networks, the topographies sometimes deviated somewhat from those observed in prior work.

Our results show that functional networks defined by state boundaries differ in their timescales of information processing. While some networks have a particular temporal mode of information processing, other networks show a within-network gradient of neural state timescales. For the DMN, we observed a split into posterior, superior, and anterior subnetworks with markedly different timescales. The anterior and posterior subnetworks closely resemble previously observed posterior and anterior DMN subnetworks (Andrews-Hanna et al., 2010; Campbell et al., 2013; Lei et al., 2014), while the superior subnetwork resembles the right dorsal lateral DMN subnetwork (Gordon et al., 2020). The posterior/fast DMN is particularly prominent in the precuneus and angular gyri, which are thought to engage in episodic memory retrieval through connectivity with the hippocampal formation (Andrews-Hanna et al., 2010). The posterior DMN has also been proposed to be involved in forming mental scenes or situation models (Ranganath and Ritchey, 2012). Thus, neural states in this subnetwork may reflect the construction of mental scenes of the movie and/or retrieval of related episodic memories. The superior DMN (or right dorsal lateral DMN) showed timescales of neural states that were in between those of the anterior and posterior DMN. This network has previously been suggested to be a connector hub within the DMN, through which the FPCN exerts top-down control over the DMN. This is in line with the strong state boundary overlap we observed between searchlights in the sDMN and the FPCN. The anterior/slow DMN is particularly prominent in the medial prefrontal cortex that has been related to self-referential thought, affective processing, and integrating current information with prior knowledge (Benoit et al., 2014; Gilboa and Marlatte, 2017; van Kesteren et al., 2012; Northoff et al., 2006). The current results suggest that these processes require integration of information over longer timescales.

Real-life experience

Although event segmentation is thought to be a pivotal aspect of how information is processed in real life (Zacks et al., 2007), it is often not considered in experimental settings, where events are predetermined by the trial or block structure. This study and previous work (Baldassano et al., 2017) show that we are now able to investigate brain activity as it unfolds over time without asking participants to perform a task. This allows us to study brain function in a way that is much more similar to our daily life experience than typical cognitive neuroscience experiments (Hamilton and Huth, 2020; Lee et al., 2020; Willems et al., 2020). This opens the door for investigations of neural differences during narrative comprehension between groups of participants, such as participants with autism who may have trouble distinguishing events that require them to infer the state of mind of others (Baron-Cohen, 2000; Hasson et al., 2009), or participants with Alzheimer’s disease, who may have trouble with segmenting and encoding events in memory (Zacks et al., 2006).

It should be noted that this more naturalistic way of investigating brain activity comes at a cost of reduced experimental control (Willems et al., 2020). For example, some of the differences in brain activity that we observe over time may be associated with eye movements. Preparation of eye movements may cause activity changes in the frontal-eye-fields (Vernet et al., 2014), while execution of eye movements may alter the input in early sensory regions (Lu et al., 2016; Son et al., 2020). However, in a related study (Davis et al., 2021), we found no age difference in eye movement synchrony while viewing the same movie, despite our previous observation of reduced synchrony with age in several areas (particularly the hippocampus, medial PFC, and FPCN; Geerligs and Campbell, 2018), suggesting a disconnect between eye movements and neural activity in higher-order areas. In addition, reducing this potential confound by asking participants to fixate leads to an unnatural mode of information processing, which could arguably bias the results in different ways by requiring participants to perform a double task (monitoring eye movements in addition to watching the movie).

Conclusion

Here, we demonstrate that event segmentation is underpinned by neural state boundaries that occur in a nested cortical hierarchy. This work also provides the first cortex-wide mapping of timescales of information processing and shows that the DMN fractionates into faster and slower subnetworks. Together, these findings provide new insights into the neural mechanisms that underlie event segmentation, which in turn is a critical component of real-world perception, narrative comprehension, and episodic memory formation. What remains to be addressed is how timescales of different brain regions relate to the types of neural representations that are contained within these regions. For example, does the dissociation between the posterior and anterior DMN reflect relatively fast construction of mental scenes and slow integration with existing knowledge, respectively? Studying brain function from this perspective provides us with a new view on the organizational principles of the human brain.

Materials and methods

Participants

This study included data from 265 adults (131 females) who were aged 18–50 (mean age 36.3, SD = 8.6) from the healthy, population-derived cohort tested in stage II of the Cam-CAN project (Shafto et al., 2014; Taylor et al., 2017). Participants were native English speakers, had normal or corrected-to-normal vision and hearing, and had no neurological disorders (Shafto et al., 2014). Ethical approval for the study was obtained from the Cambridgeshire 2 (now East of England – Cambridge Central) Research Ethics Committee. Participants gave written informed consent.

Movie

Request a detailed protocol

Participants watched a black-and-white television drama by Alfred Hitchcock called Bang! You’re Dead while they were scanned with fMRI. The full 25 min episode was shortened to 8 min, preserving the narrative of the episode (Shafto et al., 2014). This shortened version of the movie has been shown to elicit robust brain activity, synchronized across participants (Campbell et al., 2015; Geerligs and Campbell, 2018). Participants were instructed to watch, listen, and pay attention to the movie.

fMRI data acquisition

Request a detailed protocol

The details of the fMRI data acquisition are described in Geerligs and Campbell, 2018. In short, 193 volumes of movie data were acquired with a 32-channel head-coil, using a multi-echo, T2*-weighted EPI sequence. Each volume contained 32 axial slices (acquired in descending order), with slice thickness of 3.7 mm and interslice gap of 20% (TR = 2470 ms; five echoes [TE = 9.4 ms, 21.2 ms, 33 ms, 45 ms, 57 ms]; flip angle = 78°; FOV = 192 mm × 192 mm; voxel size = 3 mm × 3 mm × 4.44 mm), the acquisition time was 8 min and 13 s. High-resolution (1 mm × 1mm × 1 mm) T1- and T2-weighted images were also acquired.

Data preprocessing and hyperalignment

Request a detailed protocol

The initial steps of data preprocessing for the movie data were the same as in Geerligs and Campbell, 2018 and are described there in detail. Briefly, the preprocessing steps included deobliquing of each TE, slice time correction, and realignment of each TE to the first TE in the run, using AFNI (version AFNI_17.1.01; https://afni.nimh.nih.gov; Cox, 1996). To denoise the data for each participant, we used multi-echo independent component analysis (ME-ICA), which is a very promising method for removal of non-BOLD-like components from the fMRI data, including effects of head motion (Kundu et al., 2012; Kundu et al., 2013). Co-registration followed by DARTEL intersubject alignment was used to align participants to MNI space using SPM12 software (http://www.fil.ion.ucl.ac.uk/spm).

To optimally align voxels across participants in the movie dataset, we subsequently used whole-brain searchlight hyperalignment as implemented in the PyMVPA toolbox (Guntupalli et al., 2016; Hanke et al., 2009). Hyperalignment is an important step in the pipeline because the neural state segmentation method relies on group-averaged voxel-level data. Hyperalignment uses Procrustes transformations to derive the optimal rotation parameters that minimize intersubject distances between responses to the same timepoints in the movie. The details of the procedure are identical to those in Geerligs et al., 2021. After hyperalignment, the data were highpass-filtered with a cut-off of 0.008 Hz. For the analyses that included 2 or 15 independent groups of participants, we ran hyperalignment separately within each subgroup to make sure that datasets remained fully independent.

Data-driven detection of neural state boundaries

Request a detailed protocol

To identify neural state boundaries in the fMRI data, we used GSBS (Geerligs et al., 2021). GSBS performs an iterative search for state boundary locations that optimize the similarity between the average activity patterns in a neural state and the (original) brain activity at each corresponding timepoint. At each iteration of the algorithm, previous boundary locations are fine-tuned by shifting them by 1 TR (earlier or later) if this further improves the fit. To determine the optimal number of boundaries in each brain region, we used the t-distance metric. This metric identifies the optimal number of states, such that timepoints within a state have maximally similar brain activity patterns, while timepoints in consecutive states are maximally dissimilar. The validity of these methods has been tested extensively in previous work, with both simulated and empirical data (Geerligs et al., 2021). The input to the GSBS algorithm consists of a set of voxel time courses within a searchlight and a maximum value for the number of states, which we set to 100, roughly corresponding to half the number of TRs in our data (Geerligs et al., 2021).

Here we improved on the existing method in three ways to increase the validity and reliability of our results. First, GSBS previously placed one boundary in each iteration. We found that for some brain regions this version of the algorithm showed suboptimal performance. A boundary corresponding to a strong state transition was placed in a relatively late iteration of the GSBS algorithm. This led to a steep increase in the t-distance in this particular iteration, resulting in a solution with more neural state boundaries than might be necessary or optimal (for more details, see the ‘Supplementary methods’ section in Appendix 1 and , Appendix 1—figure 1A). We were able to address this issue by allowing the algorithm to place two boundaries at a time. A 2-D search is performed, which allows the algorithm to determine the location of a new state, rather than identifying a boundary between two states. A restriction to the search is that both boundaries must be placed within a single previously existing state. In some cases, it may be more optimal to place one new boundary than two, for example, when an existing state should be split in two (rather than three) substates. To accommodate this, we allow the algorithm to determine whether one or two boundaries should be placed at a time, based on which of these options results in the highest t-distance.

As a consequence of this change in the fitting procedure, we also adjusted the boundary fine-tuning. While we previously fine-tuned boundaries in the order they were detected (i.e., first to last), we now perform the fine-tuning starting from the weakest boundary and ending with the strongest boundary. These changes to the algorithm are all evaluated extensively in the ‘Supplementary methods’ section in Appendix 1. Code that implements the improved version of GSBS in Python is available in the StateSegmentation Python package (https://pypi.org/project/statesegmentation/).

The final change compared to our previous work entails the use of deconvolved data. We observed that the algorithm was often unable to differentiate short states from transitions between longer states due to the slow nature of the hemodynamic response. This issue can be resolved by first deconvolving the data. Simulations and empirical results demonstrate that these changes resulted in stark increases in the reliability of our results (see ‘Supplementary methods’ in Appendix 1). The data were deconvolved using Wiener deconvolution as implemented in the rsHRF toolbox (version 1.5.8), based the canonical hemodynamic response function (HRF; Wu et al., 2021). Importantly, we did not use the iterative Wiener filter algorithm as we noticed that this blurred the boundaries between neural states. We also investigated the effects of estimating the HRF shape based on the movie fMRI data instead of using the canonical HRF and found that this did not have a marked impact on the results (see ‘Supplementary methods’ in Appendix 1).

Whole-brain search for neural state boundaries

Request a detailed protocol

We applied GSBS in a searchlight to the hyperaligned movie data. Spherical searchlights were scanned within the Harvard-Oxford cortical mask with a step size of two voxels and a radius of three voxels (Desikan et al., 2006). This resulted in searchlights with an average size of 97 voxels (max: 123; IQR: 82–115); this variation in searchlight size was due to the exclusion of out-of-brain voxels. Only searchlights with more than 15 voxels were included in the analysis.

Previous analyses have shown that neural state boundaries cannot be identified reliably in single-subject data. Instead data should be averaged across a group of at least ~17 participants to eliminate sources of noise from the data (Geerligs et al., 2021). As the group size increases, the reliability of the results also increases. Therefore, all the results reported here are with the maximal possible group size. To illustrate the reliability of the cortical hierarchy of state durations and the communities of timepoints, we randomly divided the data into two independent samples of ~135 participants each before identifying the optimal number of states. In all other analyses, the figures in the ‘Results’ section are derived from data with all participants averaged in one big group. Statistical testing to determine statistical significance of these results is done with data in which participants were grouped in 15 smaller independent subgroups of 17/18 randomly selected participants per group.

Defining event boundaries

Request a detailed protocol

Event boundaries in the Cam-CAN movie dataset were identified by Ben-Yakov and Henson, 2018 based on data from 16 observers. These participants watched the movie outside the scanner and indicated with a keypress when they felt ‘one event (meaningful unit) ended and another began.’ Participants were not able to rewind the movie. Ben-Yakov and Henson, 2018 referred to the number of observers that identified a boundary at the same time as the boundary salience. In line with their approach, we only included boundaries identified by at least five observers. This resulted in a total of 19 boundaries separated by 6.5–93.7 s, with a salience varying from 5 to 16 observers (mean = 10).

Comparison of neural state boundaries to event boundaries

Request a detailed protocol

To compare the neural state boundaries across regions to the event boundaries, we computed two overlap metrics; the absolute and relative boundary overlap. Both overlap measures were scaled with respect to the expected number of overlapping boundaries. To compute these values, we define E as the event boundary time series and Si as the neural state boundary time series for searchlight i. These time series contain zeros at each timepoint t when there is no change in state/event and ones at each timepoint when there is a transition to a different state/event.

The overlap between event boundaries and state boundaries in searchlight i is defined as

Oi=t=1nEtSi,t

where n is the number of TRs.

If we assume that there is no association between the occurrence of event boundaries and state boundaries, the expected number of overlapping boundaries is defined as in Zacks et al., 2001a as:

OEi=1nt=1nEtt=1nSi,t

Because the number of overlapping boundaries will increase as the number of state boundaries increases, the absolute overlap (OA) was scaled such that it was zero when it was equal to the expected overlap and one when all neural state boundaries overlapped with an event boundary. The absolute overlap therefore quantifies the proportion of the neural state boundaries that overlap with an event boundary:

OAi=OiOEit=1nSi,tOEi.

Instead, the relative overlap (OR) was scaled such that is was one when all event boundaries overlapped with a neural state (or when all neural state boundaries overlapped with an event boundary if there were fewer state boundaries than event boundaries). In this way, this metric quantifies the overlap without penalizing regions that have more or fewer state boundaries than event boundaries. The relative overlap is defined as

ORi=OiOEimin{t=1nEt,t=1nSi,t}OEi.

For each searchlight, we tested whether the boundary overlap was significantly different from zero across the 15 independent samples.

In addition to investigating the overlap between the event boundaries and the state boundaries, we also investigated the effect of boundary strength. We define boundary strength as the Pearson correlation distance between the neural activity patterns of consecutive neural states. We investigated whether taking the strength of state boundaries into account improved the absolute overlap compared to using the binary definition of state boundaries. To do this, we change the neural state boundary time series for searchlight Si such that, instead of ones, it contains the observed state boundary strength when there is a transition to a different state. After redefining Si in this way, we recomputed the absolute overlap and investigated which brain regions showed a significant increase in overlap when we compare the strength-based absolute overlap (OA-STi) to the binary absolute overlap (OAi), across the 15 independent samples.

Quantification of boundary overlap between searchlights

Request a detailed protocol

In order to quantify whether the overlap between neural state boundaries between different brain regions was larger than expected based on the number of state boundaries, we computed the relative boundary overlap as described above. We used the relative, instead of the absolute overlap, to make sure that the overlap between regions was not biased by regional differences in the number of states. The relative boundary overlap allows us to quantify the degree to which state boundaries are nested. The overlap between the neural state time series of searchlights i and j is defined as

Oi,j=t=1nSi,tSj,t.

Here, we used the binary definition of Si and Sj, containing ones when there was a transition between states and zeros when there was no transition.

The expected overlap between neural state boundaries was quantified as

OEi,j=1nt=1nSi,tt=1nSj,t.

The relative boundary overlap metric (OR) was scaled such that it was zero when it was equal to the expected overlap and one when it was equal to the maximal possible overlap:

ORi,j=Oi,jOEi,jmin{t=1nSi,t,t=1nSj,t}OEi,j.

For each pair of searchlights, we tested whether the boundary overlap was significantly different from zero across the 15 independent samples.

Identification of functional networks

Request a detailed protocol

In order to identify networks of regions that contained the same neural state boundaries, we computed the boundary overlap between each pair of searchlights as described above. We used the data in which all 265 participants were averaged. Based on the boundary overlap between all searchlight pairs, functional networks were detected using a consensus partitioning algorithm (Lancichinetti and Fortunato, 2012), as implemented in the Brain Connectivity Toolbox (Rubinov and Sporns, 2010). The aim of the partitioning was to identify networks (groups) of searchlights with high boundary overlap between searchlights within each network and low(er) overlap between searchlights in different networks. Specifically, an initial partition into functional networks was created using the Louvain modularity algorithm (Blondel et al., 2008), which was refined using a modularity fine-tuning algorithm (Sun et al., 2009) to optimize the modularity. The fit of the partitioning was quantified using an asymmetric measure of modularity that assigns a lower importance to negative weights than positive weights (Rubinov and Sporns, 2011).

Because the modularity maximization is stochastic, we repeated the partitioning 100 times. Subsequently, all 100 repetitions for all of the groups were combined into a consensus matrix. Each element in the consensus matrix indicates the proportion of repetitions and groups in which the corresponding two searchlights were assigned to the same network. The consensus matrix was thresholded such that values less than those expected by chance were set to zero (Bassett et al., 2013). The values expected by chance were computed by randomly assigning module labels to each searchlight. This thresholded consensus matrix was used as the input for a new partitioning, using the same method described above, until the algorithm converged to a single partition (such that the final consensus matrix consisted only of ones and zeroes).

The procedure described above was applied for different values of the resolution parameter γ (varying γ between 1 and 3; Reichardt and Bornholdt, 2006). Increasing the value of γ allows for the detection of smaller networks. We used the same values for γ across the initial and consensus partitioning. We selected the partition with the highest similarity to a previous whole brain network partition (Power et al., 2011), as measured by aMI (Xuan Vinh et al., 2010). We specifically chose the parcellation by Power et al., 2011 as a reference as it proved functional network labels per voxel, rather than regional of interest, making it more similar to our searchlight analyses. To compare our network labels for each searchlight to the voxelwise Power networks, we labeled each searchlight according to the Power network label that occurred most frequently in the searchlight voxels. The highest similarity was observed for gamma = 1.8 (aMI = 0.39). We named each functional network we identified in accordance with the Power network that it overlapped most with, in addition to a descriptive term about the network location (e.g., ventral, posterior) or function (early, late).

Co-occurrence of neural state boundaries and events

Request a detailed protocol

On the level of functional networks, we investigated the association between neural boundary co-occurrence and event boundaries. Just like our investigation of the role of boundary strength, we investigated whether taking boundary co-occurrence into account would increase the absolute overlap with events. We did this for both boundary co-occurrence within the network that a given searchlight is part of, as well as the co-occurrence across all searchlights in the brain.

To do this, we changed the neural state time series for searchlight i (Si) such that for timepoints with state transitions, it does not contain ones, but the proportion of searchlights within that searchlights’ network (or within all searchlights in the brain whole brain) that also show a neural state boundary at that timepoint. After redefining Si in these ways, we recomputed the absolute overlap. This resulted in three measures of absolute overlap: the binary overlap (OAi), the within-network co-occurrence overlap (OA-Ni), and the whole-brain co-occurrence overlap (OA-WBi) and investigated which brain regions showed a significant increase in overlap when we compared OA-Ni and OA-WBi to OAi, across the 15 independent samples.

To look in more detail at how boundaries that are shared vs. boundaries that are not shared are associated with the occurrence of an event boundary, we performed an additional analysis at the level pairs of searchlights. For each pair of searchlights i and j, we created three sets of neural state boundaries time series, boundaries unique to searchlights i or j: Si,j and Sj,i and boundaries shared between searchlights i and j: Si&j. More formally, using the binary definition of the neural state boundary time series Si and Sj, these are defined at each timepoint t as

Si&j,t=Si,tSj,t,
Si,j,t=Si,tSi&j,t,
Sj,i,t=Sj,tSi&j,t.

Then, we investigated the absolute overlap between each of these three boundary series and the event boundaries as described in the section ‘Comparison of neural state boundaries to event boundaries.’ This resulted in three estimates of absolute boundary overlap; for boundaries unique to searchlight i (OAi,j) and searchlight j (OAj,i) and the shared boundaries (OAi&j). Then we tested whether the absolute overlap for the shared boundaries was larger than the absolute overlap for non-shared boundaries using the searchlight that showed the largest overlap in their unique boundaries as the baseline: OAi&j>max{OAi,j,OAj,i}. Because the absolute boundary overlap is scaled by the total number of neural state boundaries, it is not biased when there is a larger or smaller number of shared/non-shared states between searchlights i and j. It is only affected by the proportion of neural state boundaries that overlap with an event boundary. If that proportion is the same for shared and non-shared boundaries, the overlap is also the same.

Finally, we performed an exploratory analysis to further investigate how neural state boundaries are shared across the cortical hierarchy. To this end, we used a WSBM to identify groups of timepoints, which we will refer to as ‘communities’ (Aicher et al., 2015). The advantage of WSBM is that it can identify different types of community structures, such as assortative communities (similar to modularity maximization) or core-periphery communities. As the input to the WSBM, we computed the Euclidean distance between the neural state boundary vectors of each timepoint. These neural state boundary vectors contain zeros for searchlights with no boundary and ones for searchlights with a neural state boundary at a specific timepoint. We varied the number of communities from 2 to 10, and we repeated the community detection 1000 times for each number of communities with a random initialization. The WSBM can be informed by the absence or presence of connections and by the connection weights. The alpha parameter α determines the trade-off between the two. Because we used an unthresholded Euclidean distance matrix as the input to the WSBM, we based the community detection only on the weights and not on the absence or presence of certain connections (fixing α to 1). The optimal number of communities was based on the log-likelihood for each number of communities. After identifying the communities, we ordered them based on the average number of neural states per timepoint in each cluster. The algorithm was implemented in MATLAB using code made available at the author’s personal website (http://tuvalu.santafe.edu/waaronc/wsbm/).

Statistical testing and data visualization

Request a detailed protocol

The results reported in the article are based on analyses in which data were averaged over all 265 participants (or two groups of ~127 participants for the results in Figure 1). To investigate the statistical significance of the associations within each searchlight, network, or network pair, we also ran separate analyses within 15 independent samples of participants. For each metric of interest, we obtained a p-value for each searchlight by testing whether this metric differed significantly from zero across all 15 independent samples using a Wilcoxon signed-rank test (Wilcoxon, 1945). p-Values were corrected for multiple comparisons using FDR correction (Benjamini and Hochberg, 2000). Results in which the sign of the effect in the one group analysis did not match the sign of the average effect across the 15 independent subgroups were considered non-significant.

For all searchlight-based analyses, p-values from the searchlights were projected to the voxel level and averaged across the searchlights that overlapped each voxel before they were thresholded using the FDR-corrected critical p-value (Benjamini and Hochberg, 2000). When projecting the results of the analyses to the voxel level, we excluded voxels for which less than half of the searchlights that covered that voxel were included in the analysis. These excluded searchlights had too few in-brain voxels (see section ‘Whole-brain search for neural state boundaries’). Data were projected to the surface for visualization using the Caret toolbox (Van Essen et al., 2001).

Appendix 1

Supplementary methods

Detecting states

Since the publication of our article describing the GSBS algorithm, we discovered some issues that have resulted in several improvements to the algorithm. First, we discovered that for specific brain regions the original GSBS algorithm performed suboptimally; the placement of one new boundary at a late stage in the fitting process resulted in a large increase in the t-distances (our measure of fit; see Appendix 1—figure 1A). This suggests that a strong neural state boundary (i.e., demarcating a large change in neural activity patterns) was detected only in a late iteration of the algorithm, which led to an overestimation of the number of neural states. To deal with this problem, we adapted the algorithm, such that it can place two boundaries at the same time, essentially demarcating the location of a new ‘substate’ within a previously defined state (as described in ‘Materials and methods’). In the following, we refer to this adapted version as states-GSBS. This change in the fitting procedure remedied the issues we experienced before (see Appendix 1—figure 1B) and resulted in more robust fitting behavior, which we observed across many brain regions. It also resulted in a change in the approach we used to fine-tune boundary locations; while we previously fine-tuned boundary locations based on their order of detection, the order is now determined by the strength of the boundaries (weakest – strongest).

Appendix 1—figure 1
T-distance curves and the optimal number of states for the different versions of the greedy state boundary search (GSBS) algorithm.

(A) The t-distance curve for the original GSBS implementation for an example brain region. (B) The t-distance curve for the same brain region with the new option to place two boundaries in one iteration of the algorithm. The dotted lines indicated the optimal number of states for the original GSBS algorithm (blue line) and the states-GSBS algorithm (red line).

To investigate how these changes to GSBS impacted reliability, we split the data in two independent groups of participants and looked at the percentage of overlapping boundaries between the groups for each searchlight. To make sure differences in number of states between methods did not impact our results, we fixed the number of state boundaries to 18 or 19. Because the states-GSBS algorithm can place one or two boundaries at a time, we cannot fix the number of state boundaries exactly, which is why it can be either 18 or 19. We found that the number of overlapping boundaries between groups was substantially higher for states-GSBS compared to the original GSBS implementation and also compared to the GSBS implementation with altered fine-tuning (see Appendix 1—figure 2A). This was also the case when we used the optimal number of states as determined by the t-distance, instead of fixing the number of states (see Appendix 1—figure 2B). We also investigated the reliability of regional differences in states duration by computing the correlations in median state duration across all searchlights between the two independent groups. Again we observed that reliability increased substantially for states-GSBS compared to the original GSBS implementation and also compared to the GSBS implementation with altered fine-tuning (see Appendix 1—figure 2C).

Appendix 1—figure 2
Comparing different implementations of the greedy state boundary search (GSBS) algorithm and different data preprocessing steps.

(A) The percentage of overlapping boundaries between two independent groups for each searchlight. The bar shows the mean across 5029 searchlights, while the error bar shows the standard error. The number of states was fixed to k = 18/19. (B) Same as (A) but now the number of states was determined by the optimal t-distance. (C) The correlation between the estimated median state lengths over all searchlights between two independent groups (correlation computed across 5029 searchlights). Bounds FTo = the original GSBS implementation; bounds FTs = the GSBS implementation with strength-ordered fine-tuning; states = the states-GSBS implementation that can place two boundaries at a time; states DC = states GSBS applied to data deconvolved with a canonical hemodynamic response function (HRF); states DE = states GSBS applied to data deconvolved with an estimated HRF.

Deconvolution

Another issue we discovered with GSBS is that it was unable to detect short states (of one or two TRs) in some cases. Specifically, we noticed that this happens when consecutive neural states are strongly anticorrelated. We observed such anticorrelated states in many of our searchlights. To investigate this issue, we simulated data with 15, 30, or 50 neural states within 200 TRs using the same setup as our previous work (Geerligs et al., 2021). However, instead of randomly generating an activity pattern per state, we used one activity pattern that we inverted when there was a state boundary. This resulted in strongly anticorrelated states (see Appendix 1—figure 3A). In this simulated setup, we found that as the number of states increased and there were more states with very short durations, the number of states was underestimated by states-GSBS. We hypothesized that this was due to the slow hemodynamic response, which obscures transitions between short states. Indeed, when we deconvolved the simulated data, the number of states was estimated correctly, even when there were 50 states (see Appendix 1—figure 4B).

Appendix 1—figure 3
Performance of states-GSBS on simulated data with anticorrelated states and (from left to right) 15, 30, or 50 states.

(A) The correlation matrices with the detected neural state boundaries in white. (B) The t-distance curves, where the black line indicates the simulated number of states. For k = 30 and k = 50, the t-distance peaks at a number of states that is below the simulated number of states, suggesting that some of the boundaries of short-lasting neural states are not detected. GSBS, greedy state boundary search.

Appendix 1—figure 4
Estimated hemodynamic response function (HRF) peak delays and the impact of HRF peak delay on the estimated number of states in simulated data.

(A) The estimates of the HRF peak delays for all searchlights are shown in a histogram. The values are averaged across all participants within each of the independent groups. The distribution is highly similar for both groups. (B) t-distance curves are shown for simulated data with different HRF delays that are deconvolved with a canonical HRF. There is no systematic under- or overestimation of the number of states when the HRF peak delays are in the range of the empirical data.

Because we wanted to be able to identify states with both short and long durations, we chose to apply HRF deconvolution to our data before running states-GSBS using a canonical HRF. We found that deconvolution resulted in a very large improvement in the reliability of regional differences in states duration and also substantially increased the boundary overlap between independent samples (see Appendix 1—figure 2). However, it should also be noted that deconvolution may not be optimal for every study interested in neural states. It is particularly important for studies that are interested in accurately identifying the number of neural states in particular brain regions and for studies that are interested in short-lasting states. For studies that are more interested in transitions between longer-lasting states, the deconvolution may actually result in lower signal-to-noise. In particular, because we also observed that deconvolution results in reduced similarity of timepoints within the same state as well as increased similarity of timepoints between states.

Regional differences in HRF

One concern with deconvolution is that there are known differences between regions in the timing of the HRF (Taylor et al., 2018). To investigate whether such differences might impact our results, we estimated the HRF for each participant and each searchlight using the rsHRF toolbox that is designed to estimate HRFs in resting state data (Wu et al., 2021). In this case, we applied the algorithm to our fMRI data recorded during movie watching. Because HRF estimation is applied to single-subject data that contains many sources of noise, we performed some extra data denoising steps (as in Geerligs and Campbell, 2018), which included regressing out signals from the CSF and white matter as well as head motion signals. This denoised data was used to run the HRF estimation for each participant and for each voxel within a searchlight. Subsequently, the HRF shape was averaged across all voxels in a searchlight and the data were deconvolved for each participant. Importantly, we used the same data as before (without the extra denoising steps) as the input for the deconvolution to make sure results were comparable. Also, we observed that data cleaning removed some of the signal of interest, resulting in slightly decreased reliability of boundaries. After deconvolution, the data were again averaged within the two independent groups of participants and then we applied the states-GSBS algorithm.

Appendix 1—figure 4A shows the estimated HRF peak delays for each brain region after averaging the estimated peaks across all participants within each of the two independent groups. The differences between searchlights in their estimated HRF delay (averaged across participants) were highly reliable across the two independent groups (r = 0.87). The peak delays varied between 4.6 and 5.4 s, which is very similar to the delay of the canonical HRF (5 s). When we deconvolved the data with the estimated HRF instead of the canonical HRF, we found that this resulted in a slight decrease in the boundary overlap between independent samples and also slightly reduced the reliability of regional differences in states duration (see Appendix 1—figure 2). Regional differences in state duration were highly similar when we compared the deconvolution with the canonical HRF and the deconvolution with the estimated HRF (r = 0.92 and r = 0.93 for the two groups at the voxel level and r = 0.73 and r = 0.76 at the level of searchlights). These results suggest that regional differences in the HRF shape did not bias the estimated regional timescales.

To investigate this in more detail, we ran additional simulations to investigate the consequences of slight deviation in the HRF shape on the recovery of the state boundaries (see Appendix 1—figure 4B). Simulated data with HRF delays between 4.6 and 5.4 s, which were deconvolved with a canonical HRF, did not show an under- or overestimation in the number of states. Together, these results show that the regional differences we observe in the duration of neural states when we use data that is deconvolved with a canonical HRF cannot be explained by regional differences in the HRF shape. Furthermore, it is not clear that the extra step of estimating the HRF shape results in more accurate or reliable results. That is why we opted for the simpler approach of canonical HRF estimation throughout the article.

Sample size effects

To look at how replicable results are across samples and how this depends on the sample size, we computed the proportion of boundaries that was shared between each unique pair of participant groups. In line with our previous work, we observed that the boundary time courses were a lot more consistent between different pairs of participant groups when the data were split into two independent groups of 127/128 participants per group (63% of boundaries shared on average) than when the data were split into 15 groups of around 17/18 participants per group (49% of boundaries shared on average). That is why, throughout the article, we report the results with the largest possible sample size. To make sure there were enough unique data points to perform tests for statistical significance to show the consistency of effects across samples, all statistical analyses are performed on the data split into 15 independent groups.

Supplementary results

Reliability

As a first step, we determined in which searchlights neural state boundaries were sufficiently reliable for follow-up analyses when we looked at the smallest and therefore least reliable sample size (15 groups with 17/18 participants per group). Specifically, we investigated which searchlights showed a significantly positive Pearson correlation between state boundary time courses in each participant group and the average state boundary time courses across all other participant groups (similar to Geerligs et al., 2021). Reliable boundary time courses were observed in 5029 out of 5061 searchlights. The 32 regions without reliable boundary time courses were not included in any of the analyses reported in the article or Appendix 1 (‘Supplementary methods’ or ‘Supplementary results’; see Appendix 1—figure 5A for a map of these regions). In these 5029 regions, the reliability was highest for searchlights around the visual and auditory cortex and lowest around the paracentral lobule and the posterior parts of the orbitofrontal cortex (see Appendix 1—figure 5B).

Appendix 1—figure 5
Reliability of neural states boundaries across the brain and an overview of searchlights that were excluded due to poor reliability.

(A) Searchlights that were excluded due to poor reliability. (B) A map of the reliability of neural state boundaries across the cortex.

Overlap between searchlights

To investigate whether the strong relative boundary overlap between brain regions could be caused by shared sources of noise across brain regions, we recomputed this overlap based on the data from two independent groups of participants. To make sure the resulting matrix remained symmetric, we averaged the results across the two possible orders of participant groups (i.e., comparing searchlight 1 in subgroup 1 to searchlight 2 in subgroup 2, as well as comparing searchlight 1 in subgroup 2 to searchlight 2 in subgroup 1). The results show that the relative boundary overlap computed in this way across independent datasets is highly similar to the boundary overlap within one subgroup (r = 0.69, see Appendix 1—figure 6A and B), showing that shared noise cannot be the cause of the strong overlap we observed.

Appendix 1—figure 6
Investigating the role of shared noise, ‘ regular’ functional connectivity and regional differences in state length in shaping the boundary overlap between pairs of searchlights.

(A) The relative neural state boundary overlap between each pair of searchlights. (B) Same as (A), but computed between two independent groups of participants. This ensures that the overlap cannot be caused by noise shared across brain regions. (C) The correlation matrix based on the averaged brain activity time course in each searchlight (i.e., standard measure of functional connectivity). (D) The difference between each pair of searchlights in median state length was markedly different from the relative boundary overlap (shown in A), showing that the boundary overlap between different regions was not just due to regional differences in the optimal number of states.

To examine whether the relative boundary overlap is simply a proxy for ‘regular’ functional connectivity, we compared it to the correlation between mean activity time courses in each searchlight (see Appendix 1—figure 6A and C). We found that these correlation patterns could only explain a small part of the regional differences in boundary overlap. To investigate whether the boundary overlap is simply a result of regional similarities in state length, we compared the boundary overlap to regional differences in state duration (see Appendix 1—figure 6A and D). We found that differences in state duration cannot explain the overlap patterns we observed.

Effects of noise on overlap between neural states and events for shared boundaries

One concern is that identifying boundaries shared by two regions has a similar effect to averaging, which provides a better estimation of boundaries within each searchlight because it reduces noise. This noise reduction could be the cause of the increased overlap between events and neural states for shared boundaries vs. non-shared boundaries. To investigate this possibility, we examined the increase in overlap for shared vs. non-shared values in the data averaged across 265 participants as well as for each independent subgroup of 17/18 participants. If noise reduction is the cause of the increase in overlap with event boundaries, we should expect the difference between shared and non-shared boundaries to be largest in the smaller independent subgroups where there is the most to be gained from noise reduction. In contrast, if the increase in overlap with event boundaries is a real effect, not due to noise, its effect size should be larger in the data averaged across all participants, where estimates of boundary locations are more accurate. The results in Appendix 1—figure 7 show that the latter interpretation is correct, making it unlikely that the observed increase in overlap between neural state and event boundaries is related to noise.

Appendix 1—figure 7
Mean increase in absolute overlap for shared vs. non-shared boundaries for large and small groups of participants.

(A) Mean increase in absolute overlap for shared vs. non-shared boundaries across all pairs of searchlights and (B) for the pairs of searchlights that showed a significant increase in overlap.The effect size for the full sample of 265 participants is shown by the red line, and the effect sizes for the independent subgroups of 17/18 participants are shown in blue. The effect size is larger in the data averaged across all 265 participants, suggesting that the increase in overlap is not due to noise reduction.

Head motion

Another quality check we performed was to investigate the association between neural state boundaries and head motion. First, we computed the average amount of head motion for each TR across all the participants in each of the 15 independent groups. Second, we computed Pearson correlations between the neural state boundary time courses and the average head motion time courses. We investigated whether there was a consistently positive or negative association between state boundaries and head motion across the 15 samples after FDR correlation for multiple comparisons. This was not the case for any of the searchlights.

Stability of communities of timepoints

In the main text, we identified different communities of timepoints that varied in the degree to which neural state boundaries were shared across the cortical hierarchy. To make sure that these findings are replicable, here we repeated the same analysis across two independent samples of participants (see Appendix 1—figure 8). We found that even though the two subgroups did not have the same number of communities (four in group 1, five in group 2), the pattern of results was highly similar across the two. Both groups showed communities that differed in the degree to which boundaries propagated across the cortical hierarchy and in both cases this was also associated with the occurrence of event boundaries, such that timepoints in communities with more widespread boundaries also were more likely to coincide with event boundaries.

Appendix 1—figure 8
Stability of communities of timepoints across two independent groups of participants.
Appendix 1—table 1
For each network, the table lists the network defined by Power et al., 2011 that showed the highest overlap and the percentage of searchlights in the network that overlapped with that particular Power et al., 2011 network.
Network namePower networkPercentage of searchlights
MotorSensorimotor49
Sensorimotor-medialSensorimotor45
Sensorimotor-lateralSensorimotor37
AuditoryAuditory24
Visual earlyVisual58
Visual lateVisual20
Dorsal attention networkDorsal attention network27
Cinglulo-opercular networkCinglulo-opercular network23
Fronto-parietal control networkFronto-parietal task control24
Posterior default mode networkDefault mode network57
Superior default mode networkDefault mode network24
Anterior default mode networkDefault mode network68
Appendix 1—table 2
For the three separate default mode networks (DMNs) we identified, the table lists the overlap with the posterior and anterior DMN defined in Campbell et al., 2013.
Network namePercentage of searchlights that overlap with anterior DMNPercentage of searchlights that overlap with posterior DMN
Posterior DMN3970
Superior DMN5231
Anterior DMN7765

Data availability

The data used in this project can be requested via - https://camcan-archive.mrc-cbu.cam.ac.uk/dataaccess/. The code used to generate the results in the paper is available at https://github.com/lgeerligs/NestedHierarchy (copy archived at swh:1:rev:9049f7500c6db1b90b539bcf859e59edb55f5fa6). The improvements to our GSBS algorithm that are presented in this paper are released in a Python package: https://pypi.org/project/statesegmentation/.

The following previously published data sets were used

References

    1. Baron-Cohen S
    (2000) Theory of mind and autism: a review
    International Review of Research in Mental Retardation 23:169–184.
    https://doi.org/10.1016/S0074-7750(00)80010-5
    1. Reichardt J
    2. Bornholdt S
    (2006) Statistical mechanics of community detection
    Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics 74:016110.
    https://doi.org/10.1103/PhysRevE.74.016110

Decision letter

  1. David Badre
    Reviewing Editor; Brown University, United States
  2. Michael J Frank
    Senior Editor; Brown University, United States
  3. Charan Ranganath
    Reviewer; University of California at Davis, United States

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Decision letter after peer review:

[Editors’ note: the authors submitted for reconsideration following the decision after peer review. What follows is the decision letter after the first round of review.]

Thank you for submitting your work entitled "Timescales and functional organization of event segmentation in the human brain" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and a Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Charan Ranganath (Reviewer #2).

Our decision has been reached after consultation between the reviewers. Based on these discussions and the individual reviews below, we regret to inform you that this submission will not be considered further for publication in eLife.

The reviewers were in agreement that this is an important topic and that this research is both interesting and promising. However, the reviewers raised a number of significant concerns that centered around two themes. First, there were a number of points raised about the methodology itself and its validity. After some discussion, it was decided that these methodological points could be addressable through additional analysis and/or simulation, but likely require considerably more work than would be usual for an eLife revision. The second set of concerns were with regard to the clear scientific advance over prior work; there was not consensus that these findings move the field forward in a clear way. Reviewer 1 suggested that the generalizability and impact might be improved by drawing direct links to the existing literature, including analysis of a secondary dataset as in the Baldasano et al. (2017) paper. Though, there might be other ways to clarify the impact, as well. Regardless, this is a challenging concern to address in a straightforward way through revision.

As addressing these concerns would likely require more than is typically expected for an eLife revision, it was decided to reject this submission. This being said, if you were to undertake the work required to conclusively address these issues, there was sufficient enthusiasm among reviewers that they would be willing to consider this paper again, as a new submission.

I have appended the detailed reviews to this decision letter. I hope you find them constructive with this work.

Reviewer #1:

In this paper, Geerligs et al. focus on the alignment of event boundaries across brain regions. They examine the transitions between brain states using the method introduced by Baldassano et al. (2017), and how these state transitions are shared across nodes of large-scale brain networks. They introduce a method that enables them to map event-timescales in a broader set of regions than previously possible, and they use this method to reveal how functional networks of regions share time-aligned "event transitions".

This is a well-written manuscript on a timely and important question.

My main concerns relate to the validity (and potential sources of bias) in the methodology for identifying the event-rate of each region, and I also outline a number of other areas where the conceptual and methodological framing could be improved.

p.3 "This dataset, in combination with the application of hyperalignment to optimize functional alignment (Guntupalli et al., 2016), allowed us to study event segmentation across the entire cortex for the first time, because this dataset shows reliable stimulus-driven activity (i.e., significant inter-subject correlations) over nearly all cortical brain regions (Geerligs et al., 2018). "

A central methodological question, which affects almost every claim in this manuscript, is whether the inference of event boundaries from the HMM model (the methods in Figure 1) is valid, and in what ways it might be biased. The validity question is simple: does it measure what it is supposed to measure? In particular, I would like the authors to justify the final step, in which they compute the difference between the correlation for real boundaries and the correlation for random boundaries. Surely, this difference computation will be affected by the noise ceiling of the individual ROI being examined? I understand why using the random condition as a "reference" makes some sense, but I do not understand why the final decision is made based on the simple arithmetic difference of the mean value for the random boundaries and real boundaries? I suggest that the authors justify this procedure using a simulation procedure where the ground truth about event transitions is known, and the procedure should be compared against the method applied in the original Baldassano et al. (2017) paper.

The bias question is also fairly simple: which factors influence the "k" that is inferred? In particular, if a region has high reliability or low reliability of its response across subjects, does this affect the number of events that will be inferred for that region using the HMM procedure? As noted above, this simulation could additionally investigate how the "k" value varies as function of the noise level (i.e. response reliability) of the ROI.

Additionally, although hyperalignment render a larger swathe of cortex available to analysis, but there will still be variability in the reliability of the signal across regions, and this might interact with the hyperalignment performance. In particular, the accuracy of the hyper alignment procedure (for each subject) will presumably also increase for regions whose reliability of response is higher; it is therefore very to consider whether noise (in "space") introduced by the hyperalignment procedure (and varying across regions as a function of their reliability) could further bias the measurement of the event-timescale via the HMM procedure.

Finally, to better understand this method, the authors could also apply their approach to the freely available data from the Baldassano et al. (2017) paper. Does this method produce results that are at least qualitatively similar? This could help to resolve the question of why the event timescales in this paper are shorter than those observed in the Baldassano et al. paper.

p.7: Event networks: "We found that event boundaries are shared within long-range networks that resemble the functional networks that are typically identified based on (resting state) timeseries correlations (see figure 3A)".

This is one of the most intriguing aspects of this paper. However, it would be much more convincing if the authors would replace their qualitative language (e.g. "resemble") with quantitative metrics of overlap. The overlap could be measure between (a) networks defined based on event-timing and (b) networks defined based on functional connectivity. All of the major functional networks should be available in atlases (e.g. the Yeo lab atlases) or via data sharing repositories. Thus, the authors should be able to substantiate their broad claims of "resemblance" with quantitative demonstrations of how well the event-networks match the functional-connectivity-networks. All of the visual networks as well as the FPN and DMN should be quantitatively compared against standard networks defined elsewhere in the literature.

On the same point: p.13 "The fractionation of the DMN into a fast and slow subnetwork closely aligns with the previously observed posterior and anterior DMN subnetworks (Andrews-Hanna et al., 2010; Campbell et al., 2013; Lei et al., 2014)."

Again, please quantify the alignment when claiming spatial alignment with prior findings.

p.13 "Our results show for the first time that neural events are shared across brain regions in distinct functional networks. "

The authors should consider re-wording this sentence to distinguish their findings from what was already shown in Figure 4B of Baldassano et al. (2017). In particular, note the commonality of event boundaries across early visual and late visual areas (part of the visual network), as well as the commonality of events across angular gyrus and posterior medial cortex (parts of the DMN).

On a related note, in the Abstract we read: "This work extends the definition of functional networks to the temporal domain" – I am unclear on how novel this extension is. To the best of my understanding, the concept of dynamic functional connectivity is not new (e.g. Hutchison et al., 2013), and even second-order pattern-transition methods have been employed to study functional networks (e.g. Anzellotti and Coutanche, 2018). I would like the authors to sharpen their argument for why this result is not entirely expected in light of prior work. Shouldn't members of the same functional networks be expected to exhibit state-transitions at rates higher than chance?

p.11. I struggled to follow the logic of the analysis employed in Figure 6. Why is event duration being predicted from individual frequency bands of the PSD? There is voluminous evidence for band-specific and region-specific artifact (e.g. Birn et al., 2013; Shmueli et al., 2007). Furthermore, distinct functional networks have distinct frequency profiles and coherence patterns (e.g. Salvador et al., 2008; Baria et al., 2011; Stephens et al., 2013). Finally, the frequency bands in the PSD are non-independent (because of the temporal smoothing in the BOLD signal). Therefore, the relationship between frequency band and event duration is confounded by (i) non-independence of frequencies and (ii) frequency covariation across brain regions which arises for a multitude of reasons. The results in Figure 6A seem rather noisy to me, and I imagine that this is because the regression procedure on the PSD is influenced by many interacting and confounding variables.

Another region why this analysis produces (in my opinion) curious results is that it spans distinct sensory modalities which are already known to have opposite PSD-event relationships: along the auditory pathway, PSDs get flatter as event time-scales get longer, while in the visual pathway, PSDs in V1 are already very steep, even while the event timescales are short. It is not clear what is gained by fitting a single model to regions with obviously different relationships of PSD and event structure.

p.12. "These results suggest that visual and auditory stimulation are a prerequisite for observing the temporal hierarchy we describe in this paper and that this hierarchy only partly reflects an intrinsic property of brain function that is also present in the resting state."

I do not follow the logic supporting this claim. How can we know whether the (event-based) temporal hierarchy is preserved in the resting state unless we can measure the event transitions in the resting state data? Isn't this analysis just another way of saying that the PSDs have different shapes during rest and during movie viewing?

References

Anzellotti, S., and Coutanche, M. N. (2018). Beyond functional connectivity: investigating networks of multivariate representations. Trends in cognitive sciences, 22(3), 258-269.

Baria, A. T., Baliki, M. N., Parrish, T., and Apkarian, A. V. (2011). Anatomical and Functional Assemblies of Brain BOLD Oscillations. Journal of Neuroscience, 31(21), 7910-7919. https://doi.org/10.1523/JNEUROSCI.1296-11.2011

Birn, R. M., Diamond, J. B., Smith, M. A., and Bandettini, P. A. (2006). Separating respiratory-variation-related fluctuations from neuronal-activity-related fluctuations in fMRI. Neuroimage, 31, 1536-1548. https://doi.org/10.1016/j.neuroimage.2006.02.048

Coutanche, M. N., and Thompson-Schill, S. L. (2013). Informational connectivity: identifying synchronized discriminability of multi-voxel patterns across the brain. Frontiers in human neuroscience, 7, 15.

Hutchison, R. M., Womelsdorf, T., Allen, E. A., Bandettini, P. A., Calhoun, V. D., Corbetta, M.,.… Chang, C. (2013). Dynamic functional connectivity: Promise, issues, and interpretations. NeuroImage, 80, 360-378. https://doi.org/10.1016/j.neuroimage.2013.05.079

Salvador, R., Martínez, A., Pomarol-Clotet, E., Gomar, J., Vila, F., Sarró, S.,.… Bullmore, E. (2008). A simple view of the brain through a frequency-specific functional connectivity measure. NeuroImage, 39(1), 279-289. https://doi.org/10.1016/j.neuroimage.2007.08.018

Shmueli, K., van Gelderen, P., de Zwart, J. A., Horovitz, S. G., Fukunaga, M., Jansma, J. M., and Duyn, J. H. (2007). Low-frequency fluctuations in the cardiac rate as a source of variance in the resting-state fMRI BOLD signal. Neuroimage, 38(2), 306-320.

Stephens, G. J., Honey, C. J., and Hasson, U. (2013). A place for time: The spatiotemporal structure of neural dynamics during natural audition. Journal of Neurophysiology, 110(9), 2019-2026. https://doi.org/10.1152/jn.00268.2013

Reviewer #2:

In this paper, Geerlings and colleagues leverage a large, publicly-available dataset in order to assess shared and distinct timescales of neural pattern shifts at event boundaries across different areas of the brain. In line with prior work, the authors report a gradient of timescales in neural event segmentation, with sensory regions comprising the fastest-shifting areas and 'default mode' nodes such as precuneus and medial prefrontal cortex comprising the slowest-shifiting areas. Importantly, the authors build on this previous research and demonstrate that canonical functional networks – such as the frontoparietal network, and the 'default mode' network – feature distinct subnetworks with corresponding faster and slower timescales of pattern shifts. Finally, a fairly novel analysis applied to these types of data examined power spectral density across regions, which could be used to predict event duration across regions (consistent with observed pattern shifts), and could partly, but not entirely, characterize resting-state fMRI data (suggesting that the audiovisual stimulus drove additional functional properties in brain networks not observed during rest).

Overall, this is an interesting and timely study. The question of how the brain segments naturalistic events is one of increasing popularity, and this manuscript approaches the question with a large sample size and fairly thorough analyses. That said, there are a number of questions and concerns, primarily regarding the analyses.

• Procedures such as hyperalignment, or the related shared response model used by Baldassano and colleagues, are typically implemented by training on one set of the data, and applying the alignment procedure to a separate, held-out dataset (i.e., training and testing sets). It is unclear whether this approach was taken in the current study, or whether the hyperalignment algorithm was trained and tested on same dataset. In the latter case, there is a degree of circularity in the way across-participant alignment was conducted, potentially leading to biased correlation measures. The movie used in the CamCAN dataset is only 8 minutes long, which is probably not enough data for obtaining separate training and test datasets. However, this is still potentially a serious issue for this manuscript, and I am not sure if the use of hyperalignment is appropriate. If I have misunderstood the methodology, it perhaps warrants some clarification in how the training and application of the hyperalignment algorithm proceeded. (I will note that I am aware you used cross-validation for deriving the number of events, but that is unfortunately a separate issue from a train-test split in the hyperalignment routine itself.)

• A key finding from the study is that the FPN and DMN fractionate into different subnetworks that have fast and slow timescales. As noted above, the present results are based on an analysis of data from a relatively short period of time. Although the sample size is very large, one wonders whether this distinction would remain solid with a longer movie. With a very short movie, one can only sample a small number of real events, and this could lead to some instability in estimates of the timescale of representations in relation to the events. This might be an issue in relation to the differentiation of fast and slow subnetworks within the FPN and DMN. For instance, Figure 3B, suggests that the fit values for the slow FPN remain more or less stable across a range of event durations (which presumably reflect k values?). The slow FPN shows an interesting bimodal distribution (as do many of the networks) with the second peak coinciding with the peak for the fast FPN. The differentiation is a bit more convincing for the fast and slow DMN, but it is still not clear whether there are enough events and enough fMRI data from each subject to ensure reliable estimates of the timescales. Just to provide some context for this point, some estimates suggest that reliable identification of resting state networks requires at least 20 minutes of fMRI data.

• Throughout the paper, fMRI results are described in reference to event processing, but the relationship is underdeveloped. Much of the paper relies on the Hidden Markov Model, which assumes that there is a pattern that remains stationary throughout an event. Baldassano's data shows a surprisingly strong correspondence in posterior medial cortex, but it is less clear whether this assumption is valid for other areas. In relation to this point, one can think of event processing as an accumulation of evidence. At the onset of an event, one might have a decent idea of what is about to happen, but as information comes in, the event model can be refined to make stronger predictions. These kinds of within-event dynamics would be lost in the Hidden Markov model. A related point is that the paper conflates timescales of neural states with psychologically meaningful conceptions of events. EST suggests that event segmentation is driven by prediction error-by one interpretation of the model, sensory information can change considerably without leading one to infer an event boundary. However, change in incoming sensory information would almost certainly lead to the detection of "event boundaries" across short timescales in sensory cortical areas. Figure 5 makes it fairly clear that there is a pretty strong distinction to be made between data-driven event identification based on the fMRI data and psychologically meaningful events inferred by the subjects. It would be helpful for the authors to be more clear about what the data do and do not show in relation to putative event cognition processes.

• Why were voxels with an intersubject correlation of less than r=0.35 excluded from analyses? Is this based on prior studies or preliminary analyses? It is not necessarily a bad thing if this choice was made arbitrarily, but I imagine this threshold could have important impacts on the data as presented, so it is worth clarifying.

• Was ME-ICA the only step taken to account for head motion artifacts? If so, there is some concern about whether this step was sufficient to deal with the potential confound. This is especially critical given the fairly brief time series being analyzed here. It would be more compelling to see a quantitative demonstration that head motion is not correlated with the measures of interest.

• A related issue is that of eye movements. Eye movements are related to event processing (e.g., Eisenberg et a., 2018), so one can expect neural activity related to event prediction/prediction error to be confounded with lower-level effects related to eye movements. For instance, we might expect signal artifacts in the EPI data, as well as neural activity related to the generation of eye movements, and changes in visual cortex activity resulting from eye movements. It is unlikely that this issue can be conclusively addressed with the current dataset, and it's not a deal-breaker in the sense that eye movements are intrinsically related to naturalistic event processing. However, it would be useful for the authors to discuss whether this issue is a potential limitation.

• The power spectral analyses were a bit difficult to follow, but more importantly, the motivation for the analysis was not clearly described. The main take home points from this analyses are nicely summarized at the end of p. 14, but it would be helpful to clarify the motivation for this analysis (and the need for doing it) on p.11 in the Results section. Relatedly, is Figure 6A an example spectrum from a particular voxel or region, or an average across regions?

• The take-home message appears to be that different brain networks have different timescales at which they seem to maintain event representations. Moreover, certain networks (e.g., the posterior medial/'default mode' network) do not have uniformly fast or slow timescales. The network-based analysis used here is indeed novel, but the impact of the work could be enhanced by clarifying the significance of the results in relation to what we know about event processing. The explicit demarcation of 'fast' and 'slow' subnetworks may be the key conceptual advance, as was the power spectral analysis, but it isn't clear whether these conclusions could also be ascertained from the maps shown in Baldassano et al., 2017 or other papers from the Hasson group.

This review was completed by Zach Reagh, Ph.D. in collaboration with Charan Ranganath, Ph.D. (I sign all reviews)

Reviewer #3:

Geerligs and colleagues conduct a thorough set of analyses aimed at identifying event segmentation timescales across the cortex in a large cohort of participants. They extend previous work by Baldassano et al. by covering the entire cortex, and nicely control for the power spectrum of different regions. In addition, they examine which regions share the same event boundaries, not just the same timescale, and relate these to functional connectivity networks. Overall, their work is impressive and rigorous, but there are a few points that make it somewhat difficult to assess the how strong the contribution is to our understanding of processing timescales:

1. The authors divide the brain into functional networks based on boundary similarity and find that this division is very similar to functional networks defined using resting-state timeseries correlations. They further find increased similarity between regions of different networks that are that are interconnected. Wouldn't the similarity between boundary vectors be strongly linked to the timeseries correlations (both between regions in the same network and across networks)? While the similarity-based functional networks aren't completely identical to those identified in rest, perhaps the same results would be obtained by correlating timeseries in this specific dataset, using the movie data (altering the interpretation of the results).

2. It seems that the power spectrum analysis is run both on the resting-state data and on the movie data, whereas the timescale segmentation is run only on the movie data. I expect this is because hyperalignment is possible only when using a shared stimulus, and the HMM is run only on the hyperaligned data. However, this may bias the correlations presented in figure 6 – the movie PSD-based timescale estimation would be expected to be more similar to the HMM timescales than the rest, simply because the same data is used. A more convincing analysis would be to run the HMM on the rest data as well, and test for correlations between the two estimations of event timescales in the rest data, although this would entail substantial additional analyses (as HMM would also have to be run on non-hyperaligned movie data for comparability). It would also help with point 1, testing whether similarity in boundary vectors arises directly from timeseries correlations. I realize this adds quite a bit of analysis, and the authors may prefer to avoid doing so, but the conclusions arising from the power spectrum analysis should be softened in the Results and Discussion, clearly mentioning this caveat.

3. It would aid clarity to better separate the current contributions from previous findings, in the Results, and mainly in the Discussion. The authors do describe what has previously been found, citing all relevant literature, but it would be helpful to have a clear division of previous findings and novel ones. For example in the first paragraph of the Discussion, and in general when discussing the interpretation of activity the different regions (currently regions that have already been found are somewhat intermixed with the new regions found).

[Editors’ note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "A partially nested cortical hierarchy of neural states underlies event segmentation in the human brain" for further consideration by eLife. Your revised article has been evaluated by Michael Frank (Senior Editor) and a Reviewing Editor.

The manuscript has been improved but there are some remaining issues that need to be addressed, as outlined below:

The reviewers were positive about the revisions you made to this submission and felt that extensive work had been done to improve the paper. There were a few remaining points raised by this review that could be addressed the further strengthen the paper. The Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

1. Reviewer 1 has raised some additional points for clarification in their review, as noted below. These should be clarified in a revision. Please refer to the comments below for these notes.

2. Some of the conclusions do not completely reflect the results. If additional analyses are not added, perhaps these conclusions could be rephrased, such as "some of the neural boundaries are represented throughout the hierarchy.… until eventually reflected in conscious experience" (p. 14) and "boundaries that were represented in more brain regions at the same time were also more likely to be associated with the experience of an event boundary" (p. 15).

3. Since the GSBS algorithm was fine-tuned based on the data that was later used for analysis, it would be helpful to include additional information demonstrating the choices in the optimization procedure are independent of the eventual results. For example, it isn't clear what 'important boundaries being detected late' means, whether that indicates event boundaries were being missed by the original algorithm. Combined with the fact part of the optimization was based on fixing the number of state boundaries to the number of event boundaries – could these choices have increased the chance of finding overlap between state boundaries and event boundaries?

4. Two small notes: the network defined as posterior DMN includes anterior regions, which is slightly confusing; were the regional differences in HRF assessed on the resting state data or the movie watching data?

Additional Suggestions for Revision (for the authors):

One of the reviewers had some suggestions for additional analyses that might strengthen the results. We pass them along to you here, but you should view these as optional. Only include them if you agree that they will strengthen the conclusions.

there are a few analyses that may help strengthen the conclusions - these are suggested as optional additional analyses, but the authors should feel free not to include them:

• To verify the overlap between searchlights is not due to various artifacts, it may be preferable to compare the searchlight in one region with the searchlights of other groups in the second region (following the rationale of intersubject functional connectivity vs. functional connectivity). It would also be interesting to further explore the nature of the overlap - to see whether there are specific state boundaries that drive most of the overlap or whether different pairs of regions have different overlapping boundaries. This could be used to explore the nature of the hierarchy between regions, beyond just finding that higher regions share boundaries with lower regions. For example, it could enable testing whether state shifts shared by multiple lower level regions are the ones that traverse the hierarchy.

• Further to this, it would be interesting to test whether event boundaries and non-event neural state boundaries form a similar hierarchy (though this may not be feasible with such a low number of event boundaries).

• To assess the effects of noise reduction on the overlap between neural state boundaries and event boundaries, it may be worth testing whether neural state boundaries shared across groups of participants are also more likely to be event boundaries (and specifically whether this effect is stronger in the same regions arising from the co-occurrence analysis). This analysis wouldn't provide an answer, but could help shed some light on the role of noise reduction.

Reviewer #1:

This work investigates timescales of neural pattern states (periods of time with a relatively stable activity pattern in a region) across the brain and identify links between state shifts and perceived boundaries events. In multiple regions, they find significant overlap between state shifts and event boundaries, and an even stronger overlap for state shifts that occur simultaneously in more than one region. The results are interesting and timely and extend previous work by Baldassano et al. that found a similar hierarchy in a specific set of brain regions (here extended to the entire cortex).

Strengths

The question of whether neural state shifts form a hierarchy such that state shifts in higher regions coincide with state shifts in sensory regions, and the question of whether event boundaries occur at conjunctions of shifts in different regions are both very interesting.

The optimized GSBS method nicely overcomes limitations of previous methods, as well as a previous version of GSBS. In general, justification is provided for the different analysis choices in the manuscript.

The current work goes beyond previous work by extending the analysis to the entire cortex, revealing that state shifts in higher regions of the cortex overlap with state shifts in lower regions of the hierarchy.

Weaknesses

One of the important conclusions of the paper is that simultaneous neural state shifts in multiple brain regions are more likely to be experienced as boundaries. This finding fits in nicely with existing literature, but the analysis supporting it is not as compelling as the rest of the analyses in the paper:

1. The methods section describing the analysis is not entirely clear. Do Oi, Oj refer to the number of neural state boundaries in searchlights I,j? Or the number of neural state boundaries in each that overlap with an event boundary? If the former (which was my initial interpretation), then how is the reference searchlight chosen – max {Oi,Oj}, as indicated by the formula, or the searchlight with the larger overlap of its unique boundaries (and is the overlap calculated in numerical value or the proportion of overlap)? Given the unclarity, it is difficult to assess whether the degree of overlap between neural state boundaries and event boundaries in each of the searchlights (and/or the number of boundaries in each) could affect the results. It would be helpful to provide verification (either mathematically or with simulations) that higher overlap in one/both searchlights does not lead to a larger difference in overlap between shared and non-shared boundaries.

2. The analysis focuses on pairs of searchlights/regions, demonstrating that in a subset of regions there is a higher chance of an overlap with event boundaries for neural state boundaries that are shared between two regions. Yet the interpretation goes beyond this, suggesting that "boundaries that were represented in more brain regions at the same time were also more likely to be associated with the experience of an event boundary". Additional analyses would be needed to back this claim, demonstrating that overlap between a larger number of regions increases the chance of perceiving a boundary.

3. Could the effect be due to reduction of noise rather than event boundaries arising at neural state boundaries shared between regions? Identifying boundaries shared by two regions has a similar effect to averaging, which the authors have indeed found reduces noise and provides a better estimation of boundaries within each searchlight. This possibility should be discussed.

Recommendations for the authors:

1. As this is a revision of a previous version of the manuscript, and the authors have already conducted a great deal of work to address previous concerns, I am hesitant to suggest additional analyses. However, there are a few analyses that may help strengthen the conclusions – these are suggested as optional additional analyses, but the authors should feel free not to include them:

• To verify the overlap between searchlights is not due to various artifacts, it may be preferable to compare the searchlight in one region with the searchlights of other groups in the second region (following the rationale of intersubject functional connectivity vs. functional connectivity). It would also be interesting to further explore the nature of the overlap – to see whether there are specific state boundaries that drive most of the overlap or whether different pairs of regions have different overlapping boundaries. This could be used to explore the nature of the hierarchy between regions, beyond just finding that higher regions share boundaries with lower regions. For example, it could enable testing whether state shifts shared by multiple lower level regions are the ones that traverse the hierarchy.

• Further to this, it would be interesting to test whether event boundaries and non-event neural state boundaries form a similar hierarchy (though this may not be feasible with such a low number of event boundaries).

• To assess the effects of noise reduction on the overlap between neural state boundaries and event boundaries, it may be worth testing whether neural state boundaries shared across groups of participants are also more likely to be event boundaries (and specifically whether this effect is stronger in the same regions arising from the co-occurrence analysis). This analysis wouldn't provide an answer, but could help shed some light on the role of noise reduction

2. Some of the conclusions do not completely reflect the results. If additional analyses are not added, perhaps these conclusions could be rephrased, such as "some of the neural boundaries are represented throughout the hierarchy.… until eventually reflected in conscious experience" (p. 14) and "boundaries that were represented in more brain regions at the same time were also more likely to be associated with the experience of an event boundary" (p. 15).

3. Since the GSBS algorithm was fine-tuned based on the data that was later used for analysis, it would be helpful to include additional information demonstrating the choices in the optimization procedure are independent of the eventual results. For example, it isn't clear what 'important boundaries being detected late' means, whether that indicates event boundaries were being missed by the original algorithm. Combined with the fact part of the optimization was based on fixing the number of state boundaries to the number of event boundaries – could these choices have increased the chance of finding overlap between state boundaries and event boundaries?

4. Two small notes: the network defined as posterior DMN includes anterior regions, which is slightly confusing; were the regional differences in HRF assessed on the resting state data or the movie watching data?

https://doi.org/10.7554/eLife.77430.sa1

Author response

[Editors’ note: the authors resubmitted a revised version of the paper for consideration. What follows is the authors’ response to the first round of review.]

The reviewers were in agreement that this is an important topic and that this research is both interesting and promising. However, the reviewers raised a number of significant concerns that centered around two themes. First, there were a number of points raised about the methodology itself and its validity. After some discussion, it was decided that these methodological points could be addressable through additional analysis and/or simulation, but likely require considerably more work than would be usual for an eLife revision. The second set of concerns were with regard to the clear scientific advance over prior work; there was not consensus that these findings move the field forward in a clear way. Reviewer 1 suggested that the generalizability and impact might be improved by drawing direct links to the existing literature, including analysis of a secondary dataset as in the Baldasano et al. (2017) paper. Though, there might be other ways to clarify the impact, as well. Regardless, this is a challenging concern to address in a straightforward way through revision.

As addressing these concerns would likely require more than is typically expected for an eLife revision, it was decided to reject this submission. This being said, if you were to undertake the work required to conclusively address these issues, there was sufficient enthusiasm among reviewers that they would be willing to consider this paper again, as a new submission.

I have appended the detailed reviews to this decision letter. I hope you find them constructive with this work.

Reviewer #1:

In this paper, Geerligs et al. focus on the alignment of event boundaries across brain regions. They examine the transitions between brain states using the method introduced by Baldassano et al. (2017), and how these state transitions are shared across nodes of large-scale brain networks. They introduce a method that enables them to map event-timescales in a broader set of regions than previously possible, and they use this method to reveal how functional networks of regions share time-aligned "event transitions".

This is a well-written manuscript on a timely and important question.

We thank the reviewer for their compliments about our work.

My main concerns relate to the validity (and potential sources of bias) in the methodology for identifying the event-rate of each region, and I also outline a number of other areas where the conceptual and methodological framing could be improved.

p.3 "This dataset, in combination with the application of hyperalignment to optimize functional alignment (Guntupalli et al., 2016), allowed us to study event segmentation across the entire cortex for the first time, because this dataset shows reliable stimulus-driven activity (i.e., significant inter-subject correlations) over nearly all cortical brain regions (Geerligs et al., 2018). "

A central methodological question, which affects almost every claim in this manuscript, is whether the inference of event boundaries from the HMM model (the methods in Figure 1) is valid, and in what ways it might be biased. The validity question is simple: does it measure what it is supposed to measure? In particular, I would like the authors to justify the final step, in which they compute the difference between the correlation for real boundaries and the correlation for random boundaries. Surely, this difference computation will be affected by the noise ceiling of the individual ROI being examined? I understand why using the random condition as a "reference" makes some sense, but I do not understand why the final decision is made based on the simple arithmetic difference of the mean value for the random boundaries and real boundaries? I suggest that the authors justify this procedure using a simulation procedure where the ground truth about event transitions is known, and the procedure should be compared against the method applied in the original Baldassano et al. (2017) paper.

The bias question is also fairly simple: which factors influence the "k" that is inferred? In particular, if a region has high reliability or low reliability of its response across subjects, does this affect the number of events that will be inferred for that region using the HMM procedure? As noted above, this simulation could additionally investigate how the "k" value varies as function of the noise level (i.e. response reliability) of the ROI.

Additionally, although hyperalignment render a larger swathe of cortex available to analysis, but there will still be variability in the reliability of the signal across regions, and this might interact with the hyperalignment performance. In particular, the accuracy of the hyper alignment procedure (for each subject) will presumably also increase for regions whose reliability of response is higher; it is therefore very to consider whether noise (in "space") introduced by the hyperalignment procedure (and varying across regions as a function of their reliability) could further bias the measurement of the event-timescale via the HMM procedure.

Finally, to better understand this method, the authors could also apply their approach to the freely available data from the Baldassano et al. (2017) paper. Does this method produce results that are at least qualitatively similar? This could help to resolve the question of why the event timescales in this paper are shorter than those observed in the Baldassano et al. paper.

We thank the reviewer for these valid questions. In trying to answer them, we performed many simulations to determine the validity of our previous approach. These simulations revealed to us that the method we used before was indeed biased by the level of noise in different brain regions. While running these simulations we also discovered some problems with the HMM-approach in dealing with states of unequal length. In the end, all of these issues led us to develop a new method for detecting neural state boundaries and the optimal number of states, which does not suffer from these issues. This new method called greedy state boundary search (GSBS) was used to redo all the analyses in the paper and has now been published in Neuroimage – https://doi.org/10.1016/j.neuroimage.2021.118085.

In the Neuroimage paper, we performed an extensive set of simulations to demonstrate the validity of GSBS. We show that state boundaries can be identified reliably when the data are averaged across groups of ~17/18 or more participants. We also observed that high levels of noise in the data lead our algorithm to over-estimate the number of states in a region. In contrast, regions such as the medial prefrontal cortex, which show relatively low levels of inter-subject synchrony, have a small number of long-lasting neural states. Therefore, we can be confident that the regional differences in the number of states we observe are not due to regional differences in noise.

p.7: Event networks: "We found that event boundaries are shared within long-range networks that resemble the functional networks that are typically identified based on (resting state) timeseries correlations (see figure 3A)".

This is one of the most intriguing aspects of this paper. However, it would be much more convincing if the authors would replace their qualitative language (e.g. "resemble") with quantitative metrics of overlap. The overlap could be measure between (a) networks defined based on event-timing and (b) networks defined based on functional connectivity. All of the major functional networks should be available in atlases (e.g. the Yeo lab atlases) or via data sharing repositories. Thus, the authors should be able to substantiate their broad claims of "resemblance" with quantitative demonstrations of how well the event-networks match the functional-connectivity-networks. All of the visual networks as well as the FPN and DMN should be quantitatively compared against standard networks defined elsewhere in the literature.

On the same point: p.13 "The fractionation of the DMN into a fast and slow subnetwork closely aligns with the previously observed posterior and anterior DMN subnetworks (Andrews-Hanna et al., 2010; Campbell et al., 2013; Lei et al., 2014)."

Again, please quantify the alignment when claiming spatial alignment with prior findings.

To address this concern, we compared the measures of boundary overlap to the functional connectivity observed with correlation of time series across all pairs of searchlights. We show that there is only a medium sized correlation between the two (r=0.39) suggesting that overlap of neural state boundaries is not simply a result of ‘regular’ functional connectivity between brain regions. In fact, we observe that regions that are negatively correlated do tend to have neural state boundaries at the same time.

To determine the correspondence to previously identified networks, we now quantify the spatial alignment between the networks we detected and the networks defined by Power et al. (2011). We also compared the DMN subnetworks to the previously detected DMN subnetwork by Campbell et al. (2013). For the newly identified superior DMN, we were not able to quantify the alignment with the data from Gordon et al. (2020) because those data were in surface space, rather than MNI space.

The new results and comparisons can be found on page 8-12 of the manuscript and tables S1 and S2.

p.13 "Our results show for the first time that neural events are shared across brain regions in distinct functional networks. "

The authors should consider re-wording this sentence to distinguish their findings from what was already shown in Figure 4B of Baldassano et al. (2017). In particular, note the commonality of event boundaries across early visual and late visual areas (part of the visual network), as well as the commonality of events across angular gyrus and posterior medial cortex (parts of the DMN).

We have rephrased this to: “In line with previous work (Baldassano et al., 2017) we found that neural state boundaries are shared across brain regions. Our results show for the first time that these boundaries are shared within distinct functional networks.”

On a related note, in the Abstract we read: "This work extends the definition of functional networks to the temporal domain" – I am unclear on how novel this extension is. To the best of my understanding, the concept of dynamic functional connectivity is not new (e.g. Hutchison et al., 2013), and even second-order pattern-transition methods have been employed to study functional networks (e.g. Anzellotti and Coutanche, 2018). I would like the authors to sharpen their argument for why this result is not entirely expected in light of prior work. Shouldn't members of the same functional networks be expected to exhibit state-transitions at rates higher than chance?

We have removed this claim from the abstract. In addition, we now show in the Results section and figure S6 that the correlations between state boundaries cannot be directly explained from the correlations between the average searchlight timeseries (i.e., regular functional connectivity). The following text has been added to the manuscript”:

“Although the networks we identified show overlap with functional networks previously identified in resting state, they clearly diverged for some networks (e.g., the visual network). Some divergence is expected because neural state boundaries are driven by shifts in voxel-activity patterns over time, rather than by the changes in mean activity that we typically use to infer functional connectivity. This divergence was supported by the overall limited similarity with the previously identified networks by Power et al. (2011; adjusted mutual information = 0.39), as well as the differences between the correlation matrix that was computed based on the mean activity time courses in each searchlight and the relative boundary overlap between each pair of searchlights (figure S6; r=0.31). Interestingly, regions with strongly negatively correlated mean activity time courses typically showed overlap that was similar to or larger than the overlap expected by chance. Indeed, the relative boundary overlap between each pair of searchlights was more similar to the absolute Pearson correlation coefficient between searchlights (r=0.39) than when the sign of the correlation coefficient was preserved (r=0.31). This suggests that pairs of regions that show negatively correlated BOLD activity still tend to show neural state boundaries at the same time.”

p.11. I struggled to follow the logic of the analysis employed in Figure 6. Why is event duration being predicted from individual frequency bands of the PSD? There is voluminous evidence for band-specific and region-specific artifact (e.g. Birn et al., 2013; Shmueli et al., 2007). Furthermore, distinct functional networks have distinct frequency profiles and coherence patterns (e.g. Salvador et al., 2008; Baria et al., 2011; Stephens et al., 2013). Finally, the frequency bands in the PSD are non-independent (because of the temporal smoothing in the BOLD signal). Therefore, the relationship between frequency band and event duration is confounded by (i) non-independence of frequencies and (ii) frequency covariation across brain regions which arises for a multitude of reasons. The results in Figure 6A seem rather noisy to me, and I imagine that this is because the regression procedure on the PSD is influenced by many interacting and confounding variables.

Another region why this analysis produces (in my opinion) curious results is that it spans distinct sensory modalities which are already known to have opposite PSD-event relationships: along the auditory pathway, PSDs get flatter as event time-scales get longer, while in the visual pathway, PSDs in V1 are already very steep, even while the event timescales are short. It is not clear what is gained by fitting a single model to regions with obviously different relationships of PSD and event structure.

In response to the comments of reviewers 1 and 3, we have removed the resting state analyses from the paper. We realised that using the power spectral density as a proxy for neural state timescales is suboptimal, given the variable duration of neural states within brain regions. In response to reviewer suggestions, we have shifted the focus of the manuscript to what neural states can tell us about the neural mechanisms underlying event segmentation.

p.12. "These results suggest that visual and auditory stimulation are a prerequisite for observing the temporal hierarchy we describe in this paper and that this hierarchy only partly reflects an intrinsic property of brain function that is also present in the resting state."

I do not follow the logic supporting this claim. How can we know whether the (event-based) temporal hierarchy is preserved in the resting state unless we can measure the event transitions in the resting state data? Isn't this analysis just another way of saying that the PSDs have different shapes during rest and during movie viewing?

This claim has been removed from the paper, in relation to the previous point made by the reviewer.

Reviewer #2:

In this paper, Geerlings and colleagues leverage a large, publicly-available dataset in order to assess shared and distinct timescales of neural pattern shifts at event boundaries across different areas of the brain. In line with prior work, the authors report a gradient of timescales in neural event segmentation, with sensory regions comprising the fastest-shifting areas and 'default mode' nodes such as precuneus and medial prefrontal cortex comprising the slowest-shifiting areas. Importantly, the authors build on this previous research and demonstrate that canonical functional networks – such as the frontoparietal network, and the 'default mode' network – feature distinct subnetworks with corresponding faster and slower timescales of pattern shifts. Finally, a fairly novel analysis applied to these types of data examined power spectral density across regions, which could be used to predict event duration across regions (consistent with observed pattern shifts), and could partly, but not entirely, characterize resting-state fMRI data (suggesting that the audiovisual stimulus drove additional functional properties in brain networks not observed during rest).

Overall, this is an interesting and timely study. The question of how the brain segments naturalistic events is one of increasing popularity, and this manuscript approaches the question with a large sample size and fairly thorough analyses. That said, there are a number of questions and concerns, primarily regarding the analyses.

We thank the reviewer for their positive feedback.

• Procedures such as hyperalignment, or the related shared response model used by Baldassano and colleagues, are typically implemented by training on one set of the data, and applying the alignment procedure to a separate, held-out dataset (i.e., training and testing sets). It is unclear whether this approach was taken in the current study, or whether the hyperalignment algorithm was trained and tested on same dataset. In the latter case, there is a degree of circularity in the way across-participant alignment was conducted, potentially leading to biased correlation measures. The movie used in the CamCAN dataset is only 8 minutes long, which is probably not enough data for obtaining separate training and test datasets. However, this is still potentially a serious issue for this manuscript, and I am not sure if the use of hyperalignment is appropriate. If I have misunderstood the methodology, it perhaps warrants some clarification in how the training and application of the hyperalignment algorithm proceeded. (I will note that I am aware you used cross-validation for deriving the number of events, but that is unfortunately a separate issue from a train-test split in the hyperalignment routine itself.)

We agree with the reviewer that hyperalignment parameters are typically estimated in a separate dataset. Hyperalignment can introduce dependencies between datasets from different participants, which can result in biased statistics. To avoid this issue, we ran hyperalignment separately in each of the participant subgroups reported in the manuscript. All statistical testing was performed on independent subgroups of participants (17/18 participants per group), which were hyperaligned separately (i.e. within each group).

• A key finding from the study is that the FPN and DMN fractionate into different subnetworks that have fast and slow timescales. As noted above, the present results are based on an analysis of data from a relatively short period of time. Although the sample size is very large, one wonders whether this distinction would remain solid with a longer movie. With a very short movie, one can only sample a small number of real events, and this could lead to some instability in estimates of the timescale of representations in relation to the events. This might be an issue in relation to the differentiation of fast and slow subnetworks within the FPN and DMN. For instance, Figure 3B, suggests that the fit values for the slow FPN remain more or less stable across a range of event durations (which presumably reflect k values?). The slow FPN shows an interesting bimodal distribution (as do many of the networks) with the second peak coinciding with the peak for the fast FPN. The differentiation is a bit more convincing for the fast and slow DMN, but it is still not clear whether there are enough events and enough fMRI data from each subject to ensure reliable estimates of the timescales. Just to provide some context for this point, some estimates suggest that reliable identification of resting state networks requires at least 20 minutes of fMRI data.

To illustrate the reliability of our approach in identifying regional timescale differences, we now estimate the timescales separately in two independent samples of participants (see figure 1). The correlations between the results of these two groups is very high at the voxel level (r=0.85), suggesting that even in this short dataset we can reliably estimate regional differences in timescale.

Second, figure 4 shows that the timescale differences across the different DMN subnetworks are highly reliable across the searchlights in these networks. It should be noted that due to substantial improvements to our data analysis pipeline, we no longer observe two distinct FPN networks.

• Throughout the paper, fMRI results are described in reference to event processing, but the relationship is underdeveloped. Much of the paper relies on the Hidden Markov Model, which assumes that there is a pattern that remains stationary throughout an event. Baldassano's data shows a surprisingly strong correspondence in posterior medial cortex, but it is less clear whether this assumption is valid for other areas. In relation to this point, one can think of event processing as an accumulation of evidence. At the onset of an event, one might have a decent idea of what is about to happen, but as information comes in, the event model can be refined to make stronger predictions. These kinds of within-event dynamics would be lost in the Hidden Markov model. A related point is that the paper conflates timescales of neural states with psychologically meaningful conceptions of events. EST suggests that event segmentation is driven by prediction error-by one interpretation of the model, sensory information can change considerably without leading one to infer an event boundary. However, change in incoming sensory information would almost certainly lead to the detection of "event boundaries" across short timescales in sensory cortical areas. Figure 5 makes it fairly clear that there is a pretty strong distinction to be made between data-driven event identification based on the fMRI data and psychologically meaningful events inferred by the subjects. It would be helpful for the authors to be more clear about what the data do and do not show in relation to putative event cognition processes.

We strongly agree with the reviewer that it is important to distinguish between the transitions in brain activity patterns observed in the current paper and perceived event boundaries that have been described extensively in the behavioural literature. We have therefore changed the terminology in the paper and refer to ‘neural states’, rather than events when referring to the brain data. We now also discuss much more extensively what our findings can tell us about the mechanisms underlying event segmentation in both the introduction and Discussion sections.

• Why were voxels with an intersubject correlation of less than r=0.35 excluded from analyses? Is this based on prior studies or preliminary analyses? It is not necessarily a bad thing if this choice was made arbitrarily, but I imagine this threshold could have important impacts on the data as presented, so it is worth clarifying.

We investigated the effect of thresholding based on inter-subject correlation in a recent Neuroimage paper (Geerligs et al., 2021) and we observed that it did not result in more reliable estimates of neural state boundaries. Hence, we have removed this threshold from the analysis pipeline.

• Was ME-ICA the only step taken to account for head motion artifacts? If so, there is some concern about whether this step was sufficient to deal with the potential confound. This is especially critical given the fairly brief time series being analyzed here. It would be more compelling to see a quantitative demonstration that head motion is not correlated with the measures of interest.

ME-ICA denoising is currently the most effective method to deal with head motion (Power et al., 2018, PNAS). In addition, all our analyses are performed on group averaged data. This means that head motion that is not synchronized across participants cannot affect the results. To further investigate this potential confound, we computed the correlation between scan to scan head motion and state boundaries for each searchlight within each of the 15 groups of participants. Next we used a Wilcoxon signrank test to investigate if these correlations were significantly different from zero across the 15 groups. We found that none of the searchlights showed a significant association between the average head motion in each group of participants and the occurrence of neural state boundaries. These results are reported in the supplementary Results section. Together these results suggest that head motion did not confound the results reported in the current manuscript.

• A related issue is that of eye movements. Eye movements are related to event processing (e.g., Eisenberg et a., 2018), so one can expect neural activity related to event prediction/prediction error to be confounded with lower-level effects related to eye movements. For instance, we might expect signal artifacts in the EPI data, as well as neural activity related to the generation of eye movements, and changes in visual cortex activity resulting from eye movements. It is unlikely that this issue can be conclusively addressed with the current dataset, and it's not a deal-breaker in the sense that eye movements are intrinsically related to naturalistic event processing. However, it would be useful for the authors to discuss whether this issue is a potential limitation.

We agree with the reviewer that eye movements may affect neural data in the frontal eye fields as well as sensory cortices. However, they are indeed intrinsically related to naturalistic stimulus processing. Fixating the eyes would result in a very unnatural mode of information processing which might bias neural activity in very different ways. In response to this comment from the reviewer, we added the following section to the discussion:

“It should be noted that this more naturalistic way of investigating brain activity comes at a cost of reduced experimental control (Willems et al., 2020). For example, some of the differences in brain activity that we observe over time may be associated with eye movements. Preparation of eye movements may cause activity changes in the frontal-eye-fields (Vernet et al., 2014), while execution of eye movements may alter the input in early sensory regions (Lu et al., 2016; Son et al., 2020). However, in a related study (Davis et al., 2021), we found no age difference in eye movement synchrony while viewing the same movie, despite our previous observation of reduced synchrony with age in several areas (particularly the hippocampus, medial PFC, and FPCN; Geerligs et al., 2018), suggesting a disconnect between eye movements and neural activity in higher-order areas. In addition, reducing this potential confound by asking participants to fixate leads to an unnatural mode of information processing which could arguably bias the results in different ways by requiring participants to perform a double task (monitoring eye movements in addition to watching the movie). ”

• The power spectral analyses were a bit difficult to follow, but more importantly, the motivation for the analysis was not clearly described. The main take home points from this analyses are nicely summarized at the end of p. 14, but it would be helpful to clarify the motivation for this analysis (and the need for doing it) on p.11 in the Results section. Relatedly, is Figure 6A an example spectrum from a particular voxel or region, or an average across regions?

These analyses have now been removed from the paper. Based on the comments from reviewers 1 and 3, we realised that using the PSD as a proxy for neural state timescales is suboptimal, given the variable duration of neural states within brain regions.

• The take-home message appears to be that different brain networks have different timescales at which they seem to maintain event representations. Moreover, certain networks (e.g., the posterior medial/'default mode' network) do not have uniformly fast or slow timescales. The network-based analysis used here is indeed novel, but the impact of the work could be enhanced by clarifying the significance of the results in relation to what we know about event processing. The explicit demarcation of 'fast' and 'slow' subnetworks may be the key conceptual advance, as was the power spectral analysis, but it isn't clear whether these conclusions could also be ascertained from the maps shown in Baldassano et al., 2017 or other papers from the Hasson group.

We have completely rewritten the introduction and Discussion sections to clarify the significance of our work in relation to what we know about event processing.

This review was completed by Zach Reagh, Ph.D. in collaboration with Charan Ranganath, Ph.D. (I sign all reviews)

Reviewer #3:

Geerligs and colleagues conduct a thorough set of analyses aimed at identifying event segmentation timescales across the cortex in a large cohort of participants. They extend previous work by Baldassano et al. by covering the entire cortex, and nicely control for the power spectrum of different regions. In addition, they examine which regions share the same event boundaries, not just the same timescale, and relate these to functional connectivity networks. Overall, their work is impressive and rigorous, but there are a few points that make it somewhat difficult to assess the how strong the contribution is to our understanding of processing timescales:

Thank you for your enthusiasm. We address your specific concerns below.

1. The authors divide the brain into functional networks based on boundary similarity and find that this division is very similar to functional networks defined using resting-state timeseries correlations. They further find increased similarity between regions of different networks that are that are interconnected. Wouldn't the similarity between boundary vectors be strongly linked to the timeseries correlations (both between regions in the same network and across networks)? While the similarity-based functional networks aren't completely identical to those identified in rest, perhaps the same results would be obtained by correlating timeseries in this specific dataset, using the movie data (altering the interpretation of the results).

We now show in the Results section and figure S6 that the correlations between state boundaries cannot be directly explained from the correlations between the average searchlight timeseries (i.e., regular functional connectivity). Although there are similarities between the two, as we would expect, the correlation between them is only r=0.31, suggesting that the same results could not have been obtained using ‘regular’ functional connectivity analyses in the movie dataset. The following text has been added to the manuscript:

“Although the networks we identified show overlap with functional networks previously identified in resting state, they clearly diverged for some networks (e.g., the visual network). Some divergence is expected because neural state boundaries are driven by shifts in voxel-activity patterns over time, rather than by the changes in mean activity that we typically use to infer functional connectivity. This divergence was supported by the overall limited similarity with the previously identified networks by Power et al. (2011; adjusted mutual information = 0.39), as well as the differences between the correlation matrix that was computed based on the mean activity time courses in each searchlight and the relative boundary overlap between each pair of searchlights (figure S6; r=0.31).

Interestingly, regions with strongly negatively correlated mean activity time courses typically showed overlap that was similar to or larger than the overlap expected by chance. Indeed, the relative boundary overlap between each pair of searchlights was more similar to the absolute Pearson correlation coefficient between searchlights (r=0.39) than when the sign of the correlation coefficient was preserved (r=0.31). This suggests that pairs of regions that show negatively correlated BOLD activity still tend to show neural state boundaries at the same time.”

2. It seems that the power spectrum analysis is run both on the resting-state data and on the movie data, whereas the timescale segmentation is run only on the movie data. I expect this is because hyperalignment is possible only when using a shared stimulus, and the HMM is run only on the hyperaligned data. However, this may bias the correlations presented in figure 6 – the movie PSD-based timescale estimation would be expected to be more similar to the HMM timescales than the rest, simply because the same data is used. A more convincing analysis would be to run the HMM on the rest data as well, and test for correlations between the two estimations of event timescales in the rest data, although this would entail substantial additional analyses (as HMM would also have to be run on non-hyperaligned movie data for comparability). It would also help with point 1, testing whether similarity in boundary vectors arises directly from timeseries correlations. I realize this adds quite a bit of analysis, and the authors may prefer to avoid doing so, but the conclusions arising from the power spectrum analysis should be softened in the Results and Discussion, clearly mentioning this caveat.

Unfortunately, it is not possible to detect neural states in the resting state data, since detecting neural states requires data averaging across participants. Because resting-state fluctuations in brain activity are not stimulus driven, data cannot be meaningfully averaged across participants in resting state. Therefore, based in the comments from reviewers 1 and 3 and the altered focus of the paper on neural mechanisms underlying event segmentation, we have removed the analyses from the PSDbased timescale estimation from the paper.

3. It would aid clarity to better separate the current contributions from previous findings, in the Results, and mainly in the Discussion. The authors do describe what has previously been found, citing all relevant literature, but it would be helpful to have a clear division of previous findings and novel ones. For example in the first paragraph of the Discussion, and in general when discussing the interpretation of activity the different regions (currently regions that have already been found are somewhat intermixed with the new regions found).

We have rewritten the introduction and Discussion sections extensively to make more clear what is novel in the current study.

[Editors’ note: what follows is the authors’ response to the second round of review.]

The manuscript has been improved but there are some remaining issues that need to be addressed, as outlined below:

The reviewers were positive about the revisions you made to this submission and felt that extensive work had been done to improve the paper. There were a few remaining points raised by this review that could be addressed the further strengthen the paper. The Reviewing Editor has drafted this to help you prepare a revised submission.

We are very happy to hear that the reviewers were positive about the extensive work we did for the revision. We have taken care to address all remaining points. A summary of the most important changes is provided below.

First, we have described our analyses more clearly, regarding how the overlap between neural state and event boundaries is computed and how we compare this overlap for shared vs. non-shared states. This also helped us streamline our analyses more and we now use the same metric (absolute overlap) throughout the manuscript, also for investigating the effect of boundary strength. This should make the paper easier to follow for readers. Second, we have added additional analyses to more clearly demonstrate the effects of boundary sharing across (large) parts of the cortical hierarchy in relation to perceiving event boundaries. These analyses provide stronger support for the claims we made in the previous version of the manuscript. Finally, we have added some analyses to make sure that the effects we see cannot be explained by shared confounds or noise.

Essential revisions:

1. Reviewer 1 has raised some additional points for clarification in their review, as noted below. These should be clarified in a revision. Please refer to the comments below for these notes.

We have clarified all the points that the reviewer raised. More details about these clarifications are provided in the answers to specific reviewer comments below.

2. Some of the conclusions do not completely reflect the results. If additional analyses are not added, perhaps these conclusions could be rephrased, such as "some of the neural boundaries are represented throughout the hierarchy.… until eventually reflected in conscious experience" (p. 14) and "boundaries that were represented in more brain regions at the same time were also more likely to be associated with the experience of an event boundary" (p. 15).

We rephrased the first sentence (originally p 14) to: “This finding suggests that some of the neural state boundaries that can be identified in early sensory regions are also consciously experienced as an event boundary. Potentially because these boundaries are propagated to regions further up in the cortical hierarchy.”

The second sentence (p. 15) is now supported more strongly by the results that we have obtained through new analyses. More details about these new results are provided in the answers to specific reviewer comments below.

3. Since the GSBS algorithm was fine-tuned based on the data that was later used for analysis, it would be helpful to include additional information demonstrating the choices in the optimization procedure are independent of the eventual results. For example, it isn't clear what 'important boundaries being detected late' means, whether that indicates event boundaries were being missed by the original algorithm. Combined with the fact part of the optimization was based on fixing the number of state boundaries to the number of event boundaries – could these choices have increased the chance of finding overlap between state boundaries and event boundaries?

The primary concern with the performance of the original algorithm in some brain regions, was that the placement of one new boundary resulted in a huge increase in the t-distance. This suggests there was a strong neural state boundary (i.e. large transition in brain activity patterns) that was detected in a late iteration of the algorithm. Therefore, the peak of the t-distance was at a high value of k (number of states) and the inferred optimal number of states was higher than it should have been. By placing two boundaries at the same time (i.e. inferring the location of a state, rather than the location of one boundary), the algorithm behaves much more stable and no longer shows this kind of behavior.

We have not investigated whether those boundaries that are detected late tend to overlap with event boundaries. This is because it is not the problem that the original algorithm missed boundaries, but rather that too many boundaries were added, probably including many boundaries that did not overlap with events. We have now clarified this point in the methods section:

“First, GSBS previously placed one boundary in each iteration. We found that for some brain regions, this version of the algorithm showed sub-optimal performance. A boundary corresponding to a strong state transition was placed in a relatively late iteration of the GSBS algorithm. This led to a steep increase in the t-distance in this particular iteration, resulting in a solution with more neural state boundaries than might be necessary or optimal (for more details, see the supplementary methods and figure 1A in appendix 1).”

And in the supplementary methods in appendix 1:

“First, we discovered that for specific brain regions the original GSBS algorithm performed suboptimally; the placement of one new boundary at a late stage in the fitting process resulted in a large increase in the t-distances (our measure of fit; see appendix 1 – figure 1A). This suggests that a strong neural state boundary (i.e. demarcating a large change in neural activity patterns) was detected only in a late iteration of the algorithm, which led to an overestimation of the number of neural states.”

For completeness we also reran the analyses comparing states and events without fixing the number of states to the number of events. The results of these analyses support our original conclusions and are shown in appendix 1 – figure 2. The adapted text in the supplementary methods section of appendix 1 is copied below:

“To investigate how these changes to GSBS impacted reliability, we split the data in two independent groups of participants and looked at the percentage of overlapping boundaries between the groups for each searchlight. To make sure differences in number of states between methods did not impact our results, we fixed the number of state boundaries to 18 or 19. Because the states-GSBS algorithm can place one or two boundaries at a time, we cannot fix the number of state boundaries exactly, which is why is can be either 18 or 19. We found that the number of overlapping boundaries between groups was substantially higher for states-GSBS, compared to the original GSBS implementation and also compared to the GSBS implementation with altered finetuning (see appendix 1 – figure 2A). This was also the case when we used the optimal number of states as determined by the t-distance, instead of fixing the number of states (see appendix 1 – figure 2B).”

4. Two small notes: the network defined as posterior DMN includes anterior regions, which is slightly confusing; were the regional differences in HRF assessed on the resting state data or the movie watching data?

We have added the following sentence to clarify the issue about network naming: “It should be noted that all three of the DMN subnetworks include some anterior, superior and posterior subregions; the names of these subnetworks indicate which aspects of the networks are most strongly represented.”

Regional differences in HRF were assessed with movie data. This has now been clarified in the methods section:

“We also investigated the effects of estimating the HRF shape based on the movie fMRI data, instead of using the canonical HRF and found that this did not have a marked impact on the results (see supplementary methods in appendix 1).”

And also in the supplementary methods in appendix 1: “To investigate whether such differences might impact our results, we estimated the HRF for each participant and each searchlight, using the rsHRF toolbox that is designed to estimate HRFs in resting state data (Wu et al., 2021). In this case we applied the algorithm to our fMRI data recorded during movie watching.”

Additional Suggestions for Revision (for the authors):

One of the reviewers had some suggestions for additional analyses that might strengthen the results. We pass them along to you here, but you should view these as optional. Only include them if you agree that they will strengthen the conclusions.

There are a few analyses that may help strengthen the conclusions - these are suggested as optional additional analyses, but the authors should feel free not to include them:

• To verify the overlap between searchlights is not due to various artifacts, it may be preferable to compare the searchlight in one region with the searchlights of other groups in the second region (following the rationale of intersubject functional connectivity vs. functional connectivity).

To address this point, we have repeated the overlap analyses in data with two independent groups of participants, like the reviewer suggested. The results of these analyses are described in the results section and in appendix 1 - figure 6.

“To make sure the observed relative boundary overlap between searchlights was not caused by noise shared across brain regions, we also computed the relative boundary overlap across two independent groups of participants (similar to the rationale of inter-subject functional connectivity analyses Simony et al., 2016). We observed that the relative boundary overlap computed in this way was similar to the relative overlap computed within a participant group (r=0.69; see appendix 1 - figure 6), suggesting that shared noise is not the cause of the observed regional overlap in neural state boundaries.”

And in the supplementary results section of appendix 1:

• It would also be interesting to further explore the nature of the overlap - to see whether there are specific state boundaries that drive most of the overlap or whether different pairs of regions have different overlapping boundaries. This could be used to explore the nature of the hierarchy between regions, beyond just finding that higher regions share boundaries with lower regions. For example, it could enable testing whether state shifts shared by multiple lower level regions are the ones that traverse the hierarchy.

To further explore the nature of the overlap we have now added an additional analysis in which we clustered time points with similar patterns of neural states. Below we copied the description from the results section:

“So far, we have focused on comparing state boundary timeseries across regions or between brain regions and events. However, that approach does not allow us to fully understand the different ways in which boundaries can be shared across parts of the cortical hierarchy at specific points in time. To investigate this, we can group timepoints together based on the similarity of their boundary profiles;

i.e. which searchlights do or do not have a neural state boundary at the same timepoint. We used a weighted stochastic block model (WSBM) to identify groups of timepoints, which we will refer to as ‘communities’. We found an optimal number of four communities (see figure 7). These communities group together timepoints that vary in the degree to their neural state boundaries are shared across the cortical hierarchy: timepoints in the first community show the most widely spread neural state boundaries across the hierarchy, while timepoints in the later communities show less widespread state transitions. We found that from community 1 to 4, the prevalence of state boundaries decreased for all networks, but most strongly for the FPCN and CON, sDMN, aDMN and auditory networks. However, the same effect was also seen in the higher visual and SMN and motor networks. This might suggest that boundaries that are observed widely across lower-level networks are more likely to traverse the cortical hierarchy. We also found a similar drop in prevalence of event boundaries across communities, supporting our previous observation that the perception of event boundaries is associated with the sharing of neural state boundaries across large parts of the cortical hierarchy. We repeated this analysis in two independent groups of participants to be able to assess the stability of this pattern of results. Although group 1 showed an optimum of four communities and group 2 an optimum of five communities, the pattern of results was highly similar across both groups (see appendix 1 - figure 8).”

• Further to this, it would be interesting to test whether event boundaries and non-event neural state boundaries form a similar hierarchy (though this may not be feasible with such a low number of event boundaries).

Unfortunately, this was indeed not feasible given the low number of event boundaries in our data. A longer movie would be required to answer this question.

• To assess the effects of noise reduction on the overlap between neural state boundaries and event boundaries, it may be worth testing whether neural state boundaries shared across groups of participants are also more likely to be event boundaries (and specifically whether this effect is stronger in the same regions arising from the co-occurrence analysis). This analysis wouldn't provide an answer, but could help shed some light on the role of noise reduction.

We have performed additional analyses to test whether these effects could indeed be due to noise reduction. These analyses have shown that noise reduction is an unlikely cause of our results. The relevant parts of the results section and supplementary results in appendix 1 are copied below:

Results section:

“Analyses shown in the supplementary results section in appendix 1 demonstrate that these increases in overlap for shared vs. non-shared boundaries cannot be attributed to effects of noise (see also appendix 1 - figure 7).”

Supplementary results in appendix 1:

“Effects of noise on overlap between neural states and events for shared boundaries

One concern is that identifying boundaries shared by two regions has a similar effect to averaging, which provides a better estimation of boundaries within each searchlight because it reduces noise. This noise reduction could be the cause of the increased overlap between events and neural states for shared boundaries vs. non-shared boundaries. To investigate this possibility we examined the increase in overlap for shared vs. non-shared values in the data averaged across 265 participants as well as for each independent subgroup of 17/18 participants. If noise reduction is the cause of the increase in overlap with event boundaries, we should expect the difference between shared and nonshared boundaries to be largest in the smaller independent subgroups where there is the most to be gained from noise reduction. In contrast, if the increase in overlap with event boundaries is a real effect, not due to noise, its effect size should be larger in the data averaged across all participants, where estimates of boundary locations are more accurate. The results in appendix 1 - figure 7 show that the latter interpretation is correct, making it unlikely that the observed increase in overlap between neural state and event boundaries is related to noise.”

Reviewer #1:

This work investigates timescales of neural pattern states (periods of time with a relatively stable activity pattern in a region) across the brain and identify links between state shifts and perceived boundaries events. In multiple regions, they find significant overlap between state shifts and event boundaries, and an even stronger overlap for state shifts that occur simultaneously in more than one region. The results are interesting and timely and extend previous work by Baldassano et al. that found a similar hierarchy in a specific set of brain regions (here extended to the entire cortex).

Strengths

The question of whether neural state shifts form a hierarchy such that state shifts in higher regions coincide with state shifts in sensory regions, and the question of whether event boundaries occur at conjunctions of shifts in different regions are both very interesting.

The optimized GSBS method nicely overcomes limitations of previous methods, as well as a previous version of GSBS. In general, justification is provided for the different analysis choices in the manuscript.

The current work goes beyond previous work by extending the analysis to the entire cortex, revealing that state shifts in higher regions of the cortex overlap with state shifts in lower regions of the hierarchy.

Weaknesses

One of the important conclusions of the paper is that simultaneous neural state shifts in multiple brain regions are more likely to be experienced as boundaries. This finding fits in nicely with existing literature, but the analysis supporting it is not as compelling as the rest of the analyses in the paper:

1. The methods section describing the analysis is not entirely clear. Do Oi, Oj refer to the number of neural state boundaries in searchlights I,j? Or the number of neural state boundaries in each that overlap with an event boundary? If the former (which was my initial interpretation), then how is the reference searchlight chosen – max {Oi,Oj}, as indicated by the formula, or the searchlight with the larger overlap of its unique boundaries (and is the overlap calculated in numerical value or the proportion of overlap)? Given the unclarity, it is difficult to assess whether the degree of overlap between neural state boundaries and event boundaries in each of the searchlights (and/or the number of boundaries in each) could affect the results. It would be helpful to provide verification (either mathematically or with simulations) that higher overlap in one/both searchlights does not lead to a larger difference in overlap between shared and non-shared boundaries.

We agree that our initial description of this analysis was not sufficiently clear. We have now clarified the description of the approach we used. In the previous version of the paper, we used the relative boundary overlap to quantify the overlap for both the shared and non-shared boundaries, but after some simulations we did based on your suggestions, we realized that this metric was biased against pairs of regions with a high number of states (higher than the number of events). That is why we now use the absolute overlap in our analysis, which only depends on the proportion of neural state boundaries that overlap with events. This also led to much stronger evidence for the increased overlap between shared vs. non-shared neural state boundaries with event boundaries.

We have extensively revised our mathematical descriptions of both the overlap metrics as well as our explanations of how we computed the overlap difference between shared/non-shared pairs. These new descriptions clarify that higher overlap in one/both searchlights does not lead to a larger difference in overlap between shared and non-shared boundaries. The overlap metric only depends on the proportion of neural states that overlap with an event boundary. If that proportion is the same for shared/non-shared boundaries then the absolute overlap will also not show any difference.

The relevant sections of text are copied below:

“Comparison of neural state boundaries to event boundaries

To compare the neural state boundaries across regions to the event boundaries, we computed two overlap metrics; the absolute and relative boundary overlap. Both overlap measures were scaled with respect to the expected number of overlapping boundaries. To compute these values, we define E as the event boundary timeseries and Si as the neural state boundary timeseries for searchlight i. These timeseries contain zeros at each timepoint t when there is no change in state/event and ones at each timepoint when there is a transition to a different state/event.

The overlap between event boundaries and state boundaries in searchlight i is defined as:

Oi=t=1nEt,Si,t

where n is the number of TRs.

If we assume that there is no association between the occurrence of event boundaries and state boundaries, the expected number of overlapping boundaries is defined as in Zacks et al. (2001a) as:

OEi=1/nt=1nEtt=1nSi,t

Because the number of overlapping boundaries will increase as the number of state boundaries increases, the absolute overlap (OA) was scaled such that it was zero when it was equal to the expected overlap and one when all neural state boundaries overlapped with an event boundary. The absolute overlap therefore quantifies the proportion of the neural state boundaries that overlap with an event boundary:

OAi=OiOEit=1nSi,tOEi.

Instead, the relative overlap (OR) was scaled such that is was one when all event boundaries overlapped with a neural state (or when all neural state boundaries overlapped with an event boundary if there were fewer state boundaries than event boundaries). In this way, this metric quantifies the overlap without penalizing regions that have more or fewer state boundaries than event boundaries. The relative overlap is defined as:

ORi=OiOEimin{t=1nEt,t=1nSi.t}OEi

And later in the methods section:

“To look in more detail at how boundaries that are shared vs. boundaries that are not shared are associated with the occurrence of an event boundary, we performed an additional analysis at the level pairs of searchlights. For each pair of searchlights i and j, we created three sets of neural state boundaries timeseries, boundaries unique to searchlights i or j: Si,j and Sj,i and boundaries shared between searchlights i and j: Si&j. More formally, using the binary definition of the neural state boundary timeseries Si and Sj, these are defined at each timepoint t as:

Siandj,t=Si,tSj,t,Si,j,t=Si,tSiandj,t,Sj,i,t=Sj,tSiandj,t.

Then, we investigated the absolute overlap between each of these three boundary series and the event boundaries as described in the section ‘Comparison of neural state boundaries to event boundaries’. This resulted in three estimates of absolute boundary overlap; for boundaries unique to searchlight i (OAi,j) and searchlight j (OAj,i) and the shared boundaries (OAi&j). Then we tested whether the absolute overlap for the shared boundaries was larger than the absolute overlap for non-shared boundaries, using the searchlight that showed the largest overlap in their unique boundaries as the baseline: OAi&j>max{OAi,j,OAj,i}. Because the absolute boundary overlap is scaled by the total number of neural state boundaries, it is not biased when there is a larger or smaller number of shared/non-shared states between searchlights i and j. It is only affected by the proportion of neural state boundaries that overlap with an event boundary. If that proportion is the same for shared and non-shared boundaries, the overlap is also the same.”

In the Results section:

“To investigate the role of boundary co-occurrence across networks in more detail, we investigated for each pair of searchlights whether boundaries that are shared have a stronger association with perceived event boundaries as compared to boundaries that are unique to one of the two searchlights. We found that boundary sharing had a positive impact on overlap with perceived boundaries, particularly for pairs of searchlights within the auditory network and between the auditory network and the anterior DMN (see figure 6A and B). In addition, we saw that neural state boundaries that were shared between the auditory network and the early and late visual networks, and the superior and posterior DMN were more likely to be associated with a perceived event boundary than non-shared boundaries. The same was true for boundaries shared between the anterior DMN and the lateral and medial SMN network and the posterior DMN. Boundary sharing between the other higher-level networks (pDMN, sDMN, FPCN and CON) as well as between these higher-level networks and the SMN networks was also beneficial for overlap with event boundaries. On a regional level, the strongest effects of boundary sharing was observed in the medial prefrontal cortex, medial occipital cortex, preceuneus, middle and superior temporal gyrus and insula (see figure 6B). Analyses shown in the supplementary Results section in appendix 1 demonstrate that these increases in overlap for shared vs. non-shared boundaries cannot be attributed to effects of noise (see also appendix 1 – figure 7).”

2. The analysis focuses on pairs of searchlights/regions, demonstrating that in a subset of regions there is a higher chance of an overlap with event boundaries for neural state boundaries that are shared between two regions. Yet the interpretation goes beyond this, suggesting that "boundaries that were represented in more brain regions at the same time were also more likely to be associated with the experience of an event boundary". Additional analyses would be needed to back this claim, demonstrating that overlap between a larger number of regions increases the chance of perceiving a boundary.

We have performed two additional analyses that provide strong support for this conclusion. First, we modified the analyses to investigate co-occurrence within networks and across the whole brain in figure 5A. Second, we added an exploratory analysis in which we identify communities of time points that also supports this claim (see figure 7). The relevant sections from the Results section are copied below:

“Shared neural state boundaries and event boundaries

Previous research on event segmentation has shown that the perception of an event boundary is more likely when multiple features of a stimulus change at the same time (Clewett et al., 2019). When multiple sensory features changes at the same time, this could be reflected in many regions within the same functional network showing a state boundary at the same time (e.g. in the visual network when many aspects of the visual environment change), or in neural state boundaries that are shared across functional networks (e.g. across the auditory and visual networks when a visual and auditory change coincide). Similarly, boundaries shared between many brain regions within or across higher-level cortical networks might reflect a more pronounced change in conceptual features of the narrative (e.g. the goals or emotional state of the character). Therefore, we expect that in a nested cortical hierarchy, neural state boundaries that are shared between many brain regions within functional networks, and particularly those shared widely across functional networks, would be more likely to be associated with the perception of an event boundary. To investigate this, we first weighted each neural state boundary in each searchlight by the proportion of searchlights within the same network that also showed a boundary at the same time. This is very similar to how we investigated the role of boundary strength above.”

Recommendations for the authors:

• It would be interesting to test whether event boundaries and non-event neural state boundaries form a similar hierarchy (though this may not be feasible with such a low number of event boundaries).

Unfortunately, this was indeed not feasible given the low number of event boundaries in our data. A longer movie would be required to answer this question.

https://doi.org/10.7554/eLife.77430.sa2

Article and author information

Author details

  1. Linda Geerligs

    Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Nijmegen, Netherlands
    Contribution
    Conceptualization, Data curation, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing - original draft, Project administration, Writing - review and editing
    For correspondence
    Linda.Geerligs@donders.ru.nl
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-1624-8380
  2. Dora Gözükara

    Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Nijmegen, Netherlands
    Contribution
    Software, Methodology, Writing - review and editing
    Competing interests
    No competing interests declared
  3. Djamari Oetringer

    Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Nijmegen, Netherlands
    Contribution
    Software, Methodology, Writing - review and editing
    Competing interests
    No competing interests declared
  4. Karen L Campbell

    Department of Psychology, Brock University, St. Catharines, Canada
    Contribution
    Writing - original draft, Writing - review and editing
    Competing interests
    No competing interests declared
  5. Marcel van Gerven

    Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Nijmegen, Netherlands
    Contribution
    Conceptualization, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-2206-9098
  6. Umut Güçlü

    Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Nijmegen, Netherlands
    Contribution
    Conceptualization, Software, Methodology, Writing - review and editing
    Competing interests
    No competing interests declared

Funding

Nederlandse Organisatie voor Wetenschappelijk Onderzoek (VI.Vidi.201.150)

  • Linda Geerligs

Natural Sciences and Engineering Research Council of Canada (RGPIN-2017-03804)

  • Karen L Campbell

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

LG was supported by a Vidi grant (VI.Vidi.201.150) from the Netherlands Organization for Scientific Research. KC was supported by the Natural Sciences and Engineering Research Council of Canada (grant RGPIN-2017-03804 to KC) and the Canada Research Chairs program. We thank Aya Ben-Yakov for providing data on the perceived event boundaries in the Cam-CAN movie dataset. Data collection and sharing for this project was provided by the Cambridge Centre for Ageing and Neuroscience (Cam-CAN). Cam-CAN funding was provided by the UK Biotechnology and Biological Sciences Research Council (grant number BB/H008217/1), together with support from the UK Medical Research Council and University of Cambridge, UK.

Ethics

Human subjects: This Cambridge Centre for Ageing Neuroscience study was conducted in compliance with the Helsinki Declaration, and has been approved by the local ethics committee, Cambridgeshire 2 Research Ethics Committee (now East of England - Cambridge Central; reference: 10/H0308/50). Participants gave written informed consent prior to participating in the study.

Senior Editor

  1. Michael J Frank, Brown University, United States

Reviewing Editor

  1. David Badre, Brown University, United States

Reviewer

  1. Charan Ranganath, University of California at Davis, United States

Publication history

  1. Preprint posted: February 5, 2021 (view preprint)
  2. Received: February 3, 2022
  3. Accepted: September 14, 2022
  4. Accepted Manuscript published: September 16, 2022 (version 1)
  5. Version of Record published: October 4, 2022 (version 2)

Copyright

© 2022, Geerligs et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 823
    Page views
  • 219
    Downloads
  • 1
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Linda Geerligs
  2. Dora Gözükara
  3. Djamari Oetringer
  4. Karen L Campbell
  5. Marcel van Gerven
  6. Umut Güçlü
(2022)
A partially nested cortical hierarchy of neural states underlies event segmentation in the human brain
eLife 11:e77430.
https://doi.org/10.7554/eLife.77430
  1. Further reading

Further reading

    1. Cell Biology
    2. Neuroscience
    Sanja Jasek, Csaba Verasztó ... Gáspár Jékely
    Research Article Updated

    Cells form networks in animal tissues through synaptic, chemical, and adhesive links. Invertebrate muscle cells often connect to other cells through desmosomes, adhesive junctions anchored by intermediate filaments. To study desmosomal networks, we skeletonised 853 muscle cells and their desmosomal partners in volume electron microscopy data covering an entire larva of the annelid Platynereis. Muscle cells adhere to each other, to epithelial, glial, ciliated, and bristle-producing cells and to the basal lamina, forming a desmosomal connectome of over 2000 cells. The aciculae – chitin rods that form an endoskeleton in the segmental appendages – are highly connected hubs in this network. This agrees with the many degrees of freedom of their movement, as revealed by video microscopy. Mapping motoneuron synapses to the desmosomal connectome allowed us to infer the extent of tissue influenced by motoneurons. Our work shows how cellular-level maps of synaptic and adherent force networks can elucidate body mechanics.

    1. Neuroscience
    Javad Karimi Abadchi, Zahra Rezaei ... Majid H Mohajerani
    Research Article Updated

    Coordinated peri-ripple activity in the hippocampal-neocortical network is essential for mnemonic information processing in the brain. Hippocampal ripples likely serve different functions in sleep and awake states. Thus, the corresponding neocortical activity patterns may differ in important ways. We addressed this possibility by conducting voltage and glutamate wide-field imaging of the neocortex with concurrent hippocampal electrophysiology in awake mice. Contrary to our previously published sleep results, deactivation and activation were dominant in post-ripple neocortical voltage and glutamate activity, respectively, especially in the agranular retrosplenial cortex (aRSC). Additionally, the spiking activity of aRSC neurons, estimated by two-photon calcium imaging, revealed the existence of two subpopulations of excitatory neurons with opposite peri-ripple modulation patterns: one increases and the other decreases firing rate. These differences in peri-ripple spatiotemporal patterns of neocortical activity in sleep versus awake states might underlie the reported differences in the function of sleep versus awake ripples.