Abstract
Previously we showed that networkbased modelling of brain connectivity interacts strongly with the shape and exact location of brain regions, such that crosssubject variations in the spatial configuration of functional brain regions are being interpreted as changes in functional connectivity (Bijsterbosch et al., 2018). Here we show that these spatial effects on connectivity estimates actually occur as a result of spatial overlap between brain networks. This is shown to systematically bias connectivity estimates obtained from group spatial ICA followed by dual regression. We introduce an extended method that addresses the bias and achieves more accurate connectivity estimates.
https://doi.org/10.7554/eLife.44890.001Introduction
Resting state functional magnetic resonance imaging (rfMRI) can be used to characterise the rich intrinsic functional organisation of the brain (Biswal et al., 1995). Distributed networks of brain regions exhibit coordinated activity of neural populations both within and between networks, known as functional connectivity (Fox et al., 2005; Greicius et al., 2003; Raichle et al., 2001). Analytical approaches to functional connectivity can be broadly split into voxelbased methods (deriving mapbased connectivity estimates to study the spatial organisation of networks) and nodebased methods (approaches based on networkscience that describe connectivity in terms of ‘edges’ between functional brain regions) (Bijsterbosch et al., 2017; Rubinov and Sporns, 2010). Here, we focus on nodebased methods and ask how complex aspects of spatial organisation may influence estimated nodebased functional connectivity.
There is growing interest in interindividual differences in these nodebased functional connectivity patterns and their potential use as markers for health and disease (Finn et al., 2015; Kaiser et al., 2015). We previously showed that interindividual variability in estimated functional connectivity between brain regions is, to a large extent, driven by variability in the spatial organisation (i.e., the precise shape and location) of largescale brain networks (Bijsterbosch et al., 2018). Specifically, our previous results revealed that nodebased functional connectivity as normally estimated from rfMRI data is influenced by a mixture of spatial and temporal factors, with spatial information explaining up to 62% of interindividual variance. This is unexpected and arguably undesirable, because temporal correlationbased functional connectivity estimates are often considered as an accurate representation of temporal coupling strength between neural populations. Therefore, there is a need to better separate interindividual variability in spatial and temporal domains in analytical approaches to rfMRI data, in order to gain a better understanding of interindividual connectivity profiles and derive more interpretable associations with behaviour. To identify connectivity measures with the strongest unique association with behaviour (i.e., potential measures for clinical biomarkers), we need to understand and address the reason for the previously reported conflation of temporal and spatial sources of interindividual variability (Bijsterbosch et al., 2018).
In our earlier work we tested a number of parcellation methods (i.e., voxelbased methods used to parcellate the brain into a small set of largescale functional networks that consists of brain regions that exhibit similarities in their BOLD timeseries, or into a larger set of contiguous regions that consist of interconnected voxels). We tested a grouplevel functionallydefined hard parcellation (Yeo et al., 2011), an individualised multimodal hard parcellation (Glasser et al., 2016), and both high and low dimensional soft parcellations estimated with Independent Component Analysis (ICA) (Beckmann and Smith, 2004). We found that the crosssubject variability in spatial organisation strongly influenced nodebased functional connectivity estimates regardless of the parcellation type/method (Bijsterbosch et al., 2018). There are a number of possible explanations for this previously observed influence of spatial organisation on functional connectivity estimates, as discussed below.
Firstly, it is possible that our previous findings were driven by misalignment between the spatial boundaries of functional regions (‘nodes’) estimated at the grouplevel, and the true spatial organisation in an individual subject. This type of misalignment refers to the lack of voxeltovoxel correspondence across individuals that remains even after applying standard alignment approaches (here surfacebased multimodal alignment [Robinson et al., 2014]). A direct result of such misalignment, if not accounted for in the methodological approach used to estimate node timeseries, is that those timeseries will incorrectly contain a mixture of multiple ‘true’ node timeseries. The effect of such mixed timeseries is potentially profound, leading to a sharp reduction in the accuracy of subsequently estimated temporal correlations (‘edges’) (Smith et al., 2011). Parcellations are typically defined at the group level in order to ensure correspondence between nodes across subject, although some parcellation approaches described above include steps to account for misalignment (e.g. using a classifier for the multimodal parcellation, and dual regression for ICA [Hacker et al., 2013; Nickerson et al., 2017]). While these approaches are expected to appropriately account for misalignment, our previous findings suggested that spatial information strongly influenced estimated temporal correlations despite the use of dual regression or classifier steps to obtain subject specific maps (Bijsterbosch et al., 2018).
Secondly, it is expected that different parcellation methods preferentially represent connectivity information in either the temporal or spatial domain. For example, ‘nodes’ in a low dimensional soft parcellation take the form of spatially extended networks of brain regions, and therefore the connectivity between functional regions that are included in the same network are described in the spatial map. Conversely, these functional regions would be split into separate nodes when using a highdimensional (contiguous) hard parcellation, and consequently the same connectivity between functional regions must be described in the edges. In our earlier work, we tested to what extent our findings were explained by this type of withinnetwork connectivity information that may be represented in lowdimensional spatial maps. The findings showed that the direct mapping between lowdimensional extended networks and highdimensional node edges does not explain the influence of crosssubject variability in spatial organisation on crosssubject variability in edge estimates (see Supplementary file 1e in Bijsterbosch et al., 2018).
Thirdly, it is possible that the assumptions that underlie the estimation of the grouplevel parcellation are incorrect (e.g. the spatial independence constraint in spatial ICA). If the assumption of spatial independence is incorrect, this would result in misestimated group maps, which may affect how functional connectivity is represented in downstream dual regression estimates. Breaking the spatial independence assumption implies the presence of relatively extensive amounts of spatial overlap between nodes, such that spatial correlations between node maps are present. While this type of complex, overlapping spatial structure does not easily fit with the intuitive notion of nodes (and is precluded, by definition, in hard parcellations), the example of overlapping receptive fields with selectivity for stimulus orientation, length, width, or colour in V1 demonstrates the possibility for hierarchically overlapping functional systems (Van Essen and Maunsell, 1983). Indeed, taskbased functional activation patterns can be more accurately captured based on softparcellations with inherent scope for overlap, compared with hard parcellations (Bzdok et al., 2016). It is possible that spatial ICA may underestimate spatial overlap between nodes (as a result of enforcing spatial independence in the estimated maps), although previous work has shown that, in the presence of noise, this overlap can be recovered using thresholding (Beckmann et al., 2005).
An alternative parcellation method, designed to avoid the spatial independence constraint, is PROFUMO. This adopts a hierarchical Bayesian framework that includes spatial priors (for map sparsity and group map regularisation) and temporal priors (consistent with the hemodynamic response function) (Harrison et al., 2019; Harrison et al., 2015). The Probabilistic Functional Mode (PFM) maps obtained from PROFUMO commonly show relatively extensive amounts of spatial overlap (and hence spatial correlation) between nodes. These overlap regions may contain a spatial representation of complex betweennode patterns of functional connectivity. Therefore, it is possible that the presence of these spatial correlations in PFM maps, which are not present in unthresholded group spatial ICA maps (by design, as a result of spatial independence), may explain our previous results (Bijsterbosch et al., 2018).
The aim of this work is to disambiguate functional connectivity information in the temporal and spatial domains, and to determine the influence of different algorithms used for parcellation and connectivity estimation. Specifically, we focus on soft parcellations to determine the interaction between complex functional brain organisation and the estimation of spatial nodes and temporal edges. We adopt simulation approaches that allow direct control and knowledge of the ground truth, provide mathematical explanations, show examples observed in real data, and test associations with behaviour.
Results
Dual regression performance in the presence of misalignment
One potential explanation for our previously reported findings (Bijsterbosch et al., 2018), is spatial misalignment between the exact location of node boundaries in individual subjects compared with group maps. In a typical ICA pipeline, group ICA (using the temporal concatenation approach) is followed by dual regression (Nickerson et al., 2017). Dual regression aims to estimate subjectspecific spatial maps that accurately capture spatial organisation, accounting for any misalignment with the group maps. We will refer to group ICA followed by dual regression as the ‘ICADR’ pipeline throughout.
To assess the degree to which the ICADR pipeline is affected by spatial misalignment, we make use of the fact that data from the Human Connectome Project are available using two different surfacebased versions of multimodal surface matching (MSM) spatial alignment (Robinson et al., 2018; Robinson et al., 2014), and as volumetrically aligned data. Cortical alignment using ‘MSMSulc’ is driven by gyral folding patterns alone, whereas cortical alignment using ‘MSMAll’ incorporates multiple spatial features including folding patterns, myelin maps, functional resting state networks, and visuotopic maps for alignment. Previous work has shown substantial improvements in alignment when using MSMAll compared with MSMSulc, and both types of surface alignment are superior compared with volumetric alignment (Coalson et al., 2018).
Here, we directly compare spatial alignment using MSMAll, MSMSulc, and volumetric data at the single subject level. Firstly, we perform singlesubject ICA independently on each dataset (temporally concatenated across four runs; dimensionality = 25). Given that the data are identical apart from the alignment procedure, any differences in the spatial network boundaries extracted with ICA from the MSMAll or MSMSulc data should be driven exclusively by misalignment (although note that for volumetric data other tissue types such as white matter and CSF may influence the network decomposition). Furthermore, these singlesubject ICA results are not influenced by other subjects or by the group, and can therefore be viewed as the ‘ground truth’ spatial organisation in this subject for each respective alignment type. Secondly, we perform dual regression against group maps obtained from 1004 subjects aligned using MSMAll (dimensionality = 25). Matching group maps in volumetric space were obtained by regressing each subject’s MSMAll stage one dual regression timeseries into that subject’s volumetric data, these maps were then averaged across all subjects and entered into a spatial ICA with the same dimensionality (to ensure spatial independence, which may not be fully preserved in the regression and averaging steps). These group maps are expected to better represent the subjectspecific MSMAllaligned data compared with the subjectspecific MSMSulc and volumetrically aligned data. Therefore, we can directly test how well dual regression captures the ‘ground truth’ spatial organisation obtained using singlesubject ICA for each respective alignment type. This procedure was repeated separately in a subset of N = 22 subjects.
Firstly, we perform spatial correlations between group maps and singlesubject spatial ICA maps to determine how well the group maps represent the subjectspecific organisation. Correlations are transformed using Fisher’s RtoZ and entered into a oneway ANOVA with a factor for alignment (3 levels corresponding to MSMAll, MSMSulc, and volumetric). The main effect was significant (F(2,1098)=673.2, p<10^{−10}), and all posthoc paired tests were significant after Bonferroni correction (MSMAllMSMSulc ΔZ=0.045, p<10^{−10}; MSMSulcvolumetric ΔZ=0.152, p<10^{−10}; see Figure 1, Figure 1—figure Supplement 1A).
Next, we estimate the correlation between subjectspecific maps obtained with dual regression and “ground truth” maps, separately for MSMSulc, MSMAll, and volumetrically aligned data. Results from a oneway ANOVA with a factor for alignment (3 levels corresponding to MSMAll, MSMSulc, and volumetric), showed a significant main effect, F(2,1098)=842.2, p<10^{−10}. This effect was driven by a significantly lower correlation between dual regression and “ground truth” maps in volumetric data (MSMSulcvolumetric ΔZ=0.242, p<10^{−10}). Conversely, the difference in this correlation between MSMAll and MSMSulc did not reach significance (ΔZ=0.013, p=0.061; see Figure 1, Figure 1—figure Supplement 1B).These results show that, despite misalignment between MSMSulc subject data and MSMAll group maps, dual regression was largely able to overcome this misalignment to estimate the subjectspecific spatial organisation. However, dual regression was not able to correct for the more substantial amounts of misalignment observed in volumetrically aligned data.
An example of an individual ICA component is shown in Figure 1. These results qualitatively illustrate the extent to which dual regression corrected for minor misalignment between MSMAll and MSMSulc. There was a clear shift in the parietal node of the default mode network in MSMSulc data (Figure 1C) compared with MSMAll data (Figure 1A). Nevertheless, dual regression against the identical set of group maps is able to accurately estimate the shifted versions of the subject maps (Figure 1B and D).
Crosssubject relationship between temporal and spatial connectivity
As discussed in the introduction, there are multiple potential reasons for the previously observed influence of spatial information on estimated nodebased functional connectivity, including spatial misalignment and inappropriate parcellation assumptions. In the previous section we show that the contribution of spatial misalignment is likely to be relatively minor provided that surfacebased alignment is used in conjunction with the ICADR pipeline. Next, we aim to test the role of inappropriate parcellation assumptions.
Previous rfMRI research has utilised the spatial shape and amplitude of intrinsic networks (Filippini et al., 2009), or temporal correlation patterns between timeseries extracted from nodes (Smith et al., 2011). However, correlations between network/node spatial maps (“spatial edges”) are not commonly studied. A potential reason for this is that many parcellation approaches (by design) result in binary, nonoverlapping node maps (Glasser et al., 2016; Gordon et al., 2016; Yeo et al., 2011). For the purposes of this paper, we define temporal edges as the correlation between fMRI timeseries extracted from two separate nodes, i.e., $\frac{cov({T}_{X},{T}_{Y})}{{\sigma}_{TX}{\sigma}_{TY}}$, where ${T}_{X}$ and ${T}_{Y}$ are vectors (size timepoints x 1) representing the extracted timeseries of two different nodes ($X$ and $Y$; Figure 2). Spatial edges are mathematically defined as the correlation between the two node spatial maps, i.e., $\frac{cov({S}_{X},{S}_{Y})}{{\sigma}_{SX}{\sigma}_{SY}}$ (Figure 2). In the case of grayordinate data, ${S}_{X}$ and ${S}_{Y}$ (the spatial maps of nodes $X$ and $Y$) are represented as vectors (size 91282 x 1), but three dimensional (volumetric) spatial node maps can also simply be vectorised to allow straightforward calculation of the correlation coefficient. To begin to elucidate the interaction between estimated functional connectivity in the spatial and temporal domains, we first assess the relationship between temporal correlations and spatial correlations estimated with spatial ICA using HCP data.
Group ICA was performed using all data from 1,004 HCP subjects, and the dimensionality of decomposition was fixed at 50 nodes. Following group ICA (using the temporal concatenation approach), dual regression was performed to derive subjectspecific node timeseries and spatial maps. Dual regression comprises two stages, where stage one involves multiple spatial regression against group spatial ICA node maps (resulting in subjectspecific node timeseries), and stage two involves multiple temporal regression against the stage one timeseries (resulting in subjectspecific node spatial maps).
Subjectspecific node timeseries derived from 50 nodes are correlated to estimate a 50 × 50 network matrix of temporal edges, and subjectspecific node spatial maps from 50 nodes are correlated to estimate a 50 × 50 spatial overlap matrix of spatial edges. Pearson’s correlation coefficient (‘full correlation’) is used for both temporal and spatial edge estimates. Note that, while group spatial ICA maps are uncorrelated due to the spatial independence constraint, the subjectspecific node maps derived from stage 2 of dual regression can be correlated. We subsequently directly compare the relationship between spatial overlap and temporal network matrices across all edges and subjects.
The results reveal a significant negative correlation (across edges and subjects) between ICADR temporal network matrix edges and ICADR spatial overlap matrix edges (Figure 3). This means that when the spatial maps for two nodes are less correlated (e.g. there is less spatial overlap), then there is more positive functional connectivity between these nodes. This negative association is surprising, because one might expect the presence of less spatial overlap between two nodes to be associated with less functional connectivity (rather than more) between those nodes.
Temporal network and spatial overlap matrix estimation in ICA followed by dual regression
We now provide evidence that the negative association between node spatial map correlations and functional connectivity in Figure 3 could be the result of dual regression being used on spatial ICA maps that are incorrect, due to the assumption of spatial independence being wrong (e.g. when there is spatial overlap between nodes).
Consider that neuroimaging data $Y\left[timepointsxvoxels\right]$ can be summarised as a linear combination of a set of spatial maps $S\left[voxelsxN\right]$ (where N = number of extracted components) and a set of timeseries $T\left[timepointsxN\right]$ following the outer product model:
Here, $R=QQ\text{'}\left[NxN\right]$ is a square rotation matrix, which may be changed from the identity matrix in order to enforce independence in either the spatial or temporal domain (as is the case in ICA). The decomposition is often considered in terms of estimated component spatial maps $M\left[Nxvoxels\right]$, and component timeseries $A\left[timepointsxN\right]$.
If the rotation matrix $QQ\text{'}$ enforces spatial independence, $cov\left(M\text{'}\right)=cov\left(QS\right)\left[NxN\right]$ is equal to the identity matrix. Equation 4 therefore shows that all betweencomponent covariance information must be contained in $cov\left(A\right)=cov\left(TQ\right)\left[NxN\right]$ in the case of spatial independence. Hence, $cov\left(TQ\right)$ as estimated from spatial ICA (or equivalently from stage 1 dual regression) reflects a weighted combination of the ground truth temporal and spatial covariance. Generally speaking, each parcellation method sits somewhere along a continuum of how the total combined temporal and spatial covariance structure is represented, as determined by the form of the rotation matrix $QQ\text{'}$. For example, both nonoverlapping hard parcellations and spatial ICA represent all covariance structure temporally, whereas temporal ICA represents all covariance structure spatially (Smith et al., 2012). PROFUMO does not explicitly enforce orthogonality or independence in either domain, and sits along this continuum between both extremes based on model parameters and priors.
To explain the negative correlation between dual regression spatial overlap and temporal network matrices (Figure 3), we need to look at stage 2 of dual regression. Here, a multiple temporal regression is performed using the $A=TQ$ timeseries from Equation (4) to estimate node spatial maps:
For clarity, we provide the matrix dimensions as: $pinv\left(TQ\right)\left[Nxtimepoints\right]$ and $Y\left[timepointsxvoxels\right]$. Assuming that $TQ$ are zero mean and unit variance, the first part of the pseudoinverse $\left({\left(TQ\right)\text{'}TQ)}^{1}\right[NxN]$ is equal to the inverse of the covariance matrix (i.e., the partial correlation between stage 1 dual regression timeseries multiplied by 1). Therefore, the spatial maps obtained in Equation (5) (${S}_{DR}$) are negatively weighted by the partial correlation between the stage 1 timeseries ($TQ$). As we saw in Equation (4), $cov\left(TQ\right)$ represents a weighted combination of the ground truth spatial and temporal covariance due to the enforced spatial independence. Therefore, the inflated temporal correlations in $cov\left(TQ\right)$ will negatively weight the correlations between spatial maps, resulting in the negative correlation observed in Figure 3.
It is worth noting that the above theory holds in the absence of any added unstructured noise, which is expected to be present in realistic rfMRI data. Previous work has shown that thresholding can be used in the presence of noise to recover spatial overlap between group spatial ICA node maps (Beckmann et al., 2005). Therefore, using thresholded maps to extract timeseries may be an appropriate technique to provide more accurate measures of spatial overlap and temporal network matrices for the ICADR pipeline. This thresholding approach is discussed and tested in detail below in the section entitled ‘Using mixturemodel thresholding to improve ICA dual regression estimates’.
To show the effects of the theory described above, we performed a two dimensional simulation that includes two correlated maps (25% overlap; each map occupying 1 percent of total voxels). We generated data for 50 subjects by taking the outer product of the maps with two correlated timeseries, and entered simulated data from all 50 subjects into a spatial ICA using the temporal concatenation approach. Random noise was added to each map in each subject, but there was no systematic misalignment across simulated subject data. The full simulation was repeated across 10 instances to obtain robust results. Example group maps estimated using spatial ICA show somewhat weaker weights in the overlapping region, consistent with the independence constraint (Figure 4A and insert). The group maps are used in a dual regression pipeline to obtain estimated subject timecourses (and derived temporal correlations), and estimated subject maps (and derived spatial correlations). As described in the theory above and consistent with the results shown in real data in Figure 3, the derived spatial overlap and temporal network matrices are inversely correlated with each other (Figure 4B). As expected, the temporal network matrix edges were shifted positively compared to the ground truth (Figure 4C), and the spatial overlap matrix edges were shifted negatively compared to the ground truth (Figure 4D). Nevertheless, the individual maps and timecourses estimated with ICADR are highly correlated with the ground truth (Figure 4E). Hence, even though the firstorder map and timecourse estimates from ICADR may have good accuracy, secondorder spatial and temporal edge estimates can be more strongly affected by systematic shifts if the assumption of spatial independence is not met.
The theory and simulation above describe the effects of the presence of ‘true’ spatial overlap on estimated temporal and spatial correlations obtained from a traditional ICADR pipeline. Spatial overlap is shown to affect estimated temporal and spatial correlation such that: (i) temporal correlations between stage one dual regression timeseries are a weighted sum of ground truth spatial and temporal correlations, and (ii) spatial correlations between stage two dual regression node maps are inversely weighted by the partial correlation between timeseries.
In our previous work, we showed that simulated data containing *only* interindividual variation in node spatial maps resulted in a substantial amount of interindividual information in temporal network matrices estimated with ICADR (Bijsterbosch et al., 2018). There, the spatial information in the simulated data (i.e., the simulation ‘ground truth’) were subjectspecific PFM maps, which are known to contain spatial overlap (Harrison et al., 2019; Harrison et al., 2015). Therefore, the theory above provides a clean (and mathematical) explanation for how the ‘ground truth’ spatial correlations present between PFM spatial maps can contaminate temporal correlations estimated from traditional ICADR.
Evidence for the existence of overlap in real data
The theory and results in the previous sections show the potential effect of ‘true’ spatial overlap on results obtained from the ICADR pipeline. An important question is what level of overlap in the spatial organisation of largescale brain networks is present in rfMRI data. While it is not straightforward to know the ‘ground truth’ functional organisation in the human brain, we present several results that can provide insights.
The first approach is to simply take subjectspecific map estimates obtained from different parcellation methods (ICADR and PROFUMO), and to sum the grayordinatewise weights across all maps. Subjectspecific maps are first normalised (separately per node) to a maximum of 1 and thresholded such that voxel weights between −0.2 and 0.2 are set to zero to remove background noise. These steps are needed to avoid the resulting summary overlap maps being driven by either node maps with strong weights or by contributions from background weights. For each subject, a summary map is obtained by summing the absolute values at each grayordinate across all node maps. These summary maps are subsequently averaged across all subjects. The results of this simple approach reveal areas of overlap in the temporoparietaloccipital junction (Figure 5A). Overlap areas are spatially consistent between ICADR and PROFUMO approaches, although PFM maps show somewhat more extensive regions with high overlap.
In addition, it is of interest to compare maps obtained from ICA and PROFUMO. PROFUMO does not enforce the spatial independence constraint and is therefore well suited to capture overlap. Figure 5B shows a direct comparison of the overlap between two matching grouplevel components obtained from PROFUMO and ICA. The spatial correlation between unthresholded ICA and PFM maps is high (r = 0.84 between the red maps and r = 0.80 between the green maps). However, the spatial correlation between the two unthresholded PFM maps is strong (r = 0.46), whereas there is no correlation between the two unthresholded ICA maps. While the thresholded maps in Figure 5B on the right show that ICA captures some of the overlapping regions, the extent of these is qualitatively reduced compared to PFM maps on the left.
Using mixturemodel thresholding to improve ICA dual regression estimates
We have demonstrated that if the assumption of spatial independence is incorrect, then this induces negative correlations between node spatial maps, which in turn can contaminate functional connectivity through dual regression. We now consider an alteration to traditional ICA dual regression to alleviate this contamination. The proposed method works by reducing the problematic negative spatial correlations present in the node/component spatial maps following spatial ICA. This is known as thesholded dual regression, in which a Gaussian/Gamma mixture model can be fitted to the histogram of an ICA component map to determine a threshold used to zero the background. This allows better recovery of ground truth spatial correlations in simple simulations (see Figure 3 in Beckmann et al., 2005).
In order to estimate timecourses such that the temporal network matrices more accurately match the ground truth, we propose to perform thresholding of the subjectspecific spatial maps obtained from dual regression stage two using Gaussian/Gamma mixture modelling. Specifically, the proposed solution includes the following stages (Figure 6):
Multiple regression of subject data against group maps (identical to current stage 1)
Multiple regression of subject data against stage one timeseries (identical to current stage 2)
Thresholding of stage two spatial maps using mixture modelling (newly proposed stage 3)
Multiple regression of subject data against stage three thresholded maps (newly proposed stage 4)
These stage four timeseries will then be used for the estimation of temporal network matrices, instead of using the intermediate timeseries from dual regression stage 1. Below, we use simulations to test this thresholded ICADR procedure against the standard ICADR pipeline and against PROFUMO.
In stage three above, mixture modelling is used to approximate the distribution of voxelwise spatial weights in a given subjectspecific stage two map using a Gaussian for the background, and two Gamma distributions for positive and negative tails (Beckmann et al., 2005). The mean and standard deviation of the Gaussian background distribution are used to shift and rescale the map weights and a threshold of 2 is applied to set the background to zero.
Dual regression can be performed as part of two different analysis pipelines with relatively separate goals and interpretations. Firstly, dual regression can be performed in order to draw inferences regarding the spatial shape and local amplitude of networks. This analysis procedure is best performed on the unthresholded stage two spatial maps, because subjectspecific thresholding may introduce unwanted biases in the subsequent voxelwise inference performed on the maps. Secondly, dual regression can be performed in order to extract timecourses used to estimate temporal network matrices (and hence downstream inferences based on networkmatrix edge comparisons across subjects). For this second analysis pipeline we show here that using stage four timeseries instead of stage one timeseries is beneficial in order to more clearly separate spatial and temporal information (at least in the typical case of wholebrain analyses; spatial ICA performed within a small region of interest may behave differently if the number of background voxels is reduced relative to the number of ‘active’ voxels, such that the mixture model may not be valid). A third potential use of dual regression is to study spatial overlap matrices. While spatial relationships between networks are not typically studied, there is increasing interest in crosssubject differences in spatial network organisation. Spatial overlap matrices (which should be studied using stage three thresholded maps) reflect a relatively simple metric related to overlapping organisation and may become of increasing interest, particularly when studying associations with behaviour.
Direct comparisons of PROFUMO and ICA accuracy using simulations
To test the accuracy of estimates obtained from PROFUMO, traditional ICADR, and thresholded ICADR (described in the previous section), we simulated an rfMRI dataset containing a ground truth set of nodes. To generate the simulations, we followed the general framework described in detail in Harrison et al. (2019) and Harrison et al. (2015), with slight changes to some of the parameters. Briefly, the simulated data contained 10,000 voxels, 30 subjects, two runs per subject, and 600 timepoints per run. This full simulation is repeated 10 times in order to obtain wellsampled results.
In the spatial domain, a binary atlas of 100 nonoverlapping contiguous parcels was generated, and random subjectspecific warps were applied as a simplified representation of misalignment (with the maximum possible voxelwise displacement equal to the average parcel size). Then, 15 nonbinary spatial node maps were generated by specifying weights for each atlas parcel determining how strongly it contributes to each spatial node. While the parcel identities were fixed, the strength of the weights varied from subjecttosubject. In the temporal domain, sparse and correlated ‘neural’ timecourses were generated and convolved with subjectspecific haemodynamic response functions.
Following the generation of subject data, the full PROFUMO, traditional ICADR, and thresholded ICADR pipelines are performed to obtain group maps, subjectspecific maps, and subjectspecific timeseries. To test the performance of each of these approaches, we calculate (at a subject level): (i) the correlation between estimated and ground truth timeseries, (ii) the correlation between estimated and ground truth spatial maps, (iii) the correlation between estimated and ground truth results for each temporal network matrix edge (across subjects), and (iv) the correlation between estimated and ground truth results for each spatial overlap matrix edge (across subjects).
In addition, we focus on a specific subset of the edges (across all repeats of the simulation) with significantly positive ground truth spatial correlation based on a onetailed ttest against zero (after Bonferroni correction for multiple comparison across 15*14/2 edges * 10 iterations = 1050 comparisons). A total of 126 edges (out of 1050) showed significant ground truth positive spatial correlation. The simulations are designed to have reasonable, but not excessive, amounts of spatial overlap (consistent with the results in Figure 5A), and therefore the spatial correlations are relatively weak, despite being significantly different from zero. We investigate these specific edges further because the tests described above are correlationbased and it is possible that interindividual variance is captured well (leading to good results in correlationbased tests) even in the presence of a considerable bias in absolute network matrix values. We aim to test whether there is any absolute bias away from ground truth network matrix values (as measured with the positiveedge comparisons), and how well each method captures relative differences between individuals (as measured using the correlationbased tests).
The results show that the accuracy of estimated timeseries is relatively good in all of the analysis pipelines tested here (Figure 7A). The accuracy of spatial map estimation is superior in PROFUMO compared to both variations of the ICADR pipeline (Figure 7B). As expected, the accuracy of spatial map estimates was improved using the thresholded ICADR pipeline compared to the traditional ICADR pipeline (Figure 7B). Similar overall results are found for the accuracy of spatial overlap matrices (Figure 7D). These biases in secondary estimates for spatial and temporal edges are also found in the absence of any betweensubject spatial misalignment, confirming that this effect is independent of any potential misalignment problems (Figure 7, Figure 7—figure supplement 1). When focusing on edges with significantly positive ground truth spatial correlation, PROFUMO outperformed both ICADR pipelines for estimating spatial edges (Figure 7E), whereas thresholded ICADR results were closest to the ground truth for estimating temporal edges (Figure 7F).
Linking spatial overlap and temporal network matrices to behaviour
In our previous work we found a strong relationship between PROFUMO spatial maps and a behavioural mode that includes positive and negative traits (Bijsterbosch et al., 2018). Here, we repeat the canonical correlation analysis (CCA) on 1,001 HCP subjects to estimate multivariate relationships between a set of behavioural variables and a set of edges. For the edges, we test both spatial overlap and temporal network matrices obtained from one of the pipelines described above. Temporal network matrices are classically used as a proxy for neural coupling, and are commonly studied in the literature. On the other hand, spatial overlap matrices reflect spatial overlap between resting state networks, which is an aspect of functional connectivity that has not yet received attention in the existing literature. Here, we aim to test which of these different aspects of functional connectivity is most strongly and uniquely associated with individual differences in behaviour.
Details of the CCA procedure can be found in our previous work (Bijsterbosch et al., 2018). Briefly, CCA takes a set of behavioural measures and a set of edges as inputs and determines a linear combination of each such that the resulting scores are maximally correlated between the two sets of input variables. Following earlier work (Bijsterbosch et al., 2018; Smith et al., 2015), we use a total of 158 behavioural variables available in the Human Connectome Projects broadly covering demographic (e.g. education level), psychometric (e.g. IQ), lifestyle (e.g. substance use), and psychosocial (e.g. rulebreaking behaviour) measures. Previous work has shown a strong relationship between a behavioural population mode of variation that includes positive measures (such as IQ and selfreported life satisfaction) and negative measures (such as drug and alcohol use) (Smith et al., 2015).
To determine the unique variance contained in spatial overlap and temporal network matrices respectively, we adopted the same partial CCA approach that was used previously (Bijsterbosch et al., 2018; Smith et al., 2015). We regress spatial edges onto temporal edges within each method (after dimensionality reduction), and enter the residuals into a standard CCA against behavioural measures. Hence, the partial CCA results remove any shared behaviourallyrelevant variance that is present in both spatial overlap and temporal network matrices within the overall method (i.e., within PROFUMO/standard ICADR/thresholded ICADR).
All of the spatial overlap and temporal network matrix results obtained with any of the three pipelines show significant associations with behaviour (Figure 8, blue bars). The strongest relationship with behaviour is observed for spatial overlap matrices estimated with PROFUMO (R_{UV} = 0.72), which is significantly stronger than with the original ICADR temporal network matrix result (p=0.03), and compared with the thresholded ICADR temporal network matrix result (p=0.001). This finding is closely linked to our earlier work, where we reported a strong CCA result for PFM spatial maps, but there are key differences in the way brainbased inputs to the CCA were calculated between these two findings. Here, we use subjectspecific spatial overlap matrices as input (i.e., only including spatial correlations between maps), whereas our previous results were based on the full set of subject spatial maps (i.e., including all spatial features in all maps).
To facilitate the interpretation of these findings in light of our earlier work (Bijsterbosch et al., 2018), we include a movie of spatial overlap as a function of the most significant behavioural CCA mode (Figure 8—video 1). Here, we followed the same procedure described previously to generate interpolated movie frames for each of the 50 PFM maps, and subsequently summed the grayordinatewise weights across all 50 maps following normalisation (as for Figure 5A). The resulting movie (Figure 8—video 1) shows that spatial overlap systematically varies along with the networkspecific spatial variation that we showed previously (Bijsterbosch et al., 2018). Hence, these results show that spatial overlap (and resulting correlation) is a key behaviourallyrelevant aspect of spatial information.
Out of the results that reach significance in Figure 8, subject behavioural weights are significantly correlated with the previously reported positivenegative population mode of behaviour (Bijsterbosch et al., 2018; Smith et al., 2015) for all results except for PFM temporal netmats. The PFM temporal netmats are instead linked to variables such as blood pressure (diastolic and systolic), hematocrit values, and alcohol use, therefore representing a more physiological population mode. Correlation with the positivenegative mode was reduced for all partial CCA results compared with full CCA results (R_{full} = 0.14 to R_{partial} = 0.01 for ICADR thresholded temporal network matrices; R_{full} = 0.42 to R_{partial} = 0.21 for ICADR original temporal network matrices; R_{full} = 0.39 to R_{partial} = 0.16 for PFM spatial overlap matrices). Hence, the partial results were less similar to the positivenegative behavioural mode than the full CCA results, indicative of shared behaviourallyrelevant information between spatial overlap and temporal network matrices in all of the approaches that were tested. Specifically, the amount of overall shared variance between temporal network and spatial overlap matrices was 29.6% for original ICADR, 26.8% for thresholded ICADR, and 24.5% for PROFUMO. Hence, these results show reduced conflation of temporal and spatial information in both thresholded ICADR and PROFUMO compared with traditional ICADR. Interestingly, the partial CCA result for the ICADR thresholded temporal network matrix was no longer significantly linked to the positivenegative behavioural mode (R_{partial} = 0.01, p_{Bonferroni} = 1), supporting the interpretation that crosssubject behavioural variability in temporal network matrices is largely driven by spatial information.
Discussion
We previously showed that the spatial topographical organisation of functional networks is most strongly predictive of crosssubject variability in behaviour, and that functional connectivity (temporal) network matrices contain little unique traitlevel crosssubject information that is not also reflected in spatial maps (Bijsterbosch et al., 2018). Here, we aimed to determine and address the reason for this conflation of temporal and spatial sources of interindividual variability, focussing on the pipeline that uses ICA for parcellation and dual regression for the estimation of subject maps and timeseries. Our results reveal that functional networks are often spatially overlapping (Figure 5), and that individual differences in the amount of spatial overlap (and resulting spatial correlation) is likely to explain our previous results. These findings support the main results from our earlier article that estimated temporal correlations are (in part) driven by spatial information (Bijsterbosch et al., 2018). However, our new findings show that these results should not be interpreted as spatial misalignment (as we suggested in our earlier work), but are rather the result of spatially overlapping functional organization. Spatial overlap is underestimated in spatial ICA (due to the statistical independence constraint; Figure 4A), systematically affecting downstream dual regression steps, such that the resulting temporal network matrices are negatively weighted by any spatial correlations present in the underlying true networks. These effects, discussed mathematically in section ‘Temporal network and spatial overlap matrix estimation in ICA followed by dual regression’, and shown in simulated (Figure 4) and real data (Figure 3), directly explain our previous dual regressionbased results. Importantly, these methodological aspects are highly relevant for research into brainbehaviour relationships for two reasons. Firstly, they influence the interpretation of existing research using temporal network matrices (which likely reflect a combination of spatial and temporal information). Secondly, our findings suggest that patterns of spatial overlap may be highly sensitive to individual differences in behaviour and warrant further research.
As discussed in the introduction, one potential reason for our earlier findings could have been linked to residual spatial misalignment between group and subject maps (Bozek et al., 2018; Demirtas et al., 2018; Kong et al., 2019). The results in Figure 1 reveal that dual regression obtains relatively accurate estimates in the presence of minor misalignment (as modelled with MSMAll compared with MSMSulc), suggesting that spatial misalignment may not contribute strongly when care is taken to adopt surfacebased alignment methods and subjectspecific nodedefinition approaches. Therefore, our earlier findings (in HCP data) are more likely explained by the biases introduced as a result of spatial overlap, and not by residual spatial misalignment. However, dual regression was not able to account for more extensive misalignment which is seen when using volumebased alignment procedures (Coalson et al., 2018). Therefore, it is likely that misalignment may influence functional connectivity estimates for analyses performed in volumetric space (Smith et al., 2011). Similarly, misalignment is likely to play a larger role when using anatomicallyderived hard parcellations that are unlikely to match the subjectspecific functional organisation well.
In this work we have broadly considered spatial edges as a distinct form of functional connectivity (to be differentiated from temporal edges; see Figure 2). However, the correct interpretation of spatial edges is an important new research question that requires further investigation. On the level of neural circuitry, neural organisations that may give rise to observed spatial overlap in fMRI include: (i) independent colocated (at fMRI resolution) neural populations, (ii) fast dynamic switching between network membership, (iii) multiplicity of hierarchically organised functional representations, (iv) concurrent contributions to multiple networks (i.e., integration). Determining the neural basis of spatial overlap will inform the correct interpretation of spatial edges by addressing questions such as: ‘is spatial overlap indicative of coupling?”, and ‘do spatial edges fall within the definition of functional connectivity?”.
The detailed explanation of our previous findings presented here only applies to the results that were obtained using a dual regression pipeline. Mathematical differences in timeseries estimation imply that the theory described here cannot explain our earlier findings obtained using a masking procedure with hard parcellations (i.e., Yeo and HCP_MMP1.0) (Glasser et al., 2016; Yeo et al., 2011). We are currently undertaking similarly detailed investigations to further understand our previous results in these hard parcellations. Early qualitative observations suggest the presence of complex patterns of partial overlap between multiple extended networks. This complex overlap may not easily be modelled by increasing the dimensionality to allow separation of previously overlapping voxels into one new parcel, and nonoverlapping voxels into other parcel(s) (which would allow results to be unbiased, particularly at a sufficiently high dimensionality). As such, our preliminary results suggest that hard parcellations are affected by overlapping functional organisation due to the misestimation of spatial nodes and resulting mixing of extracted timeseries (Smith et al., 2011). Further work is needed to test this hypothesis and to develop parcellation methods that address this issue.
The biases described in this work can be addressed using a relatively straightforward extension to the existing dual regression pipeline. Specifically, we suggest a further regression step after standard dual regression, that uses thresholded subject maps to estimate timeseries. It has previously been shown that thresholded maps (using GammaGaussian mixture modelling) better capture spatial overlap (Beckmann et al., 2005; Bielczyk et al., 2018). Here, we explicitly test this approach and report greater accuracy in the estimation of timeseries, spatial maps, and both temporal and spatial overlap matrices when compared with traditional dual regression (Figure 7). When dual regression is performed with the aim of extracting timeseries and estimating temporal network matrices (as opposed to statistical tests of spatial map shape and amplitude), we recommend adopting this thresholded dual regression extension in order to accurately estimate node timeseries. While this approach may lead to somewhat lower associations with behaviour (Figure 8), it is driven more purely by uniquely temporal connectivity information, rather than conflating both temporal and spatial connectivity. Thresholded dual regression will be implemented in the next version of FSL dual_regression (https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/DualRegression).
Our results show that spatial correlations obtained from PROFUMO are most strongly linked to behaviour (Figure 8). This finding is an indirect replication of our earlier work (Bijsterbosch et al., 2018), but here it is driven purely by correlations between spatial maps, while we previously entered the crosssubject covariance matrix of the full weighted spatial maps. Hence, this result indicates that crosssubject variance in spatial correlations (rather than in more complex aspects of spatial organisation) are behaviourallyrelevant. Unfortunately, spatial correlations obtained from thresholded dual regression are not as strongly associated with behaviour (Figure 8), which is likely linked to the fact that PROFUMO outperforms thresholded dual regression in the accuracy of estimated spatial maps and correlations (Figure 7).
The methodological challenge of accurately modelling spatial overlap in network organisation that is addressed here has important implications for a number of key neuroscientific questions. Our findings show that spatial overlap was prominent in the temporoparietaloccipital junction and within higher visual areas along the lateral wall (Figure 5). The hierarchical functional organisation of V1 is relatively wellunderstood to represent overlapping receptive fields with selectivity for stimulus orientation, length, width, or colour (Van Essen and Maunsell, 1983). On the other hand, the complex functional organisation in the inferior parietal lobule, and its involvement in wideranging perceptual and cognitive functions, is the topic of ongoing research and debate (Carter and Huettel, 2013; Igelström and Graziano, 2017; Lin et al., 2018; Mars et al., 2012). Further evidence of the highly complex spatial nature of functional organisation can also be found in recent gradientbased analyses of cortical topography (Haak et al., 2018; Margulies et al., 2016; Marquand et al., 2017). These spatial complexities are largely ignored in most current connectomics approaches, where the use of groupbased anatomical hard atlases (such as the AAL) is still highly pervasive. This likely limits the ability to gain further insights into the organisation and function of cortical regions such as the temporoparietaloccipital junction. As such, the results presented here and elsewhere emphasise the need for modern connectomics research to acknowledge and appropriately account for the presence of highly complex, hierarchically overlapping functional organisation.
It is possible that the overlapping organisation identified here may be linked to complex patterns of interdigitation between largescale networks (Braga and Buckner, 2017). We did not find any evidence for interdigitated patterns here, although the results in Figure 5 (and similar observations at the individual subject level) do show adjacency of two separate networks in a number of distinct cortical zones, consistent with the findings of Braga et al. A larger number of data points per subject may be needed to estimate further detailed patterns of interdigitation in regions that we refer to here as ‘overlapping’ (e.g. there is a roughly threefold increase in the Braga et al dataset compared to HCP data). Alternatively, a truly overlapping hierarchical spatial organisation may be represented as interdigitation in seedbased correlation results (as opposed to multivariate decomposition methods), because correlations may effectively be reduced in regions of overlap relative to neighbouring nonoverlapping regions, creating interdigitated patterns. A further possibility is that individual differences in cortical folding associated with brain volume may influence both observed spatial overlap and interdigitation (Toro et al., 2008). Indeed, we previously showed a strong brainbehaviour relationship for fractional surface area (Supplementary file 1B in Bijsterbosch et al., 2018), and recent work has reported broad associations between structural brain measures and behaviour (Arenas et al., 2018). Hence, further work is needed to understand the relationship between structural morphology and overlapping or interdigitated functional organization.
In conclusion, we replicate our previous work showing that spatial topographical network organisation is most strongly linked to behaviour (Figure 8), and additionally show that crosssubject variability in spatial overlap between complex cortical networks is a key source of behaviourallyrelevant information. Furthermore, we show that this spatially overlapping network structure is underestimated when using a common parcellation technique (ICA), resulting in the conflation of temporal and spatial connectivity information in derived temporal network matrices (from dual regression). We present a solution that obtains more accurate estimates of temporal and spatial overlap matrices based on thresholded spatial maps.
Materials and methods
Dataset
Request a detailed protocolData from the Human Connectome Project S1200 release were used including 1004 subjects with 4800 resting state timepoints (Van Essen et al., 2013). The data were preprocessed using the HCP minimally preprocessing pipeline (Glasser et al., 2013), followed by FIX cleanup of artificial components obtained from singlerun ICA (Smith et al., 2013). Three subjects were excluded for CCA purposes due to incomplete genetic information (i.e., N = 1001 for CCA).
Simulation 1
Request a detailed protocolThe simulation for Figure 4 contained twodimensional spatial maps with 10,000 voxels. A total of 50 subjects with 200 timepoints each were simulated, and the full simulation was repeated 10 times. Two spatial nodes were generated using a Laplacian distribution ($\mu =0;\sigma =0.5$) for the background and linearly added uniformly distributed (ranging between 212) weights. Each spatial node included 100 voxels (out of 10,000), with 25% overlap between the two spatial nodes to ensure ground truth spatial correlation. Normally distributed timeseries were generated for each node, and a shared normally distributed timeseries was added to both to ensure ground truth temporal correlation. Simulated data for each subject was generated by taking the outer product between simulated timeseries and spatial maps, resulting in a 10,000 x 200 dataset. For each repeat, these datasets were concatenated across all 50 subjects before running group ICA, following by dual regression.
Simulation 2
A second simulation was performed to enable a direct comparison between the results from original ICADR, thresholded ICADR, and PROFUMO. The key differences between simulation 1 and simulation two are: (a) a spatial model that builds complex modes from an atlas of contiguous parcels, and (b) a temporal model of the ‘neural’ timeseries that is convolved with a hemodynamic response function. Simulation two was based on previous work (Harrison et al., 2019; Harrison et al., 2015), and is described in more detail below:
Atlas generation
Request a detailed protocolThe 10,000 voxels were split into 100 contiguous parcels, with the parcel widths drawn from a Dirichlet distribution. The parcel weights were then drawn from a Gamma distribution. The gradient of the warp field was generated by convolving random Gaussian noise with a boxcar function, passing through a nonlinearity to limit the range to [−1,1] (i.e., to ensure the warp remained invertible).
Node generation
Request a detailed protocolEach node was formed from a number of spatially contiguous regions. The number of regions followed a Poisson distribution, the total number of parcels followed a beta distribution, and the number of parcels per region followed a Dirichlet distribution. Several regions were made to be anticorrelated (i.e., were given negative weights). The parcel weights again followed a gamma distribution.
Time course generation
Request a detailed protocol‘Neural’ time courses were simulated at 0.1 Hz. The frequency spectra of these were randomly generated, with a bias towards low frequencies. Correlations were induced based on group, subject, and run covariance matrices each drawn from a Wishart distribution. Finally the time courses were sparsified by setting subthreshold time points to zero.
HRF convolution
Request a detailed protocolThe time courses were convolved with random draws from the FLOBS basis set, with a unique HRF being generated for every subject.
Outer product model
Request a detailed protocolThe time courses and spatial maps were combined by taking the outer product, and a weak nonlinearity was applied to the resulting voxelwise timecourses to simulate saturation of the HRF in regions exhibiting high levels of activity.
Noise model
Request a detailed protocolFinally, noise was added to the BOLD signal. This consisted of a structured and unstructured noise subspace: the structured subspace consisted of a set of ‘confounds’, which consisted of the outer product of Gaussian spatial maps and time courses. The unstructured noise was weakly nonGaussian, following a Student’s tdistribution.
The code, containing all used parameter values, is available from https://git.fmrib.ox.ac.uk/samh/PFM_Simulations (Harrison, 2019; copy archived at https://github.com/elifesciencespublications/PFM_Simulations).
Mixture modelling
Request a detailed protocolSpatial maps for resting state fMRI networks are commonly relatively sparse, with a large proportion of voxels or grayordinates considered to be part of the background (i.e., many voxels have relatively low weights and do not contribute to the network). In the presence of additive Gaussian noise, we can therefore model the distribution of spatial weights in any single ICA components using a mixture of one Gaussian distribution (for the background) and two Gamma distributions (for the positive and negative aspects of the ICA networks). Subsequently, the mean and standard deviation of the Gaussian distribution (background) are used to shift and rescale the distribution of spatial weights for each map, and a threshold of ±2 is used to threshold the spatial map such that background voxels are set to zero. Previous work has shown that this threshold procedure can accurately capture ground truth spatial correlations (Beckmann et al., 2005). Here, we adopt mixture model thresholding in a proposed extension to dual regression designed to estimate ICA timeseries and derived correlations with improved accuracy.
Data and code availability
Request a detailed protocolHCP data are distributed from the Connectome Coordination Facility (https://www.humanconnectome.org/). The simulation code is available from https://git.fmrib.ox.ac.uk/samh/PFM_Simulations. Brain data for Figures 1 and 5 are available on Balsa (https://balsa.wustl.edu/study/show/0Lwm6).
Data availability
Simulation analysis scripts are available on git (https://git.fmrib.ox.ac.uk/samh/PFM_Simulations; copy archived at https://github.com/elifesciencespublications/PFM_Simulations). Source data files for Figures 1 and 5 will be made available on BALSA: https://balsa.wustl.edu/study/show/0Lwm6.

BALSAID 0Lwm6. Study: The relationship between spatial configuration and functional connectivity of brain regions revisited.

Young AdultHuman Connectome Project: WUMinn HCP consortium.
References

Investigations into restingstate connectivity using independent component analysisPhilosophical Transactions of the Royal Society B: Biological Sciences 360:1001–1013.https://doi.org/10.1098/rstb.2005.1634

Probabilistic independent component analysis for functional magnetic resonance imagingIEEE Transactions on Medical Imaging 23:137–152.https://doi.org/10.1109/TMI.2003.822821

BookIntroduction to Resting State fMRI Functional ConnectivityJenkinson M, Chappell M, editors. Oxford University Press.

Functional connectivity in the motor cortex of resting human brain using echoplanar MRIMagnetic Resonance in Medicine 34:537–541.https://doi.org/10.1002/mrm.1910340409

Formal models of the network Cooccurrence underlying mental operationsPLOS Computational Biology 12:e1004994.https://doi.org/10.1371/journal.pcbi.1004994

A nexus model of the temporalparietal junctionTrends in Cognitive Sciences 17:328–336.https://doi.org/10.1016/j.tics.2013.05.007

Network modelling methods for FMRINeuroImage 54:875–891.https://doi.org/10.1016/j.neuroimage.2010.08.063

Brain size and folding of the human cerebral cortexCerebral Cortex 18:2352–2357.https://doi.org/10.1093/cercor/bhm261

Hierarchical organization and functional streams in the visual cortexTrends in Neurosciences 6:370–375.https://doi.org/10.1016/01662236(83)901674

The organization of the human cerebral cortex estimated by intrinsic functional connectivityJournal of neurophysiology 106:1125–1165.https://doi.org/10.1152/jn.00338.2011
Decision letter

Richard B IvrySenior Editor; University of California, Berkeley, United States

Chris HoneyReviewing Editor

Daniel S MarguliesReviewer; Max Planck Institute for Human Cognitive and Brain Sciences, Germany

Jakob SeidlitzReviewer
In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.
Thank you for submitting your article "The relationship between spatial configuration and functional connectivity of brain regions revisited" for consideration by eLife as a Research Advance. Your article has been reviewed by Daniel S Margulies (Reviewer #1) and Jakob Seidlitz (Reviewer #2), and the evaluation has been overseen by Chris Honey, the Reviewing Editor, and Richard Ivry as the Senior Editor.
The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.
Summary
This manuscript by Bijsterbosch and colleagues extends their prior work describing the impact of spatial confounds on the variation in functional connectivity across individuals, and its connection to variation in behavior. Here, the authors test several specific hypotheses to account for their prior findings. They do this by examining two axes of potential bias: i) algorithms to derive spatial and temporal functional connectivity domains, and ii) methods for generating "soft" (datadriven) and "hard" (a priori) parcellations. Furthermore, they provide both empirical evidence and quantitative support via simulations in support of these investigations. Ultimately, the authors conclude that interindividual behavioral variability is best explained by the spatial overlap between networks, which affects the extraction of soft parcellations. The manuscript is clearly written, and the analyses and results are presented in a manner that makes the logic of the study straightforward.
Overall, the reviewers were impressed by the rigor and logical progression of the manuscript, and the effort to include results from different surface alignment algorithms as well as a comparison with volumetric registration. The reviewers also appreciated the discussion and comparison of dual regression methodologies versus PROFUMO, and the clear differences in assumptions that are made with each.
Essential revisions
1A) The maps of network overlap, presented in Figure 4A, appear spatially consistent with prior findings of the relative degree of cortical folding (Toro et al., 2008). As cortical morphology may impact on functional connectivity (for example, differences between adjacent sulci and gyri), is it possible that the network overlap results are driven in part by cortical morphology? This issue is distinct from the point raised in the discussion regarding network interdigitation, but the impact of morphology on the results could be the same. A post hoc test of whether measures of cortical surface area and volume account for the brainbehavior relationship observed in the CCA analysis might be sufficient to assess whether this concern is relevant.
1B) Another possible interpretation of the topography in Figure 4: the areas of most overlap appear to be in areas that have been shown to be prone to problems in cortical reconstruction (mostly due to excess head motion). In light of the lack of influence of surface registration method (as discussed above and in the manuscript), can the authors speculate/discuss this pattern of overlap?
2) The Discussion mentions 'primary visual cortex' (V1) as being an area of high overlap (Figure 4). However the regions appear to be located predominantly within higher visual areas along the lateral wall (rather than within the calcarine sulcus). Please justify or correct this claim.
3) As the core brainbehavior finding from the CCA analysis is from Snet PFM, would there be any way to visualize the pattern of spatial overlap associated with the components (along the lines of the figures shown in Figure 4A)? This might help in interpreting the results in the context of prior findings.
4) The Discussion mentions that ongoing work explores why the brainbehavior relationships are also observed when hard parcellations are used (e.g. the Yeo et al. and Glasser et al. parcellations). The authors speculated that: "it is possible that [hard] parcellation methods are therefore unable to isolate overlap into a distinct parcel (which would allow results to be unbiased, particularly at a sufficiently high dimensionality), and instead parcel boundaries in regions of overlap are determined by the network with the strongest amplitude (or lowest crosssubject variance), leading to mixing of extracted timeseries". Are the authors now able to confirm this speculation? More generally, can the authors explain why the brainbehavior variance explained is so similar when using hard parcellations and when using dual regression approaches (Figure 1A of Bijsterbosch et al., 2018)? Should this be the case if the dual regression methods are subject to these spatial overlap / spatial independence effects, while hard parcellations are not?
https://doi.org/10.7554/eLife.44890.020Author response
Essential revisions
1A) The maps of network overlap, presented in Figure 4A, appear spatially consistent with prior findings of the relative degree of cortical folding (Toro et al., 2008). As cortical morphology may impact on functional connectivity (for example, differences between adjacent sulci and gyri), is it possible that the network overlap results are driven in part by cortical morphology? This issue is distinct from the point raised in the discussion regarding network interdigitation, but the impact of morphology on the results could be the same. A post hoc test of whether measures of cortical surface area and volume account for the brainbehavior relationship observed in the CCA analysis might be sufficient to assess whether this concern is relevant.
We agree with the suggestion that underlying cortical morphometry may impact on functional connectivity, and could potentially underlie some of the results reported here. We previously assessed the brainbehavior relationship for cortical surface area using CCA as part our earlier paper (Supplementary File 1B: https://doi.org/10.7554/eLife.32992.028). The findings showed that there is indeed a significant brainbehaviour relationship for Fractional Surface Area (based on the HCP_MMP1.0 parcellation), which is further supported by recent preprint findings showing brainbehaviour associations for multiple structural measures (Arenas, et al., 2018). Further research will be needed to directly relate morphometry and functional organisation in order to establish the relationship between functional overlap and cortical folding. These considerations are included in the Discussion:
“A further possibility is that individual differences in cortical folding associated with brain volume may influence both observed spatial overlap and interdigitation (Toro et al., 2008). Indeed, we previously showed a strong brainbehaviour relationship for fractional surface area (Supplementary File 1B in (Bijsterbosch et al., 2018)), and recent work has reported broad associations between structural brain measures and behaviour (Arenas et al., 2018). Hence, further work is needed to understand the relationship between structural morphology and overlapping or interdigitated functional organization.”
1B) Another possible interpretation of the topography in Figure 4: the areas of most overlap appear to be in areas that have been shown to be prone to problems in cortical reconstruction (mostly due to excess head motion). In light of the lack of influence of surface registration method (as discussed above and in the manuscript), can the authors speculate/discuss this pattern of overlap?
As the reviewer points out, research has shown relatively widespread correlations between cortical thickness estimates from Freesurfer and head motion (Reuter et al., 2015), including in some of the overlap areas reported in our manuscript. However, these effects are highly unlikely to play a role in the findings we report here. For HCP data, the quality of the structural data and of the Freesurfer pipelines is tightly controlled and checked manually for all subjects, and poor quality structural scans were flagged and repeated to achieve optimal data quality (Glasser et al., 2013; Marcus et al., 2013), leading to very high quality surface projections. The MSMAll alignment procedure incorporates functional information in addition to surface folding patterns, and is therefore expected to be minimally affected by errors in cortical projection. While there are substantial improvements in MSMAll alignment compared to MSMSulc alignment (Coalson, Van Essen, and Glasser, 2018), we observe highly similar CCA results against behaviour regardless of the alignment procedure and we show here that dual regression performs robustly in both types of surfacebased alignment. These findings are in line with the stateoftheart quality of HCP surface projections, suggesting that cortical reconstruction errors do not contribute to the findings reported in this manuscript.
2) The Discussion mentions 'primary visual cortex' (V1) as being an area of high overlap (Figure 4). However the regions appear to be located predominantly within higher visual areas along the lateral wall (rather than within the calcarine sulcus). Please justify or correct this claim.
Thank you for highlighting this, we have corrected this in the revised manuscript:
“Our findings show that spatial overlap was prominent in the temporoparietaloccipital junction and within higher visual areas along the lateral wall (Figure 4).”
3) As the core brainbehavior finding from the CCA analysis is from Snet PFM, would there be any way to visualize the pattern of spatial overlap associated with the components (along the lines of the figures shown in Figure 4A)? This might help in interpreting the results in the context of prior findings.
Thank you for this excellent suggestion. In the revised manuscript we include a video of overlap associated with the main positivenegative CCA mode (Figure 8—video 1). This video maps directly onto the videos included in our earlier work, and shows the change in overlap as a function of behaviour. We have included these results in the revised manuscript:
“To facilitate the interpretation of these findings in light of our earlier work (Bijsterbosch et al., 2018), we include a video of spatial overlap as a function of the most significant behavioural CCA mode (Figure 8—video 1). Here, we followed the same procedure described previously to generate interpolated video frames for each of the 50 PFM maps, and subsequently summed the grayordinatewise weights across all 50 maps following normalisation (as for Figure 5A). The resulting video (Figure 8—video 1) shows that spatial overlap systematically varies along with the networkspecific spatial variation that we showed previously (Bijsterbosch et al., 2018). Hence, these results show that spatial overlap (and resulting correlation) is a key behaviourallyrelevant aspect of spatial information.”
4) The Discussion mentions that ongoing work explores why the brainbehavior relationships are also observed when hard parcellations are used (e.g. the Yeo et al. and Glasser et al. parcellations). The authors speculated that: "it is possible that [hard] parcellation methods are therefore unable to isolate overlap into a distinct parcel (which would allow results to be unbiased, particularly at a sufficiently high dimensionality), and instead parcel boundaries in regions of overlap are determined by the network with the strongest amplitude (or lowest crosssubject variance), leading to mixing of extracted timeseries". Are the authors now able to confirm this speculation? More generally, can the authors explain why the brainbehavior variance explained is so similar when using hard parcellations and when using dual regression approaches (Figure 1A of Bijsterbosch et al., 2018)? Should this be the case if the dual regression methods are subject to these spatial overlap / spatial independence effects, while hard parcellations are not?
Our ongoing work to assess the effect of overlapping functional organization on hard parcellations suggests that the accuracy of temporal network matrix estimates is reduced in the presence of overlap, similarly to standard dual regression approaches. Hence, temporal network matrices obtained from both dual regression and hard parcellation methods are affected by spatial overlap, explaining their comparable associations with behaviour. On a mathematical level, the mechanisms by which spatial overlap influences the temporal network matrix estimates differs fundamentally between the two categories of methods. As shown in this current paper, the ICADR approach captures the spatial organisation with good accuracy, but minor errors in spatial estimation (resulting from spatial independence) introduce systematic biases when calculating temporal correlations between extracted timeseries. On the other hand, our preliminary findings show that hard parcellations likely misestimate the spatial organisation in the presence of spatial overlap. This affects the network matrices as a result of averaging together timeseries from different functional systems (timeseries mixing). While these are important methodological differences in terms of the mechanisms by which overlap influences network matrix estimation, the effects on brainbehaviour investigations are likely to be very similar. We have revised the wording of the resubmitted manuscript to clarify our preliminary findings:
“Early qualitative observations suggest the presence of complex patterns of partial overlap between multiple extended networks, that cannot easily be modelled by increasing the dimensionality, to allow separation of previously overlapping voxels into one new parcel, and nonoverlapping voxels into other parcel(s) (which would allow results to be unbiased, particularly at a sufficiently high dimensionality). As such, our preliminary results suggest that hard parcellations are affected by overlapping functional organisation due to the misestimation of spatial nodes and resulting mixing of extracted timeseries (Smith et al., 2011). Further work is needed to test this hypothesis and to develop parcellation methods that address this issue.”
https://doi.org/10.7554/eLife.44890.021Article and author information
Author details
Funding
Nederlandse Organisatie voor Wetenschappelijk Onderzoek (864.12.003)
 Christian F Beckmann
Wellcome (098369/Z/12/Z)
 Stephen M Smith
Wellcome (091509/Z/10/Z)
 Stephen M Smith
Wellcome (203139/Z/16/Z)
 Stephen M Smith
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
Data were provided by the Human Connectome Project, WUMinn Consortium (Principal Investigators: David Van Essen and Kamil Ugurbil; 1U54MH091657) funded by the 16 NIH Institutes and Centers that support the NIH Blueprint for Neuroscience Research; and by the McDonnell Center for Systems Neuroscience at Washington University. CFB acknowledges support from The Netherlands Organization for Scientific Research (NWO, grant no 864.12.003). We are grateful for funding from the Wellcome Trust (grants 098369/Z/12/Z and 091509/Z/10/Z). The Wellcome Centre for Integrative Neuroimaging is supported by core funding from the Wellcome Trust (203139/Z/16/Z).
Ethics
Human subjects: HCP data were acquired using protocols approved by the Washington University institutional review board (Mapping the Human Connectome: Structure, Function, and Heritability; IRB # 201204036). Informed consent was obtained from subjects. Anonymised data are publicly available from ConnectomeDB (db.humanconnectome.org; Hodge et al., 2016). Certain parts of the dataset used in this study, such as the age of the subjects, are available subject to restricted data usage terms, requiring researchers to ensure that the anonymity of subjects is protected (Van Essen et al., 2013).
Senior Editor
 Richard B Ivry, University of California, Berkeley, United States
Reviewing Editor
 Chris Honey
Reviewers
 Daniel S Margulies, Max Planck Institute for Human Cognitive and Brain Sciences, Germany
 Jakob Seidlitz
Publication history
 Received: January 14, 2019
 Accepted: May 7, 2019
 Accepted Manuscript published: May 8, 2019 (version 1)
 Version of Record published: May 29, 2019 (version 2)
Copyright
© 2019, Bijsterbosch et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 2,071
 Page views

 318
 Downloads

 15
 Citations
Article citation count generated by polling the highest count across the following sources: Crossref, Scopus, PubMed Central.