Peer review process
Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, and public reviews.
Read more about eLife’s peer review process.Editors
- Reviewing EditorEmma SprootenDonders Institute for Brain, Cognition and Behaviour, Nijmegen, Netherlands
- Senior EditorAndre MarquandRadboud University Nijmegen, Nijmegen, Netherlands
Reviewer #1 (Public review):
Summary:
Gruskin and colleagues use twin data from a movie-watching fMRI paradigm to show how genetic control of cortical function intersects with the processing of naturalistic audiovisual stimuli. They use hyperalignment to dissect heritability into the components that can be explained by local differences in cortical-functional topography and those that cannot. They show that heritability is strongest at slower-evolving neural time scales and is more evident in functional connectivity estimates than in response time series.
Strengths:
This is a very thorough paper that tackles this question from several different angles. I very much appreciate the use of hyperalignment to factor out topographic differences, and I found the relationship between heritability and neural time scales very interesting. The writing is clear, and the results are compelling.
Weaknesses:
The only "weaknesses" I identified were some points where I think the methods, interpretation, or visualization could be clarified.
(1) On page 16, the authors compare heritability in functional connectivity (FC) and response time series, and find that the heritability effect is larger in FC. In general, I agree with your diagnosis that this is in large part due to the fact that FC captures the covariance structure across parcels, whereas response time series only diverge in terms of univariate time-point-by-time-point differences. Another important factor here is that (within-subject) FC can be driven by intrinsic fluctuations that occur with idiosyncratic timing across subjects and are unrelated to the stimulus (whereas time-locked metrics like ISC and time-series differences cannot, by definition). This makes me wonder how this connectivity result would change if the authors used intersubject functional connectivity (ISFC) analysis to specifically isolate the stimulus-driven components of functional connectivity (Simony et al., 2016). This, to me, would provide a closer comparison to the ISC and response time series results, and could allow the authors to quantify how much of the heritability in FC is intrinsic versus stimulus-driven. I'm not asking that the authors actually perform this analysis, as I don't think it's critical for the message of the manuscript, but it could be an interesting future direction. As the authors discuss on page 17, I also suspect there's something fundamentally shared between response time series and connectivity as they relate to functional topography (Busch et al., 2021) that drives part of the heritability effect.
(2) The observation that regions with intermediate ISC have the largest differences between MZ, DZ, and UR is very interesting, but it's kind of hard to see in Figure 1B. Is there any other way to plot this that might make the effect more obvious? For example, I could imagine three scatter plots where the x- and y-axes are, e.g., MZ ISC and UR ISC, and each data point is a parcel. In this kind of plot, I would expect to see the middle values lifted visibly off the diagonal/unity line toward MZ. The authors could even color the data points according to networks, like in Figure 3C. (They also might not need to scale the ISC axis all the way to r = 1, which would make the differences more visible.)
(3) On page 9, if I understand correctly, the authors regress the vector of ISC values across parcels out of the vector of heritability values across parcels, and then plot the residual heritability values. Do they center the heritability values (or include some kind of intercept) in the process? I'm trying to understand why the heritability values go from all positive (Figure 2A) to roughly balanced between positive and negative (Figure 2B). Important question for me: How should we interpret negative values in this plot? Can the authors explain this explicitly in the text? (I also wonder if there's a more intuitive way to control for ISC. For example, instead of regressing out ISC at the parcel/map level, could they go into a single parcel and then regress the subject-level pairwise ISC values out when computing the heritability score?).
(4) On page 4 (line 155), the authors say "we shuffled dyad labels"- is this equivalent to shuffling rows and columns of the pairwise subject-by-subject matrix combined across groups? I'm trying to make sure their approach here is consistent with recommendations by Chen et al., 2016. Is this the same kind of shuffling used for the kinship matrix mentioned in line 189?
(5) I found panel A in Figure 4 to be a little bit misleading because their parcel-wise approach to hyperalignment won't actually resolve topographic idiosyncrasies across a large cortical distance like what's depicted in the illustration (at the scale of the parcels they are performing hyperalignment within). Maybe just move the green and purple brain areas a bit closer to each other so they could feasibly be "aligned" within a large parcel. Worth keeping in mind when writing that hyperalignment is also not actually going to yield a one-to-one mapping of functionally homologous voxels across individuals: it's effectively going to model any given voxel time series as a linear combination of time series across other voxels in the parcel.
(6) I believe the subjects watched all different movies across the two days, however, for a moment I was wondering "are Day 1 and Day 2 repetitions of the same movies?" Given that Day 1 and Day 2 are an organizational feature of several figures, it might be worth making this very explicit in the Methods and reminding the reader in the Results section.
References:
Busch, E. L., Slipski, L., Feilong, M., Guntupalli, J. S., di Oleggio Castello, M. V., Huckins, J. F., Nastase, S. A., Gobbini, M. I., Wager, T. D., & Haxby, J. V. (2021). Hybrid hyperalignment: a single high-dimensional model of shared information embedded in cortical patterns of response and functional connectivity. NeuroImage, 233, 117975. https://doi.org/10.1016/j.neuroimage.2021.117975
Chen, G., Shin, Y. W., Taylor, P. A., Glen, D. R., Reynolds, R. C., Israel, R. B., & Cox, R. W. (2016). Untangling the relatedness among correlations, part I: nonparametric approaches to inter-subject correlation analysis at the group level. NeuroImage, 142, 248-259. https://doi.org/10.1016/j.neuroimage.2016.05.023
Simony, E., Honey, C. J., Chen, J., Lositsky, O., Yeshurun, Y., Wiesel, A., & Hasson, U. (2016). Dynamic reconfiguration of the default mode network during narrative comprehension. Nature Communications, 7, 12141. https://doi.org/10.1038/ncomms12141
Reviewer #2 (Public review):
Summary:
The authors attempt to estimate the heritability of brain activity evoked from a naturalistic fMRI paradigm. No new data were collected; the authors analyzed the publicly available and well-known data from the Human Connectome Project. The paper has 3 main pieces, as described in the Abstract:
(1) Heritability of movie-evoked brain activity and connectivity patterns across the cortex.
(2) Decomposition of this heritability into genetic similarity in "where" vs. "how" sensory information is processed.
(3) Heritability of brain activity patterns, as partially explained by the heritability of neural timescales.
Strengths:
The authors investigate a very relevant topic that concerns how heritable patterns of brain activity among individuals subjected to the same kind of naturalistic stimulation are. Notably, the authors complement their analysis of movie-watching data with resting-state data.
Weaknesses:
The paper has numerous problems, most of which stem from the statistical analyses. I also note the lack of mapping between the subsections within the Methods section and the subsections within the Results section. We can only assess results after understanding and confirming the methods are valid; here, however, Methods and Results, as written, are not aligned, so we can't always be sure which results are coming from which analysis.
(A) Intersubject correlation (ISC) (section that starts from line 143): "We used non-parametric permutation testing to quantify average differences in ISC for each parcel in the Schaefer 400 atlas for each day of data collection across three groups: MZ dyads, DZ dyads, and unrelated (UR) dyads, where all UR dyads were matched for gender and age in years." ... "some participants contributed to ISC values for multiple dyads (thus violating independence assumptions)"
This is an indirect attempt to demonstrate heritability. And it's also incorrect since, as the authors themselves point out, some subjects contribute to more than one dyad.
Permutation tests don't quantify "average differences", they provide a measure of evidence about whether differences observed are sufficient to reject a hypothesis of no difference.
Matching subjects is also incorrect as it artificially alters the sample; covarying for age and sex, as done in standard analyses of heritability, would have been appropriate.
It isn't clear why the authors went through the trouble of implementing their own non-parametric test if HCP recommends using PALM, which already contains the validated and documented methods for permutation tests developed precisely for HCP data.
The results from this analysis, in their current form, are likely incorrect.
(B) Functional connectivity (FC) (section that starts from line 159): Here the authors compute two 400x400 FC matrix for each subject, one for rest, one for movie-watching, then correlate the correlations within each dyad, then compared the average correlation of correlations for MZ, DZ, and UR. In addition to the same problems as the previous analysis, here it is not clear what is meant by "averaging correlations [...] within a network combination". What is a "network combination"? Further, to average correlations, they need to be r-to-z transformed first. As with the above, the results from this analysis in its current form are likely incorrect.
(C) ISC and FC profile heritability analyses (section that starts from line 175): Here, the authors use first a valid method remarkably similar to the old Haseman-Elston approach to compute heritability, complemented by a permutation test. That is fine. But then they proceed with two novel, ill-described, and likely invalid methods to (1) "compare the heritability of movie and rest FC profiles" and (2) to "determine the sample size necessary for stable multidimensional heritability results". For (1), they permute, seemingly under the alternative, rest and movie-watching timeseries, and (2), by dropping subjects and estimating changes in the distribution.
The (1) might be correct, but there are items that are not clearly described, so the reader cannot be sure of what was done. What are the "153 unique network combinations"? Why do the authors separate by day here, whereas the previous analyses concatenated both days? Were the correlations r-to-z transformed before averaging?
The (2) is also not well described, and in any case, power can be computed analytically; it isn't clear why the authors needed to resort to this ad hoc approach, the validity of which is unknown. If the issue is the possibility that the multidimensional phenotypic correlation matrix is rank-deficient, it suffices that there are more independent measurements per subject than the number of subjects.
(D) Frequency-dependent ISC heritability analysis (from line 216): Here, the authors decompose the timeseries into frequency bands, then repeat earlier analyses, thus bringing here the same earlier problems and questions of non-exchangability in the permutations given the dyads pattern, r-z transforms, and sex/age covariates.
(E) FC strength heritability analysis (from line 236): Here, the authors use the univariate FC to compute heritability using valid and well-established methods as implemented in SOLAR. There is no "linkage" being done here (thus, the statement in line 238 is incorrect in this application. SOLAR already produces SEs, so it's unclear why the authors went out of their way to obtain jackknife estimates. If the issue is non-normality, I note that the assumption of normality is present already at the stage in which parameters themselves are estimated, not just the standard errors; for non-normal data, a rank-based inverse-normal transformation could have been used. Moreover, typically, r-to-z transformed values tend to be fairly normally distributed. So, while the heritabilities might be correct, the standard errors may not be (the authors don't demonstrate that their jackknife SE estimator is valid). The comparison of h2 between dyads raises the same questions about permutations, age/sex covariates, and r-z transforms as above.
(F) Hyperalignment (from line 245): It isn't clear at this point in the manuscript in what way hyperalignment would help to decompose heritability in "where vs. how" (from the Abstract). That information and references are only described much later, from around line 459. The description itself provides no references, and one cannot even try to reproduce what is described here in the Methods section. Regardless, it isn't entirely clear why this analysis was done: by matching functional areas, all heritabilities are going to be reduced because there will be less variance between subjects. Perhaps studying the parameters that drive the alignment (akin to what is done in tensor-based and deformation-based morphometry) could have been more informative. Plus, the alignment process itself may introduce errors, which could also reduce heritability. This could be an alternative explanation for the reduced heritability after hyperalignment and should be discussed. An investigation of hyperaligment parameters, their heritability, and their co-heritability with the BOLD-phenotypes can inform on this.
(G) Relationships between parcel area and heritability (from line 270): As under F), how much the results are distorted likely depends on the accuracy of the alignment, and the error variance (vs heritable variance) introduced by this.
(H) Neural timescale analyses (from line 280): Here, a valid phenotype (NT) is assessed with statistical methods with the same limitations as those previously (exchangability of dyads, age/sex covariates, and r-z transforms). NT values are combined across space and used as covariates in "some multivariate analyses". As a reader, I really wanted to see the results related to NT, something as simple as its heritability, but these aren't clearly shown, only differences between types of dyads.
(I) Significance testing for autocorrelated brain maps and FC matrices (from line 310): Here, the authors suddenly bring up something entirely different: reliability of heritability maps, and then never return to the topic of reliability again. As a reader, I find this confusing. In any case, analyses with BrainSMASH with well-behaved, normally distributed data are ok. Whether their data is well behaved or whether they ensured that the data would be well behaved so that BrainSMASH is valid is not described. As to why Spearman correlations are needed here, Mantel tests, or whether the 1000 "surrogate" maps are valid realizations of the data under the null, remains undemonstrated.
(J) Global signal was removed, and the authors do not acknowledge that this could be a limitation in their analyses, nor offer a side analysis in which the global signal is preserved.
(K) FDR is used to control the error rate, but in many cases, as it's applied to multiple sets of p-values, the amount of false discoveries is only controlled across all tests, but not within each set. The number of errors within any set remains unknown.
(L) Generally, when studying the heritability of a trait, the trait must be defined first. Here, multiple traits are investigated, but are never rigorously defined. Worse, the trait being analyzed changes at every turn.
Reviewer #3 (Public review):
Strengths:
It's sort of novel to study the heritability of movie-watching fMRI data. The methodology the authors used in the paper is also supportive of their findings. Figures are nicely organized and plotted. They finally found that sensory processing in the human brain is under genetic control over stable aspects of brain function (here referring to neural timescale and resting state connectivity).
Weaknesses:
What I am worried about most is the sample size and interpretation of heritability.
(1) Figure 1. I assumed that the authors just calculated the ISC within each group (MZ, DZ, and UR). Of course, you can get different variations between each group. Therefore, there is heritability. Why not calculate ISC across the whole sample, then separate MZ, DZ, and UR?
(2) Heritability scores in the paper are sort of small. If the sample size is small, please consider p-values, which will tell more about the trustworthiness of your heritability.
(3) I don't understand the high-frequency signals in fMRI data. It's always regarded as noise, the band 1 here in particular.
(4) The statement "we show that the heritability of brain activity patterns can be partially explained by the heritability of the neural timescale" should come from Figure 5. However, after controlling for NT, the heritability decreased max. 0.025 in temporal areas. I am not sure this change supports the statement. If the visual cortex is outlined, and combining ISC changes in the visual cortex, I think this would somehow be answered. Instead of delta h2, adding a new model h2 would be obvious to the readers.
(5) Figures 7 and 8, when getting the difference of heritability, please also consider the standard errors of the heritability estimates. Then you can compare across networks/regions.
(6) I think movie VS resting state is a really important result in this paper. However, there is almost no discussion. Discussing this part would be more beneficial for understanding the genetic control over the neuron arousal and excitation circuits.