Peer review process
Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, and public reviews.
Read more about eLife’s peer review process.Editors
- Reviewing EditorTirin MooreStanford University, Howard Hughes Medical Institute, Stanford, United States of America
- Senior EditorTirin MooreStanford University, Howard Hughes Medical Institute, Stanford, United States of America
Reviewer #1 (Public review):
Summary:
In this article, Chunharas and colleagues compared the representational differences of orientation information during a sensory task and a working memory task. By reanalyzing data from a previous fMRI study and applying representational similarity analysis (RSA), they observed that orientation information was represented differently in the two tasks: during visual perception, orientation representation resembled the veridical model, which captures the known naturalistic statistics of orientation information; whereas during visual working memory, a categorical model, which assumes different psychological distances between orientations, better explained the data, particularly in more anterior retinotopic regions. The authors suggest fundamental differences in the representational geometry of visual perception and working memory along the human retinotopic cortex.
Strengths:
Examining the differences in representational geometry between perception and working memory has important implications for the understanding of the nature of working memory. This study presents a carefully-executed reanalysis of previous data to address this question. The authors developed a novel method (model construction combined with RSA) to examine the representational geometry of orientation information under different tasks, and the control analyses provide rich, convincing support for their claims.
Weaknesses:
Although the control analyses are convincing, I still have alternative explanations for some of the results. I'm also concerned about the low sample size (n = 6 in the fMRI experiment). Overall, I think additional analyses may help to further clarify the issues and strengthen the claims.
(1) The central claim of the current study is that orientation information is represented in a veridical manner during the sensory task, and in a categorical manner during working memory. However, In the sensory task, a third type of representational geometry was observed, especially in brain regions from V3AB and beyond. These regions showed a symmetric pattern in which oblique orientations (45 and 135 degrees) appeared more similar to each other. In fact, a similar pattern can even be found in V1-V3, although the effect looked weaker. The authors raised two possible explanations for this in the discussion, one being that participants might have used verbal labels (e.g., diagonal) for both orientations, and the other being a lack of attention to orientation. Either way, this suggests that a veridical model may not be the best fit for these ROIs. How would this symmetric model explain the sensory data, in comparison to the veridical model?
(2) If the symmetric model also explains the sensory data well, I wonder whether this result challenges the authors' central claim, or instead suggests that the sensory task is not ideal for the purpose of the study. One way to address this issue might be to use the sample period of the working memory task as the perception task, as some other studies have been doing (e.g., Kwak & Curtis, 2022). This epoch of data might function as a stronger version of the attention task as the authors discussed in the discussion. What would the representational geometry look like in the sample period? I would also like to note that the current analyses used 5.6-13.6 s after stimulus onset for the memory task, which I think may reflect a mix of sample- and delay-related activity.
(3) When comparing the veridical and categorical models, it is important to first show the significance of each model before making comparisons. For instance, was the veridical model significant in different ROIs in the memory task? And was either model significant in IPS1-3 in the two tasks? I'm asking about this because the two models appear to be both significant in the memory task, whereas only the veridical model was significant in the sensory task (with overall lower correlation coefficients than the categorical model in the memory task).
(4) The current study has a low sample size of six participants. With such a small sample, it would be helpful to show results from individual participants. For example, I appreciate that Figures 2D and 3C showed individual data points, but additionally showing the representational geometry plot (i.e., Figure 1C) for each subject could better illustrate the robustness of the effect. Alternatively, the original paper from which the fMRI data were drawn actually had two fMRI experiments with similar task designs. I wonder if the authors could replicate these patterns using data from the second experiment with seven participants. This might provide even stronger support for the current findings with a more reasonable sample size.
Reviewer #2 (Public review):
Summary:
In this manuscript, the authors examined the representational geometry of orientation representations during visual perception and working memory along the visual hierarchy. Using representational similarity analysis, they found that similarity was relatively evenly distributed among all orientations during perception, while higher around oblique orientations during WM. There were some noticeable differences along the visual hierarchy. IPS showed the most pronounced oblique orientation preferences during WM but no clear patterns during perception, likely due to the different task demands for the WM orientation task and the perception contrast discrimination task. The authors proposed two models to capture the differences. The veridical model estimated the representational geometry in perception by assuming an efficient coding framework, while the categorical model estimated the pattern in WM using psychological distances to measure the differences among orientations (including estimates from a separate psychophysical study performed outside the scanner). Therefore, I think this work is valuable and advances our understanding of the transition from perception to memory.
Strengths:
The use of RSA to identify representational biases goes beyond simply relying on response patterns and helps identify how representational formats change from perception to WM. The study nicely leverages ideas about efficient coding to explain perceptual representations that are more veridical, while leaning on ideas about abstractions of percepts that are more categorical-psychological in nature (but see (1) below). Moreover, the match between memory biases of orientation and the patterns estimated with RSA were compelling (but see (2) below). I found the analyses showing how RSA and decoding (eg, cross-generalization) are associated and how/why they may differ to be particularly interesting.
Weaknesses:
(1) The idea that later visual maps (ie, IPS0) encode perceptions of orientation in a veridical form and then in a categorical form during WM is an attractive idea. However, the support is somewhat weakened by a few issues. The RSA plots in Figure 1C for IPS0 appear to show a similar pattern, but just of lower amplitude during perception. But in the model fits either for orientation statistics or estimated from the psychophysics task, the Veridical model fits best for perception and the Categorical model fits best for memory in IPS0. By my eye, the modeled RSMs in Figures 2 & 3 do not look like the observed ones in Figure 1C. Those modeled RSMs look way more categorical than the observed IPS0. They look like something in between.
(2) My biggest concern is the omission of the in-scanner behavioral data. Yes, on the one hand, they used the N=17 outside the scanner psychophysics dataset for the analyses in Figure 3. On the other hand, they do not even mention the behavioral data collected in the scanner along with the BOLD data. Those data had clear oblique effects if I recall correctly. Why use the data from the psychophysics experiment? Also, perhaps a missed opportunity; I wonder if the Veridical/Categorical models fit a single subject's RSA data matches that subject's behavioral biases. That would really be compelling if found.
The data were collected (reanalysis of published study) without consideration for the aims of the current study, and are therefore not optimized to test their goals. The biggest issue is that "The distractors are really distracting me." I'm somewhat concerned about how the distractors may have impacted the results. I honestly did not notice that the authors were using delay periods that had 11s of distractor stimuli until way into the paper. On the one hand, the "patterns" of the model fits across the ROIs appear to be qualitatively similar. That's good if you want to pool data like the authors did. But, while the authors state on line 350 "..we also confirmed that the presence of distractors during the delay did not impact the pattern of results in the memory task (Supplementary Figure 5)." When looking at Supplementary Figure 5, I noticed that there are a couple of exceptions to this. In the Gratings distractor data, V1 shows a better fit to the Veridical model, while V4 and IPS0 shows no better fit to either model. And in the Noise distractor data, neither model fits better for any ROI. At first glance, I was concerned, but then looking at the No distractor data, the pattern is identical to that of the combined data. Thus, this can be seen as a glass half full/empty issue as almost all of the ROIs show a similar pattern, but still it would concern me if I were leading this study. This gets me to my key question, why even use the distractor trials at all, where the interpretation can get dicey? For instance, the authors have shown in this exact data that the impact of distraction affects the fidelity of representations differently along the visual hierarchy (Rademaker, 2019), consistent with several other studies (eg., Bettencourt & Xu, 2016; Lorenc, 2018; Hallenbeck et al., 2022) and with one of the author's preprints (Rademaker & Serences, 2024). My guess is that without the full dataset, some of the RSA analyses are underpowered. If that is the case, I'm fine with it, but it might be nice to state that.