Early Visual Cortex Supports One-Shot Episodic Memory via Spatially Tuned Reactivation

  1. Department of Psychology, New York University, New York, United States
  2. Center for Neural Science, New York University, New York, United States
  3. Department of Cognitive and Psychological Sciences, Brown University, Providence, United States

Peer review process

Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, and public reviews.

Read more about eLife’s peer review process.

Editors

  • Reviewing Editor
    Lila Davachi
    Columbia University, New York, United States of America
  • Senior Editor
    Joshua Gold
    University of Pennsylvania, Philadelphia, United States of America

Reviewer #1 (Public review):

Summary:

This paper reports the findings of a neuroimaging experiment that tested the hypothesis that the cortex, specifically early visual areas, reinstates the content from single events during our lives. The researchers tested this hypothesis by presenting to-be-remembered pictures of objects at spatial locations on the computer screen and then testing subjects with both recall and recognition. They show that during memory testing, the spatial location of the object can be decoded from the pattern of cortical BOLD responses measured with fMRI. They go on to show that the spatial tuning is higher during recognition than recall, that the tuning is correlated with memory retrieval accuracy, and that the retrieved precision is predicted by the encoded precision, particularly in the higher-level visual areas. Thus, the paper finds evidence of cortical reinstatement of details from a single event in a human life.

Strengths:

This is a strong manuscript that I have had the luxury of commenting on during a round of review at another prestigious journal. As a result, the authors have already made changes to address previous comments about highlighting the complementary learning systems approach more to motivate the alternative prediction that the cortex should only show evidence of reinstatement after repeated presentations. In addition, the authors have fleshed out the discussion of working memory in this task. They also revised their review of the literature to include citations suggesting spatial locations are normal parts of our episodic representations, likely obligatory in nature, as my group and others have argued in completely unrelated work. I applaud the authors for being responsive to a previous round of review and using the comments to address relatively minor issues with the paper, even though they moved on to a different journal. Thus, I found the paper even stronger than at first approach, and at first blush, the results were intriguing and the paper well written.

Weaknesses:

There is a logical perspective in the narrative that seems to unnecessarily weaken the paper. The paper shows evidence consistent with the conclusion that mnemonic representations are contained in early visual cortex, but then argues that those representations are not actually stored therein. For example, the first half of the last sentence of the conclusions (see page 19 of the manuscript). I understand the perspective that subcortical mechanisms must be involved in the act of retrieval, given the neuropsychology and other evidence. But if storage is elsewhere with the same fidelity so as to code this information, then how would such a memory system work? The MTL neurons would need to have the real, precise representation of all the orientations encoded at all the retinotopic locations, a mirror to V1 in terms of precision, because that's the actual memory representation being retrieved, so its fidelity will be limited by what is stored in the file, so to speak. Then, at retrieval, the paper proposes that the brain just reactivates the encoding context in V1 to help with the response output and ensure the precision of the behavioral responses. This must mean that the hippocampus/MTL has cells and networks with tuning functions that match the precision in all the cortical sensory systems that they are integrating context across, given the episodic memory models like Polyn and colleagues (2009, Psych Rev). So, there are little MTL maps that are completely redundant with V1, M1, A1, S1, etc.? Why such redundancy?

Why not propose that what the subcortical systems do is to encode a unique pattern for that episode, that is separated from others, that just links (or provides pointers to, in computer science jargon) the contextual details stored in the cortical networks themselves? In this way, we can explain why neglected patients also neglect their memories of the town square. This has always been my interpretation of the results of the Polyn et al. (2006, Science) paper and the models tested with those whole-brain results. That is, you see widespread cortical context reinstatement during (one-shot) free recall events that included visual selective cortex for faces when faces were being recalled, but included a broad network, probably V1, and activating sounds in A1, body posture in M1, etc., though the latter three examples did not discriminate between categories of memoranda, in their experiments. Given that you show that activity in V1 during retrieval looks like it is being used, you should propose that the early cortex really participates in memory storage functions. V1 neurons are wired up to neurons of other selectivities in a competitive network with plastic synaptic connections. How would experience be prevented from changing activity in the cortex? Yes, cortical changes slow after the critical periods, as studied in the classic eye suturing experiments to study ocular dominance, but changes in cortical representations do not stop with maturity, with the pinwheel centers looking like they are context sensitive, thus, changing rapidly to events across time (Okamoto, Ikezoe, et al., 2011, Sci Reports). The brain would need a no-plasticity mechanism, and instead, it looks like the cortex can completely rewire even in adulthood (Buonomano & Merzenich, 1998, Annu Rev Neuro).

I believe that the paper needs to describe the strong/radical interpretation of the current findings; that they are consistent with the view that the entire brain may be a memory structure, with encoding linking representations across sensory cortices. But also activating semantic and lexical systems, emotional networks encoding those aspects of context which we know can sometimes strongly drive effects, a nice prediction that could be made in the discussion/conclusions. Here you are looking at how precise the visual reinstatement is in V1 during retrieval following one exposure. One parsimonious mechanism to explain this effect is that the brain stores details of events using the neurons that do the high-fidelity perception of the event. Given that our goal is to stimulate thinking among fellow scientists so that this paper can be a citation classic, I think the paper should be revised so that it paints a complete picture of the theoretical possibilities of its findings.

Reviewer #2 (Public review):

Summary:

The study aims to show that the early visual cortex is not merely a sensory-perceptual region that encodes stimuli while they are physically present, but also supports the formation and retrieval of long-term episodic memories. Instead, the authors demonstrate that spatially tuned reactivation of early visual cortex after a single encoding event supports memory-guided behavior, such as recalling an object's original location.

Strengths:

The study provides solid evidence that location information for single, trial-unique objects is reinstated in early visual cortex during both recognition and recall, even without explicit spatial demands, and the remembered vs. forgotten analyses link spatial tuning to behavior. The one-shot design and absence of explicit spatial instructions are important strengths that bring the paradigm closer to everyday, incidental episodic experiences and go beyond highly trained cue-target associations.

Weaknesses:

(1) Conceptually, the main findings would appear less surprising without a sharper theoretical contrast. Given basic retinotopic coding, it is natural that object identity and location are jointly encoded when an object is presented at a particular position, so spatially tuned reinstatement in V1-V3 can be interpreted as a reconfirmation of known properties unless more clearly contrasted with theories that emphasize more abstract, position-invariant cortical representations following hippocampal-cortical recoding. As currently framed, the introduction does not fully articulate what existing accounts might predict, or what pattern of results would have challenged those accounts, which somewhat weakens the perceived theoretical payoff.

(2) It also remains somewhat unclear why early visual cortex (V1-V3), specifically, is the critical locus for the spatial information of interest, as opposed to higher-level visual or parietal regions that could also provide a spatial scaffold; clearer rationale and, if possible, control analyses in additional regions would help here.

(3) Since gaze behavior is central to any spatial account, it would be helpful to report basic eye-tracking analyses comparing remembered versus forgotten trials, especially at encoding, to rule out systematic differences in fixation patterns that could contribute to the spatial tuning results.

Reviewer #3 (Public review):

Summary and Overall Evaluation:

This is an elegant paper addressing an important question: whether spatial location is automatically activated during the recall of object memories. Building on prior work that relied on trained or repeated stimuli, the present study uses unique objects with one-time encoding across four spatial locations - a meaningful advance in ecological validity. The experimental design is clean, the data analysis is well-executed, and the reported effects, while small, are intriguing and open up interesting questions about the role of spatial structure in visual memory. Overall, this is a solid contribution, and my comments below are intended to help the authors strengthen the paper further.

Major Comments

(1) Incidental encoding.
Was the memory task fully incidental - that is, were participants unaware that a subsequent memory test would follow encoding? This seems important for interpreting the automaticity claim that is central to the paper's contribution, and should be clarified explicitly.

(2) Spatial extent of the analysis - higher visual regions and negative pRFs.
The analysis appears restricted to regions V1-V3. Have the authors examined higher visual areas as well? This seems like an important omission given that object memory likely engages regions well beyond the early visual cortex. Relatedly, recent work by Adam Steel and colleagues suggests that spatially tuned negative pRFs may play an important role in memory. Have the authors considered examining these? Expanding the analysis in these directions could substantially enrich the findings.

(3) Mechanism - retinotopic or spatiotopic?
The paper makes a compelling case that spatial structure supports memory, but the nature of that spatial structure deserves more discussion. Are the effects retinotopic or spatiotopic in nature? The current design may not be able to fully dissociate these possibilities, but this distinction is theoretically important, and the authors should engage with it directly. Even a careful discussion of what the current data can and cannot tell us on this point would be valuable.

(4) Relationship between encoding failure and retrieval failure.
For trials where memory performance is worse, and the encoding models fail, is there a systematic relationship between how the pRFs fail at object retrieval versus spatial retrieval? In other words, are the pRFs wrongly tuned in the same way at both stages? This analysis could provide meaningful insight into whether object and location retrieval draw on shared spatial representations.

(5) Object shape and spatial mapping.
Real-world objects vary considerably in surface structure and shape, which may affect how cleanly they map onto a specific spatial location. Was this considered in the analysis? What was taken as the correct or peak location for each object, and how was this defined when objects extended across space? Apologies if this was addressed in the methods and I missed it.

(6) Time course of pRF activation.
Is there a way to examine the time course of pRF activation within a trial? Do the spatially tuned responses arise immediately upon retrieval, or do they build up over time? Even a preliminary analysis of this would be of considerable theoretical interest, as it would speak to whether spatial reinstatement is an early automatic process or a later, more deliberate one.

(7) Effect size and functional significance.
The authors acknowledge that the reported effects are very small, which I appreciate. However, this does raise genuine questions about functional significance that I think deserve a more direct response. One approach that would help contextualize the spatial effects would be to compare their magnitude to that of another feature - object identity, for example - to give readers a sense of the relative importance of spatial versus non-spatial information in memory representations. I recognize this may not be straightforward with the current design, but even a brief discussion of how one might benchmark the spatial effects would be helpful.

(8) The attention account.
I found the discussion of attention less than fully convincing. The authors appear to argue against an attentional interpretation of the spatial effects, but it is not clear why participants wouldn't attend to the encoded location during retrieval - particularly in a design with relatively few retrieval cues, where spatial location may be one of the most useful available. The attention account thus seems difficult to rule out on the basis of the current data, and the discussion should engage more seriously with this alternative rather than setting it aside.

(9) Later-remembered versus later-forgotten objects - BOLD signal.
Were later-remembered objects associated with stronger overall BOLD responses during encoding compared to later-forgotten objects, or was the effect specific to the pRF modelling? Clarifying this would help readers understand whether the spatial effects are part of a broader pattern of stronger encoding or something more specific to the spatial reinstatement mechanism.

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation