Peer review process
Revised: This Reviewed Preprint has been revised by the authors in response to the previous round of peer review; the eLife assessment and the public reviews have been updated where necessary by the editors and peer reviewers.
Read more about eLife’s peer review process.Editors
- Reviewing EditorTatyana SharpeeSalk Institute for Biological Studies, La Jolla, United States of America
- Senior EditorTirin MooreStanford University, Howard Hughes Medical Institute, Stanford, United States of America
Reviewer #1 (Public review):
Summary:
Knudstrup et al. use two-photon calcium imaging to measure neural responses in the mouse primary visual cortex (V1) in response to image sequences. The authors presented mice with many repetitions of the same four-image sequence (ABCD) for four days. Then on the fifth day, they presented unexpected stimulus orderings where one stimulus was either omitted (ABBD) or substituted (ACBD). After analyzing trial-averaged responses of neurons pooled across multiple mice, they observed that stimulus omission (ABBD) caused a small, but significant, strengthening of neural responses but observed no significant change in the response to stimulus substitution (ACBD). Next, they performed population analyses of this dataset. They showed that there were changes in the correlation structure of activity and that many features about sequence ordering could be reliably decoded. This second set of analyses is interesting and exhibited larger effect sizes than the first results about predictive coding. However, concerns about the design of the experiment temper my enthusiasm.
The most recent version of this manuscript makes a few helpful changes (entirely in supplemental figures--the main text figures are unchanged). It does not resolve any of the larger weaknesses of the experimental design, or even perform single-neuron tracking in the one case where it was possible (between similar FOVs shown in Supplemental Figure 1).
Strengths:
(1) The topic of predictive coding in the visual cortex is exciting, and this task builds on previous important work by the senior author (Gavornik and Bear 2014) where unexpectedly shuffling sequence order caused changes in LFPs recorded from visual cortex.
(2) Deconvolved calcium responses were used appropriately here to look at the timing of the neural responses.
(3) Neural decoding results showing that the context of the stimuli could be reliably decoded from trial-averaged responses were interesting. But I have concerns about how the data was formatted for performing these analyses.
Weaknesses:
(1) All analyses were performed on trial-averaged neural responses that were pooled across mice (except for Supplementary Figure 6, see below). Owing to differences between subjects in behavior, experimental preparation quality, and biological variability, it seems important to perform most analyses on individual datasets to assess how behavioral training might differently affect each animal.
In the most recent draft, a single-mouse analysis was added for Figure 4C (Supplementary Figure 6). This effect of "representational drift" was not statistically quantified in either the single-mouse results or in the main text figure panel. Moreover, the apparent correlational drift could be accounted for by a reduction in SNR as a consequence of photobleaching.
(2) The correlation analyses presented in Figure 3 (labeled the second Figure 2 in the text) should be conducted on a single-animal basis. Studying population codes constructed by pooling across mice, particularly when there is no behavioral readout to assess whether learning has had similar effects on all animals, appears inappropriate to me. If the results in Figure 3 hold up on single animals, I think that is definitely an interesting result.
In the most recent draft, this analysis was still not performed on single mice. I was referring to the "decorrelation of responses" analysis in Figure 3, not the "representational drift" analysis in Figure 4. See my comments on Supplementary Figure 6 above.
(3) On Day 0 and Day 5, the reordered stimuli are presented in trial blocks where each image sequence is shown 100 times. Why wasn't the trial ordering randomized as was done in previous studies (e.g. Gavornik and Bear 2014)? Given this lack of reordering, did neurons show reduced predictive responses because the unexpected sequence was shown so many times in quick succession? This might change the results seen in Figure 2, as well as the decoder results where there is a neural encoding of sequence order (Figure 4). It would be interesting if the Figure 4 decoder stopped working when the higher order block structure of the task were disrupted.
In the rebuttal letter for the most recent draft, the authors refer to recent work in press (Hosmane et al. 2024) suggesting that because sleep may be important for plastic changes between sessions, they do not expect much change to be apparent within a session. However, they admit that this current study is too underpowered to know for sure--and do not cite or mention this yet unpublished work in the manuscript itself.
As a control, I would be interested to at least know how much variance in neural responses is observed between intermediate "training" sessions with identical stimuli, e.g. between Day 1 and Day 4, but this is not possible as imaging was not performed on these days.
Despite being referred to as "similar" I do not think early and late responses are clearly shown--aside from the histograms comparing "early traces" to "all traces" which include early traces in Figure 5B and Figure 6A. Showing variance in single-cell responses would be helpful to add in Supplementary Figure 3 and Supplementary Figure 4.
(4) A primary advantage of using two-photon calcium imaging over other techniques like extracellular electrophysiology is that the same neurons can be tracked over many days. This is a standard approach that can be accomplished by using many software packages-including Suite2P (Pachitariu et al. 2017), which is what the authors already used for the rest of their data preprocessing. The authors of this paper did not appear to do this. Instead, it appears that different neurons were imaged on Day 0 (baseline) and Day 5 (test). This is a significant weakness of the current dataset.
In the most recent draft, this concern has not been mitigated. Despite Supplementary Figure 1 showing similar FOVs, mostly different neurons were still extracted. In all other sessions, it is not reported how far apart the other recorded FOVs were from each other.
The rebuttal comment that the PE statistic is computed on an individual cell within-session basis is reasonable. Moreover, the bootstrapped version of the PE analysis in Supplementary Figure 8 is an improvement of the main analysis in the paper. As a control, it would have been helpful to compute the stability of the PE ratio statistics between training days (e.g. between day 1 and day 4). How much change would have been observed when none is expected? Unfortunately, imaging was not performed on these training days so this analysis will not be readily possible to perform. Moreover, the PE statistic requires averaging across cells and trials and is therefore very likely to wash out many interesting effects. Even if it is the population response that is changing, why would it be the arithmetic mean that changes in particular vs. some other projection of the population activity? The experimental and analysis design of the paper here remains weak in my mind.
Reviewer #2 (Public review):
Knudstrup and colleagues investigate response to short and rapid sequences of stimuli in layer 2/3 of mouse visual cortex. To quote the authors themselves: "the work continues the recent tradition of providing ambiguous support for the idea that cortical dynamics are best described by predictive coding models". Unfortunately, the ambiguity here is largely a result of the choice of experimental design and analysis, and the data provide only incomplete support for the authors' conclusions.
The authors have addressed some of the concerns of the first revision. However, many still remain.
(1) From the first review: "There appears to be some confusion regarding the conceptual framing of predictive coding. Assuming the mouse learns to expect the sequence ABCD, then ABBD does not probe just for negative prediction errors, and ACBD not just positive prediction errors. With ABBD, there is a combination of a negative prediction error for the missing C in the 3rd position, and a positive prediction error for B in 3rd. Likewise, with ACBD, there is negative prediction error for the missing B at 2nd and missing C at 3rd, and a positive prediction error for the C in 2nd and B in 3rd. Thus, the authors' experimental design does not have the power to isolate either negative or positive prediction errors. Moreover, looking at the raw data in Figure 2C, this does not look like an "omission" response to C, more like a stronger response to a longer B. The pitch of the paper as investigating prediction error responses is probably not warranted - we see no way to align the authors' results with this interpretation."
The authors acknowledge in their response that this is a problem, but do not appear to discuss this in the manuscript. This should be fixed.
(2) From the first review: "Recording from the same neurons over the course of this paradigm is well within the technical standards of the field, and there is no reason not to do this. Given that the authors chose to record from different neurons, it is difficult to distinguish representational drift from drift in the population of neurons recorded. "
The authors respond by pointing out that what they mean by "drift" is within day changes. This has been clarified. However, the analyses in Figures 3 and 5 still are done across days. Figure 3: "Experience modifies activity in PCA space ..." and figure 5: "Stimulus responses shift with training". Both rely on comparisons of population activity across days. This concern remains unchanged here. It would probably be best to remove any analysis done across days - or use data where the same neurons were tracked. Performing chronic two-photon imaging experiments without tracking the same neurons is simply bad practice (assuming one intends to do any analysis across recording sessions).
(3) From the first revision: "The block paradigm to test for prediction errors appears ill chosen. Why not interleave oddball stimuli randomly in a sequence of normal stimuli? The concern is related to the question of how many repetitions it takes to learn a sequence. Can the mice not learn ACBD over 100x repetitions? The authors should definitely look at early vs. late responses in the oddball block. Also the first few presentations after block transition might be potentially interesting. The authors' analysis in the paper already strongly suggests that the mice learn rather rapidly. The authors conclude: "we expected ABCD would be more-or-less indistinguishable from ABBD and ACBD since A occurs first in each sequence and always preceded by a long (800 ms) gray period. This was not the case. Most often, the decoder correctly identified which sequence stimulus A came from." This would suggest that whatever learning/drift could happen within one block did indeed happen and responses to different sequences are harder to interpret."
Again, the authors acknowledge the problem and state that "there is no indication that this is a learned effect". However, they provide no evidence for this and perform no analysis to mitigate the concern.
(4) Some of the minor comments also appear unaddressed and uncommented. E.g. the response amplitudes are still shown in "a.u." instead of dF/F or z-score or spikes.
Reviewer #3 (Public review):
Summary:
This work provides insights into predictive coding models of visual cortex processing. These models predict that visual cortex neurons will show elevated responses when there are unexpected changes to learned sequential stimulus patterns. This model is currently controversial, with recent publications providing conflicting evidence. In this work, the authors test two types of unexpected pattern variations in layer 2/3 of the mouse visual cortex. They show that pattern omission evokes elevated responses, in favor of a predictive coding model, but find no evidence for prediction errors with substituted patterns, which conflicts with both prior results in L4, and with the expectations of a predictive coding model. They also report that with sequence training, responses sparsify and decorrelate, but surprisingly find no changes in the ability of an ideal observer to decode stimulus identity or timing.
These results are an important contribution to the understanding of how temporal sequences and expectations are encoded in the primary visual cortex
Comments on revisions:
In this revision, the authors address several of the concerns in the original manuscript. However, the primary issue, raised by all three reviewers, was the block design of the experiments. This design makes disentangling the effects of any rapid (within block) plasticity from any longer term (across days) plasticity-which nominally is the subject of the paper-extremely difficult.
Although it may be the case that re-running the experiments with an interleaved design is beyond the scope of this paper, unfortunately, the revised manuscript still does not adequately discuss this potential confound. The authors note that stimulus A in ABCD, ABBD, and ACBD could be distinguished on day 0, indicating that within block changes do occur. In both the original and revised manuscript this finding is discussed in terms of representational drift, but the authors fail to discuss how such within block plasticity may impact their primary findings of prediction error effects.
This remains a significant concern with the revised manuscript.
Many of the other issues in the original manuscript have been addressed, and in these areas the revised manuscript is both clearer and more accurately reflects the presented data. The additional analyses and controls shown in the supplemental figures aid in the interpretation of the findings.