Dissociable dynamic effects of expectation during statistical learning

  1. Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany
  2. Berlin School of Mind and Brain, Berlin, Germany
  3. Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, the Netherlands
  4. Neural Circuits and Cognition Lab, European Neuroscience Institute Göttingen - A Joint Initiative of the University Medical Center Göttingen and the Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
  5. Perception and Plasticity Group, German Primate Center, Leibniz Institute for Primate Research, Göttingen, Germany

Peer review process

Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, and public reviews.

Read more about eLife’s peer review process.

Editors

  • Reviewing Editor
    Clare Press
    University College London, London, United Kingdom
  • Senior Editor
    Huan Luo
    Peking University, Beijing, China

Reviewer #1 (Public review):

Summary:

In this lovely paper, McDermott and colleagues tackle an enduring puzzle in the cognitive neuroscience of perceptual prediction. Though many scientists agree that top-down predictions shape perception, previous studies have yielded incompatible results - with studies showing 'sharpened' representations of expected signals, and others showing a 'dampening' of predictable signals to relatively enhance surprising prediction errors. To deepen the paradox further, it seems like there are good reasons that we would want to see both influences on perception in different contexts.

Here, the authors aim to test one possible resolution to this 'paradox' - the opposing process theory (OPT). This theory makes distinct predictions about how the time course of 'sharpening' and 'dampening' effects should unfold. The researchers present a clever twist on a leading-trailing perceptual prediction paradigm, using AI to generate a large dataset of test and training stimuli so that it is possible to form expectations about certain categories without repeating any particular stimuli. This provides a powerful way of distinguishing expectation effects from repetition effects - a perennial problem in this line of work.

Using EEG decoding, the researchers find evidence to support the OPT. Namely, they find that neural encoding of expected events is superior in earlier time ranges (sharpening-like) followed by a relative advantage for unexpected events in later time ranges (dampening-like). On top of this, the authors also show that these two separate influences may emerge differently in different phases of learning - with superior decoding of surprising prediction errors being found more in early phases of the task, and enhanced decoding of predicted events being found in the later phases of the experiment.

Strengths:

As noted above, a major strength of this work lies in important experimental design choices. Alongside removing any possible influence of repetition suppression mechanisms in this task, the experiment also allows us to see how effects emerge in 'real-time' as agents learn to make predictions. This contrasts with many other studies in this area - where researchers 'over-train' expectations into observers to create the strongest possible effects or rely on prior knowledge that was likely to be crystallised outside the lab.

Weaknesses:

This study reveals a great deal about how certain neural representations are altered by expectation and learning on shorter and longer timescales, so I am loath to describe certain limitations as 'weaknesses'. But one limitation inherent in this experimental design is that, by focusing on implicit, task-irrelevant predictions, there is not much opportunity to connect the predictive influences seen at the neural level to the perceptual performance itself (e.g., how participants make perceptual decisions about expected or unexpected events, or how these events are detected or appear).

The behavioural data that is displayed (from a post-recording behavioural session) shows that these predictions do influence perceptual choice - leading to faster reaction times when expectations are valid. In broad strokes, we may think that such a result is broadly consistent with a 'sharpening' view of perceptual prediction, and the fact that sharpening effects are found in the study to be larger at the end of the task than at the beginning. But it strikes me that the strongest test of the relevance of these (very interesting) EEG findings would be some evidence that the neural effects relate to behavioural influences (e.g., are participants actually more behaviourally sensitive to invalid signals in earlier phases of the experiment, given that this is where the neural effects show the most 'dampening' a.k.a., prediction error advantage?)

Reviewer #2 (Public review):

Summary:

There are two accounts in the literature that propose that expectations suppress the activity of neurons that are (a) not tuned to the expected stimulus to increase the signal-to-noise ratio for expected stimuli (sharpening model) or (b) tuned to the expected stimulus to highlight novel information (dampening model). One recent account, the opposing process theory, brings the two models together and suggests that both processes occur, but at different time points: initial sharpening is followed by later dampening of the neural activity of the expected stimulus. In this study, the authors aim to test the opposing process theory in a statistical learning task by applying multivariate EEG analyses and finding evidence for the opposing process theory based on the within-trial dynamics.

Strengths:

This study addresses a very timely research question about the underlying mechanisms of expectation suppression. The applied EEG decoding approach offers an elegant way to investigate the temporal characteristics of expectation effects. A major strength of the study lies in the experimental design that aims to control for repetition effects, one of the common confounds in prediction suppression studies. The reported results are novel in the field and have the potential to substantially improve our understanding of expectation suppression in visual perception.

Weaknesses:

The strength in controlling for repetition effects by introducing a neutral (50% expectation) condition also adds a weakness to the current version of the manuscript, as this neutral condition is not integrated into the behavioral (reaction times) and EEG (ERP and decoding) analyses. This procedure remained unclear to me. The reported results would be strengthened by showing differences between the neutral and expected (valid) conditions on the behavioral and neural levels. This would also provide a more rigorous check that participants had implicitly learned the associations between the picture category pairings.

It is not entirely clear to me what is actually decoded in the prediction condition and why the authors did not perform decoding over trial bins in prediction decoding as potential differences across time could be hidden by averaging the data. The manuscript would generally benefit from a more detailed description of the analysis rationale and methods.

Finally, the scope of this study should be limited to expectation suppression in visual perception, as the generalization of these results to other sensory modalities or to the action domain remains open for future research.

Reviewer #3 (Public review):

Summary:

In their study, McDermott et al. investigate the neurocomputational mechanism underlying sensory prediction errors. They contrast two accounts: representational sharpening and dampening. Representational sharpening suggests that predictions increase the fidelity of the neural representations of expected inputs, while representational dampening suggests the opposite (decreased fidelity for expected stimuli). The authors performed decoding analyses on EEG data, showing that first expected stimuli could be better decoded (sharpening), followed by a reversal during later response windows where unexpected inputs could be better decoded (dampening). These results are interpreted in the context of opposing process theory (OPT), which suggests that such a reversal would support perception to be both veridical (i.e., initial sharpening to increase the accuracy of perception) and informative (i.e., later dampening to highlight surprising, but informative inputs).

Strengths:

The topic of the present study is of significant relevance to the field of predictive processing. The experimental paradigm used by McDermott et al. is well designed, allowing the authors to avoid several common confounds in investigating predictions, such as stimulus familiarity and adaptation. The introduction of the manuscript provides a well-written summary of the main arguments for the two accounts of interest (sharpening and dampening), as well as OPT. Overall, the manuscript serves as a good overview of the current state of the field.

Weaknesses:

In my opinion, several details of the methods, results, and manuscript raise doubts about the quality and reliability of the reported findings. Key concerns are:

(1) The results in Figure 2C seem to show that the leading image itself can only be decoded with ~33% accuracy (25% chance; i.e. ~8% above chance decoding). In contrast, Figure 2E suggests the prediction (surprisingly, valid or invalid) during the leading image presentation can be decoded with ~62% accuracy (50% chance; i.e. ~12% above chance decoding). Unless I am misinterpreting the analyses, it seems implausible to me that a prediction, but not actually shown image, can be better decoded using EEG than an image that is presented on-screen.

(2) The "prediction decoding" analysis is described by the authors as "decoding the predictable trailing images based on the leading images". How this was done is however unclear to me. For each leading image decoding the predictable trailing images should be equivalent to decoding validity (as there were only 2 possible trailing image categories: 1 valid, 1 invalid). How is it then possible that the analysis is performed separately for valid and invalid trials? If the authors simply decode which leading image category was shown, but combine L1+L2 and L4+L5 into one class respectively, the resulting decoder would in my opinion not decode prediction, but instead dissociate the representation of L1+L2 from L4+L5, which may also explain why the time-course of the prediction peaks during the leading image stimulus-response, which is rather different compared to previous studies decoding predictions (e.g. Kok et al. 2017). Instead for the prediction analysis to be informative about the prediction, the decoder ought to decode the representation of the trailing image during the leading image and inter-stimulus interval. Therefore I am at present not convinced that the utilized analysis approach is informative about predictions.

(3) I may be misunderstanding the reported statistics or analyses, but it seems unlikely that >10 of the reported contrasts have the exact same statistic of Tmax= 2.76. Similarly, it seems implausible, based on visual inspection of Figure 2, that the Tmax for the invalid condition decoding (reported as Tmax = 14.903) is substantially larger than for the valid condition decoding (reported as Tmax = 2.76), even though the valid condition appears to have superior peak decoding performance. Combined these details may raise concerns about the reliability of the reported statistics.

(4) The reported analyses and results do not seem to support the conclusion of early learning resulting in dampening and later stages in sharpening. Specifically, the authors appear to base this conclusion on the absence of a decoding effect in some time-bins, while in my opinion a contrast between time-bins, showing a difference in decoding accuracy, is required. Or better yet, a non-zero slope of decoding accuracy over time should be shown (not contingent on post-hoc and seemingly arbitrary binning).

(5) The present results both within and across trials are difficult to reconcile with previous studies using MEG (Kok et al., 2017; Han et al., 2019), single-unit and multi-unit recordings (Kumar et al., 2017; Meyer & Olson 2011), as well as fMRI (Richter et al., 2018), which investigated similar questions but yielded different results; i.e., no reversal within or across trials, as well as dampening effects with after more training. The authors do not provide a convincing explanation as to why their results should differ from previous studies, arguably further compounding doubts about the present results raised by the methods and results concerns noted above.

Impact:

At present, I find the potential impact of the study by McDermott et al. difficult to assess, given the concerns mentioned above. Should the authors convincingly answer these concerns, the study could provide meaningful insights into the mechanisms underlying perceptual prediction. However, at present, I am not entirely convinced by the quality and reliability of the results and manuscript. Moreover, the difficulty in reconciling some of the present results with previous studies highlights the need for more convincing explanations of these discrepancies and a stronger discussion of the present results in the context of the literature.

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation