Perceptual predictions track subjective, over objective, statistical structure

  1. Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, United Kingdom
  2. Functional Imaging Laboratory, Department of Imaging Neuroscience, University College London, London, United Kingdom
  3. School of Psychological Sciences, Birkbeck, University of London, London, United Kingdom

Peer review process

Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, and public reviews.

Read more about eLife’s peer review process.

Editors

  • Reviewing Editor
    Nathan Faivre
    Centre National de la Recherche Scientifique, Grenoble, France
  • Senior Editor
    Huan Luo
    Peking University, Beijing, China

Reviewer #1 (Public review):

Summary:

Press et al test, in three experiments, whether responses in a speeded response task reflect people's expectations, and whether these expectations are best explained by the objective statistics of the experimental context (e.g., stimulus probabilities) or by participants' mental representation of these probabilities. The studies use a classical response time and accuracy task, in which people are (1) asked to make a response (with one hand), this response then (2) triggers the presentation of one of several stimuli (with different probabilities depending on the response), and participants (3) then make a speeded response to identify this stimulus (with the other hand). In Experiment 1, participants are asked to rate, after the experiment, the subjective probabilities of the different stimuli. In Experiments 2 and 3, they rated, after each trial, to what extent the stimulus was expected (Experiment 2), or whether they were surprised by the stimulus (Experiment 3). The authors test (using linear models) whether the subjective ratings in each experiment predict stimulus identification times and accuracies better than objective stimulus probabilities (Experiment 1), or than their objective probability derived from a Rescorla-Wagner model of prior stimulus history (Experiment 2 and 3). Across all three experiments, the results are identical. Response times are best described by contributions from both subjective and objective probabilities. Accuracy is best described by subjective probability.

Strengths:

This is an exciting series of studies that tests an assumption that is implicit in predictive theories of response preparation (i.e., that response speed/accuracy tracks subjective expectancies), but has not been properly tested so far, to my knowledge. I find the idea of measuring subjective expectancy and surprise in the same trials as the response very clever. The manuscript is extremely well written. The experiments are well thought-out, preregistered, and the results seem highly robust and replicable across studies.

Weaknesses:

In my assessment, this is a well-designed, implemented, and analysed series of studies. I have one substantial concern that I would like to see addressed, and two more minor ones.

(1) The key measure of the relationship between subjective ratings and response times/accuracy is inherently correlational. The causal relationship between both variables is therefore by definition ambiguous. I worry that the results don't reveal an influence of subjective expectancy of response times/accuracies, but the reverse: an influence of response times/accuracies on subjective expectancy ratings.

This potential issue is most prominent in Experiments 2 and 3, where people rate their expectations in a given trial directly after they made their response. We can assume that participants have at least some insight into whether their response in the current trial was correct/erroneous or fast/slow. I therefore wonder if the pattern of results can simply be explained by participants noticing they made an error (or that they responded very slowly) and subsequently being more inclined to rate that they did not expect this stimulus (in Experiment 2) or that they were surprised by it (in Experiment 3).

The specific pattern across the two response measures might support this interpretation. Typically, participants are more aware of the errors they make than of their response speed. From the above perspective, it would therefore be not surprising that all experiments show stronger associations between accuracy and subjective ratings than between response times and subjective ratings -- exactly as the three studies found.

I acknowledge that this problem is less strong in Experiment 1, where participants do not rate expectancy or surprise after each response, but make subjective estimates of stimulus probabilities after the experiment. Still, even here, the flow of information might be opposite to what the authors suggest. Participants might not have made more errors for stimuli that they thought as least likely, but instead might have used the number of their responses to identify a given stimulus as a proxy for rating their likelihood. For example, if they identify a square as a square 25% of the time, even though 5% of these responses were in error, it is perhaps no surprise if their rating of the stimulus likelihood better tracks the times they identified it as a square (25%) than the actual stimulus likelihoods (20%).

This potential reverse direction of effects would need to be ruled out to fully support the authors' claims.

(2) My second, more minor concern, is whether the Rescorla-Wagner model is truly the best approximation of objective stimulus statistics. It is traditionally a model of how people learn. Isn't it, therefore, already a model of subjective stimulus statistics, derived from the trial history, instead of objective ones? If this is correct, my interpretation of Experiments 2 and 3 would be (given my point 1 above is resolved) that subjective expectancy ratings predict responses better than this particular model of learning, meaning that it is not a good model of learning in this task. Comparing results against Rescorla-Wagner may even seem like a stronger test than comparing them against objective stimulus statistics - i.e., they show that subjective ratings capture expectancies better even than this model of learning. The authors already touch upon this point in the General Discussion, but I would like to see this expanded, and - ideally - comparisons against objective stimulus statistics (perhaps up to the current trial) to be included, so that the authors can truly support the claim that it is not the objective stimulus statistics that determine response speed and accuracy.

(3) There is a long history of research trying to link response times to subjective expectancies. For example, Simon and Craft (1989, Memory & Cognition) reported that stimuli of equal probability were identified more rapidly when participants had explicitly indicated they expect this stimulus to occur in the given trial, and there's similar more recent work trying to dissociate stimulus statistics and explicit expectations (e.g., Umbach et al., 2012, Frontiers; for a somewhat recent review, see Gaschler et al., 2014, Neuroscience & Biobehavioral Reviews). It has not become clear to me how the current results relate to this literature base. How do they impact this discussion, and how do they differ from what is already known?

Reviewer #2 (Public review):

Summary:

This work by Clarke, Rittershofer, and colleagues used categorization and discrimination tasks with subjective reports of task regularities. In three behavioral experiments, they found that these subjective reports explain task accuracy and response times at least as well and sometimes better than objective measures. They conclude that subjective experience may play a role in predicting processing.

Strengths:

This set of behavioral studies addresses an important question. The results are replicated three times with a different experimental design, which strengthens the claims. The design is preregistered, which further strengthens the results. The findings could inspire many studies in decision-making.

Weaknesses:

It seems to me that it is important, but difficult to distinguish whether the objective and subjective measures stem from reasonably different mechanisms contributing to behavior, or whether they are simply two noisy proxies to the same mechanism, in which case it is not so surprising that both contribute to the explained variance. The authors acknowledge in the discussion that the type of objective measure that is chosen is crucial.

For instance, the RW model's learning rates were not fitted to participants but to the sequence of stimuli, so they represent the optimal parameter values, not the true ones that participants are using. Is the subjective measure just a readout of the RW model's true state when using the participants' parameters? Relatedly, would the authors consider the RW predictions from participants using a sub-optimal alpha to be a subjective or an objective measure? Do the results truly show the importance of subjective measures, or is it another way of saying that humans are sub-optimal (Rahnev & Denison, 2018, BBS) ... or optimal for other goals. I see the difficulty of avoiding double-dipping on accuracy, but this seems essential to address. This relates to a more general question about the underlying mechanisms of subjective versus objective measures, which is alluded to in the discussion but could be interesting to develop a bit further.

In terms of methods, I did not fully understand the 'RW model expectedness' objective metric in Experiments 2 and 3. VT is defined as the 'model's expectation for the given tone T. A (signed?) prediction error is defined for the expectation update, but it seems that the RW model expectedness used in the figures and statistical models is VT, sign-inverted for unexpected stimuli. So how do we interpret negative values, and how often do they occur? Shouldn't it be the unsigned value that is taken as objective surprise? This could be explained in a bit more detail. Could this be related to the quadratic effect that one can see in Figures 4E and 5E, which is not taken into account in the statistical model? Figures 4A and 5A also seem to show a combination of linear and quadratic effects. A more complete description of the objective measure could help determine whether this is a serious issue or just noise in the data.

Gabor patches in Experiments 2 and 3 seemed to have been presented at quite a sharp contrast (I did not find this info), and accuracy seems to saturate at 100%. What was the distribution of error rates, i.e., how many participants were so close to 100% that there was no point in including them in the analysis?

In the second preregistration, the authors announced that BIC comparisons between the full model and the objective model will test whether subjective measures capture additional variance [...] beyond objective prediction error. This is also the conclusion reached in sections 3.3 and 4.3. The model comparison, however, is performed by selecting the best of three models, excluding the null model. It seems that the full model still wins over the objective model, but sometimes quite marginally. Could the authors not test the significance of the model comparison since models are nested?

Reviewer #3 (Public review):

Summary:

Clarke et al. investigate the role of subjective representations of task-based statistical structure on choice accuracy and reaction times during perceptual decision-making. Subjective representations of objective statistical structure are often overlooked in studies of predictive processing and, consequently, little is known about their role in predictive phenomena. By gauging the subjective experience of stimulus probability, expectedness, and surprise in tasks with fixed cue-stimulus contingencies, the authors aimed to separate subjective and objective (task-induced) contributions to predictive effects on behaviour.

Across three different experiments, subjective and objective contributions to predictions were found to explain unique portions of variance in reaction time data. In addition, choice accuracy was best predicted by subjective representations of statistical structure in isolation. These findings reveal that the subjective experience of statistical regularities may play a key role in the predictive processes that shape perception and cognition.

Strengths:

This study combines careful and thorough behavioral experimentation with an innovative focus on subjective experience in predictive processing. By collecting three independent datasets with different perceptual decision-making paradigms, the authors provide converging evidence that subjective representations of statistical structure explain unique variance in behavior beyond objective task structure. The analysis strategy, which directly contrasts the contributions of subjective and objective predictors, is conceptually rigorous and allows clear insight into how subjective and objective influences shape behavior. The methods are consistently applied across all three datasets and produce coherent results, lending strong support to the authors' conclusions. The study emphasizes the critical role of subjective experience in predictive processing, with implications for understanding learning, perception, and decision-making.

Weaknesses:

Despite these strengths, there are several conceptual and technical issues that should be addressed. In Experiments 2 and 3, the authors use a Rescorla-Wagner (RW) learning model to estimate trialwise expectedness (Experiment 2) and surprise (Experiment 3). While the RW model is a well-established model for explaining learning behaviour, it does not represent the objective 'ground truth' statistical structure of the environment, and treating RW trajectories as such imposes assumptions about learning that may not match participants' actual behavior. This assumption could strongly affect the comparison between subjective and 'objective' predictors. It would strengthen the primary conclusions of the manuscript if other implementations of the objective statistical structure, such as the true task-defined probabilities (i.e., 25% or 75%), were considered to provide a complementary 'ground truth' perspective.

Additionally, because objective statistical structure was predictive of subjective ratings in all three experiments, these predictors are likely collinear in the full model. Collinearity can lead to inflated standard errors and unstable coefficient estimates, even if the models converge. Currently, this potential critical problem of the applied statistical models is not assessed, reported on, or controlled for (e.g., by residualizing predictors). RW trajectories are also not reported in the manuscript, limiting the ability to assess how the model evolves over time and whether it maps onto the task-induced probabilities in a sensible way. This is particularly relevant because participants' subjective estimates of the task-induced probabilities seem to converge to the ground truth after just a few trials, especially for the 75% stimuli (Figure 3C).

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation