Statistical learning shapes pain perception and prediction independently of external cues

  1. Jakub Onysk  Is a corresponding author
  2. Nicholas Gregory
  3. Mia Whitefield
  4. Maeghal Jain
  5. Georgia Turner
  6. Ben Seymour
  7. Flavia Mancini  Is a corresponding author
  1. Computational and Biological Learning Unit, Department of Engineering, University of Cambridge, United Kingdom
  2. Applied Computational Psychiatry Lab, Max Planck Centre for Computational Psychiatry and Ageing Research, Queen Square Institute of Neurology and Mental Health Neuroscience Department, Division of Psychiatry, University College London, United Kingdom
  3. MRC Cognition and Brain Sciences Unit, University of Cambridge, United Kingdom
  4. Wellcome Centre for Integrative Neuroimaging, John Radcliffe Hospital, Headington, United Kingdom
  5. Center for Information and Neural Networks (CiNet), Japan


The placebo and nocebo effects highlight the importance of expectations in modulating pain perception, but in everyday life we don’t need an external source of information to form expectations about pain. The brain can learn to predict pain in a more fundamental way, simply by experiencing fluctuating, non-random streams of noxious inputs, and extracting their temporal regularities. This process is called statistical learning. Here, we address a key open question: does statistical learning modulate pain perception? We asked 27 participants to both rate and predict pain intensity levels in sequences of fluctuating heat pain. Using a computational approach, we show that probabilistic expectations and confidence were used to weigh pain perception and prediction. As such, this study goes beyond well-established conditioning paradigms associating non-pain cues with pain outcomes, and shows that statistical learning itself shapes pain experience. This finding opens a new path of research into the brain mechanisms of pain regulation, with relevance to chronic pain where it may be dysfunctional.

eLife assessment

This study presents a valuable insight into a computational mechanism of pain perception. The evidence supporting the authors' claims is compelling. The work will be of interest to pain researchers working on computational models and cognitive mechanisms of pain in a Bayesian framework.


Clinical pain typically varies over time; in most pain states, the brain receives a stream of volatile and noisy noxious signals, which are also auto-correlated in time. The temporal structure of these signals is important, because the human brain has evolved the exceptional ability to extract regularities from streams of auto-correlated sensory signals, a process called statistical learning (Dehaene et al., 2015; Schapiro and Turk-Browne, 2015; Fiser and Lengyel, 2019; Meyniel et al., 2016; Kourtzi and Welchman, 2019; Sherman et al., 2020; Turk-Browne et al., 2009). In the context of pain, statistical learning can allow the brain to predict future pain, which is crucial for orienting behaviour and maximising well-being (Mancini et al., 2022; Mulders et al., 2023). Statistical learning might also be fundamental to the ability of the nervous system to endogenously regulate pain. Indeed, statistical learning generates predictions about forthcoming pain. We already know that pain expectations can modulate pain levels by gating the reciprocal transmission of neural signals between the brain and spinal cord, as shown by previous work on placebo and nocebo effects (Tracey, 2010; Tinnermann et al., 2017; Eippert et al., 2009; Geuter and Büchel, 2013; Fields, 2018).

By using temporal sequences of noxious inputs, we have previously shown that the pain system supports the statistical learning of the basic rate of getting pain by engaging both somatosensory and supramodal cortical regions (Mancini et al., 2022). Specifically, both sensorimotor cortical regions and the ventral striatum encode probabilistic predictions about pain intensity, which are updated as a function of learning by engaging parietal and prefrontal regions. According to a Bayesian inference framework, both the predictive inference and its confidence should, in principle, modulate the neural response to noxious inputs and affect perception, as a function of learning. In support of this conjecture, there is evidence that the confidence of probabilistic pain predictions modulates the cortical response to pain (Mulders et al., 2023). The relationship is inverse: the lower the confidence, the higher is the early cortical response to noxious inputs (and vice versa), as measured by EEG. This is expected based on Bayesian inference theory: when confidence is low, the brain relies less on his prior beliefs and more on sensory evidence to respond to the input. Bayesian inference theory also predicts that prior expectations and their confidence scale perception (Knill and Richards, 1996). Thus, we hypothesise that the predictions generated by learning the statistics of noxious inputs in dynamically evolving sequences of stimuli modulate the perception of forthcoming inputs.

Previously, it was found that pain perception is strongly influenced by probabilistic expectations as defined by a cue that predicts high or low pain (Jepma et al., 2018). In contrast to such cue paradigm, the primary aim of our experiment was to determine whether the expectations participants hold about the sequence itself inform their perceptual beliefs about the intensity of the stimuli. To that end, we recruited 27 healthy participants to complete a psycho-physical experiment where we delivered four different, 80-trial-long sequences of evolving thermal stimuli, with four levels of temporal regularity. On each trial, a 2 s thermal stimulus was applied, following which participants were asked to either rate their perception of the intensity (Figure 1A) or to predict the intensity of the next stimulus in the sequence (Figure 1B). Participants also reported their response confidence.

Task design.

On each trial, each participant received a thermal stimulus lasting 2s from a sequence of intensities. This was followed by a perception (A) or a prediction (B) input screen, where the y-axis indicates the level of perceived/predicted intensity (0–100) centred around participant’s pain threshold, and the x-axis indicates the level of confidence in one’s perception (0–1). The inter-stimulus interval (ISI; black screen) lasted 2.5s (trial example in C). (D) Example intensity sequences are plotted in green, participant’s perception and prediction responses are in red and black, respectively. (E) Participant’s confidence rating for perception (red) and prediction (black) trials.

We contrasted four models of statistical learning, which varied according to the inference strategy used (i.e. optimal Bayesian inference or a heuristic) and the role of expectations on perception. All models used confidence ratings to weigh the inference. We anticipate that probabilistic learning weighted by confidence and expectations modulates pain perception. This provides behavioural evidence for a link between learning and endogenous pain regulation. One reason why this is important is that it might help understand individual differences in the ability to endogenously regulate pain. This is particularly relevant for chronic pain, given that endogenous pain regulation can be dysfunctional in several chronic pain conditions (Bushnell et al., 2013; Bruehl et al., 1999; Yarnitsky, 2015; King et al., 2009; Bannister and Dickenson, 2017), even before chronic pain develops (Tracey, 2016). Although there is ample evidence for changes in the functional anatomy and connectivity of endogenous pain modulatory systems in chronic pain, their computational mechanisms are poorly understood.


Model-naive performance

Prior to modelling, we first checked whether participant’s performance in the task was affected by the level of temporal regularity, i.e., the sequence condition. We varied the level of volatility and stochasticity across blocks (i.e. conditions), whilst we fixed their overall level within each block; the level of volatility was defined by the number of trials until the mean intensity level changes. The stochasticity is the additional noise that is added on each trial to the underlying mean, often referred to as the observation noise. The changes were often subtle and participants were not informed when they happened. We set two levels (low/high) of each type of uncertainty, achieving a 2×2 factorial design, with the order of conditions randomised across participants. A set of four example sequences of thermal intensities delivered to one of the participants can be found in Figure 1C, alongside their ratings of perception and predictions. Additionally, example confidence ratings for each type of response are plotted in Figure 1D. Appendix 1—figures 2 and 3 show the plots of each participant’s responses superimposed onto the sequences of noxious inputs.

As a measure of performance, we calculated the root mean square error (RMSE) of participants responses (ratings and predictions) compared to the normative noxious input for each condition as in Figure 2 (see also Materials and methods). The lower the RMSE, the more accurate participants’ responses are. Performance in different conditions was analysed with a repeated measures ANOVA, whose results are reported in full in Appendix 1—table 1. Although volatility did not affect rating accuracy (F(1,26)=0.96,p=0.336,ηp2=0.036), we found a two-way interaction between the level of stochasticity of the sequence (low, high) and the type of rating provided (perceived intensity vs. prediction) (F(1,26)=29.842,p<0.001,ηp2=0.534). We followed up this interaction in post hoc comparisons, as reported in Appendix 1—table 2. The performance score differences between all the pairs of stochasticity and response type interactions were significant, apart from the perception ratings in the stochastic environment as compared with perception and prediction performance in the low stochastic setting. Intuitively, the RMSE score analysis revealed an overall trend of participants performing worse on the prediction task, in particular when the level of stochasticity is high.

Participant’s model-naive performance in the task.

Violin plots of participant root mean square error (RMSE) for each condition for A: rating and B: prediction responses as compared with the input. Lower and upper hinges correspond to the first and third quartiles of partipants’ errors (the upper/lower whisker extends from the hinge to the largest/smallest value no further than 1.5 * ”Interquartile range” from the hinge); the line in the box corresponds to the median. Each condition has N=27 particpants.

Modelling strategy

Our models were selected a priori, following the modelling strategy from Jepma et al., 2018, and hence considered the same set of core models for clear extension of the analysis to our non-cue paradigm. The key question for us was whether expectations were used to weigh the behavioural estimates during sequence learning. Therefore, we compared Bayesian and non-Bayesian models of sequential learning that weighted their ratings based on prior expectations versus two corresponding models that assumed perfect perception (i.e. not weighted by prior beliefs). As a baseline, we included a random response model (please see Materials and methods for a formal treatment of the computational models).

According to an optimal Bayesian inference strategy, on each trial, participants update their beliefs about the feature of interest (thermal stimuli) based on probabilistic inference, maintaining a full posterior distribution over its values (Jepma et al., 2018; Särkkä, 2013). Operating within a Bayesian paradigm, participants are assumed to track and, following new information, update both the mean of the sequence of interest and the uncertainty around it (Hoskin et al., 2019). In most cases, such inference makes an assumption about environmental dynamics. For example, a common assumption is that the underlying mean (a hidden/latent state) evolves linearly according to a Gaussian random walk, with the rate of this evolution defined by the the variance of this Gaussian walk (volatility). The observed value is then drawn from another Gaussian with that mean, which has some observation noise (stochasticity). In this case, the observer can infer the latent states through the process of Bayesian filtering (Särkkä, 2013), using the Kalman filter (KF) algorithm (Kalman, 1960; Figure 3B).

Expectation weighted models.

Computational models used in the main analysis to capture participants’ pain perception (P^t) and prediction (E^t+1) ratings. Both types of ratings are affected by confidence rating (Ct)) on each trial. (A) In the reinforcement learning model, participant’s pain perception (Pt) is taken to be weighted sum of the current noxious input (Nt) and their current pain expectation (Et). Following the noxious input, participant updates their pain expectation (Et+1). (B) In the Kalman filter model, a generative model of the environment is assumed (yellow background) - where the mean pain level (xt) evolves according to a Gaussian random walk (volatility v2). The true pain level on each trial (πt) is then drawn from a Gaussian (stochasticity s2). Lastly, the noxious input, Nt, is assumed an imperfect indicator of the true pain level (subjective noise ϵ2). Inference and prediction steps are depicted in a blue box. Participant’s perceived pain is a weighted sum of expectation about the pain level (mt) and current noxious input (Nt). Following each observation, Nt, participant updates their expectation about the pain level (mt+1).

Sequence learning can also be captured by a heuristic to the Bayesian inference, i.e., a simple reinforcement learning (RL) rule. Here, participants maintain and update a point estimate of the expected value of the sequence in an adaptive manner, within a non-stationary environment. RL explicitly involves correcting the tracked mean of the sequence proportionally to a trial-by-trial prediction error - a difference between the expected and actual value of the sequence (Sutton and Barto, 2018; Figure 3A). Importantly, RL agents do not assume any specific dynamics of the environment and hence are considered model-free.

Both models perform a form of error correction about the underlying sequence. The rate at which this occurs is captured by the learning rate α[0,1] element. The higher the learning rate, the faster participants update their beliefs about the sequence after each observation. For the RL model, the learning rate α is a free parameter that is constant across the trials. On the other hand, the learning rate in the KF model αt (known as the Kalman gain) is calculated on every trial. It depends on participants’ trial-wise belief uncertainty as well as their overall estimation of the inherent noise in the environment (stochasticity, s). The belief uncertainty is updated after each observation and depends on participants’ sense of volatility (v) and stochasticity (s) in the environment.

Crucially, we also used participants’ trial-by-trial confidence ratings to measure to what extent confidence plays a role in learning. This is captured by the confidence scaling factor C, which defines the extent to which confidence affects response (un-)certainty. Intuitively, the higher the confidence scaling factor C, the less important role confidence plays in participant’s response. With relatively low values of C, when the confidence is low, participants’ responses are more noisy, i.e., less certain. We demonstrate this in Figure 4 by plotting hypothetical responses (A–F) and the effect on the noise scaling (G–L) as a function of C and confidence ratings.

Confidence scaling factor demonstration.

(A–F) For a range of values of the confidence scaling factor C, we simulated a set of typical responses a participant would make for various levels of confidence ratings. The belief about the mean of the sequence is set at 50, while the response noise at 10. The confidence scaling factor C effectively scales the response noise, adding or reducing response uncertainty. (G–L) The effect of different levels of parameter C on noise scaling. As C increases the effect of confidence is diminished.

To evaluate the effect of expectation on perceived intensity (on top of statistical learning modulating perception), we expanded the standard RL and KF models by adding a perceptual weighting element, γ[0,1] (similarly to Jepma et al., 2018). Essentially, γ governs how much each participant relies on the normative input on each trial, and how much their expectation of the input influences their reported perception - i.e., they take a weighted average of the two. The higher the γ, the bigger the impact of the expectation on perception. Again, in the case of the RL model (eRL - expectation weighted RL), γ is a free parameter that is constant across trials, while in the KF model (eKF - expectation weighted KF), γt is calculated on every trial and depends on: (1) the participants’ trial-wise belief uncertainty, (2) their overall estimation of the inherent noise in the environment (stochasticity, s), and (3) the participant’s subjective uncertainty about the level of intensity, ϵ.

Thus, in total we tested five models: RL and KF (perception not weighted by expectations), eRL and eKF (perception weighted by expectations), and a baseline random model. We then proceeded to fit these five computational models to participants’ responses. For parameter estimation, we used hierarchical Bayesian methods, where we obtained group- and individual-level estimates for each model parameter (see Materials and methods).

Modelling results

We fit each model for each condition sequence. Example trial-by-trial model prediction plots from one participant can be found in Appendix 1—figure 4. To establish which of the models fitted the data best, we ran model comparison analysis based on the difference in expected log point-wise predictive density (ELPD) between models. The models are ranked according to the ELPD (with the largest providing the best fit). The ratio between the ELPD difference and the standard error around it provides a significance test proxy through the sigma effect. We considered at least a 2 sigma effect as indication of a significant difference. In each condition, the expectation weighted models (eKF and eRL) provided better fit than models without this element (KF and RL; approximately 2–4 sigma difference, as reported in Figure 5A–D) and Appendix 1—table 5. This suggests that regardless of the levels of volatility and stochasticity, participants still weigh perception of the stimuli with their expectation. In particular, we found that the expectation weighted KF model offered a better fit than the eRL, although in conditions of high stochasticity this difference was short of significance against the eRL model. This suggests that in learning about temporal regularities in the sequences of thermal stimuli, participants’ expectations modulate the perception of the stimulus. Moreover, this process was best captured by a model that updates the observer’s belief about the mean and the uncertainty of the sequence in a Bayesian manner.

Model comparison for each sequence condition (A–D).

The dots indicate the expected log point-wise predictive density (ELPD) difference between the winning model (eKF - expectation weighted Kalman filter) and every other model. The line indicates the standard error (SE) of the difference. The non-winning models’ ELPD differences are annotated with the ratio between the ELPD difference and SE indicating the sigma effect, a significance heuristic.

We also found that as the confidence in the response decreases, the response uncertainty is scaled linearly with a negative slope ranging between 0.112 and 0.276 across conditions (Figure 6), confirming the intuition that less confidence leads to bigger uncertainty.

The effect of the confidence scaling factor on noise scaling for each condition.

(A–D) Each coloured line corresponds to one participant, with the black line indicating the mean across all participants. The mean slope for each condition is annotated.

As an additional check, for each participant, condition and response type (perception and prediction), we plotted participants’ ratings against model predicted ratings and calculated a grand mean correlation in Appendix 1—figure 5.

Next, we checked whether the parameters of the the winning eKF model differed across different sequence conditions. Given that volatility was fixed within condition, we treated it as a single-context scenario from the point of view of modelling (Heald et al., 2023), and we did not interpret its effect on the learning rate (Piray and Daw, 2021). There were no differences for the group-level parameters; i.e., we did not detect significant differences between conditions in a hypothetical healthy participant group as generalised from our population of participants (Appendix 1—figure 12).

However, we found some differences at the individual level of parameters (i.e. within our specific population of recruited participants), which we detected by performing repeated measures ANOVAs (see Appendix 1—figure 13 for visualisation). The stochasticity parameter s was affected by the interaction between the levels of stochasticity and volatility (F(1,26)=35.108,p<0.001,ηp2=0.575), and was higher in highly stochastic and volatile conditions as compared to conditions where either volatility (t=7.735,pbonf<0.001), stochasticity (t=9.396,pbonf<0.001), or both were low (t=8.826,pbonf<0.001). This suggests that, while participants’ performance was generally worse in highly stochastic environments, participants seem to have attributed this to only one source - stochasticity (s), regardless of the source of higher uncertainty in the sequence (stochasticity or volatility).

The response noise ξ was modulated by the level of volatility (F(1,26)=5.079,p=0.033,ηp2=0.163), where it was smaller in highly volatile conditions. Moreover, we detected a significant interaction between volatility and stochasticity on the confidence scaling factor C (F(1,26)=81.258,p<0.001,ηp2=0.758), where the values C were overall lower when either volatility (t=11.570,pbonf<0.001), stochasticity (t=6.165,pbonf<0.001), or both (t=4.575,pbonf<0.001) were high as compared to the conditions where both levels of noise were low. This indicates there may have been some trade-off between ξ and C, as lower values of C introduce additional uncertainty when participant’s confidence is low.

Lastly, we found the initial uncertainty belief w0 was affected by the interaction between volatility and stochasticity (F(1,26)=5275.367,p<0.001,ηp2=0.995) without a consistent pattern. All the other effects were not significant.

In summary, we formalised the process behind pain perception and prediction in noxious time-series within the framework of sequential learning, where the best description of participants’ statistical learning was captured through Bayesian filtering, in particular using a confidence weighted KF model. Most importantly, we discovered that, in addition to weighing their responses with confidence, participants used their expectations about stimulus intensity levels to form a judgement as to what they perceived. This mechanism was present across various levels of uncertainty that defined the sequences (volatility and stochasticity).


Statistical learning allows the brain to extract regularities from streams of sensory inputs and is central to perception and cognitive function. Despite its fundamental role, it has often been overlooked in the field of pain research. Yet, chronic pain appears to fluctuate over time. For instance Mayr et al., 2022; Baliki et al., 2012; Foss et al., 2006, reported that chronic back pain ratings vary periodically, over several seconds-minutes and in absence of movements. This temporal aspect of pain is important because periodic temporal structures are easy to learn for the brain (Dehaene et al., 2015; Mancini et al., 2022). If the temporal evolution of pain is learned, it can be used by the brain to regulate its responses to forthcoming pain, effectively shaping how much pain it experiences. Indeed, we show that healthy participants extract temporal regularities from sequences of noxious stimuli and use this probabilistic knowledge to form confidence weighted judgements and predictions about the level of pain intensity they experience in the sequence. We formalised our results within a Bayesian inference framework, where the belief about the level of pain intensity is updated on each trial according to the amount of uncertainty participants ascribe to the stimuli and the environment. Importantly, their perception and prediction of pain were influenced by the expected level of intensity that participants held about the sequence before responding. When varying different levels of inherent uncertainty in the sequences of stimuli (stochasticity and volatility), the expectation and confidence weighted models fitted the data better than models weighted for confidence but not for expectations (Figure 5A–D). The expectation weighted Bayesian (eKF) model offered a better fit than the expectation weighted, model-free RL model, although in conditions of high stochasticity this difference was short of significance. Overall, this suggests that participants’ expectations play a significant role in the perception of sequences of noxious stimuli.

Statistical inference and learning in pain sequences

The first main contribution of our work is towards the understanding of the phenomenon of statistical learning in the context of pain. Statistical learning is an important function that the brain employs across the lifespan, with relevance to perception, cognition, and learning (Sherman et al., 2020). The large majority of past research on statistical learning focused on visual and auditory perception (Dehaene et al., 2015; Fiser and Lengyel, 2019; Meyniel et al., 2016; Meyniel and Dehaene, 2017), with the nociceptive system receiving relatively little attention (Tabor et al., 2017). Recently, we showed that the human brain can learn to predict a sequence of two pain levels (low and high) in a manner consistent with optimal Bayesian inference, by engaging sensorimotor regions, parietal, premotor regions, and dorsal striatum (Mancini et al., 2022). We also found that the confidence of these probabilistic inferences modulates the cortical response to pain, as expected by hierarchical Bayesian inference theory (Mulders et al., 2023). Here, we tested sequences with a much larger range of stimulus intensities to elucidate the effect of statistical learning and expectations on pain perception. As predicted by hierarchical Bayesian inference theory, we find that the pain intensity judgements are scaled by both expectations and confidence.

Hence, our work highlights the inferential nature of the nociceptive system (Jepma et al., 2018; Tabor et al., 2017; Fardo et al., 2017; Seymour and Mancini, 2020; Büchel et al., 2014), where in addition to the sheer input received by the nociceptors, there is a wealth of a priori knowledge and beliefs the agent holds about themselves and the environment that need to be integrated to form a judgement about pain (Yoshida et al., 2013; Anchisi and Zanon, 2015; Wiech, 2016; Tabor and Burr, 2019). This has an immediate significance for the real world, where weights need to be assigned to prior beliefs and/or stimuli to successfully protect the organism from further damage, but only to an extent to which it is beneficial.

Secondly, our results regarding the effect of expectation on pain perception relate to a much larger literature on this topic. The prime example would be placebo analgesia (i.e. the expectation of pain relief decreasing pain perception) and nocebo hyperalgesia (i.e. the expectation of high level of pain increasing its perception; Tracey, 2010; Büchel et al., 2014; Colloca et al., 2008; Blasini et al., 2017). Recent work attempted to capture such expectancy effects within the Bayesian inference framework. For example, Hoskin et al., 2019, showed that in addition to expectation influencing perceived pain in general, higher level of uncertainty around that expectation attenuated its effect on perception. Similarly, Hird et al., 2019, demonstrated that when the discrepancy between the expectation and outcome (prediction error) is unusually large, the role of expectation is significantly reduced and so the placebo and nocebo effects are not that strong. An unusually large prediction error could be thought of as contributing to increased uncertainty about the stimuli, which mirrors the results from Hoskin et al., 2019 Bayesian framework. Nevertheless, the types of stimuli used in the above studies (i.e. noxious stimuli cued by non-noxious stimuli) differed from the more ecologically valid sequences of pain that are reported by chronic pain patients (Mayr et al., 2022), as we indicated above. Furthermore, Jepma et al., 2018, used a conditioning paradigm and also found that expectations influence both perception and learning, in a self-reinforcing loop. Our work has followed a similar modelling strategy to Jepma et al., 2018, but it goes beyond simple conditioning schedules or sequences of two-level discrete painful stimuli, showing expectancy effects even when the intensities are allowed to vary across a wider range of values and according to more complex statistical temporal structures. Additionally, given the reported role of confidence in the perception of pain (Mulders et al., 2023; Brown et al., 2008), we draw a more complete picture by including participants’ confidence ratings in our modelling analysis.

Future studies would need to determine whether statistical learning and its effect on pain is altered in chronic pain conditions. This is important because statistical learning could, in principle, influence how a pain state evolves. Once a pain state is initiated, how an individual learns and anticipates the fluctuating pain signals may contribute to determine how well it can be regulated by the nervous system, thus affecting the severity and recurrence of pain flares. This, in turn, would affect whether aversive associations with the instigating stimulus are extinguished or reinforced (Seymour and Mancini, 2020). In chronic pain, dysfunctional learning may promote the amplification and maintenance of pain signals, contributing to the reinforcement of aversive associations with incident stimuli, as well as the persistence of pain (Seymour, 2019; Baliki and Apkarian, 2015; Vlaeyen et al., 2016).

Our paper comes with open tools, which can be adapted in future studies on statistical learning in chronic pain. The key advantage of taking an hypothesis-driven, computational-neuroscience approach to quantify learning is that it allows to go beyond symptoms-mapping, identifying the quantifiable computational principles that mediate the link between symptoms and neural function.

In summary, we show that statistical expectations and confidence scale the judgement of pain in sequences of noxious stimuli as predicted by hierarchical Bayesian inference theory, opening a new avenue of research on the role of learning in pain.

Materials and methods


Thirty-three (18 female) healthy adult participants were recruited for the experiment. The mean age of participants was 22.4±2.7 years of age (range: 18–35). Participants had no chronic condition and no infectious illnesses, as well as no skin conditions (e.g. eczema) at the site of stimulus delivery. Moreover, we only recruited participants that had not taken any anti-anxiety, anti-depressive medication, nor any illicit substances, alcohol and pain medication (including NSAIDs such as ibuprofen and paracetamol) in the 24 hr prior to the experiment. All participants gave informed written consent to take part in the study, which was approved by the local ethics committee (Department of Engineering, University of Cambridge Ethics Committee).


Request a detailed protocol

The experimental room’s temperature was maintained between 20°C and 23°C. Upon entry, an infrared thermometer was used to ensure participants’ temperature was above 36°C at the forehead and forearm of the non-dominant hand, to account for the known effects of temperature on pain perception (Strigo et al., 2000). A series of slideshows were presented, which explained the premise of the experiment and demonstrated what the participant would be asked to carry out. Throughout this presentation, questions were asked to ensure participants understood the task. Participants were given multiple opportunities to ask questions throughout the presentation.

We used the Medoc Advanced Thermosensory Stimulator 2 (TSA2) (Medoc Advanced Medical Systems, 2022) to deliver thermal stimuli using the CHEPS thermode. The CHEPS thermode allowed for rapid cooling (40°C/s) and heating (70°C/s) so transitions between the baseline and stimuli temperatures were minimal. The TSA2 was controlled externally, via Matlab (Mathworks).

We then established the pain threshold, using the method of limits (Lue and Shih, 2017), in order to centre the range of temperature intensities used in the experiment. Each participant was provided with stimuli of increasing temperature, starting from 40°C going up in 0.5°C increments, using an inter-stimulus interval (ISI) of 2.5 s and a 2 s duration. The participant was asked to indicate when the stimuli went from warm to painful - this temperature was noted and the stimuli ended. The procedure was repeated three times, and the average was used as an estimate of the pain threshold.

During the experiment, four sequences of thermal stimuli were delivered. Due to the phenomenon of offset analgesia, where decreases in tonic pain result in a proportionally larger decrease in perceived pain (Hermans et al., 2016), we chose phasic stimuli, with a duration of 2 s and an ISI 2.5 s. In order to account for individual differences, the temperatures which the levels refer to are based upon the participants’ threshold. The median intensity level was defined as threshold, giving a max temperature of 3°C above threshold, which was found to be acceptable by participants. Before the start of the experiment each participant was provided with the highest temperature stimuli that could be presented, given their measured threshold, to ensure they were comfortable with this. Two participants found the stimulus too painful - the temperature range was lowered by 1°C and this was found to be acceptable.

After every trial of each sequence, the participant was asked for either their perception of the previous stimulus or their prediction for the next stimulus through a 2D VAS (Figure 1B), presented using PsychToolBox-3 (Kleiner, 2007). The y-axis encodes the intensity of the stimulus either perceived or predicted, ranging from 0 (no heat detected/predicted) to 100 (worst pain imaginable perceived/predicted); on this scale, 50 represents the pain threshold. This was done as a given sequence was centred around the threshold. The x-axis encodes confidence in either perception or prediction, ranging from 0 - completely uncertain (‘unsure’) - to 1 - complete confidence in the rating (‘sure’). Differing background colours were chosen to ensure participants were aware of what was being asked, and throughout the experiment participants were reminded to take care in answering the right question. The mouse movement was limited to be inside of the coloured box, which defined the area of participants’ input. At the beginning of each input screen, the mouse location was uniformly randomised within the input box.

The sequence of response types was randomised so as to retain 40 prediction and 40 perception ratings for each of the four sequence conditions. For an 80-trial-long sequence, this gave 80 participant responses. Each sequence condition was separated by a 5 min break, during which the thermode’s probe was slightly moved around the area of skin on the forearm to reduce sensitisation (i.e. a gradual increase in perceived intensity with repetitive noxious stimuli) (Hollins et al., 2011). In the middle of each sequence, there was a 3 min break. During the ISI, the temperature returned to a baseline of 38°C. One participant was unable to complete the sequence as their threshold was too low, and data from four participants was lost due to Medoc software issues (the remote control failed and the data of two out of four sessions were not saved). We excluded one participant’s whose ratings/predictions were inversely proportional to the noxious input. Thus, we analysed data from 27 participants.

Generative process of the painful sequences

Request a detailed protocol

We manipulated two sources of uncertainty in the sequence: the stochasticity (s) of the observation and the volatility (v) of the underlying sequence (Piray and Daw, 2021). Sequences were defined by two levels (high or low) of stochasticity and volatility, resulting in four different sequences conditions - creating a 2×2 factorial design. Each sequence was defined as a series of chunks, where the intensity for trial t, it, was sampled from N(I,σ2), where σ2 indicates the level of stochasticity (σ2=1.75 for high level of stochasticity, σ2=0.25 for low level of stochasticity). The mean of the chunk, I, was drawn from U(3.5,10.5). To ensure a noticeable difference in chunk intensity to the participant, concurrent chunk means were constrained to be at least two intensity levels different. Volatility was implemented by defining the length, or number of trials, of a chunk (l) drawn from U(La,L+a), where L is the mean o the chunk length (L=15 for high volatility level, L=25 for low volatility level). A jitter, a, was added around the mean to ensure the transition from one chunk to the next was not consistent or predictable. For both high and low volatility conditions, we set a=3. Sampled values were then discretised, where any intensities outside the valid intensity range [1,13] were discarded and re-sampled resulting in an 80-trial-long sequence for each condition. The mean of each sequence was centred around intensity level 7, i.e., the participants threshold. So defined, six sets of four sequences were sampled. Each participant received one set, with a randomised sequence order. See an example sequence (after subject-specific linear transformation) and one participant’s responses (including confidence ratings) in Figure 1C and D.

Data pre-processing

Request a detailed protocol

Since the intensity values of the noxious input were discretised between 1 and 13, while the participant’s responses (perception and prediction) were given on a 0–100 scale, we applied a linear transformation of the input to map its values onto a common 0–100 range. For each participant, for a set of inputs at perception trials from the concatenated sequence (separate sequence conditions in the order as presented), we fit a linear least squares regression using Python’s scipy.stats.linregress function. On rare occasions, when the transformed input was negative, we refit the line using Python’s non-linear least squares function scipy.optimize.curve_fit, constraining the intercept above 0 (Virtanen et al., 2020). We then extracted each participant’s optimised slope and intercept and applied the transformation to both the concatenated and condition-specific sequence of inputs. So transformed, the sequences were then used in all the analyses. Plots of each participant transformation can be found in Appendix 1—figure 1. We superimposed participant’s responses onto the noxious input condition sequences in Appendix 1—figure 2.

To capture participant’s model-naive performance in the task, both for the concatenated and condition-specific sequence, we calculated RMSE of each participant’s perception (Equation 1) and prediction (Equation 2) responses as compared to the input. The lower the RMSE, the higher the response accuracy.

(1) RMSEP=t=1TP(YtP^t)TP
(2) RMSEE=t=1TE(yt+1E^t+1)TE

where TP is the number of perception trials, P^t is participant’s perception response to the stimulus yt at trial t, TE is the number of prediction trials, and E^t+1 is participant’s prediction of the next stimulus intensity yt+1 at trial t+1.


Reinforcement learning

Request a detailed protocol
Request a detailed protocol

In RL models, learning is driven by discrepancies between the estimate of the expected value and observed values. Before any learning begins, at trial t=1, participants have an initial expectation, E1=E0, which is a free parameter that we estimate.

On each trial, participants receive a thermal input Nt. We then calculate the prediction error δt, defined as the difference between the expectation Et and the input Nt (Equation 3).

(3) δt=NtEt

Participant is then assumed to update their expectation of the stimulus on the next trial as in Equation 4:

(4) Et+1=Et+αδt

where α is the learning rate (free parameter), which governs how fast participants assimilate new information to update their belief.

On trials when participants rate their perceived intensity, we assume no effects on their perception other than confidence rating ct and response noise, so participants’ perception response P^t is drawn from a Gaussian distribution, with the mean Pt=Nt and a confidence-scaled response noise ξ (free parameter), as in Equation 5:

(5) P^tN(Pt,ξ2exp{C1(1ct)}2)

where C is the confidence scaling factor (free parameter), which defines the extent to which confidence affects response uncertainty. Please see Figure 4 for an intuition behind confidence scaling.

On trials when participants are asked to predict the intensity of the next thermal stimulus, we use the updated expectation Et+1 to model participants’ prediction response E^t+1. This is similarly affected by confidence rating and response noise and is defined as in Equation 6.

(6) E^t+1N(Et+1,ξ2exp{C1(1ct)}2)

To recap, the RL model has four free parameters: the learning rate α, response noise ξ, the initial expectation E0, and the confidence scaling factor C.

Request a detailed protocol

Additionally, where we investigate the effects of expectation on the perception of pain (Jepma et al., 2018), we included an element that allows us to express the perception as a weighted sum of the input and expectation (Equation 7):

(7) Pt=(1γ)Nt+γEt

where γ[0,1] (free parameter) captures how much participants rely on the normative thermal input vs. their expectation. When γ=0, the expectation plays no role and the model simplifies to that of the standard RL above. In total, the eRL model has five free parameters, with the other equations the same as in the RL model, with the exception of the prediction error, which now relies on the expectation weighted pain perception Pt (Equation 8).

(8) δt=PtEt

Kalman filter


Request a detailed protocol

To capture sequential learning in a Bayesian manner, we used the KF model (Jepma et al., 2018; Särkkä, 2013; Kalman, 1960). KF assumes a generative model of the environment where the latent state on trial t, xt (the mean of the sequences in the experiment), evolves according to a Gaussian random walk with a fixed drift rate, v (volatility), as in Equation 9.

(9) xtN(xt1,v2)

The observation on trial t, Nt, is then drawn from a Gaussian (Equation 10) with a fixed variance, which represents the observation uncertainty s (stochasticity).

(10) NtN(xt,s2)

As such the KF assumes stable dynamics since the generative process has fixed volatility and stochasticity.

For ease of explanation, we refer to the thermal input at each trial as Nt, we also use the N1:t notation, which refers to a sequence of observations up to and including trial t. The model allows to obtain posterior beliefs about the latent state xt given the observations. This is done by tracking an internal estimate of the mean mt and the uncertainty, wt, of the latent state xt.

First, following standard KF results, on each trial, the participant is assumed to hold a prior belief (indicated with (–) superscript) about the latent state, xt (Equation 11).

(11) xt|N1:t1N(mt(),wt2())

On the first trial, before any observations, we set m1()=E0,w1()=w0 (free parameters). In light of the new observation, Nt on trial t, the tracked mean and uncertainty of the latent state are reweighed based on the new evidence Nt and its associated observation uncertainty s as in Equation 12.

(12) xt|N1:tN(s2mt()+wt2()Nts2+wt2(),s2wt2()s2+wt2())

We can then define the learning rate αt (Equation 13),

(13) αt=wt2()s2+wt2()

to get the update rule for the new posterior beliefs (indicated with (+) superscript) about the mean (Equation 14) and uncertainty (Equation 15) of xt.

(14) mt(+)=mt()(1αt)+Ntαt
(15) wt2(+)=wt2()(1αt)

Following this new belief, and the assumption about the environmental dynamics (volatility), the participant forms a new prior belief about the latent state xt+1 for the next trial t+1 as in Equation 16.

(16) xt+1|N1:tN(mt+1(),wt+12())


(17) mt+1()=mt(+)
(18) wt+12()=wt2(+)+v2

We can simplify the notation to make it comparable to the RL models. We let mt+1=mt(+)=mt+1(), and wt+12=wt+12()=wt2(+)+v2. Following a new observation at trial t, we calculate the prediction error (Equation 19) and learning rate (Equation 20).

(19) δt=ytmt
(20) αt=wt2wt2+s2

We then update the belief about the mean (Equation 21) and uncertainty (Equation 22) of the latent state for the next trial.

(21) mt+1=mt(1αt)+Ntαt=mt+αt(Ntmt)
(22) wt+12=wt2(1αt)+v2

Now, mapping this onto the experiment, the mean of the latent state is participants’ expectation Et=mt, and so we have participant perception rating modelled as in Equation 23.

(23) P^tN(Pt,ξ2exp{C1(1ct)}2)

and the prediction rating for the next trial as in Equation 24.

(24) E^t+1N(Et+1,ξ2exp{C1(1ct)}2)

In total the model has six free parameters: s (environmental stochasticity), v (environmental volatility), ξ (response noise), E0 (initial belief about the mean), w0 (initial belief about the uncertainty), and C (confidence scaling factor).


Request a detailed protocol

We can introduce the effect of expectation on the pain perception, by assuming that participants treat the thermal input as an imperfect indicator of the true level of pain (Jepma et al., 2018). In this case, the input, Nt, is modelled as in Equation 25:

(25) NtN(πt,ϵ2)

which forms an expression for the likelihood of the observation and adds an additional level to the inference, slightly modifying the KF assumptions such that:

(26) πtN(xt,s2)

However, we can apply the standard KF results and Bayes’ rule to arrive at simple update rules for the participants’ belief about the mean and uncertainty of the latent state xtt. From this, we get a prior on the πt defined in Equation 27:

(27) πt|N1:t1N(mt(),wt2()+s2)

which, following a new input Nt, gives us the posterior belief about πt as in Equation 28.

(28) πt|N1:tN(ϵ2mt()+(s2+wt2())Ntϵ2+s2+wt2(),ϵ2(s2+wt2())ϵ2+s2+wt2())

Now, if we define γt as in Equation 29:

(29) γt=ϵ2ϵ2+s2+wt2()

We have that the posterior belief about the mean level of pain πt is calculated as:

(30) Pt(+)=γtmt()+(1γt)Nt

which is a weighted sum of the input Nt and participant expectation about the latent state xt, governed by the perceptual weight γt, analogously to the eRL model. Finally, the posterior belief about xt is obtained in Equation 31.

(31) xt|N1:tN((ϵ2+s2)mt()+wt2()Ntϵ2+s2+wt2(),(ϵ2+s2)wt2()ϵ2+s2+wt2())

Now, setting the learning rate as in Equation 32:

(32) αt=wt2ϵ2+wt2+s2

we get:

(33) mt(+)=mt()(1αt)+Ntαt
(34) wt2(+)=wt2()(1αt)

Next, following the same notation simplification as before, we get the update rules for the prior belief about the mean (Equation 35) and uncertainty (Equation 36) of the latent state xt+1 for the next trial.

(35) mt+1=mt(1αt)+Ntαt=mt+αt(Ntmt)
(36) wt+12=wt2(1αt)+v2

as well as the expression for subjective perception, Pt, at trial t (Equation 37).

(37) Pt=γtmt+(1γt)Nt

The perception and prediction responses are modelled analogously as the KF model. In total, the model has seven free parameters: ϵ (subjective noise), s (environmental stochasticity), v (environmental volatility), ξ (response noise), E0 (initial belief about the mean), w0 (initial belief about the uncertainty), and C (confidence scaling factor).

Random model

Request a detailed protocol

As a baseline, we also included a model that performs a random guess. The perceptual/prediction ratings were modelled as in Equation 38.

(38) P^tN(R,ξ2exp{C1(1ct)}2)

The model has three free parameters: R, ξ, and C, where R is a constant value that participants respond with.

Model fitting

Model parameters were estimated using hierarchical Bayesian methods, performed with RStan package (v. 2.21.0) (Stan Development Team, 2019) in R (v. 4.0.2) based on Markov Chain Monte Carlo techniques (No-U-Turn Hamiltonian Monte Carlo). For the individual-level parameters we used non-centred parametrisation (Papaspiliopoulos et al., 2007). For the group-level parameters we used N(0,1) priors for the mean, and the gamma-mixture representation of the Student’s-t(3,0,1) for the scale (Stan Development, 2022). Parameters in the (0, 1) range were constrained using Phi_approx - a logistic approximation to the cumulative Normal distribution (Bowling et al., 2009).

For each condition and each of the four chains, we ran 6000 samples (after discarding 6000 warm-up ones). For each condition, we examined R-hat values for each individual- (including the N(0,1) error term from the non-centred parametrisation) and group-level parameters from each model to verify whether the Markov chains have converged. At the group-level and individual-level, all R-hat values had a value <1.1, indicating convergence. In the random response model, 0.01–0.16% iterations saturated the maximum tree depth of 11.

Model comparison

Request a detailed protocol

For model comparison, we used R package loo, which provides efficient approximate leave-one-out (LOO) cross-validation. The package allows to estimate the difference in models’ expected predictive accuracy through the difference in ELPD (Vehtari et al., 2017). By looking at the ratio between the ELPD difference and the SE of the difference, we get the sigma effect - a heuristic for significance of such model differences. There’s no agreed-upon threshold of SEs that determines significance, but the higher the sigma difference, the more robust is the effect. The closeness of fit can also be captured with LOO information criterion (LOOIC), where the lower LOOIC values indicate better fit.

Parameter comparison

Request a detailed protocol

For the comparison of group-level parameters between conditions, we extracted 95% high-density intervals of the permuted and merged (across chains) posterior samples of each group-level parameter (Kruschke, 2023). To assess significant differences between conditions, we calculated a difference between such defined intervals. In the Bayesian scenario, a significant difference is indicated by the interval not containing the value 0 (Aylward et al., 2019; Ahn et al., 2017).

Parameter and model recovery

Request a detailed protocol

To asses the reliability of our modelling analysis (Wilson and Collins, 2019), for each model we performed parameter recovery analysis, where we simulated participants’ responses using newly drawn individual-level parameters from the group-level distributions.

We repurposed existing sequences of noxious inputs in the [1, 13] range (pre-transformation). When then applied a linear transformation to the input sequences using sampled slope and intercept coefficients from a Gaussian distribution of these coefficients that we estimated based on our dataset using R’s fitdistrplus package. Furthermore, we simulated the confidence ratings based on lag-1 auto-correlation across a moving window of the transformed input sequence.

We then fit the same model to the simulated data and calculated Pearson correlation coefficients r between the generated and estimated individual-level parameters. The higher the coefficient r, the more reliable the estimates are, which can be categorised as: poor (if r<0.5); fair (if 0.5<r<0.75); good (0.75<r<0.9); excellent (if r>0.9) (White et al., 2018). Results are reported in Appendix 1—table 3 and Appendix 1—figures 611.

We also performed model recovery analysis (Wilson and Collins, 2019), where we first simulated responses using each model and then fit each model-specific dataset with each model. We then counted the number of times a model fit the simulated data best (according to the LOOIC rule), effectively creating an M×M confusion matrix, where M is the number of models. In the case where we have a diagonal matrix of ones, the models are perfectly recoverable and hence as reliable as possible. Results are reported in Appendix 1—table 4.

In Appendix 1—Tables 6–9 we report bulk and tail effective sample size (ESS) for each condition, for each model and parameter.

Appendix 1

Behavioural results

Model-naive performance

For each sequence condition (volatility × stochasticity), we calculated the RMSE of participants’ responses (type: perception and prediction ratings) and compared to the normative noxious input, as a measure of performance in the task. We analysed the RMSEs with a repeated measures ANOVA, with the results reported in Appendix 1—table 1.

Given the significant interaction between stochasticity and response type, we further ran a post hoc comparison tests for this effect, as reported in Appendix 1—table 2.

Noxious inputs and responses

Input transformation

We linearly transformed participants’ responses to project them from the 1–13 range to 0–100 using a linear transformation we obtained from a regression of stimulus intensities onto pain ratings.

Plots of each participant’s transformation can be found in Appendix 1—figure 1.

We superimposed participants’ responses (perception and prediction ratings) onto the noxious input condition sequences in Appendix 1—figure 2. The black line marks the start of a new sequence condition.

Finally, we plotted participants’ confidence ratings throughout the task in Appendix 1—figure 3.

Model predictions

Following the model fitting procedure, in Appendix 1—figure 4, we plotted example model predicted ratings for both perception and prediction responses for each condition as compared with noxious input for one participant’s responses.

Lastly, for each condition and for each participant we plotted model responses (perception and prediction) against participant responses in Appendix 1—figure 5. The grand mean correlation across participants for each condition and response type was calculated and included in the figure.

Parameter recovery

The results of parameter recovery analysis for each parameter for each model are reported in Appendix 1—table 3. We recovered each individual (out of 27 participants) parameter ≈100 times and calculated the mean and SD of the correlation between the true and recovered parameter values.

Moreover, to assess the number of simulations needed, we calculated the average SD (and its error) of the correlation as a function of increasing number of simulation, as plotted in Appendix 1—figure 6. The average was obtained from the 500 randomly chosen permutations of different simulations at each n (out of ≈100).

Next, we include scatter plots from the parameter recovery for each model and parameter in Appendix 1—figures 711.

Model recovery

We also ran model recovery analysis as described in the Materials and methods. We report the confusion matrix of our analysis based on approximately 100 simulations (per model pair) in Appendix 1—table 4.

Condition-wise model comparison

For each condition, we ran model comparison procedure as described in the Materials and methods. The results are reported in Appendix 1—table 5. In each condition, the expectation weighted models provided significantly better fit than models without this element.

Model diagnostics

In Appendix 1—tables 6–9, we report bulk and tail ESS for each condition, for each model and parameter.

While some of the ESS values are below the recommended threshold of 100, indicating potential issues with parameter inference. This may be due to a low participant sample size, as well as small number of trials per condition, hinting limited statistical power. Given that the Rhat values are all around 1, and that there are no divergent transitions, as well as a fairly good parameter recovery, we see this as a minor issue.

Lastly, in Appendix 1—table 10 we model diagnostics for each condition, such as the estimated Bayesian fraction of missing information (E-BFMI), number of divergent transition, and E-BFMI values per chain.

Modelling results

Group-level differences between each condition

We plotted the estimate posterior distributions for each parameter of the model (including the across-trial average and the final learning rate and perceptual weighting term) in Appendix 1—figure 12. We found no group-level differences between conditions for any of the posterior distribution of the parameters in the winning eKF model.

Individual-level differences between conditions

We estimated the individual-level parameters for each condition, and include their violin plots in Appendix 1—figure 13.

Appendix 1—figure 1
Linear transformation of the input at perception trials.

Blue dots indicate participant’s perception responses for a given level of stimulus intensity, black dots indicate transformed intensity values, a linear least squares regression was performed to achieve the best fitting line through participant’s responses as shown in red, the intercept was constrained>0.

Appendix 1—figure 2
Participants’ responses (red - perception; green - prediction) to the noxious input (dotted line) sequences.

Vertical purple lines mark the end of each condition.

Appendix 1—figure 3
Participants’ confidence ratings (red - perception; green - prediction) during the task.

Vertical purple lines mark the end of each condition.

Appendix 1—figure 4
Example plot of the input sequences (black) for each condition, one participant’s responses (white) and the winning, expectation weighted Kalman filter (eKF), model predictions (blue) including 95% confidence intervals (shaded blue) for (A–D) perception and (E–H) prediction.
Appendix 1—figure 5
Model responses against participants’ responses for each condition and each response type (A–D) perception and (E–H) prediction.

The annotated value is the grand mean correlation across subjects for each condition and response type.

Appendix 1—figure 6
Parameter recovery average SD for: (A) eRL; (B) RL; (C) eKF; (D) KF; (E) Random model.

The average SD is plotted as a function of simulation number averaged across 500 permutations of ≈100 simulations. The coloured shading corresponds to 1 SD around the average error.

Appendix 1—figure 7
Parameter recovery scatter plot for expectation weighted reinforcement learning (eRL) model from ≈100 simulations for: (A) ɑ; (B) ɣ; (C) ξ; (D) E0; (E) C parameter.
Appendix 1—figure 8
Parameter recovery scatter plot for reinforcement learning (RL) model from ≈100 simulations for: (A) ɑ; (B) ξ; (C) E0; (D) C parameter.
Appendix 1—figure 9
Parameter recovery scatter plot for expectation weighted Kalman filter (eKF) model from ≈100 simulations for: (A) ε; (B) s; (C) v; (D) ξ; (E) E0; (F) w0; (G) C parameter.
Appendix 1—figure 10
Parameter recovery scatter plot for Kalman filter (KF) model from ≈100 simulations for: (A) s; (B) v; (C) ξ; (D) E0; (E) w0; (F) C parameter.
Appendix 1—figure 11
Parameter recovery scatter plot for random model from ≈100 simulations for: (A) ξ; (B) R; (C) C parameter.
Appendix 1—figure 12
Group-level distributions for parameters for each condition for the expectation weighted Kalman filter (eKF) model.
Appendix 1—figure 13
Violin plots (and box-plots) of individual-level parameters for each condition in the winning expectation weighted Kalman filter (eKF) model.

Lower and upper hinges correspond to the first and third quartiles of partipants’ errors (the upper/lower whisker extends from the hinge to the largest/smallest value no further than 1.5 * ”Interquartile range” from the hinge); the line in the box corresponds to the median. Each condition has N=27 particpants.

Appendix 1—table 1
Within-subjects effects from repeated measures ANOVA of participant’s RMSE scores with stochasticity, volatility, and response type factors.

SS - sum of squares, MS - mean square, RMSE - root mean square error

Volatility × stochasticity0.00610.0065.688e-40.9813.723e-62.188e-5
Volatility × type7.31317.3133.1960.0850.0050.109
Stochasticity × type63.662163.66229.842<0.001*0.0410.534
Volatility × stochasticity × type1.35611.3560.7040.4098.807e-40.026
  1. * indicates statistical significance at 0.05 level.

Appendix 1—table 2
Post hoc comparisons for the repeated measures ANOVA’s interaction effect of stochasticity × type.
95%CI for mean diff.
Mean diff.LowerUpperSEtpbonf
High, perceptionLow, perception0.367−0.6871.4210.3810.9631.000
High, prediction−3.686−4.636−2.7350.345−10.688<0.001*
Low, prediction−1.147−2.3290.0340.430−2.6650.062
Low, perceptionHigh, prediction−4.053−5.234−2.8710.430−9.415<0.001*
Low, prediction−1.514−2.464−0.5640.345−4.390<0.001*
High, predictionLow, prediction2.5391.4843.5930.3816.658<0.001*
  1. * indicates statistical significance at 0.05 level.

Appendix 1—table 3
Pearson correlation coefficient r (SD) from the parameter recovery analysis for each model.
r (SD)0.685 (0.113)0.92 (0.049)0.993 (0.005)0.723 (0.093)0.481 (0.131)
r (SD)0.842 (0.081)0.993 (0.004)0.625 (0.107)0.455 (0.133)
r (SD)0.742 (0.1)0.531 (0.13)0.745 (0.09)0.986 (0.075)0.849 (0.118)0.309 (0.179)0.472 (0.123)
r (SD)0.605 (0.129)0.589 (0.117)0.993 (0.005)0.585 (0.157)0.298 (0.18)0.442 (0.146)
Random model
r (SD)0.996 (0.004)0.999 (0.001)0.079 (0.206)
Appendix 1—table 4
Confusion matrix from the model recovery analysis based on ≈100 simulations.

The y-axis indicates which model simulated the dataset, while the x-axis indicates which model fit the data based on leave-one-out information criterion (LOOIC).

Appendix 1—table 5
Model comparison results for each condition.
ConditionModel nameELPD differenceSE differenceSigma effectLOOIC
Vol. high Stoch. higheKF - expectation weighted0.0000.00015748.389
eRL - expectation weighted–9.5605.0711.88515767.509
Random response–730.60077.0099.48717209.588
Vol. high Stoch. loweKF - expectation weighted0.0000.00015682.115
eRL - expectation weighted–17.4395.8962.95815716.993
Random response–824.34679.14810.41517330.807
Vol. low Stoch. higheKF - expectation weighted0.0000.00015990.114
eRL - expectation weighted–12.0277.0291.71116014.169
Random response–831.09684.5499.83017652.306
Vol. low Stoch. loweKF - expectation weighted0.0000.00015904.936
eRL - expectation weighted–11.0684.3092.56915927.072
Random response–901.792107.2448.40917708.519
Appendix 1—table 6
Bulk and tail effective sample size (ESS) values for vol. high - stoch. high.
ModelParam.ESS (bulk)ESS (tail)
Appendix 1—table 7
Bulk and tail effective sample size (ESS) values for vol. high - stoch. low.
ModelParam.ESS (bulk)ESS (tail)
Appendix 1—table 8
Bulk and tail effective sample size (ESS) values for vol. low - stoch. high.
ModelParam.ESS (bulk)ESS (tail)
Appendix 1—table 9
Bulk and tail effective sample size (ESS) values for vol. low - stoch. low.
ModelParam.ESS (bulk)ESS (tail)
Appendix 1—table 10
Model diagnostics for each condition - estimated Bayesian fraction of missing information (E-BFMI), number of divergent transition E-BFMI values per chain.
ConditionModel# chains low E-BFMI# div. transitionsE-BFMI values
HVHSeRL000.696 0.713 0.695 0.691
RL000.76 0.748 0.771 0.806
eKF000.755 0.767 0.771 0.759
KF000.633 0.596 0.547 0.563
Random000.842 0.851 0.843 0.835
HVLSeRL000.689 0.76 0.69 0.689
RL000.624 0.688 0.688 0.685
eKF000.741 0.764 0.753 0.779
KF000.654 0.734 0.689 0.674
Random000.883 0.779 0.836 0.833
LVHSeRL000.73 0.732 0.728 0.702
RL000.719 0.714 0.742 0.7
eKF000.753 0.755 0.792 0.766
KF000.75 0.768 0.729 0.754
Random000.864 0.849 0.883 0.845
LVLSeRL000.764 0.762 0.75 0.764
RL000.714 0.764 0.719 0.697
eKF000.783 0.751 0.772 0.77
KF000.705 0.695 0.702 0.726
Random000.835 0.829 0.852 0.847

Data availability

All code and data are openly available on Zenodo (

The following data sets were generated
    1. Onysk J
    2. Gregory N
    3. Whitefield M
    4. Jain M
    5. Turner G
    6. Seymour B
    7. Mancini F
    (2024) Zenodo
    Statistical learning shapes pain perception and prediction independently of external cues.


    1. Hermans L
    2. Calders P
    3. Van Oosterwijck J
    4. Verschelde E
    5. Bertel E
    6. Meeus M
    An Overview of Offset Analgesia and the Comparison with Conditioned Pain Modulation: A Systematic Literature Review
    Pain Physician 19:307–326.
    1. Kleiner M
    What’s new in psychtoolbox-3
    Perception 36:1–16.
  1. Book
    1. Knill DC
    2. Richards W
    Perception as Bayesian Inference
    Cambridge University Press.
  2. Book
    1. Kruschke JK
    Doing Bayesian data analysis: a Tutorial
    Jags R, editors. Academic Press.
    1. Schapiro A
    2. Turk-Browne N
    Statistical learning
    Brain Mapping 3:501–506.
  3. Book
    1. Sutton RS
    2. Barto AG
    Reinforcement Learning: An Introduction. Adaptive Computation and Machine Learning Series
    The MIT Press.

Article and author information

Author details

  1. Jakub Onysk

    1. Computational and Biological Learning Unit, Department of Engineering, University of Cambridge, Cambridge, United Kingdom
    2. Applied Computational Psychiatry Lab, Max Planck Centre for Computational Psychiatry and Ageing Research, Queen Square Institute of Neurology and Mental Health Neuroscience Department, Division of Psychiatry, University College London, London, United Kingdom
    Conceptualization, Formal analysis, Investigation, Visualization, Methodology, Writing - original draft, Project administration, Writing - review and editing
    For correspondence
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-0876-5465
  2. Nicholas Gregory

    Computational and Biological Learning Unit, Department of Engineering, University of Cambridge, Cambridge, United Kingdom
    Conceptualization, Formal analysis, Investigation, Writing - review and editing
    Competing interests
    No competing interests declared
  3. Mia Whitefield

    Computational and Biological Learning Unit, Department of Engineering, University of Cambridge, Cambridge, United Kingdom
    Formal analysis, Methodology
    Competing interests
    No competing interests declared
  4. Maeghal Jain

    Computational and Biological Learning Unit, Department of Engineering, University of Cambridge, Cambridge, United Kingdom
    Formal analysis
    Competing interests
    No competing interests declared
  5. Georgia Turner

    1. Computational and Biological Learning Unit, Department of Engineering, University of Cambridge, Cambridge, United Kingdom
    2. MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, United Kingdom
    Conceptualization, Methodology
    Competing interests
    No competing interests declared
  6. Ben Seymour

    1. Wellcome Centre for Integrative Neuroimaging, John Radcliffe Hospital, Headington, Oxford, United Kingdom
    2. Center for Information and Neural Networks (CiNet), Osaka, Japan
    Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-1724-5832
  7. Flavia Mancini

    Computational and Biological Learning Unit, Department of Engineering, University of Cambridge, Cambridge, United Kingdom
    Conceptualization, Supervision, Funding acquisition, Methodology, Writing - original draft, Writing - review and editing
    For correspondence
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-8441-9236


Medical Research Council (MR/T010614/1)

  • Flavia Mancini

Medical Research Council (MR/W027593/1)

  • Ben Seymour
  • Flavia Mancini

Wellcome Trust
  • Ben Seymour

Versus Arthritis (21537)

  • Ben Seymour

IITP (MSIT 2019-0-01371)

  • Ben Seymour

University of Cambridge Research Computing Service (EP/T022159/1)

  • Flavia Mancini

NIHR Oxford Health BRC (NIHR203316)

  • Ben Seymour

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication. The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care or other funders. For the purpose of Open Access, the authors have applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission.


The study was funded by an MRC Career Development Award to FM (MR/T010614/1) and a UKRI Advanced Pain Discovery Platform grant to both FM and BS (MR/W027593/1). BS was also funded by Wellcome (214251/Z/18/Z), Versus Arthritis (21537), and IITP (MSIT 2019-0-01371). This work has been performed using resources provided by the Cambridge Tier-2 system operated by the University of Cambridge Research Computing Service ( funded by EPSRC Tier-2 capital grant (EP/T022159/1). HPC access was additionally funded by an EPSRC research infrastructure grant to FM. We are grateful to Prof. Máté Lengyel and Prof. Deborah Talmi for helpful discussions about the study. For the purpose of open access, the author has applied a Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version arising from this submission.


All participants gave informed written consent to take part in the study, which was approved by the the ethics committee of the Department of Engineering, University of Cambridge.

Version history

  1. Preprint posted: July 4, 2023 (view preprint)
  2. Sent for peer review: July 19, 2023
  3. Preprint posted: October 4, 2023 (view preprint)
  4. Preprint posted: May 28, 2024 (view preprint)
  5. Version of Record published: July 10, 2024 (version 1)

Cite all versions

You can cite all versions using the DOI This DOI represents all versions, and will always resolve to the latest one.


© 2023, Onysk et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.


  • 698
  • 37
  • 0

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Jakub Onysk
  2. Nicholas Gregory
  3. Mia Whitefield
  4. Maeghal Jain
  5. Georgia Turner
  6. Ben Seymour
  7. Flavia Mancini
Statistical learning shapes pain perception and prediction independently of external cues
eLife 12:RP90634.

Share this article

Further reading

    1. Cell Biology
    2. Neuroscience
    Jaebin Kim, Edwin Bustamante ... Scott H Soderling
    Research Article

    One of the most extensively studied members of the Ras superfamily of small GTPases, Rac1 is an intracellular signal transducer that remodels actin and phosphorylation signaling networks. Previous studies have shown that Rac1-mediated signaling is associated with hippocampal-dependent working memory and longer-term forms of learning and memory and that Rac1 can modulate forms of both pre- and postsynaptic plasticity. How these different cognitive functions and forms of plasticity mediated by Rac1 are linked, however, is unclear. Here, we show that spatial working memory in mice is selectively impaired following the expression of a genetically encoded Rac1 inhibitor at presynaptic terminals, while longer-term cognitive processes are affected by Rac1 inhibition at postsynaptic sites. To investigate the regulatory mechanisms of this presynaptic process, we leveraged new advances in mass spectrometry to identify the proteomic and post-translational landscape of presynaptic Rac1 signaling. We identified serine/threonine kinases and phosphorylated cytoskeletal signaling and synaptic vesicle proteins enriched with active Rac1. The phosphorylated sites in these proteins are at positions likely to have regulatory effects on synaptic vesicles. Consistent with this, we also report changes in the distribution and morphology of synaptic vesicles and in postsynaptic ultrastructure following presynaptic Rac1 inhibition. Overall, this study reveals a previously unrecognized presynaptic role of Rac1 signaling in cognitive processes and provides insights into its potential regulatory mechanisms.

    1. Neuroscience
    Qiaoli Huang, Huan Luo
    Research Article

    Daily experiences often involve the processing of multiple sequences, yet storing them challenges the limited capacity of working memory (WM). To achieve efficient memory storage, relational structures shared by sequences would be leveraged to reorganize and compress information. Here, participants memorized a sequence of items with different colors and spatial locations and later reproduced the full color and location sequences one after another. Crucially, we manipulated the consistency between location and color sequence trajectories. First, sequences with consistent trajectories demonstrate improved memory performance and a trajectory correlation between reproduced color and location sequences. Second, sequences with consistent trajectories show neural reactivation of common trajectories, and display spontaneous replay of color sequences when recalling locations. Finally, neural reactivation correlates with WM behavior. Our findings suggest that a shared common structure is leveraged for the storage of multiple sequences through compressed encoding and neural replay, together facilitating efficient information organization in WM.