Introduction

Despite its unpleasantness, pain serves a useful function. This is most clearly the case for phasic pain, where newly incoming nociceptive signals warn of impending tissue damage, elicit rapid defensive responses, and drive learning that reduces future chances of encountering harm. In this context, function can be objectively measured through conditioned responses and avoidance behaviour, which quantify pain in terms of its motivational value (Fields, 2006; Seymour, 2019). However, the adaptive function of tonic pain has been much harder to quantify. In ecological and ethological contexts, tonic pain is generally considered to serve protective and recuperative functions. For instance, in the context of injury, it has been proposed that tonic pain can aid recovery by reducing motivational vigour and hence suppress unnecessary activity until healing occurs (Bolles and Fanselow, 1980; Wall, 1979; Walters and Williams, 2019; Seymour et al., 2023). The goal of this study was to test whether we can formally and quantifiably dissociate these two distinct motivational functions of pain.

One of the challenges in studying adaptive functions of pain is the difficulty of embedding experiments within ecologically meaningful contexts. To solve this, we designed an immersive foraging task using virtual reality (VR), in which humans search a forest to collect fruits from the forest floor. Some fruits, distinguishable through colour, were associated with a brief electrical pain stimulus to the grasping hand (as if they had thorns), to capture how phasic pain can elicit learned avoidance. In experiment 1, we aimed to show that a basic learning model derived from computational reinforcement learning could quantify this aversive motivational value, inferred from a reduced tendency to pick these fruits. Experiment 2 extended this by simultaneously delivering tonic pain, via an inflated pressure-cuff, to the non-dominant arm, intending to emulate a mild injury. This allowed us to quantify both the aversive value of phasic pain, and the reduced motivational vigour of foraging induced by tonic pain. In this way, the experiment aimed to computationally dissociate these two fundamentally distinct components of adaptive pain behaviour.

Results

Experiment 1: Phasic pain avoidance in a free-operant foraging task

Twenty-five subjects performed a free-operant instrumental foraging task to accrue reward points and avoid pain. The task was embedded into a fully immersive VR context, in which subjects physically moved in a flat open space. A virtual boundary of 4 × 4 m was displayed when the subject approached the edge of the area. Within the VR context, this space resembled a flat forest (Fig. 1), with trees and vegetation locations randomly generated. The task was structured into one-minute blocks. At the beginning of each block, virtual fruits of varying heights and locations were generated within the virtual space. The spatial coordinates of subjects’ heads and hands were tracked, requiring them to physically move towards the fruit, pick it up by reaching with their dominant hand, clicking, and holding the button on the handset. Only the dominant hand was enabled in the virtual environment. Subsequently, subjects could drop the fruit into one of the baskets to earn points. Subjects were told that the total points accrued would be rewarded with an incentive of up to £10. All fruits used in both experiments were pineapples, as their spiky shape naturally matches the pricking pain induced by the Wasp pain stimulation electrodes.

Task description.

(A) An illustrative figure depicting the foraging task. (B) Subjects’ view (left eye) in the task. (C) A perspective view of the task environment. (D) An aggregated top-view heat map of head trajectories of all subjects’ data. Red dots denote baskets’ preset locations.

Pineapples were categorized into two visually distinguishable types: 50% were green and 50% were yellow. The green pineapple was aversive, and picking one up immediately elicited a brief painful cutaneous electric stimulation to the proximal medial side of the middle finger of the (grasping) dominant hand. In contrast, picking up a yellow pineapple was always pain-free. The intensity of the electric stimulation was held constant within each block. Experiment 1 consisted of 20 blocks, with the initial 10 blocks designated for practice purposes with a fixed order of stimulation intensities. The subsequent 10 blocks used randomized pain stimulation intensities across five different levels, determined by a pain calibration procedure. After completing the 10 training blocks, all subjects were aware that picking up green pineapples resulted in painful shocks, as confirmed through verbal questioning. However, subjects did not know the exact pain level of a new block until they tried to pick up one green pineapple. Only data from the latter 10 non-training blocks were analysed.

Avoidance increases with increasing phasic pain intensity

We found that the probability of choosing a fruit decreased as the pain associated with it increased: Figure 2A shows an approximately linear decrease in aversive choice probability as a function of visual analogue scale (VAS) rating of pain stimuli, whereby the aversive choice probability is defined as the number of painful fruits collected divided by the total number of fruits collected.

Experiment 1 behaviour and modelling results.

(A) Each coloured line represents one single subject data with the colour reflecting the value in pain-free condition (VAS=0). The dashed black line’s slope and intercepts are fixed effect estimates from a linear mixed model. The slope’s estimate shows an inverse relationship between pain choice probability and VAS rating, β = 0.0263, 95%CI[0.0383, 0.0144], t(21.27) = 4.40, p < .001. (B) As described in (A), the dashed line’s slope is the fixed effect’s estimate of choice distance bias, and it shows a positive relationship between the choice distance bias and VAS ratings, β = 0.0113, 95%CI[0.00294, 0.0197], t(29.36) = 2.68, p = .012. (C) The negative phasic pain coefficients (M = 0.592, SD = 0.261) showed the model captured the aversiveness of phasic pain stimuli in this free-operant decision-making task, t(23) = −10.89, p < .001. (D) The moving effort coefficient Cm was separated into a horizontal component (M = 0.115, SD = 0.0967) and a vertical component (M = −0.370, SD = 0.217). The fitted coefficients showed lower effort cost to move horizontally than vertically, t(23) = 6.72, p < .001. (E) VAS ratings at different electric shock intensities. (F) Phasic pain utility values as a function of electric shock intensity. Each coloured line represents one subject’s fitted curve. Black line is the average over all subjects. (G) Linear regression showed significant correlation between VAS pain ratings and model estimated phasic pain values, R2 = 0.146, F (1, 118) = 20.13, p < .001.

Additional cost of effort associated with movement

Next, since movement itself may carry a small cost related to time and effort (incurring opportunity costs), we investigated whether there is a trade-off relationship between moving distances and pain intensity (i.e., is it worth moving further to avoid a certain amount of pain). We measured the distance from the fruit to the subject at the moment the subject first fixated on the fruit, as estimated from eye-tracking data, after collecting the previous one. We found a significant correlation between the egocentric distance differences between painful and non-painful fruits, and the VAS ratings. As shown in Figure 2B, the vertical axis represents the egocentric distances from non-painful fruits subtracted by those from painful fruits. That is, as pain linked to the aversive fruits increased, subjects travelled a greater distance to pick up a pain-free fruit.

Choice-based computational modelling in RL framework

Above we showed the regression results analysed at the block level. Within each one-minute block, each pickup, which could result in an immediate shock, can be considered an individual trial. By analysing the data at the trial level, we can extract more detailed information and provide a parameterized behavioural description for each individual. This approach laid the groundwork for a robust, objective measure of phasic pain value within an ecologically valid context. In each trial, despite the presence of numerous lower-level controls required to complete the action, the primary decision involved choosing which fruit to pick up. We simplified the problem by assuming that there exists a set of decision points 𝒮 as states, where the subject took an action at a state s𝒮—choosing to pick up a fruit—and deterministically arrived at another state s ∈ 𝒮 if the block had not concluded. Based on the initial regression results, we assumed that the reward was a linear combination of an internal reward r for each fruit, a negative utility function u representing the phasic pain value, and an effort cost function d. Therefore, the Bellman optimality equation for the action-value function was

, where Cp and Cm were the parameters to be fitted for each subject.

As fruits were randomly distributed in our task, previous foraging studies suggested that animals may adopt a suboptimal greedy policy, opting for the closest fruit rather than solving the complex travelling salesman problem for a marginally better optimal solution (Anderson, 1983; Jeon et al., 2023). To initially demonstrate the basic effects, we further eschewed learning and future planning by setting the discount factor γ = 0. One of the key simplifications was assuming the existence of a set of decision points in this free-operant context. To fit the model, we again utilised the eye-tracking data, setting the decision point as the first time point when the subject saw a new fruit. Nevertheless, the task’s available actions were partially observable, and subjects could explore for more actions if they were not satisfied with the currently available ones. This exploration action was not modelled in our formulation. To proceed with model fitting, we retrospectively identified the fruits that were picked up and selected only those time points when the fruits chosen afterwards were first seen as the decision points. We also kept track of the previously seen fruits, storing them in a memory queue, so the decision involved choosing the best option from these items in memory. Figure 2(C, D) shows the parameters fitted to the subjects’ behavioural data. This shows the negative utility associated with pain and vertical and horizontal components of movement, and it illustrates how the computational modelling framework can be used to quantify behaviour in a free-operant VR context.

Skin conductance analysis dissociated decision values and subjective ratings for phasic pain

As an objective measure derived from empirical behavioural data, the phasic pain utility function generated by the model can be abstract. To understand its physical implications, we further examined its relationship with subjective pain ratings and their correlations with physiological responses, specifically skin conductance responses (SCRs). Figure 2(E, F) presents the subjective ratings and decision values of phasic pain across different conditions, while Figure 2G demonstrates a significant correlation between decision values and subjective pain ratings.

Two types of evoked SCRs were analysed: those elicited by the pick-up action (which triggered a shock for painful fruit) and those associated with the fixation action (looking at either painful or non-painful fruit). Figure 3A compares evoked SCRs for painful and non-painful fruit. To quantify the magnitude of evoked SCRs which may overlap in the time domain, given that changes in skin conductance can be approximated by a linear time-invariant system, we fitted the skin conductance data to a constructed time series by convoluting the event trigger with a known canonical response function (CRF) (Bach et al., 2010). We selected fixation and pick-up events for painful fruit and extracted their fitted coefficients. Consistent with the significant correlation between decision values and pain ratings, decision values also exhibited similar correlation with fixation coefficients as pain ratings (Fig. 3B). However, while pain ratings showed a significant correlation with shock-evoked SCRs, the correlation between decision values and shock-evoked SCRs was much weaker (Fig. 3C). Despite small R2 values, which suggest the presence of noise, this discrepancy suggests a notable dissociation between expected utility (at the decision level) and experienced utility (the actual pain experience).

Skin conductance changes and their correlation with decision values derived from the model.

(A) Evoked SCR for fixation and fruit pick-up events. Compared to seeing the visual cues, shock stimuli induced a greater SCR when the shock intensities were high. Shaded area is the 95% confidence interval. (B) Both decision values and subjective ratings showed a similar level of correlation with the coefficients of fixation-evoked SCR. For decision values: R2 = 0.040, F (1, 155) = 6.38, p = .013; and for subjective pain ratings: R2 = 0.040, F (1, 155) = 6.44, p = .012. (C) In contrast, subjective pain ratings showed a significant correlation with the coefficients of shock-evoked SCR, R2 = 0.080, F (1, 170) = 14.86, p < .001, whereas the correlation for decision values was weak, R2 = 0.006, F (1, 170) = 0.97, p = .325.

Experiment 2: Modulation of free-operant foraging by tonic pain stimulation

In the second experiment, we introduced a tonic pain condition by applying an inflatable blood pressure cuff around the non-dominant arm (Graven-Nielsen et al., 2017). We reduced the number of phasic pain levels from five in the first experiment to three (no pain, low pain, and high pain). These three phasic pain levels were then combined with two fixed tonic pain conditions (with and without tonic pain) to form a factorial design experiment with a total of six different combinations. Similar to experiment 1, the experiment was divided into one-minute blocks, and the pain conditions were fixed within each block. The six combinations were initially presented in a fixed order as training blocks, then the order was randomized and repeated three times, resulting in a total of 24 blocks. To obtain a neural measure of pain behaviour, we introduced concurrent EEG recording alongside the VR set-up.

No effect found for tonic pain on phasic pain

We found no significant modulation of phasic pain ratings by tonic pain (Fig. 4(A, B)). We found a basic decrease in aversive choice probability, in terms of the probability of selecting a painful fruit, as in experiment 1. The average aversive choice probabilities were similar in conditions with and without tonic pain (Fig. 4(C-E)), providing evidence that punishment sensitivity was not affected by the presence of tonic pain.

Effects of tonic pain on phasic pain ratings and aversive choice probabilities.

(A, B), Two-way repeated measures ANOVA showed no statistically significant difference in subject’s ratings of phasic pain intensity as a result of tonic pain in low (A) or high (B) phasic pain conditions, F(1,30)=2.01, p=.167, nor was there a significant interaction effect between tonic pain and phasic pain, F(1, 30)=0.87, p=.357. (C, D), Similar linear mixed model fitting results for aversive choice probability. No tonic pain condition: β = 0.0286, 95%CI = [0.0372, 0.0198], p < .001, intercept = 0.473. Tonic pain condition: β = −0.0280, 95%CI = [0.0349, − 0.0209], p < .001, intercept = 0.472. (E), Aversive choice probability averaged over phasic pain conditions. Two-way repeated measures ANOVA showed significant effect for phasic pain (including no pain condition), F (2, 60) = 36.872, p < .001, but no effect for tonic pain, F (1, 30) = 0.00, p = .998. The interaction effect between tonic and phasic pain was also not significant F (2, 60) = 0.07, p = .930.

In the neural data, we found that phasic pain intensity modulated the amplitude of phasic pain event-related potentials (ERPs), as would be expected. But as in the behavioural results, this was not modulated by the presence/absence of tonic pain (Fig. 5). That is, tonic pain neither facilitated nor inhibited the behavioural or neural responses to the phasic pain.

Phasic pain ERPs in different pain conditions.

(A, B) Phasic pain ERP comparison in the same tonic pain conditions. (C, D) Phasic pain ERP comparison in the same phasic pain conditions. (E, F) High phasic pain induced a significantly higher N1-P2 amplitude with or without tonic pain. (G, H) Tonic pain stimulation does not show a significant effect in phasic pain ERP’s N1-P2 amplitude. Two-way repeated measures ANOVA showed the effect for phasic pain was significant, F (1, 30) = 35.42, p < .001. The effect for tonic pain was not significant, F (1, 30) = 0.30, p = .589, and the interaction effect was also not significant, F (1, 30) = 0.58, p = .454.

Tonic pain reduced motivational vigour

By analysing the motion tracking data, we found that tonic pain was associated with a reduction in task-related movement velocities (Fig. 6(A, C)) and an associated reduction in fruit collection rates (Figure 6B).

Tonic pain reduced action velocity.

(A) Average hand speed. A one-tailed paired t-test was conducted to evaluate whether the hand speed was faster without tonic pain (M = 1.01, SD = 0.19) than with tonic pain (M = 0.99, SD = 0.18), t(30) = 2.09, p = .023, with a small to moderate effect size d = 0.37. (B) The average fruit collection rate over the one-minute block. A one-tailed paired t-test was conducted to evaluate whether the collection rate was higher without tonic pain (M = 18.22, SD = 3.45) than with tonic pain (M = 17.90, SD = 3.35), t(30) = 1.93, p = .031, with a small to moderate effect size d = 0.35. (C) A visualisation demonstrating the instantaneous hand speed at different percentages of total distance travelled in reaching to the fruit. Shaded area is the 95% confidence interval of the pairwise difference mean.

To quantify this effect in terms of motivational vigour, we extended the computational model to accommodate vigour effects. The model presented for experiment 1 does not consider time, and fitting the model to the collected data requires acausal simulation (see Methods - model-fitting section below). For experiment 2, we took account of the temporal factor and fitted the model causally with high temporal resolution. Niv et al. (2007) proposed that the delay between consecutive actions performed by animals can be formulated as a trade-off between opportunity costs and vigour costs. This framework was developed based on earlier works in average-reward reinforcement learning, showing that this trade-off approach maximises average reward (Niv et al., 2005). In this formulation, animals choose an action pair (a, τ). The first, a, is the common action in discrete decision-making models, and the separate τ represents the time delay between two actions. We built upon this model to suit the requirements of our study. Similar to the model in experiment 1, our action a represents the choice of which fruit to pick up. To account for the different effort costs incurred for each fruit, we again used a distance-based effort cost term Cm · d(a), as in experiment 1. Additionally, a time delay between actions is required to model vigour. We assumed that subjects estimated the time delay by estimating their speed Vspeed and calculate the delay based on the distance to the fruit and the estimated speed. Hence, for a decision time point state s ∈ 𝒮, if the effort cost is inverse proportional to the time delay, the optimal differential value of an (a, τ) pair is

, where τ is the additional delay accounting for the time waited before committing to execute the action. s∈ 𝒮 is the deterministic subsequent state after choosing (a, τ). V is the optimal differential value function. Cv is the vigour constant that scales the vigour cost term. is the average reward and assumed to be constant.

We fitted vigour constants Cv and other parameters for no tonic pain and with tonic pain conditions separately. We found the fitted vigour constants were significantly higher in tonic pain conditions (Fig. 7A). We further fitted the model with additional separate vigour constants for different levels of phasic pain. Consistent with the fitting results over tonic pain conditions only, tonic pain showed a strong effect on fitted vigour constants, while phasic pain did not show a significant effect (Fig. 7B).

Experiment 2 model-fitting results.

(A) Fitted vigour constants were greater in tonic pain conditions (M = 53.61, SD = 27.17) than no tonic pain conditions (M = 38.16, SD = 18.88), t(30) = 2.37, p = .024. (B) Separate vigour constants fitting for each phasic pain conditions. Repeated measures ANOVA showed the effect for tonic pain was significant, F (1, 30) = 6.33, p = .017, but the effect for phasic pain was not significant, F (2, 60) = 0.40, p = .673. The interaction effect was not significant, F (2, 60) = 0.67, p = .515. (C) Phasic pain utility function curves. Solid lines are the average values in each condition. (D) Phasic pain utility function values at 50% electric shock intensity. A paired t-test was conducted and found no significant differences between no tonic pain (M = 4.645, SD = 8.982) and with tonic pain (M = −4.807, SD = 8.146) conditions, t(30) = 0.08, p = .933. (E) Phasic pain utility function values at 50% electric shock intensity plotted against aversive choice probability. Both conditions show a significant correlation between the fitted model values and behavioural choice probability. In blocks without tonic pain, R2 = 0.337, F (1, 29) = 14.75, p < .001; in tonic pain conditions, R2 = 0.311, F (1, 29) = 13.07, p = .001.

A unified model for tonic and phasic pain for decision-making in free-operant task

This new decision model therefore includes both phasic pain and vigour costs, by which the tonic pain effects can be expressed as change in vigour. Experiment 2 replicated the phasic pain punishment sensitivity effect shown in experiment 1. Using this model, the fitted phasic pain parameters also showed no significant differences in punishment sensitivity (Fig. 7(C-E)), in line with the behavioural results presented above. The vigour constants and phasic pain function fitting results illustrate how we can combine two dissociable effects of pain motivation in a single model, validated by the behavioural data.

Neural activities link to tonic pain and motivational vigour

To probe the neural activity associated with the modulation of motivational vigour by tonic pain, we performed time-frequency analysis on the EEG data. The data were epoched based on key decision points. Linear mixed models (LMM) were employed to account for random effects across subjects and to control for motion artefacts by including instantaneous head movement speed as an additional predictor. We found that alpha band power overlaying central-parietal and temporal regions (CP6 and T8), as well as beta power in parietal scalp regions (P8), was strongly negatively correlated with tonic pain when contrasted with no pain. (Fig. 8A).

(A) Topography of t-values from the LMM, with binary tonic pain condition as the dependent variable and EEG band power 0–0.5 s after the decision point as the independent variable. Empty circles represent channel power showing a significant correlation with tonic pain that does not survive Bonferroni correction. Solid circles indicate significant results after correction. Central-parietal and temporal scalp regions (CP6 and T8) showed strong negative correlation with the alpha power, t = 3.52 for CP6 and t = 3.42 for T8. Significant negative correlation with the beta power was found in parietal scalp regions (P8), t = 3.58. (B) Topography of t-values from the LMM, with continuous vigour constants as the dependent variable and EEG band power 0–0.5 s after the decision point as the independent variable. Negative correlation was found with alpha band power at midline parietal electrode Pz, t = 2.59, p = .010.

To investigate how tonic pain impacts vigour, the vigour constants separately fitted to tonic pain conditions (Fig. 7A) were fit to the EEG data. We found that the alpha band power overlaying the midline parietal region (Pz) was significantly negatively correlated with the vigour constants, but it did not survive the multiple comparison correction (Fig. 8B).

Discussion

The experiments show that tonic and phasic pain serve different motivational functions during adaptive behaviour, in line with ecological and evolutionary theories of pain (Bolles and Fanselow, 1980; Walters and Williams, 2019). Phasic pain provides a punishment teaching signal that directs avoidance through value-based learning, balancing the cost of future harm alongside potential reward. In contrast, tonic pain reduces motivational vigour, which supports energy conservation and recuperation in the context of bodily damage. The experiments are the first to show that these two functions can be formally distinguished and quantified during ongoing behaviour. This illustrates how pain simultaneously acts in different ways to serve self-protection.

One notable aspect of our results is that we did not see interactions between tonic and phasic pain. There are two contexts in which these might be predicted. First, in ‘conditioned pain modulation’ paradigms (Kennedy et al., 2016), a tonic pain stimulus is sometimes seen to reduce ratings of phasic pain stimuli delivered somewhere else on the body. In contrast, our results were more compatible with a trend in the other direction. This raises the (testable) question as to what extent conditioned pain modulation depends on the task setting: notably, here we utilise an active in contrast to a passive task, which is known to be an important distinction in other contexts of coping behaviour (Bandler et al., 2000). A second context in which phasic-tonic interactions are possible is in punishment sensitivity, where it is proposed that tonic pain signals increase vulnerability to new phasic pain insults (Seymour et al., 2023). Punishment sensitivity is seen in terms of anxiety-like responses to novel threats in animals (Crook et al., 2014; Lister et al., 2020), and also in decision-making tasks in people with chronic pain (Mancini et al., 2024). However, we do not see an increase in the punishment value of phasic pain in relation to tonic pain here.

These two contexts highlight the potential difference between ratings and choice as measures of aversiveness, a concept that is sometimes referred to as experienced utility and decision utility (Kahneman and Tversky, 1984), or disliking and unwanting (the negative version of the liking-wanting distinction (Dayan and Seymour, 2009; Berridge et al., 2009), or put colloquially, ‘what you say’ versus ‘what you do’). In our task we get a further hint of this in the SCR measures in experiment 2, whereby a discrepancy exists between decision values and pain ratings in their correlations with phasic pain-evoked SCRs. Taken together, this indicates the composite nature of overall aversiveness of pain, and the fact that the most complete measures of pain should combine both subjective and objective measures, where possible.

The experiments show how probing objective motivational functions of pain is facilitated by VR. Conventionally, pain motivation is studied using desktop tasks in which decisions and physiological responses are made by a seated subject, responding via a keyboard with pain stimuli applied to a distant site. VR provides several useful advances. First, the ability to apply pain stimuli in a way that is appropriate to the action, i.e. phasic pain stimuli to a grasping hand, providing sensorimotor congruency. Second, the ability to embed pain within a free-operant and free-moving physical task allows evaluation of full-body responses to a topographic stimulus, and it is this in particular that allows assessment of motivational vigour. Third, the provision of a multisensory immersive environment entirely transforms a pain task into a naturalistic setting that matches the experience of pain in real-life. The importance of naturalistic contexts is well documented in fear learning, although less well studied for pain (Mobbs et al., 2015).

However, the use of VR requires an analysis framework that accommodates the complexity of free-operant and free-moving nature of behaviour. This challenge can be overcome with a computational approach and judicious use of simplifying assumptions where appropriate. The free operant model exploits a theoretical framework developed for animal studies that derives action delays from a trade-off between vigour cost and opportunity cost (Niv et al., 2007). This model has been applied to human studies (Nair et al., 2023), albeit in a much simpler setting involving perfect information and a single instantaneous action. In our task, the environment was partially observable and evolved as subjects interacted with it. By utilising eye-tracking to extract an abstract observation space and a modified version of the model that enabled simulation at high temporal resolution, we demonstrated that it is possible to predict free-operant behaviour in real-time. This approach revealed critical insights into a subject’s internal pain and motivational states.

A concern that is sometimes raised in pain experiments is to what extent the impact of pain on other behavioural and cognitive processes might be interpreted as an attentional phenomenon (e.g. distraction, or cognitive load), and therefore not specific to pain per se (Moont et al., 2010). One argument to this point posits that attention is an intrinsic part of what pain is, and therefore it does not make sense to consider pain as an operational system without attention (Van Damme et al., 2010): i.e. tonic pain may mediate vigour effects by using distraction as a functional mechanism, and therefore attention is a core part of pain, and not a confound. One way to test this would be to add an additional condition comprising a tonic salient but non-painful stimulus (such as a continuous salient auditory stimulus), although this introduces other confounds related to the incommensurability of such dissimilar types of stimulation. Attentional arguments have also been proposed in CPM paradigms, but here it has been noted that CPM (and diffuse noxious inhibitory control) operates at least partly at a spinal/brainstem level, at ‘lower’ level than that typically associated with attention (Youssef et al., 2016; Kucharczyk et al., 2023).

Despite the greater levels of noise in EEG recordings due to our highly mobile task, an LMM-based approach (that incorporates head-movement speed into the model) revealed that suppression of oscillatory power in the alpha and beta bands in contralateral parietal and temporal regions during tonic pain was consistent with previous studies on tonic pain (Ferracuti et al., 1994; Huishi Zhang et al., 2016; Dowman et al., 2008; Schulz et al., 2015). When comparing how the neural correlates of tonic pain impacted motivational vigour, no oscillatory changes survived correction for multiple comparisons—although a trend towards reduced alpha power was observed at parietal scalp regions. Changes in alpha band power have been shown to affect neural activity in areas associated with sensory integration, attention, and motor planning (Hu et al., 2013; Palva and Palva, 2011), all of which are linked to motivational vigour.

Finally, the methodological framework has clinical applicability as a way to objectively measure pain behaviour in chronic pain. For instance, this could be used to provide a behaviourally sensitive evaluation of pain as a function of diagnoses or treatment. In addition, it could be implemented using real-time analysis to provide a feedback signal for closed-loop interventions, such as interactive VR rehabilitation or deep brain stimulation.

Methods

The free-operant foraging task

Foraging theory is a broad and well-studied subject within animal behaviour research. Various foraging study paradigms have been established in both laboratory and natural environments (Constantino and Daw, 2015; Charnov, 1976; Lihoreau et al., 2011). A natural foraging task, while simple for a healthy human to perform, encapsulates key components relevant to pain research. Firstly, foraging is goal-directed; subjects actively search for food under varying levels of motivation, allowing for the examination of the interaction between motivation and pain. Secondly, a natural foraging task involves full-body movement. Given that many types of pain are closely linked to movement and exercise, motion capture technology—a fundamental component of immersive VR—allows for precise recording and analysis of movement under different pain conditions. Finally, foraging involves exploring the environment and learning an optimal strategy to maximize rewards, naturally linking pain to cognitive functions. Motivated by these considerations, we designed the free-operant foraging task aimed at simulating realistic foraging behaviours.

At the beginning of each block, virtual pineapples of varying heights from 0.33 to 1 m were generated within the virtual space. Five identical baskets were placed within the space. Spatial locations of trees and vegetation were generated using the game engine’s default tree painting tool (Unity Technologies, San Francisco, US).

Participants

A total number of fifty-eight healthy subjects with no significant mobility issues were recruited from community and a pool of university staff and students. Twenty-five subjects were recruited for experiment 1; one subject was excluded due to technical failure. The final sample included 24 subjects (17 female, mean age 25.71 years). Thirty-three subjects were recruited for experiment 2. Two subjects were excluded due to technical failure and compliance failure, respectively. The final sample included 31 subjects (15 female, mean age 24.81 years). The procedure was approved by the Research Ethics Committee of the University of Oxford (CUREC approval reference R58778), and all participants gave written informed consent at the start of the experiment following the Declaration of Helsinki. Subjects were reimbursed with £30-£45 depending on the duration of experiment, including a £10 performance incentive.

Statistical Analysis

Statistical analysis was carried out using custom scripts written in Python 3 and R 4.4.2. For LMMs, lme4 and lmerTest package was used (Bates et al., 2015; Kuznetsova et al., 2017). For LMMs presented in Figure 2 and Figure 4, fixed effects, random effects for intercepts and independent variables (IV) were estimated. The model also estimated the correlation between the intercept deviations and IV effect deviations across subjects. Supplementary materials provide additional details of the formula of the LMMs and their detailed outputs.

Model-fitting

Although the model descriptions in both experiments were not complicated, it requires additional consideration when fitting a discrete choice model to a realistic free-operant task data.

Experiment 1

As we disregarded planning, the model in experiment 1 compared Cpu(a) + Cm ·d(a) for each fruit at a particular time point. The first question was what were the fruits available for subjects to choose from. Fruits were abundant in the subjects’ reachable area, however, many were not directly in subjects’ vision field or hidden behind the trees. Subjects also needed to identify the fruits’ colour to know whether the fruit was painful or not. Therefore, there was an additional search (or exploration) process that was not modelled here. We utilized eye-tracking technology (HTC Vive Pro Eye), and assumed only fruits that were in the central vision field were considered by the subject. Fruits that were captured by eye-tracking were saved into a first-in-first-out memory queue (the available choice set). We heuristically set the memory size to be 7, and the memory was cleared each time the subject chose a fruit. Next question was when the subject made a decision. The first time point that the subject looked at a pineapple not seen before could be used as the decision point in this free-operant task. However, we could not evaluate the model when the chosen fruit was not in the memory yet, as the model would never make a correct prediction out of a set of wrong candidates. We could only retrospectively identify the fruit that the subject chose and only fitted the model when the correct choice was in the choice set. This acausal problem is difficult to solve without taking time into consideration. We will show in the experiment 2 model fitting, by quantifying the effort and opportunity costs as a function of time delay, the model-fitting process was made causal.

We had converted the problem into finding a set of hyperparameters so that at each decision point, the maximum value of all choice differential values Cpu(a) + Cm · d(a) predicts the choice that the subject choose. Readers might find the problem similar to the multinomial logistic regression. However, as a decision model, we chose the objective function for the model-fitting to be the error rate (number of correct predictions divided by number of total predictions) rather than weighted least squares. The model is also flexible to take arbitrary parameters other than linear coefficients. As for this experiment, we chose u to be a sigmoid shape function with two additional x scale and translation hyperparameters. Grid search was used as a general method to find the hyperparameters that minimizes the error rate. Grid search parameters can be found in supplementary materials.

Experiment 2

A key assumption in Niv et al. (2007) was that animals chose an action pair of an action and a delay. The delay included both the time waiting and the time executing the action. It was an approximation of real animal behaviour, and in fitting to real human subjects data in this task, we did not have direct access to subjects’ delay decision. Therefore, we tracked the delay and computed the differential value at each time point. The central assumption in this model fitting solution was all differential values for the choices that were not chosen should be less than the value for the final chosen choice. The model-fitting program tracked the model predicted value for each fruit that a subject fixated on. The model’s predicted choice was correct (i.e., fit the data) when the chosen fruit had the highest predicted value (compared to other fruits within the time interval before the last fixation on the chosen fruit). Otherwise, the model’s predicted choice was incorrect.

Experimental pain stimulation

Electric phasic pain

The electric stimulation pulses were generated by a DS5 isolated bipolar constant current stimulator (Digitimer, Letchworth Garden City, UK) and delivered using a Wasp pain stimulation electrode. The pulses were 200 Hz square waves with a 1 ms pulse width. Calibration of stimulation intensity was conducted based on subjects’ pain ratings before the task commenced. To maintain consistency in pain ratings, additional calibration could be performed at the end of the 10th block in experiment 1 or at the end of every six blocks in experiment 2. Subjects rated their pain on a 0-10 numerical rating scale while researchers adjusted the stimulation intensities following a staircase procedure. The calibration process was considered complete when subjects provided consistent pain ratings within the desired range, as determined by the experimental design (8/10 in experiment 1 for the max pain condition; 3/10 in experiment 2 for the low phasic pain condition and 7/10 for the high phasic pain condition). At the end of each block, subjects provided a pain rating for that block. Ratings in both procedures were given verbally. During calibration, researchers recorded the subjects’ verbal ratings, whereas in post-block ratings, verbal responses were processed using speech recognition services (Microsoft, Redmond, US) and automatically converted into a visual analogue scale bar displayed in VR. In case of speech recognition failure, researchers wrote down the ratings and manually updated the data for analysis.

Pressure tonic pain

We used pressure cuffs (VBM, Sulz, Germany) to safely induce tonic pain (Graven-Nielsen et al., 2015; Joseph et al., 2022). The cuff was wrapped around the non-dominant arm’s biceps. The cuffs were inflated prior to the start of each required block and immediately deflated after the block ended. Subjects were asked to report verbally when they felt discomfort and when they felt pain as the researcher manually inflated the cuff. Inflation stopped once the subject reported pain. To control for variables associated with wearing the cuff other than pain, pressure cuffs remained wrapped around the subjects’ arms even in blocks without tonic pain.

Skin conductance

Data collection processing

SCRs were collected using a custom built wireless sensor. The microcontroller was an Arduino Nano 33 IoT (Arduino SA, Chiasso, Switzerland). The skin conductance sensor was a Grove GSR sensor (Seeed Studio, Shenzhen, China). The firmware was written by the authors of this article with a designated sampling rate of 1000 Hz. Data was segmented by experimental blocks (60 seconds long each). Entire block of data was removed from further analysis if the data loss was more than 20% of the expected total number of samples due to temporary wireless transmission failure. Each block of data was down-sampled to 10 Hz, and re-centered by the average baseline. A median filter with a window size of 5 was applied to the down-sampled data to further remove artifacts. Further small gaps of missing data were handled in the GLM fitting in R with na.action set to na.omit. As subjects would only know the intensities of pain stimulation in new blocks after picking up a painful fruit, for fixation events, only events after picking up the first painful fruit were analysed.

Effect coefficient estimate

Generalised linear models (GLM) were used to fit the skin conductance time series data to the constructed time series data. The time series was generated by convolving the event trigger with the CRF, with hyperparameters taken from (Bach et al., 2010). Fixation events that occurred during picking up painful fruit were not included as shock-evoked SCR was strong. For computational practicality, convolutions were truncated between 0 and 60 seconds (the length of an experimental block) (Bach et al., 2009). The GLMs employed a Gaussian distribution with an identity link function. Each GLM was fitted separately to individual blocks, generating its own set of coefficient estimates. Fitted coefficients were thresholded based on a maximum p-value of 0.05. Additionally, coefficients with z-scores greater than 3 were excluded. The remaining coefficients, including negative ones, were used for further correlation analysis, as our focus was on the relationship between decision values, ratings, and coefficients.

EEG

Data collection and processing

Whole-scalp EEG data was continuously recorded with 32-channel LiveAmp system (BrainProducts GmbH, Munich, Germany) at 500Hz. We choose FCz as the reference. To remove ocular artifacts, recordings were first cleaned with EEGLAB (v2023.1) by removing channel drift (linear filter (FIR) transition band from 0.25 to 0.75 Hz). It also automatically removes bad data periods by thresholding max acceptable 0.5 second window standard deviation of 20, max acceptable channel RMS range of 7, and maximum out-of-bound channels of 25%. The ICA was then performed on the cleaned data and the weight matrix was computed and saved separately. Ocular artifacts components were identified manually, assisted by automated classifier ICLabel (Pion-Tonachini et al., 2019). We then cleaned ocular artifacts by removing these identified components back projected onto the raw data. By manually inspecting the data, we interpolated up to 10% (3 channels) of electrodes. The output data at this stage served as the input data for ERP and time-frequency analysis separately.

ERP analysis

ERP analysis was conducted on the preprocessed data in Python with MNE-Python 1.7.0 package. The data was re-referenced to the average amplitude of all electrodes. A bandpass filter from 1 to 30 Hz was applied. The high cutoff frequency was chosen because of the presence of strong motion artifacts in our mobile EEG data. Identical analysis was applied at 0.1Hz and 0.5Hz high-pass cutoff frequencies, and significant results remained unchanged (see Supplementary figures). Recordings were processed into 1s epoch with 0.3s baseline before the event trigger. Epochs with peak-to-peak amplitude exceeding 200 µV in Cz channel were excluded (average rate of excluded epochs was 2.81%). We search for N1 amplitude peaks from 100-170ms and P2 amplitude peak from 140-300ms at channel Cz (Favero et al., 2023; van den Broeke et al., 2010).

Time-frequency analysis

Time-frequency analysis was performed on preprocessed data in Python with custom scripts relying on MNE-Python 1.7.0 package. The data was re-referenced to the average amplitude of all electrodes. A notch filter with 4Hz width centered at 50Hz was applied. Data were epoched from 0-0.5s according to the decision points. The decision points were chosen when subjects fixated on the chosen fruit for the first time or a re-evaluation point. A re-evaluation point was a subsequent fixation on the chosen fruit when other fruit had higher value than the chosen fruit but the new fixation on the chosen fruit surpassed other fruits’ value. These time points were assumed to represent the key moments for decisions and thus chosen for analysing correlated neural activities.

The spectral power was estimated using Welch’s method. A Hamming-tapered window was used with the window length equal to the epoch length (250 samples, 0.5 second). The estimated spectral power was separated into four bands: theta 4-7 Hz, alpha 8-12 Hz, beta 13-29 Hz, gamma 30-100 Hz (Levy et al., 2023; Chrastil et al., 2022; Schulz et al., 2015), and averaged within each band range. For each power band, averaged band power was inputted into an LMM for each channel as the dependent variable (Schulz et al., 2015). For each epoch, the z-score of the averaged band power was computed for each channel within the subject, and the epoch would be removed from LMM fitting for this band if any channel had a z-score greater than 3 (Nolan et al., 2010). For tonic pain effect, the binary condition of tonic pain presence was used as an independent variable. Random effects for tonic pain and intercepts were estimated. The model also estimates the correlation between the intercept deviations and tonic pain effect deviations across subjects. Crucially, a single head movement speed variable was added as an additional independent variable to reduce motion artifacts. The head movement speed for a particular epoch was calculated by averaging the speed between 10 Unity frames before the decision point and 10 Unity frames after the decision point. For vigour constants, the LMM has the same structure except replacing the binary tonic pain condition by fitted vigour constants presented in Figure 7A. Source analysis for spectral power was included in the Supplementary materials.

Data and code availability

Complete code (without raw data) is currently available on Github: https://github.com/ShuangyiTong/Phasic-and-tonic-pain-serve-distinct-functions-during-adaptive-behaviour All code and data will be available open source, released upon acceptance of the paper.

Acknowledgements

ST would like to thank Pranav Mahajan, Yijia Yan, Suyi Zhang, and Eoin Kelleher for their feedback on the content of this paper during the work-in-progress stage. ST also thanks Kirsty Bannister, Joseph Taylor, Kristian Hennings for sharing valuable knowledge on the safe use of pressure cuffs to induce tonic pain. Special thanks to Eoin Kelleher for providing an exceptional visual representation of the task (Fig. 1A). ST would like to thank the reviewers of IASP 2022, CCN 2023, and COSYNE 2024 for their feedback.

The work was funded by Wellcome Trust (214251/Z/18/Z, 203139/Z/16/Z and 203139/A/16/Z), IITP (MSIT 2019-0-01371, RS-2023-00233251), JSPS (22H04998), and EPSRC (EP/W03509X/1). This research was also partly supported by the NIHR Oxford Health Biomedical Research Centre (NIHR203316) and the Royal Academy of Engineering. The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care. For the purpose of open access, the author has applied a CC BY public copyright licence to any Author Accepted Manuscript version arising from this submission.

Additional information

Author contributions

ST and BS designed the theory and experimental protocols. ST, BS, and TD developed the VR research platform used for the experiments. ST collected and analysed the data. ST and DH designed the EEG analysis pipeline and methodology. SL and TD provided critical feedback and theoretical insights throughout multiple iterations of the study. ST and BS wrote the first draft. ST, TD, DH, SL, and BS edited and approved the manuscript. TD, SL, and BS acquired funding for the study.

Funding

Wellcome Trust (10.35802/214251)

Wellcome Trust (10.35802/203139)

IITP (MSIT 2019-0-01371)

IITP (RS-2023-00233251)

JSPS (22H04998)

EPSRC (EP/W03509X/1)

NIHR Oxford Health Biomedical Research Centre (NIHR203316)

Royal Academy of Engineering

Additional files

Supplementary materials