The control of tonic pain by active relief learning

  1. Suyi Zhang  Is a corresponding author
  2. Hiroaki Mano
  3. Michael Lee
  4. Wako Yoshida
  5. Mitsuo Kawato
  6. Trevor W Robbins
  7. Ben Seymour  Is a corresponding author
  1. University of Cambridge, United Kingdom
  2. Advanced Telecommunications Research Institute International, Japan
  3. National Institute for Information and Communications Technology, Japan
7 figures, 6 tables and 1 additional file

Figures

Experimental paradigms.

(a) Example trial in Experiment 1, which was an instrumental relief learning task (Ins) with fixed relief probabilities, yoked with identical Pavlovian task (Pav) within subject. In instrumental trials, subjects saw one of two images (’cues’) and then chose a left or right button press, with each action associated with a particular probability of relief. In the yoked Pavlovian session, subjects were simply asked to press button to match the action shown on screen (appearing 0.5 s after CS onset). (b) Instrumental/Pavlovian session yoking and cue-outcome contingency in Experiment 1, arrows represent identical stimulus-outcome sequence. Note in contingency table, left and right button presses were randomised for both actions and cues. (c) Relief and no relief outcomes, individually calibrated, constant temperatures at around 44°C were used to elicit tonic pain; a brief drop in temperature of 13°C was used as a relief outcome (4 s in Experiment 1, 3 s in Experiment 2), but temperature did not change for the duration in no relief outcomes. (d) Example trial in Experiment 2, where subjects performed an instrumental paradigm (only) involving unstable relief probabilities. The cue-action representation was different to Experiment 1, and three cues were presented alongside each other with subjects required to choose one of the three using a button press. The position of each cue varied from trial-to-trial, and the same three cues were presented throughout. Tonic pain rating being taken before the outcome was experienced, not after as in Experiment 1. (e) Example traces of dynamic relief probabilities for the three displayed cues throughout all trials in eight sessions in Experiment 2, which required a constant trade-off of exploration and exploitation throughout the task. Dynamic relief probabilities also provide varying uncertainty throughout learning.

https://doi.org/10.7554/eLife.31949.003
Figure 2 with 3 supplements
Experiment 1: behavioural results.

(a) Choice-fitted model comparison, TD model fit instrumental sessions choices best (TD: action-learning model with fixed learning rate, Hybrid: action-learning model with associability as changing learning rate, WSLS: win-stay-lose-shift model). Model frequency represents how likely a model generate the data given a random participant, while exceedance probability estimates how one model is more likely compared to others (Stephan et al., 2009). (b) Instrumental vs Pavlovian sessions SCRs (n = 15, sessions with over 20% trials <0.02 amplitude excluded). (c) Associability from hybrid model fitted trial-by-trial SCRs best in instrumental sessions (Assoc: associability, Hyb: hybrid model, RW: Rescorla-Wagner model). (d) Associability also fitted SCRs from Pavlovian sessions best. (e) Both pain and relief ratings did not differ significantly between instrumental and Pavlovian sessions (Participants’ ratings were averaged for each of the four categories shown, mean = 8 ratings per person per category).

https://doi.org/10.7554/eLife.31949.004
Figure 2—source data 1

Experiment 1’s behavioural data including SCRs, choices, ratings can be found in zip file attached.

https://doi.org/10.7554/eLife.31949.008
Figure 2—figure supplement 1
Experiment 1: raw skin conductance traces, where vertical lines are beginning of each trial when cue display starts (n = 15, excluded participants not shown, showing first non-excluded session from all participant).
https://doi.org/10.7554/eLife.31949.005
Figure 2—figure supplement 2
Experiment 1: filtered skin conductance traces (band-pass at 0.0159–2 Hz, 1 st order Butterworth), averaged across all trials within participant (n = 15, excluded participants not shown, shaded region represent SEM across all participants).
https://doi.org/10.7554/eLife.31949.006
Figure 2—figure supplement 3
Experiment 1: Model protected exceedance probability.

Choice fitting remains similar to the original exceedance probability, however, SCR fitting comparison becomes less clear regarding best fitting model.

https://doi.org/10.7554/eLife.31949.007
Experiment 1: neuroimaging results, shown at p<0.001 uncorrected: (a) TD model prediction errors (PE) as parametric modulators at outcome onset time (duration = 3 s). 

(b) Model PE posterior probability maps (PPMs) from group-level Bayesian model selection (BMS) within PE cluster mask, warm colour: TD model PE, cool colour: hybrid model PE (shown at exceedance probability P>0.7). (c) Axiomatic analysis of hybrid model PEs in instrumental sessions, ROIs were 8 mm spheres from BMS peaks favouring TD model PEs, in left putamen and VMPFC. (d) Associability uncertainty generated by hybrid model, as parametric modulators at choice time (duration = 0), in instrumental sessions. (e) Comparing pgACC activations across instrumental/Pavlovian paradigms, ROI was 8 mm sphere at [−3, 40, 5], peak from overlaying the pgACC clusters from Experiments 1 and 2.

https://doi.org/10.7554/eLife.31949.009
Figure 4 with 3 supplements
Experiment 2: behavioural results.

(a) Model comparison showed that TD model fitted choices best (Bayesian: hierarchical Bayesian model, HMM: hidden Markov model, Hybrid: action-learning model with associability as changing learning rate). (b) SCRs measured on the side with thermal stimulation (‘Stim side’, left hand) were lower than those on without stimulation (‘Non-stim side’, right hand), but both were highly correlated. (c) Associability from state-learning hybrid model fit SCRs best, similarly to Experiment 1. (d) Trial-by-trial associability from hybrid model fitted pain ratings best compared with other uncertain measures (entropy: HMM entropy, surprise: TD model prediction error magnitude from previous trial, null model: regression with no predictors). (e) Regression coefficients with associability as uncertainty predictor were significantly negative across subjects.

https://doi.org/10.7554/eLife.31949.012
Figure 4—source data 1

Experiment 2: behavioural data including SCRs, choices, ratings can be found in zip file attached.

https://doi.org/10.7554/eLife.31949.016
Figure 4—figure supplement 1
Experiment 2: raw skin conductance traces, where vertical lines are beginning of each trial when cue display starts (n = 20, excluded participants not shown, showing first non-excluded session from all participants).
https://doi.org/10.7554/eLife.31949.013
Figure 4—figure supplement 2
Experiment 2: filtered skin conductance traces (band-pass at 0.0159–2 Hz, 1 st order Butterworth), averaged across all trials within participant (n = 20, excluded participants not shown, shaded region represent SEM across all participants).

In Experiment 2, pain ratings took place immediately after cue display period, with variable length of rating time (participant terminates rating whenever they finish). This increased time gap between cue display and outcome account for the second peak in trial averaged SCR trace.

https://doi.org/10.7554/eLife.31949.014
Figure 4—figure supplement 3
Experiment 2: model protected exceedance probability.

Choice, SCR, rating fitting all remain similar to original exceedance probability figures.

https://doi.org/10.7554/eLife.31949.015
Figure 5 with 2 supplements
Experiment 2: neuroimaging results, shown at p<0.001 uncorrected: (a) TD model prediction errors (PE), at outcome onset time (duration = 3 s). 

(b) Model PE posterior probability maps (PPMs) from group-level Bayesian model selection, warm colour: TD model PE, cool colour: hybrid model PE (both shown at exceedance probability p>0.80). (c) Axiom analysis, separating trials according to outcomes and predicted relief values (bins 1–3 from low to high), BOLD activity pattern from striatum (putamen) satisfied those of relief PE. (d) Associability uncertainty generated by hybrid model correlating with pgACC activities, at choice time (duration = 0). (e) pgACC activation beta values across all subjects, ROI was 8 mm sphere at [−3, 40, 5], peak from overlaying the pgACC clusters from Experiments 1 and 2.

https://doi.org/10.7554/eLife.31949.018
Figure 5—figure supplement 1
Overlaying associability associated pgACC responses from both experiments (displayed at p<0.001 unc., crosshair at [−3, 40, 5]).
https://doi.org/10.7554/eLife.31949.019
Figure 5—figure supplement 2
Overlaying prediction error associated responses from both experiments (displayed at p<0.001 unc., showing overlapping dorsal putamen and amygdala clusters).
https://doi.org/10.7554/eLife.31949.020

Tables

Table 1
Multiple correction for Experiment 1 (cluster-forming threshold of p<0.001 uncorrected, regions from Harvard-Oxford atlas. *FWE cluster-level corrected (showing p<0.05 only).
https://doi.org/10.7554/eLife.31949.010
p*kTZMNI coordinates (mm)Region mask
xyz
TD model PE, instrumental sessions
0.00744.273.5−21-5−14Amygdala L
0.01134.983.928-1−14Amygdala R
0285.314.07−213-7Putamen L
4.73.75−28-51
0.003145.734.27207-7Putamen R
0.03423.753.1828-18
0.00744.633.71−173-3Pallidum L
0.00395.24.01177-3Pallidum R
Hybrid model PE, instrumental sessions
0.00554.33.52−21-5−14Amygdala L
0.01424.533.6528-1−14Amygdala R
0.004125.023.92−213-7Putamen L
0.01264.553.66−2838
0.04613.823.23−2811-3
0.001235.033.92207-7Putamen R
4.923.872071
4.393.5724-15
0.00654.043.36−173-3Pallidum L
0.00564.823.811771Pallidum R
Hybrid model PE, Pavlovian sessions
None
Hybrid model associability, instrumental sessions
0.02754.343.55-2375Cingulate Anterior
Table 5
Experiment 1 learning model fitting results.
https://doi.org/10.7554/eLife.31949.011
Model (Options)Data fitted (sessions)ParametersMeanStdInitial states
TD (*)choice (instrumental)learning rate, α0.4010.087Q0=0
WSLS (*)choice (instrumental)pseudo Q (cue 1), p10.3820.073No hidden states
pseudo Q (cue 2), p20.4580.075
Hybrid Action learning (*)choice (instrumental)free parameter κ0.5270.104Q0=0
free parameter η0.4130.125α0=1
RW - V (†)SCR (instrumental)learning rate, α0.4920.013V0=0
RW - V (†)SCR (Pavlovian)learning rate, α0.4920.014V0=0
Hybrid - Assoc (†)SCR (instrumental)free parameter κ0.4970.004V0=0
free parameter η0.4950.004α0=1
Hybrid - Assoc (†)SCR (Pavlovian)free parameter κ0.4980.003V0=0
free parameter η0.4960.008α0=1
Hybrid - V (†)SCR (instrumental)free parameter κ0.4920.012V0=0
free parameter η0.4990.003α0=1
Hybrid - V (†)SCR (Pavlovian)free parameter κ0.4940.005V0=0
free parameter η0.50.003α0=1
  1. *Fitting options: muTheta, muPhi = 0, sigmaTheta, sigmaPhi = 1.

    muTheta, muPhi=0, sigmaTheta=0.05, sigmaPhi=1.

Table 6
Experiment 2 learning model fitting results.
https://doi.org/10.7554/eLife.31949.017
Model (Options)Data fittedParametersMeanStdInitial states
TD (*)choicelearning rate, α0.5770.28Q0=0
Hybrid Action learning (*)choicefree parameter κ0.7740.381Q0=0
free parameter η0.140.139α0=1
HMM (*)choicestate transition probability β0.2750.213Q0=0.5
relief outcome bias c0.5350.212
no relief outcome bias d0.0270.072
Bayesian (‡)choicelevel 2 (outcome) κ0.3310.239Q0=0
level 2 (outcome) ω−0.4231.396
level 3 (belief) θ0.450.03
RW - V (†)SCR (bilateral)learning rate, α0.460.054V0=0
Hybrid - Assoc (†)SCR (bilateral)free parameter κ0.490.01V0=0
free parameter η0.4880.027α0=1
Hybrid - V (†)SCR (bilateral)free parameter κ0.480.034V0=0
free parameter η0.4960.013α0=1
  1. * Fitting options: muTheta, muPhi = 0, sigmaTheta, sigmaPhi = 1.

    muTheta, muPhi = 0, sigmaTheta = 0.05, sigmaPhi = 1.

  2. muTheta=[0,-2,0], muPhi=0, sigmaTheta, sigmaPhi=1

Table 2
Multiple correction for Experiment 2 (cluster-forming threshold of p<0.001 uncorrected, regions from Harvard-Oxford atlas. *FWE cluster-level corrected (showing p<0.05 only).
https://doi.org/10.7554/eLife.31949.021
p*kTZMNI coordinates (mm)Region mask
xyz
TD model PE
0.002154.313.63−25-5−22Amygdala L
0.003114.363.6624-8−14Amygdala R
0.01813.973.4128-1−26
0.002225.94.52−32-85Putamen L
0.02144.553.7832−161Putamen R
Hybrid model PE
0.001164.363.66−21−12−14Amygdala L
4.233.58−21-1−18
0.002134.954.0124-8−18Amygdala R
4.343.6528-1−26
0.003175.494.31−32-85Putamen L
Hybrid model associability
0.001294.53.75-64012Cingulate Anterior
4.443.71-23323
4.083.49-2445
3.933.382401
Table 3
Details of subjective ratings for Experiments 1 and 2.
https://doi.org/10.7554/eLife.31949.022
ExperimentRating typeRating timingAvg # of ratings per subject
Experiment 1Instrumental painAfter 3 s cue + choice window AND outcome (rating type depend on outcome)8.2
Instrumental relief7.7
Pavlovian pain8.1
Pavlovian relief7.7
Experiment 2Instrumental painAfter 3 s cue + choice window, BEFORE outcome70.9
Table 4
All learning models fitted (bold: winning model; AL - action-learning; SL - state-learning, F - variational Bayesian approximation to the model’s marginal likelihood, used for model comparison)
https://doi.org/10.7554/eLife.31949.023
Experiment 1 (Instrumental sessions)
ChoiceF (n=19, sum [sem])SCRF (n = 15, sum [sem])
TD-1330.920 [3.604]RW - value−1079.153 [8.024]
Hybrid (AL)-1345.667 [3.664]Hybrid (SL) - value−1077.911 [8.059]
WSLS-1486.723 [3.973]Hybrid (SL) - associability−1077.699 [8.003]
Experiment 1 (Pavlovian sessions)
Choice (not available)SCRF (n = 15, sum [sem])
N/ARW - value−1101.079 [7.132]
Hybrid (SL) - value−1096.250 [7.195]
Hybrid (SL) - associability−1095.135 [7.106]
Experiment 2 (Instrumental sessions, Pavlovian not available)
ChoiceF (n=23, sum [sem])SCRF (n = 20, sum [sem])
TD-3572.476 [8.736]RW - value−7867.834 [60.668]
Hybrid (AL)-3626.478 [8.946]Hybrid (SL) - value−7857.341 [60.643]
HMM-3571.020 [9.067]Hybrid (SL) - associability−7841.864 [60.838]
Bayesian Hierarchical-3784.372 [8.616]

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Suyi Zhang
  2. Hiroaki Mano
  3. Michael Lee
  4. Wako Yoshida
  5. Mitsuo Kawato
  6. Trevor W Robbins
  7. Ben Seymour
(2018)
The control of tonic pain by active relief learning
eLife 7:e31949.
https://doi.org/10.7554/eLife.31949