1. Neuroscience
Download icon

Punishment insensitivity emerges from impaired contingency detection, not aversion insensitivity or reward dominance

  1. Philip Jean-Richard-dit-Bressel
  2. Cassandra Ma
  3. Laura A Bradfield
  4. Simon Killcross
  5. Gavan P McNally  Is a corresponding author
  1. UNSW Sydney, Australia
  2. University of Technology Sydney, Australia
  3. St Vincent’s Centre for Applied Medical Research, Australia
Research Article
  • Cited 0
  • Views 612
  • Annotations
Cite this article as: eLife 2019;8:e52765 doi: 10.7554/eLife.52765

Abstract

Our behaviour is shaped by its consequences – we seek rewards and avoid harm. It has been reported that individuals vary markedly in their avoidance of detrimental consequences, that is in their sensitivity to punishment. The underpinnings of this variability are poorly understood; they may be driven by differences in aversion sensitivity, motivation for reward, and/or instrumental control. We examined these hypotheses by applying several analysis strategies to the behaviour of rats (n = 48; 18 female) trained in a conditioned punishment task that permitted concurrent assessment of punishment, reward-seeking, and Pavlovian fear. We show that punishment insensitivity is a unique phenotype, unrelated to differences in reward-seeking and Pavlovian fear, and due to a failure of instrumental control. Subjects insensitive to punishment are afraid of aversive events, they are simply unable to change their behaviour to avoid them.

Introduction

Our behaviours, decisions, and choices are shaped by their consequences. When rewarded, they are likely to be repeated, but when punished they are not. Reward and punishment are among the most fundamental psychological building blocks of behaviour. They allow us to cope with a changing world, maximising our probability of survival by seeking utility and avoiding harm. Yet there is often pronounced variation between individuals in responsivity to the consequences of their behaviours. Notably, individuals differ significantly in their sensitivity to punishment (Corr, 2004; Corr, 2013; Gray, 1970; Gray, 1982; Marchant et al., 2018). Insensitivity to punishment is observed experimentally as impaired suppression of behaviours that cause aversive events. Punishment sensitivity plays an important role in normal learning, decision making as well as emotion (Corr, 2004). Differences in sensitivity to punishment have been implicated in the aetiology or maintenance of a range of psychopathologies including conduct disorder (Briggs-Gowan et al., 2014; Dadds and Salmon, 2003), drug and behavioural addictions (Vanderschuren et al., 2017), eating disorders (Monteleone et al., 2018), psychopathy (Blair et al., 2006; Gregory et al., 2015), and depression (Elliott et al., 1996; Eshel and Roiser, 2010). Moreover, punishment sensitivity is an increasingly popular measure of the motivation to engage in drug-seeking and drug-taking (Augier et al., 2018; Deroche-Gamonet et al., 2004; Kasanetz et al., 2013; Marchant et al., 2018; Pascoli et al., 2015; Vanderschuren and Everitt, 2004; Vanderschuren et al., 2017).

The cause(s) of differences in punishment sensitivity are poorly understood. Three main mechanisms have been proposed (Figure 1). First, individual differences in punishment learning may be due to temperamental differences in aversive valuation or aversion sensitivity (Corr, 2004; Gray, 1970). Successful punishment learning requires that the punisher be encoded as aversive. If individuals differ in the extent to which they are sensitive to the aversiveness of punishers, then they are likely to differ in the extent to which they will suppress any behaviour that produces a punisher. A second possibility is that punishment insensitive individuals show reward dominance, with choices and behaviour more strongly determined via the value of any rewards they earn rather than any punishment they incur (O'Brien and Frick, 1996; Robinson and Berridge, 2003). A final, and not mutually exclusive, possibility is that individual differences in punishment sensitivity emerge from individual differences in aversive instrumental learning and control (Seligman, 1970). Punishment learning involves encoding the instrumental contingency between behaviour and its adverse consequences. It is possible that punishment insensitive and sensitive individuals may encode the punisher as equally aversive but differ in their ability to detect or encode the contingency between their behaviour and the punisher and/or in their ability to control behaviour according to this instrumental knowledge.

Potential sources of punishment insensitivity.

Mechanistic behavioural assessment of differences in punishment sensitivity is difficult. One way of distinguishing between these different mechanisms is to examine responses to the punisher directly. However, the magnitude of the unconditioned response to a stimulus has little bearing on what is learned about that stimulus (Rescorla, 1988). An alternative approach is to use tasks that dissociate reward and aversion learning to reveal what relationships exists between them, thus allowing diagnosis of common origins. For example, if individual differences in punishment sensitivity are due to differences in aversive valuation or sensitivity (Corr, 2004; Gray, 1970), then this should be reflected in other forms of learning about the same aversive event in the same individuals, such as Pavlovian conditioning. So, insight into the origins of differences in punishment sensitivity could be obtained through comparisons of instrumental reward learning, instrumental punishment learning, and Pavlovian fear learning. However, there are methodological issues involved when making such assessments. For example, in order to understand individual variation, these different forms of learning have to be assessed in the same individuals. To avoid carry-over effects, they must be assessed concurrently. Finally, the same measure should be used to quantify each form of learning. Few tasks solve each of these methodological issues and none have been used to understand individual differences in punishment learning.

Here we used a conditioned punishment task that permitted us to concurrently identify and study individual differences in instrumental reward learning, instrumental punishment learning, as well as Pavlovian fear learning using the same behavioural measure (Killcross et al., 1997). Rats were trained to respond on two levers for food reward. We then introduced concurrent punishment and Pavlovian fear contingencies on one lever but not the other. We used rates of lever pressing as our measure of reward, punishment, and fear. In addition to direct comparisons of lever pressing performance between the three contingencies, we used a variety of data analytic strategies (multidimensional scaling, principal components analysis, factor analysis, and k-means clustering) to understand the relationship between individual differences in instrumental reward learning, instrumental punishment learning, and Pavlovian fear learning. If punishment insensitivity is attributable to differences in reward dominance, then individual differences in punishment learning should be related to differences in instrumental reward seeking. If punishment sensitivity is related to aversive insensitivity, then individual differences in punishment and fear should be related to each other. Finally, if individual differences in punishment insensitivity are attributable to punishment-specific deficits, then no relationship between the three forms of learning should be apparent.

Results

Individual differences in punishment and fear

Three contingencies were in effect within this task: the instrumental contingency of reward which should maintain responding on both levers; the instrumental contingency of punishment, which should bias animals away from the punished response (i.e. punishment suppression), and the aversive Pavlovian contingency that drives fear conditioning to predictive cues and suppresses ongoing behaviour (i.e. Pavlovian suppression). Each of these three effects were observed (Figure 2). Prior to aversive training, there was no preference between pressing the to-be-punished versus unpunished lever (F(1,47) = .071; p = 0.791, ηp2 0.001) (Figure 2). Across the course of aversive training, reward learning was maintained and punishment learning as well as fear learning were observed. There was evidence for punishment learning because there was less lever pressing on the punished lever than the unpunished lever (the traditional measure of punishment learning in this task) (Figure 2A). Punishment avoidance increased across days (linear trend: F (1,47)=7.49; p=0.009, ηp2 0.137). Follow-up analyses revealed significant punishment suppression for each session (1st session: F (1,47)=5.48; p=0.024, ηp2 0.137; remaining sessions: F (1,47)>23.4; p<0.001, ηp2 0.332). There was also robust evidence for Pavlovian fear (Figure 2B). Conditioned suppression elicited by presentations of the CS+ also increased across training (F (1,47)=35.1; p<0.001, ηp2 0.427), with significant suppression being observed for each session (all F (1,47)>54.7; p<0.001, ηp2 0.537). However, as expected, these group-averaged data obscured pronounced individual differences. Figure 2C and D show the same data plotted at subject level. Punishment suppression appeared to be bimodally distributed with some subjects showing strong punishment suppression (i.e. punishment sensitivity) and others weaker or no punishment suppression (i.e. punishment insensitivity) (Figure 2C). There was also, albeit less pronounced, variation in Pavlovian fear (Figure 2D).

Lever preference and conditioned suppression across conditioned punishment.

(A) Mean ± SEM preference ratios showing evidence for punishment. (B) Mean ± SEM suppression ratios showing evidence for fear. (C) Violin plots and individual subject preference ratios. (D) Violin plots and individual subject conditioned suppression ratios.

The evidence for punishment is derived from a preference ratio (responses on the punished lever relative to total responses on both levers). This measure is simple and valid, but it obscures the degree to which preferences are driven by changes in punished responding, unpunished responding, or both. Moreover, subjects that suppress punished and unpunished lever-pressing equally would have ratios of 0.5, which might mistakenly be interpreted as an absence of punishment avoidance and hence punishment insensitivity. Therefore, we also assessed suppression of punished and unpunished responding separately against pre-punished rates of responding (Figure 3A). Here a suppression ratio of 0.5 indicates no difference in rate of pressing relative to last day of training (i.e. punishment insensitivity) whereas a ratio of 0 indicates complete suppression. At the group level, this analysis showed a main effect of lever (F(1,47) = 44.39; p<0.001, ηp2 0.485), session (linear: F(1,47) = 9.476; p=0.003, ηp2 0.167), and a significant lever x session interaction (F(1,47) = 10.62; p=0.002, ηp2 0.184). This interaction was driven by a significant increase in the unpunished suppression ratio across sessions (linear: F(1,47) = 24.54; p<0.001, ηp2 0.343), but no significant change in punished lever suppression (linear: F(1,47) = .210; p = 0.649, ηp2 0.004). Punished lever suppression was significantly greater than unpunished lever suppression for all sessions (F(1,47) > 9.22; p<0.004, ηp2 0.164). So, robust punishment was observed using this measure. However, this group level analysis again obscured pronounced individual differences (Figure 3B). Examination of individual subject performances showed that suppression of responding on the punished lever, but not the unpunished lever, appeared bimodal (Figure 3B and C).

Lever-press suppression across conditioned punishment.

(A) Mean ± SEM suppression ratios for responding on the punished (red) and unpunished (green) levers relative to training. *p<0.05 punished vs. unpunished. (B) Violin plots and individual subject suppression ratios for the punished lever. *p<0.05 punished vs. null ratio (0.5). (D) Violin plots and individual subject suppression ratios for the unpunished lever. *p<0.05 punished vs. null ratio (0.5).

Punishment, reward, and fear are separate

To further examine the relationship between punishment, fear, and reward we examined the correlations between suppression on the punished lever (punishment), unpunished lever (reward), and conditioned suppression elicited by the CS+ (fear) across training (Figure 4A). We observed strong positive correlations among each of these measures across sessions, showing that each subject’s relative behaviour was stable across days. However, we observed few significant positive correlations between the measures. Notably, for fear and punishment, the only significant correlation was negative and present only on the first day of training.

Relationships between punishment, fear and reward.

(A) Correlation matrix for suppression ratios during CS+ presentations, punished lever, and unpunished lever across conditioned punishment sessions (1-6). (B) Multidimensional scaling showing suppression ratio distances for CS+, punished lever suppression, and unpunished lever suppression across sessions (1-6).

The correlation matrix is useful in visualising and understanding the relationship between measures but does not readily reveal meaningful, underlying dimensions that may explain overall similarities and dissimilarities. To visualise these, we used multidimensional scaling (Figure 4B). This showed that punishment, fear, and reward clustered in separate spaces. Punished lever suppression was clustered in a separate space from conditioned suppression, suggesting that punishment and fear are highly dissimilar to each other. Unpunished lever suppression was initially closely related to punished response suppression but became progressively different as training progressed.

These results show qualitative differences between punishment, reward, and fear. To better understand the shared variance between our measures we used Principal Components Analysis (PCA) to identify any shared underlying components in learning (Figure 5A). If there is a common aversion sensitivity that underpins punishment and fear learning, or any other common process, then PCA should identify it as a component with strong loadings from both punishment suppression and CS+ suppression. A 4-component solution was optimal, accounting for 75.9% of overall variance (Figure 5—figure supplement 1), with most measures well captured by these four components (Figure 5A, Figure 5—figure supplement 1). The first component captured the influence of punishment: punishment suppression across the course of training loaded strongly on this component. The second component captured the influence of contextual fear learning early during training: both initial punishment suppression and unpunished responding loaded positively on this component whereas CS+ suppression loaded negatively. The third component captured specific CS+ fear from later in training: only CS+ suppression loaded positively on this component. The fourth component captured reward: the remainder of the variance in unpunished responding loaded positively on this component. So, punishment and fear do not load positively on the same component. In fact, any relationship between them was largely negative, indicative of a competitive rather than complementary relationship between them.

Figure 5 with 2 supplements see all
Principal component and factor analysis of suppression during conditioned punishment.

(A) Loading heatmaps for principal component analysis of suppression ratios across conditioned punishment sessions (1-6). (B) Loading heatmaps for factor analysis of suppression ratios across conditioned punishment sessions (1-6). Bottom rows indicates proportion of total variance (Var) accounted for by components/factors. Last column indicates variance of each measure accounted for by components/factors (extraction). Loadings that account for majority (>50%) or substantial (>10%) variance are indicated with ++ and +, respectively.

PCA is a dimension reduction procedure. It is less explicitly a means to identify underlying latent variables in datasets. Therefore, we performed Factor Analysis to identify latent variables in the association between punishment, reward, and fear. The results from this analysis were similar to PCA (Figure 5B, Figure 5—figure supplement 1). Based on factor loadings, variation in aversive learning can be accounted for by an influence of punishment (Factor 1), contextual fear (Factor 2), CS+ fear (Factor 3), and reward (Factor 4). Taken together, these findings suggest punishment, reward, and fear are largely orthogonal each other.

We also assessed the relationship between pre-punishment lever-pressing and behaviour in conditioned punishment. Training lever-pressing was correlated with unpunished (average r = 0.561, p<0.0001) but not punished lever-press rates during ITIs (average r = 0.198, p=0.18) or conditioned suppression (average r = 0.173, p=0.24). This relationship was further supported by PCA and multidimensional scaling (Figure 5—figure supplement 2). This implies responding during training predicts later unpunished responding but not punishment or conditioned suppression. Lever-press suppression ratios, which remove variability attributable to pre-punishment differences in reward-seeking, were not correlated with training lever-press rate (punished lever suppression: r = −0.06, p=0.69; unpunished lever suppression: r = −0.04, p=0.79), again indicating that punishment-driven changes in lever-pressing are unrelated to initial rates of lever pressing.

Cluster analysis reveals punishment sensitive vs. insensitive phenotypes

To further understand individual differences in aversive learning we used cluster analysis. This allowed us to identify clusters of subjects whose performances across training were more similar to each other and different to other clusters of subjects. Silhouette values revealed positive silhouette values for 2–4 k-mean clusters, which were each marginally higher compared to solutions using more clusters.

We examined punishment, reward, and fear behaviours for each of the cluster solutions. The groups produced by the 2-cluster solution did not differ in sex (χ2(1)=0.782, p=0.376; Figure 6—figure supplement 1). The two clusters did not differ in pre-punishment lever-pressing (all F(1,46) ≤. 659, p ≥. 421; Figure 6—figure supplement 1), showing that they did not differ in reward learning prior to aversive learning. They were, however, distinguishable by their punishment avoidance, regardless of whether this was measured via punished lever suppression or preference ratio (Figure 6A, Figure 6—figure supplement 1). Specifically, there was a significant overall difference in punished lever suppression (F(1,46) = 105.96, p<0.001, ηp2 0.697) (Figure 6A) and preference ratio (F(1,46) = 49.13, p<0.001, ηp2 0.517) (Figure 6—figure supplement 1) between clusters across sessions. However, there was no main effect of cluster on either unpunished lever suppression (F(1,46) = .215, p = 0.645) (Figure 6A) or conditioned suppression (F(1,46) = 1.008, p=0.321, ηp2 0.021) (Figure 6B). Thus, we refer to these clusters as punishment-sensitive (filled symbols, n = 15 [7 female]) and punishment-insensitive (empty symbols, n = 33 [11 female]). Further analyses showed that the punishment-sensitive group significantly suppressed punished (F(1,46) = 333.638, p<0.001, ηp2 0.878) but not unpunished (F(1,46) = 2.601, p=0.114, ηp2 0.053) responding relative to pre-punishment. In contrast, the punishment-insensitive group modestly suppressed both punished (F(1,46) = 75.318, p<0.001, ηp2 0.621) and unpunished (F(1,46) = 10.395, p=0.002, ηp2 0.184) responding relative to pre-punishment. This shows that the punishment-insensitive group were not simply showing attenuated punishment but were instead showing a distinct suppression phenotype.

Figure 6 with 1 supplement see all
Behaviour of groups from 2-cluster solution.

(A) Mean ± SEM punished and unpunished lever suppression for punishment-sensitive (PunS; filled) and punishment-insensitive (PunIns; empty) groups from 2-cluster solution. (B) Mean ± SEM conditioned suppression ratios for groups from 2-cluster solution.

Due to this differential ITI suppression across groups, shock intensities for the punishment-insensitive cluster had been increased more than for punishment-sensitive cluster (linear x cluster interaction: F(1,46) = 6.062, p=0.018; Figure 6—figure supplement 1), although shock intensity did not differ overall across sessions (F(1,45) = 2.196, p=0.145). Importantly, shock intensity was not a significant covariate for final punishment (F(1,45) = 1.042, p=0.313) or conditioned suppression (F(1,45) = .389, p = 0.536), showing this was not a driving factor for cluster differences.

When a 3-cluster solution was derived, a significant effect of sex was found (χ2 (2)=7.416, p=0.025; Figure 7—figure supplement 1). The two larger clusters were most distinguishable by their suppression on the punished lever, that is punishment-sensitive (filled symbols; n = 17 [6 female]; Figure 7A) versus punishment-insensitive (empty symbols; n = 27 [8 female]; Figure 7B) and their behaviour was largely similar to the groups in the 2-cluster solution. The last cluster (half-filled symbols; n = 4 [4 female]; Figure 7C) was a small cohort that exhibited initial indiscriminate suppression on both the punished and unpunished levers as well as a counterintuitive increase in pressing during the CS+ during initial sessions. However, in later sessions, this cluster exhibited the greatest conditioned and punished lever suppression. Given these extreme responses, we will refer to this cluster as the hyper-sensitive group.

Figure 7 with 1 supplement see all
Behaviour of groups from 3-cluster solution.

(A) Mean ± SEM punishment suppression and conditioned suppression for punishment-sensitive cluster. (B) Mean ± SEM punishment suppression and conditioned suppression for punishment-insensitive cluster. (C) Mean ± SEM punishment suppression and conditioned suppression for hyper-sensitive cluster.

Once again, the clusters did not differ in lever-press rates across training (all F(2,45) ≤ 1.886, p ≥. 164; Figure 7—figure supplement 1). Compared to pre-punishment lever-pressing, all clusters showed significant punished lever suppression averaged across punishment (F(1,45) ≥ 45.374, p<0.001, ηp2 0.502) (Figure 7). However, there were differences between the clusters. Specifically, the punishment-insensitive cluster showed the least (F(1,45) ≥ 66.55, p<0.001, ηp2 0.596) and the hyper-sensitive cluster showed the most F(1,45) ≥ 4.386, p ≤. 0419, ηp2 0.089) punished lever suppression. The clusters also differed on unpunished lever suppression relative to pre-training (F(1,45) ≥ 15.54, p<0.001, ηp2 0.257) (Figure 7). The punishment-sensitive cluster showed no (F(1,45) = 1.164, p=0.286, ηp2 0.025), the punishment-insensitive cluster showed moderate (F(1,45) = 24.828, p<0.001, ηp2 0.356), whereas the hyper-sensitive cluster showed the most suppression (F(1,45) = 61.932, p<0.001, ηp2 0.579).

The three clusters showed different profiles of learning across days. Both the punishment-sensitive (F(1,45) = 13.84, p<0.001, ηp2 0.235) and hyper-sensitive clusters (F(1,45) = 20.471, p<0.001, ηp2 0.137) increasingly differentiated between punished and unpunished lever suppression across sessions whereas the punishment-insensitive cluster did not (F(1,45) = .168, p = 0.684). The punishment-sensitive cluster exhibited differential suppression for all sessions (F(1,45) ≥ 18.585, p < 0.001, ηp2 0.292), whereas the hyper-sensitive cluster only showed significantly different lever suppression from session four onwards (F(1,45) ≥ 15.088, p<0.001, ηp2 0.251). The punishment-sensitive cluster also initially suppressed (session 1–2: F(1,45) ≥ 5.934, p ≤. 019, ηp2 0.117) but subsequently elevated (session 3–6: F(1,45) ≥ 4.129, p ≤. 048) rates of unpunished responding relative to pre-punishment training. The hyper-sensitive cluster drastically suppressed unpunished responding early during aversive training (session 1–4: F(1,45) ≥ 19.060, p<0.001, ηp2 0.298), but this recovered and they eventually pressed at rates equivalent to pre-punishment training (session 5–6: F(1,45) ≤. 070, p ≥. 793).

All clusters showed greater CS+ fear across sessions (F(1,45) ≥ 6.08, p ≤ . 018) (Figure 7). There were no significant differences between punishment-sensitive and punishment-insensitive clusters in their conditioned suppression (overall: F(1,45) = .517, p = 0.476, ηp2 0.011; linear: F(1,45) = .048, p = 0.828). However, the hyper-sensitive cluster had a significantly greater decrease in conditioned suppression across sessions than the other clusters (F(1,45) ≥ 41.255, p<0.001, ηp2 0.478).

The clusters differed in shock increments across training (linear x cluster interaction: F(2,45) = 6.850, p=0.003; Figure 7—figure supplement 1). However, shock intensity was not a significant covariate for final punishment (F(1,44) = 1.691, p=0.200) or conditioned suppression (F(1,44) = .834, p = 0.366), indicating that differences in shock intensity were not a driving factor for group differences.

Discussion

Although punishment is highly conserved across species, it is far from robust across individuals. Here we studied individual differences in punishment sensitivity in rats, using a task permitting concurrent assessment of punishment, reward, and fear learning. We identified pronounced individual differences in punishment sensitivity. Using data-driven analytic approaches we show that these individual differences in punishment sensitivity cannot be predicted or explained by individual differences in fear or reward. Rather, across each analysis, punishment, fear and reward were remarkably independent. There was no evidence here to support the possibility that punishment insensitivity is due to reduced aversion sensitivity or reward dominance.

Instead, punishment insensitivity was a failure of instrumental learning. It could have multiple origins but is most likely due to a failure to encode the instrumental response-punisher association relative to the other associations in the task. Our task involved multiple instrumental (response-outcome) and Pavlovian (stimulus-outcome) contingencies; punishment-sensitive subjects parsed these contingencies to show Pavlovian and instrumental behavioral control whereas insensitive subjects were impaired in partitioning these different contingencies. The strongest evidence for this possibility comes from the cluster analyses. Punishment-insensitive subjects identified by cluster analysis exhibited modest suppression of both punished and unpunished responding. This profile of generalised behavioural suppression is incompatible with the reward dominance account, and is similar to the behaviour shown by subjects receiving response-independent aversive events (Hunt and Brady, 1951; Jean-Richard-Dit-Bressel et al., 2018). That is, punishment-insensitive animals behaved as though they were not causing the aversive events they were experiencing and instead expressed weak but generalised Pavlovian fear. Alternatively, the subjects may have encoded this instrumental association but been unable to inhibit their punished behaviour in accordance with this knowledge. However, any such failure of inhibition must have been specific to the punished response and not a failure of behavioural inhibition more generally (Gray, 1982; Gray and McNaughton, 2000) because punishment-insensitive subjects showed intact behavioural inhibition during Pavlovian fear.

It is worth noting that punishment-insensitive rats were a relatively large proportion of the sample. Given the evidence that impaired punishment contingency detection underpinned punishment insensitivity, it is likely the relatively lean punishment contingency applied here (VI60 sec CS+) was a key factor in determining the number of insensitive subjects. Future research examining the effect of this contingency on punishment sensitivity would be useful. Interestingly, insensitivity to punishment in drug seeking has been observed using tighter response-punisher contingencies (Marchant et al., 2018). An intriguing possibility is that drugs of abuse may promote punishment resistance by impairing punishment contingency detection. This is consistent with demonstrations that insensitivity to punishment might be reduced at high shock intensities (Golden et al., 2017) or after extended punishment training (Cooper et al., 2007). Further work is needed to assess this.

From a theoretical perspective, perhaps the most surprising finding here was the absence of any notable co-variance between instrumental and Pavlovian aversive learning, even when more powerful methods capable of detecting such underlying relationships were applied (e.g. PCA and FA). Historically, theories of associative learning and motivation have assumed that instrumental and Pavlovian determinants of behaviour share a common basis and that reinforcers have common motivational value that underpins these different forms of learning (Mackintosh, 1983; Rescorla and Solomon, 1967). These theories have derived strong support from the inter-changeability of outcomes as reinforcers for Pavlovian and instrumental learning. They remain a dominant approach to understanding aversive learning (Cain and LeDoux, 2008). However, it is now well understood that distinct processes govern instrumental versus Pavlovian reward value (Balleine and Dickinson, 1998; Dickinson and Balleine, 2002). Our findings extend this dissociation to aversive learning (see also Giuliano et al., 2018; Pelloux et al., 2007). We show that if there is any trans-contingency encoding of outcome value, then this contributes little to how animals learn punishment and fear. These two forms of aversive learning were independent of each other suggesting that the motivational underpinnings of Pavlovian fear and punishment are distinct.

Our findings have important implications for use of punishment sensitivity in assessing motivation. Punishment tasks are widely used to model the adverse consequences of drug seeking and measure motivation to engage in drug-seeking in the face of adverse consequences (Augier et al., 2018; Kasanetz et al., 2013; Pascoli et al., 2015; Vanderschuren and Everitt, 2004; Vanderschuren et al., 2017). The persistence of drug-seeking in the face of punishment (i.e. insensitivity to punishment) is invoked as an objective behavioural marker of addiction (Deroche-Gamonet et al., 2004; Vanderschuren and Everitt, 2004). This insensitivity is typically attributed to drug-induced plasticity promoting reward dominance or impulsivity. However, a key finding here is that insensitivity is a characteristic of punishment itself. Punishment insensitivity can emerge from a specific deficit in instrumental aversive learning and can be observed in studies using non-drug rewards. This suggests that punishment insensitivity can pre-exist any drug-induced plasticity promoting reward dominance or impulsivity and this pre-existing difference may provide one basis for persistence of reward seeking in the face of punishment.

Other clinical populations are characterised by differences in sensitivity to punishment. Increased sensitivity to punishment is characteristic of depressive disorders; these individuals show catastrophic, globalised reactions to punishment (Elliott et al., 1996; Eshel and Roiser, 2010). Cluster analyses here identified a hyper-sensitive phenotype that initially displayed pronounced and indiscriminate suppression of behaviour, commensurate with pronounced Pavlovian fear, before showing exaggerated punishment and appropriate discrimination between punished and unpunished behaviour. The transition from fear to punishment among hypersensitive animals was rapid and occurred within two sessions. Moreover, although there were no sex differences among the sensitive and insensitive phenotypes, the hypersensitive cluster was comprised exclusively of females. The relatively small number of animals in this hypersensitive cluster preclude further analyses, but the data-driven/bottom-up approach used to identify this cluster of hypersensitive subjects could prove useful for further research.

In summary, we examined punishment-, fear- and reward-related learning and behaviour in a task that permits assessment each of these processes concurrently in the same animals. We observed pronounced variations in punishment learning. We also identified clinically relevant phenotypes of insensitivity and hyper-sensitivity to punishment. In each case, these individual differences in punishment sensitivity could be explained by failures to encode the instrumental response-punisher association, not by aversion insensitivity or reward dominance. Subjects insensitive to punishment were afraid of the punisher but were unable to change their behaviour to avoid it.

Materials and methods

Subjects

Subjects were experimentally naive adult male and female Long-Evans rats (N = 48, 18 females) supplied by the University of New South Wales (Sydney, NSW, Australia). This was a single group experiment, so N = n = 48. This group size was chosen based on past research (Marchant et al., 2018) suggesting that it would be sufficient to identify individual differences in punishment. Animals were housed in groups of four in ventilated racks in a temperature- and humidity-controlled room with a 12–12 hr light/dark cycle (lights on 07:00). Experiments were conducted during the light cycle. Animals were food restricted from 3 days prior to the experiment onwards (10–15 g food per day for males, 7–12 g for females) to maintain them at ~90% of free-feeding weight, with ad libitum access to water. All procedures were approved by the UNSW Animal Ethics Committee (AEC) and in accordance with the code set out by the National Health and Medical Research Council (NHMRC) for the treatment of animals in research.

Apparatus

Behavioural procedures were conducted in standard operant chambers (24 [length] x 30 [width] x 21 cm [height] (Med Associates, St Albans, VT) housed within sound- and light-attenuating cabinets equipped with fans providing constant ventilation and low-level background noise. All events were controlled and recorded by MedPC IV software (Med Associates). CS+ and CS- were 10 s 3 kHz tone or 5 Hz flashing light, counterbalanced. Pellets (Bioserve, Biotechnologies) were delivered from a dispenser to a recessed magazine cup (5 × 5 cm); magazine entries were detected using infrared beams at the magazine opening. Retractable levers were located on each side of the magazine. Shocks (0.5 secs, 0.3–0.6mA) were delivered via the grid floor. A 3W house light was mounted at the top of the wall opposite to the magazine and was turned on throughout each session.

Behavioural procedures (Table 1)

Table 1
Experimental design.
LeverEnd lever-press trainingConditioned punishment
PunishedFood (VI30s)Food (VI30s)
CS+ → Shock (VI60s)
UnpunishedFood (VI30s)Food (VI30s)
CS- (VI60s)
  1. CS+ and CS- were 10 s 3 kHz tone or 5 Hz flashing light, counterbalanced. CS+ co-terminated with shock (0.5 secs, 0.3–0.6mA).

Magazine training

Request a detailed protocol

Rats received one session of magazine training, during which pellets were delivered on a variable 60 s interval (VI-60s) schedule until 20 pellets were delivered or 30 min had elapsed.

Lever-press training

Request a detailed protocol

Following magazine training, rats were trained to press two levers equally on an escalating reinforcement schedule. The first two sessions (30 mins) presented a single lever (left or right, order counterbalanced) to each rat and each lever-press was rewarded with a pellet (FR1). The session terminated after 20 presses or after 30 mins. Animals (n = 2) that did not acquire lever-pressing received extra magazine and FR1 training. This was followed by single-lever sessions (30 mins) that reinforced lever-pressing on VI-15s and VI-30s schedule (one session for each schedule on each lever). Rats were then given double-lever sessions (30 mins); both levers were extended and reinforced on a VI-15s (one session) and modified VI-30s schedule (two sessions). To counteract lever-preferences and equalise lever-pressing on both levers, double-lever VI-30s sessions dynamically adjusted the VI schedule as a ratio of relative lever-press rates, decreasing the reinforcement schedule on the preferred lever and increasing the reinforcement schedule on the non-preferred lever. The last lever-press training session (60 mins) presented both levers and pressing was reinforced on a standard VI-30s schedule for each lever.

Punishment and fear conditioning

Request a detailed protocol

Following lever-press training, rats received 6 days of conditioned punishment training. Both levers were extended for 60mins and pressing was reinforced on a standard VI-30 schedule. In these sessions, the punished lever also yielded an aversive CS+ (VI-60s), while pressing the other unpunished lever yielded a neutral CS- (VI-60s). The CS+ co-terminated with a 0.5 s footshock and the CS- terminated by itself. For the first session, footshock intensity was set at. 3mA. Shock intensity was intermittently incremented by. 1mA between sessions (up to. 6mA) if suppression of ITI lever-pressing was not observed. If a lever-press was scheduled to yield both a pellet and CS at the same time, only the CS was delivered due to its leaner schedule.

Data analysis

Request a detailed protocol

Suppression/preference ratios were calculated using rates of lever-pressing on punished and unpunished levers during the inter-trial interval (ITI; non-CS periods), CS+, and CS-. Punishment learning was defined as suppression of punished lever-pressing during the ITI. This was captured in two ways. The traditional assessment is to measure rates of punished responding relative to unpunished responding during the ITI using a ‘preference ratio’ ([Pun ITI rate/total ITI rate]; previously termed a ‘punishment ratio’). Suppression of punished as well as unpunished responding were also assessed using ‘lever suppression ratios’ (session ITI rate/[training ITI rate + session ITI rate]), which capture rates of responding on each lever relative to rates on the final day of pre-punishment training. This allows separate assessment of punished vs. unpunished responding under punishment, clarifying lever preferences or lack thereof. Finally, CS suppression ratios were calculated to assess suppression of lever-pressing (both levers) during each CS relative to ITI (CS rate/[ITI rate + CS rate]). The CS+ ratio measures Pavlovian fear via conditioned suppression, while the CS- ratio acts as a control comparison.

All three ratios range from 0 to 1. A CS or lever suppression ratio below 0.5 indicates suppression of lever-pressing relative to baseline, a ratio above 0.5 indicates elevated lever-pressing, while a ratio of 0.5 indicates no change/suppression. In the case of the preference ratio, a ratio of 0.5 indicates no preference between levers during the ITI, while a ratio below 0.5 indicates avoidance of the punished lever relative to the unpunished lever.

Ratios were analysed using polynomial contrasts in PSY. Significant suppression/bias was determined via single mean tests against the null of 0.5. Differences in ratios between groups and levers, and how these developed over sessions, were assessed via between x within ANOVAs. Lever (punished vs. unpunished) identity and session (linear trend) were used as within-subject factors where applicable. Cluster was applied as a between-subjects factor where applicable.

All other analyses were conducted in SPSS 25. Relationships between punishment and conditioned suppression were assessed via correlations, principal component analysis (PCA) and factor analysis (FA). CS+ suppression, punished and unpunished lever suppression ratios across the 6 days of conditioned punishment were used as inputs to parsimoniously capture aversively-motivated changes in behaviour while controlling for non-aversion related differences in responding. PCA and FA results were varimax rotated to improve interpretability of components/factors. Relationships between lever-press rates and conditioned suppression across conditioned punishment and last day of lever-press training were also assessed using correlations and principal component analysis.

Similarity/dissimilarity of suppression ratios or lever-press rates across sessions were also conveyed using multidimensional scaling (SPSS PROXSCAL with the following parameters: simplex, interval transformation, squared Euclidean distance, z-scored). These parameters provided an excellent fit of the data (suppression ratios: normalized raw stress = 0.01831, S-Stress = 0.04746, DAF = 0.98169, Tucker’s coefficient for congruence = 0.99080; lever-press rates/conditioned suppression: normalized raw stress = 0.02288, S-Stress = 0.05471, DAF = 0.97712, Tucker’s coefficient for congruence = 0.98849).

K-means clustering was used to assess distinct suppression phenotypes. Silhouette values were obtained for 2–7 clusters, revealing marginally higher values for 2–4 cluster solutions. Distribution of sex across clusters was assessed via chi square. To determine the possible contribution of different shock intensities on suppression, a between (cluster) x within (session) ANOVA was conducted on shock intensities. To assess the role of shock intensity on suppression, a univariate ANOVA was conducted for punished lever and conditioned suppression on last day of conditioned punishment using shock intensity as a covariate and cluster as a fixed factor.

References

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
    The Role of Learning in the Operation of Motivational Systems
    1. A Dickinson
    2. BW Balleine
    (2002)
    In: H Pashler, C. R Gallistel, editors. Stevens' Handbook of Experimental Psychology. Hoboken: Wiley Online Library. pp. 497–533.
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
    The Neuropsychology of Anxiety: An Enquiry in to the Functions of the Septo-Hippocampal System
    1. JA Gray
    (1982)
    Oxford University Press.
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
    Conditioning and Associative Learning
    1. NJ Mackintosh
    (1983)
    Oxford: Oxford University Press.
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35

Decision letter

  1. Geoffrey Schoenbaum
    Reviewing Editor; National Institute on Drug Abuse, National Institutes of Health, United States
  2. Kate M Wassum
    Senior Editor; University of California, Los Angeles, United States
  3. Geoffrey Schoenbaum
    Reviewer; National Institute on Drug Abuse, National Institutes of Health, United States
  4. Michael A McDannald
    Reviewer; Boston College, United States
  5. Yavin Shaham
    Reviewer; National Institute on Drug Abuse, National Institutes of Health, United States

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

In this paper Jean-Richard-dit-Bressel and colleagues developed a task to dissociate individual variability in sensitivity to reward and punishment. They employ a variety of creative analyses of behavior using PCA and multidimensional scaling, cluster analyses, etc. to show that the individual variance in sensitivity to punishment (i.e., suppression of responding due to the production of an aversive outcome) is unrelated to variance in sensitivity to reward or perception of the aversive US. The reviewers agreed that the core idea behind the manuscript - to examine the interplay between reward, punishment, and fear - is an extremely important and hugely understudied question, of relevance to adaptive behavior generally and many psychiatric disorders in particular. The design, execution, and analysis were all excellent, convincingly showing diversity in individual responding. This paper is important to understanding both appetitive and aversive processes, their interaction, neural mechanisms, and especially potential contribution to mental illness.

Decision letter after peer review:

Thank you for submitting your article "Punishment insensitivity emerges from impaired contingency detection, not aversion insensitivity or reward dominance" for consideration by eLife. Your article has been reviewed by three peer reviewers, including Geoffrey Schoenbaum as the Reviewing Editor and Reviewer #1, and the evaluation has been overseen by a Reviewing Editor and Kate Wassum as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Michael A McDannald (Reviewer #2); Yavin Shaham (Reviewer #3).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

The authors developed a task to dissociate individual variability in sensitivity to reward and punishment. Hungry rats were trained to press a pair of levers for food reward for a number of days, culminating in a VI30 schedule. Subsequently a CS->shock punishment was instituted on one lever on a VI60 schedule. With training, the rats learned to press for reward, biased their behavior away from the punished lever, and also showed fear behaviors to the CS+ prior to shock. The authors employ a variety of creative analyses of behavior using PCA and multidimensional scaling, cluster analyses, etc to show that the individual variance in sensitivity to punishment (i.e., suppression of responding due to the production of an aversive outcome) is unrelated to variance in sensitivity to reward or perception of the aversive US.

The reviews were uniformly positive. The reviewers agreed that the core idea behind the manuscript – to examine the interplay between reward, punishment, and fear – is an extremely important and hugely understudied question, of relevance to adaptive behavior generally and many psychiatric disorders in particular. The design and execution was excellent as well, especially the creative analyses, which was thorough and well-captured the diversity in individual responding. This paper will be of interest to those interested in appetitive and aversive processes, their interaction, neural mechanisms, or contribution to mental illness.

Essential revisions:

Although each reviewer raises concerns, on discussion none were deemed to be essential to address in any particular way. Thus, as long as the authors make a good faith effort to respond to the various points, we believe the manuscript will be publishable.

Reviewer #1:

In this study, the authors develop a task in which to dissociate individual variability in sensitivity to reward and punishment. Hungry rats were trained to press a pair of levers for food reward for a number of days, culminating in a VI30 schedule. Subsequently a CS->shock punishment was instituted on one lever on a VI60 schedule. With training, the rats learned to press for reward, biased their behavior away from the punished lever, and also showed fear behaviors to the CS+ prior to shock. The authors employ a variety of creative analyses of behavior using PCA and multidimensional scaling, cluster analyses, etc to show that the individual variance in behavioral measures of reward, punishment and fear were largely unrelated. They conclude that sensitivity to punishment (ie suppression of responding due to the production of an aversive outcome) is unrelated to variance in sensitivity to reward or perception of the aversive US.

Overall I thought the experiment and associated analyses were really excellent and exceptionally creative, and I think the results nicely support the authors' main point in the abstract that: "punishment insensitivity is (can be?) a unique phenotype, unrelated to differences in reward-seeking and Pavlovian fear".

As suggested by the added parentheses above, my main criticism is with the generalizability of the results. Specifically do the authors think their results mean that this is the only cause of behavior that ignores bad outcomes, or do they see their result as showing the clear and dissociable operation of one cause, without necessarily excluding others as potential causes? If the former, then I think they have a lot more work to do to rule out other possible causes, particularly when considering things such as addiction and depression.

But my guess is that they mean the latter, and currently they do use language in a number of places that suggest they are more circumspect. I think they just need to be more clear about this. In particular, it seems to me that they have a task that very nicely isolates a particular influence. And they show that influence can be dissociated from other potential causes of insensitivity. But I think they do not rule out a role for insensitivity in the other dimensions in the general ability to control behavior. Perhaps some of the features of their task minimize these influences? For instance, the rats are food deprived, well-trained on the food-seeking task, and on a particular schedule. They are also trained first for the food. These are all interesting variables that may make the reward-seeking more robust or dissociated from the subsequent effect of punishment. Likewise, the fear measure is taken during a CS that comes after the decision to respond. These are interesting decisions in arranging their test behavior, and in some regards, they might even make sense for modeling something like addiction where the reward is well-learned. But they might affect the results, I think. If shock was trained first or if the CS+ were presented prior to the choice, as in a transfer task, might the other mechanisms are be important in the control of behavior by aversive outcomes?

Of course, I imagine the authors think what I am suggesting is obvious. But the discussion and paper to do not come across like this. It seems to me that some consideration of these dimensions and pointing out the possibility of unique subgroups that could be revealed by other training protocols might be important. Otherwise I think the paper could be misunderstood by those less versed in behavior than the authors.

Reviewer #2:

In this study, Jean-Richard-dit-Bressel and colleagues examined the relationship between conditioned fear, punishment and reward in rats. This was done using a conditioning procedure in which rats were presented with two levers, each producing reward, but with one producing an auditory CS+ and the other an auditory CS-. Punishment was measured by the rat's propensity to press the lever producing the CS- over the CS+. Conditioned suppression was measured by the change in rate of pressing during cue presentation. The results were clear and compelling. All rats acquired conditioned suppression to the CS+, albeit with some individual variance. Punishment was also observed at the group level, but markedly more variability was observed. Subsequent analyses with multi-dimensional scaling and principal components demonstrated that punishment and conditioned suppression were largely unrelated, or even negatively correlated. Even more, individuals with could be classified and punishment-sensitive or insensitive. All in all, the results reveal independent behavioral mechanisms for reward, punishment and suppression.

All aspects of this study were excellent. The core idea behind the manuscript – to examine the interplay between reward, punishment is suppression is novel and simple. Yet testing these relationships is of obvious importance. Each process is essential to adaptive behavior and each is implicated in array of psychiatric disorders. The execution and analysis of the resulting data was thorough and well-captured the diversity in individual responding. Particularly elegant was the common scale on which punishment and suppression were measured. I believe the results are important and will garner considerable interest.

There are two areas in which I feel the manuscript could be strengthened and I have one comment concerning the language used to describe the procedure. These are provided below.

Measuring conditioned suppression

I had difficulty finding which lever was used to measure conditioned suppression. Were baseline and cue rates taken from both trial types? Or was only the CS- lever used because pressing was biased towards this lever? Specifically, I could not find a description of the lever used for the main result:

"There was also robust evidence for Pavlovian fear (Figure 2B). Conditioned suppression elicited by presentations of the CS+ also increased across training (F (1,47) = 35.1; p <.001, ηp2 = 0.427), with significant suppression being observed for each session (all F (1,47) > 54.7; p <.001, ηp2 = 0.537)."

I also could not determine which lever was being used for suppression in the main figure (Figure 2).

If the authors could specify exactly which lever(s) was used for calculating suppression ratio it would be helpful.

Examining reward

I was very impressed with the author's treatment of conditioned suppression and punishment, both of which were well captured by use of a ratio. However, reward responding seemed less well captured by this measure. At times, I could not determine what measure was being used to assess reward responding. I realize this is difficult while the conditioning procedure is ongoing, as biases in lever pressing should be observed during cue and ITI periods.

However, a pure measure of lever-pressing is available in the lever-press training prior to the beginning of fear discrimination. I am curious if lever press rates observed during this time predict performance in punishment and suppression. For example, rats showing high press rates prior to discrimination may be punishment-insensitive rats OR these rats may show less conditioned suppression. These relationships could be initially examined with simple tests like Pearson's correlation coefficient. This would provide a clearer and more direct examination of the relationship between reward and conditioned punishment & conditioned suppression. If relationships are found, multi-dimensional scaling and principal components could be performed with this factor.

Punishment vs. Conditioned Punishment

The Abstract and Introduction describe the impetus of the study to disentangle reward, punishment and Pavlovian fear (suppression). For the most part, this is reasonable and sets up the reader for the study performed. As the authors are aware (indeed, Dr. Killcross is an author) this procedure was initially designed to dissociate conditioned suppression from conditioned punishment (Killcross et al., 1997). Punishment and conditioned punishment are likely to require independent + overlapping neural and behavioral mechanisms. For this reason, I think it would be prudent to state in the Abstract that conditioned punishment is measured. This should also be stated at the end of the Introduction – when the behavioral procedure is discussed. Indeed, when I first started reading the manuscript, I assumed direct punishment was going to be assessed. Ultimately, I think the use of conditioned punishment – as the authors performed – was more appropriate. Making this clear at the outset of the manuscript will better prepare the reader for the experiment that was performed.

Reviewer #3:

This is an excellent paper in which the authors used a creative two-lever operant procedure to study individual differences in punishment responding and the relationship between responding to punishment of food reward, conditioned fear responding to the punishment cue, and responding for food reward. The main finding is that punishment responding is unrelated to either conditioned fear or food reward responding. The main important general conclusion is that punishment insensitivity is not due to either reduced aversion sensitivity or higher reward value. The authors proposed that punishment insensitivity reflects a failure to learn instrumental control over punishment.

Overall, the behavioral procedure is elegant, the behavioral effects appear robust and reproducible, the experimental methodology is sound, and the statistical analyses are appropriate to the experimental design and research questions. The paper is also very well written and includes appropriate historical citations. I enclose below several comments.

1) The surprising finding in the study was the large number of punishment insensitive rats in the authors' procedure (22/30 males, 11/18 females). Typically, in punishment studies, with increased shock intensity all subjects eventually learn the punishment task. The authors should discuss this issue in the revision. In future studies, the authors should consider manipulating shock intensity parametrically to generate a more sensitive measure of punishment (the equivalent of ED50 in pharmacological dose-response curve) to characterize individual differences in punishment.

2) Subsection “Data analysis”: Change "inter-trial period" to "inter-trial-interval" to fit the abbreviation ITI.

3) Results section: Please add the final shock intensity value for the punishment sensitive and insensitive groups. I presume it was higher for the punishment insensitive group, but this was not described.

https://doi.org/10.7554/eLife.52765.sa1

Author response

Reviewer #1:

In this study, the authors develop a task in which to dissociate individual variability in sensitivity to reward and punishment. Hungry rats were trained to press a pair of levers for food reward for a number of days, culminating in a VI30 schedule. Subsequently a CS->shock punishment was instituted on one lever on a VI60 schedule. With training, the rats learned to press for reward, biased their behavior away from the punished lever, and also showed fear behaviors to the CS+ prior to shock. The authors employ a variety of creative analyses of behavior using PCA and multidimensional scaling, cluster analyses, etc to show that the individual variance in behavioral measures of reward, punishment and fear were largely unrelated. They conclude that sensitivity to punishment (ie suppression of responding due to the production of an aversive outcome) is unrelated to variance in sensitivity to reward or perception of the aversive US.

Overall I thought the experiment and associated analyses were really excellent and exceptionally creative, and I think the results nicely support the authors' main point in the abstract that: "punishment insensitivity is (can be?) a unique phenotype, unrelated to differences in reward-seeking and Pavlovian fear".

As suggested by the added parentheses above, my main criticism is with the generalizability of the results. Specifically do the authors think their results mean that this is the only cause of behavior that ignores bad outcomes, or do they see their result as showing the clear and dissociable operation of one cause, without necessarily excluding others as potential causes? If the former, then I think they have a lot more work to do to rule out other possible causes, particularly when considering things such as addiction and depression.

But my guess is that they mean the latter, and currently they do use language in a number of places that suggest they are more circumspect. I think they just need to be more clear about this. In particular, it seems to me that they have a task that very nicely isolates a particular influence. And they show that influence can be dissociated from other potential causes of insensitivity. But I think they do not rule out a role for insensitivity in the other dimensions in the general ability to control behavior. Perhaps some of the features of their task minimize these influences? For instance, the rats are food deprived, well-trained on the food-seeking task, and on a particular schedule. They are also trained first for the food. These are all interesting variables that may make the reward-seeking more robust or dissociated from the subsequent effect of punishment. Likewise, the fear measure is taken during a CS that comes after the decision to respond. These are interesting decisions in arranging their test behavior, and in some regards, they might even make sense for modeling something like addiction where the reward is well-learned. But they might affect the results, I think. If shock was trained first or if the CS+ were presented prior to the choice, as in a transfer task, might the other mechanisms are be important in the control of behavior by aversive outcomes?

Of course, I imagine the authors think what I am suggesting is obvious. But the discussion and paper to do not come across like this. It seems to me that some consideration of these dimensions and pointing out the possibility of unique subgroups that could be revealed by other training protocols might be important. Otherwise I think the paper could be misunderstood by those less versed in behavior than the authors.

We agree that protocol parameters are certainly a factor. A deeper examination of the effects that changing these parameters might have on learning and behaviour, including the subgroups they might reveal, is an interesting area for future research. This has been added to the Discussion section.

Reviewer #2:

In this study, Jean-Richard-dit-Bressel and colleagues examined the relationship between conditioned fear, punishment and reward in rats. This was done using a conditioning procedure in which rats were presented with two levers, each producing reward, but with one producing an auditory CS+ and the other an auditory CS-. Punishment was measured by the rat's propensity to press the lever producing the CS- over the CS+. Conditioned suppression was measured by the change in rate of pressing during cue presentation. The results were clear and compelling. All rats acquired conditioned suppression to the CS+, albeit with some individual variance. Punishment was also observed at the group level, but markedly more variability was observed. Subsequent analyses with multi-dimensional scaling and principal components demonstrated that punishment and conditioned suppression were largely unrelated, or even negatively correlated. Even more, individuals with could be classified and punishment-sensitive or insensitive. All in all, the results reveal independent behavioral mechanisms for reward, punishment and suppression.

All aspects of this study were excellent. The core idea behind the manuscript – to examine the interplay between reward, punishment is suppression is novel and simple. Yet testing these relationships is of obvious importance. Each process is essential to adaptive behavior and each is implicated in array of psychiatric disorders. The execution and analysis of the resulting data was thorough and well-captured the diversity in individual responding. Particularly elegant was the common scale on which punishment and suppression were measured. I believe the results are important and will garner considerable interest.

There are two areas in which I feel the manuscript could be strengthened and I have one comment concerning the language used to describe the procedure. These are provided below.

Measuring conditioned suppression

I had difficulty finding which lever was used to measure conditioned suppression. Were baseline and cue rates taken from both trial types? Or was only the CS- lever used because pressing was biased towards this lever? Specifically, I could not find a description of the lever used for the main result:

"There was also robust evidence for Pavlovian fear (Figure 2B). Conditioned suppression elicited by presentations of the CS+ also increased across training (F (1,47) = 35.1; p <.001, ηp2 = 0.427), with significant suppression being observed for each session (all F (1,47) > 54.7; p <.001, ηp2 = 0.537)."

I also could not determine which lever was being used for suppression in the main figure (Figure 2).

If the authors could specify exactly which lever(s) was used for calculating suppression ratio it would be helpful.

Response rate across both levers were used to determine conditioned suppression. Text in the Materials and methods section has been changed to make this clearer.

Examining reward

I was very impressed with the author's treatment of conditioned suppression and punishment, both of which were well captured by use of a ratio. However, reward responding seemed less well captured by this measure. At times, I could not determine what measure was being used to assess reward responding. I realize this is difficult while the conditioning procedure is ongoing, as biases in lever pressing should be observed during cue and ITI periods.

However, a pure measure of lever-pressing is available in the lever-press training prior to the beginning of fear discrimination. I am curious if lever press rates observed during this time predict performance in punishment and suppression. For example, rats showing high press rates prior to discrimination may be punishment-insensitive rats OR these rats may show less conditioned suppression. These relationships could be initially examined with simple tests like Pearson's correlation coefficient. This would provide a clearer and more direct examination of the relationship between reward and conditioned punishment & conditioned suppression. If relationships are found, multi-dimensional scaling and principal components could be performed with this factor.

This is an important question. We had previously addressed this indirectly by assessing differences between clusters in lever-press rates at the end of training (they did not differ). However, we agree that this analysis was incomplete. To provide a more direct assessment of effects related to pre-punishment responding, we have added a paragraph to the Results section and Figure 5—figure supplement 2 analysing relationships between lever-pressing at end of training and behaviour under conditioned punishment. In summary, lever-press rates prior to punishment could not predict punishment or conditioned suppression but did predict unpunished responding. We have also updated the supplementary figures for cluster analyses (Figure 6—figure supplement 1, Figure 7—figure supplement 1) to show lever-press rates across lever-press training to give a more complete picture of potential differences between groups. Again, none were observed.

Punishment vs. Conditioned Punishment

The Abstract and Introduction describe the impetus of the study to disentangle reward, punishment and Pavlovian fear (suppression). For the most part, this is reasonable and sets up the reader for the study performed. As the authors are aware (indeed, Dr. Killcross is an author) this procedure was initially designed to dissociate conditioned suppression from conditioned punishment (Killcross et al., 1997). Punishment and conditioned punishment are likely to require independent + overlapping neural and behavioral mechanisms. For this reason, I think it would be prudent to state in the Aabstract that conditioned punishment is measured. This should also be stated at the end of the Introduction – when the behavioral procedure is discussed. Indeed, when I first started reading the manuscript, I assumed direct punishment was going to be assessed. Ultimately, I think the use of conditioned punishment – as the authors performed – was more appropriate. Making this clear at the outset of the manuscript will better prepare the reader for the experiment that was performed.

We agree and have added conditioned punishment has been added to Abstract and end of Introduction.

Reviewer #3:

This is an excellent paper in which the authors used a creative two-lever operant procedure to study individual differences in punishment responding and the relationship between responding to punishment of food reward, conditioned fear responding to the punishment cue, and responding for food reward. The main finding is that punishment responding is unrelated to either conditioned fear or food reward responding. The main important general conclusion is that punishment insensitivity is not due to either reduced aversion sensitivity or higher reward value. The authors proposed that punishment insensitivity reflects a failure to learn instrumental control over punishment.

Overall, the behavioral procedure is elegant, the behavioral effects appear robust and reproducible, the experimental methodology is sound, and the statistical analyses are appropriate to the experimental design and research questions. The paper is also very well written and includes appropriate historical citations. I enclose below several comments.

1) The surprising finding in the study was the large number of punishment insensitive rats in the authors' procedure (22/30 males, 11/18 females). Typically, in punishment studies, with increased shock intensity all subjects eventually learn the punishment task. The authors should discuss this issue in the revision. In future studies, the authors should consider manipulating shock intensity parametrically to generate a more sensitive measure of punishment (the equivalent of ED50 in pharmacological dose-response curve) to characterize individual differences in punishment.

We agree, the number of punishment-insensitive animals was indeed surprising. We believe the relatively weak response-punisher contingency is a reason so many subjects failed to acquire punishment avoidance, despite the relatively high shock intensity. Text has been added to the Discussion section to note this.

2) Subsection “Data analysis”: Change "inter-trial period" to "inter-trial-interval" to fit the abbreviation ITI.

"Inter-trial period" changed to "inter-trial-interval".

3) Results section: Please add the final shock intensity value for the punishment sensitive and insensitive groups. I presume it was higher for the punishment insensitive group, but this was not described.

Yes. Correct. Shock intensity per group has been added to cluster supplementary figure (Figure 6—figure supplement 1, Figure 7—figure supplement 1). This presents a slight confound, so we assessed the possible contribution of shock intensity as a covariate in punishment and conditioned suppression; shock intensity did not significantly co-vary with punishment or conditioned suppression.

https://doi.org/10.7554/eLife.52765.sa2

Article and author information

Author details

  1. Philip Jean-Richard-dit-Bressel

    School of Psychology, UNSW Sydney, Sydney, Australia
    Contribution
    Conceptualization, Resources, Data curation, Software, Formal analysis, Investigation, Visualization, Methodology, Project administration
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0898-8987
  2. Cassandra Ma

    School of Psychology, UNSW Sydney, Sydney, Australia
    Contribution
    Investigation
    Competing interests
    No competing interests declared
  3. Laura A Bradfield

    1. School of Psychology, UNSW Sydney, Sydney, Australia
    2. Centre for Neuroscience and Regenerative Medicine, Faculty of Science, University of Technology Sydney, Sydney, Australia
    3. St Vincent’s Centre for Applied Medical Research, Sydney, Australia
    Contribution
    Conceptualization, Supervision, Investigation, Methodology, Project administration
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-3921-0745
  4. Simon Killcross

    School of Psychology, UNSW Sydney, Sydney, Australia
    Contribution
    Conceptualization, Supervision, Funding acquisition, Project administration
    Competing interests
    No competing interests declared
  5. Gavan P McNally

    School of Psychology, UNSW Sydney, Sydney, Australia
    Contribution
    Conceptualization, Resources, Data curation, Supervision, Funding acquisition, Project administration
    For correspondence
    g.mcnally@unsw.edu.au
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-9061-6463

Funding

Australian Research Council (DP190100482)

  • Gavan P McNally

Australian Research Council (DP170100075)

  • Gavan P McNally

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

This work was supported by grants from the Australian Research Council (DP190100482; DP170100075).

Ethics

Animal experimentation: All procedures were approved by the UNSW Animal Ethics Committee (AEC) (ACEC16/160B) and in accordance with the code set out by the National Health and Medical Research Council (NHMRC) for the treatment of animals in research.

Senior Editor

  1. Kate M Wassum, University of California, Los Angeles, United States

Reviewing Editor

  1. Geoffrey Schoenbaum, National Institute on Drug Abuse, National Institutes of Health, United States

Reviewers

  1. Geoffrey Schoenbaum, National Institute on Drug Abuse, National Institutes of Health, United States
  2. Michael A McDannald, Boston College, United States
  3. Yavin Shaham, National Institute on Drug Abuse, National Institutes of Health, United States

Publication history

  1. Received: October 16, 2019
  2. Accepted: November 21, 2019
  3. Accepted Manuscript published: November 26, 2019 (version 1)
  4. Version of Record published: December 3, 2019 (version 2)

Copyright

© 2019, Jean-Richard-dit-Bressel et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 612
    Page views
  • 118
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)