1. Neuroscience
Download icon

Functional heterogeneity within the rodent lateral orbitofrontal cortex dissociates outcome devaluation and reversal learning deficits

  1. Marios C Panayi  Is a corresponding author
  2. Simon Killcross
  1. The University of New South Wales, Australia
  2. University of Oxford, United Kingdom
Research Article
  • Cited 12
  • Views 1,171
  • Annotations
Cite this article as: eLife 2018;7:e37357 doi: 10.7554/eLife.37357

Abstract

The orbitofrontal cortex (OFC) is critical for updating reward-directed behaviours flexibly when outcomes are devalued or when task contingencies are reversed. Failure to update behaviour in outcome devaluation and reversal learning procedures are considered canonical deficits following OFC lesions in non-human primates and rodents. We examined the generality of these findings in rodents using lesions of the rodent lateral OFC (LO) in instrumental action-outcome and Pavlovian cue-outcome devaluation procedures. LO lesions disrupted outcome devaluation in Pavlovian but not instrumental procedures. Furthermore, although both anterior and posterior LO lesions disrupted Pavlovian outcome devaluation, only posterior LO lesions were found to disrupt reversal learning. Posterior but not anterior LO lesions were also found to disrupt the attribution of motivational value to Pavlovian cues in sign-tracking. These novel dissociable task- and subregion-specific effects suggest a way to reconcile contradictory findings between rodent and non-human primate OFC research.

https://doi.org/10.7554/eLife.37357.001

Introduction

The orbitofrontal cortex (OFC) in rodents and primates is critical for updating behaviour flexibly when outcome contingencies change (Murray et al., 2007). Compelling evidence for this view comes from studies using outcome devaluation procedures in which the value of a reward is reduced to test whether behaviour is updated to reflect changes in the outcome’s current value. In rodents, OFC lesions disrupt the appropriate reduction in anticipatory responding for a reward that has been paired with illness and has become aversive (Gallagher et al., 1999; Pickens et al., 2003, 2005). Similar conclusions have been reached in human fMRI studies (Gottfried et al., 2003) and non-human primate studies, where OFC function is disrupted by excitotoxic lesions and functional inactivation (Izquierdo and Murray, 2004, 2010; Izquierdo et al., 2004, 2005; Machado and Bachevalier, 2007; Rolls, 2000; West et al., 2011).

Similar to outcome devaluation, reversal learning procedures involve updating behaviour when rewarded and non-rewarded task contingencies change. Although OFC lesions do not impact initial acquisition of rewarded and non-rewarded cues or actions, they significantly disrupt the flexible updating of behaviour following the reversal of these contingencies (Boulougouris et al., 2007; Murray et al., 2007; Schoenbaum et al., 2003). Both outcome devaluation and reversal learning require flexibly tracking changes in learned contingencies and updating behaviour appropriately when contingencies or outcome values change, and both procedures are disrupted by damage to the OFC.

A key requirement of any theory of OFC function is to account for the deficits in both outcome devaluation and reversal learning following disruption of OFC function (Delamater, 2007; Murray et al., 2007; Rudebeck and Murray, 2014; Wikenheiser et al., 2017; Wilson et al., 2014). However, the generality of these deficits has been questioned. For example, OFC lesions do not disrupt outcome devaluation in instrumental action-outcome learning procedures (Balleine et al., 2011; Ostlund and Balleine, 2007a), which is in contrast to robust effects in Pavlovian outcome devaluation (Schoenbaum et al., 1999). Furthermore, reversal deficits are not always reported following OFC lesions, and the nature of the deficit is not always consistent (Boulougouris et al., 2007; Burke et al., 2009; Chudasama and Robbins, 2003; Mariano et al., 2009; McAlonan and Brown, 2003; Rudebeck and Murray, 2011b; Rudebeck et al., 2013; Schoenbaum et al., 2003). Although it is tempting to attribute these differences to different task parameters between studies, we propose that these discrepancies are also likely to be caused by functional heterogeneity within regions classified as OFC.

Indeed, there is mounting evidence that the OFC is a functionally heterogeneous structure (Izquierdo, 2017; Mar et al., 2011; Rudebeck and Murray, 2011a). Notably, Ostlund and Balleine, (2007a) used focal lesions of the lateral OFC (LO), which did not disrupt instrumental outcome devaluation, whereas earlier Pavlovian devaluation studies in rodents involved widespread damage to the ventral (VO), lateral (LO), dorsolateral (DLO), anterior agranular insular (AI), and even medial (MO) OFC subregions (Gallagher et al., 1999; Pickens et al., 2003, 2005). It is therefore unclear which subregions contribute to flexible behavioural control tested by Pavlovian outcome devaluation or reversal learning in rodents.

Here, we investigate whether the role of OFC in flexible behavioural control is specific to cue-guided (Pavlovian) and not action-guided (instrumental) behaviours (Ostlund and Balleine, 2007a) using focal LO lesions. Then we examine whether there is functional heterogeneity within the anterior-posterior plane of the LO region using Pavlovian outcome devaluation and reversal learning procedures.

Results

Instrumental devaluation by taste aversion

First, we tested whether the OFC plays a necessary role in guiding flexible action-outcome behaviour in an instrumental outcome devaluation task. In contrast to Pavlovian devaluation using taste aversion (Gallagher et al., 1999; Pickens et al., 2003, 2005), Ostlund and Balleine, (2007a) have shown that OFC lesions do not disrupt behaviour in an instrumental devaluation task using specific satiety as the method of devaluation. We extend these findings to instrumental devaluation using taste-aversion as the method of devaluation.

Following recovery from sham or excitotoxic OFC lesions (Figure 1A, N = 32; sham devalued n = 8, sham non-devalued n = 8, lesion devalued n = 8, lesion non-devalued n = 8), half the animals in each lesion group were assigned to have an instrumental reinforcer devalued (devalued group) or an alternative reinforcer devalued (non-devalued group). Rats were trained to lever press for either pellet or liquid sucrose rewards on a random interval 30 s schedule (RI30), and were exposed to the alternative reward non-contingently in a separate session on each day of training. OFC lesions did not affect the lever pressing acquisition across the 3 days of RI30 acquisition training (Figure 1B). A mixed Lesion (sham, lesion) x Devaluation (devalued, non-devalued) x Day (3 days) ANOVA revealed only a significant main effect of Day (F(2, 56)=13.99, p<0.001, all remaining F < 1.31, p>0.26). A significant linear trend of Day (F(1, 28)=21.80, p<0.001) suggested that all groups increased lever responding across acquisition days.

Figure 1 with 1 supplement see all
The effects of excitotoxic OFC lesions on instrumental devaluation by taste aversion.

(A) Representative OFC lesion damage in the Non-Devalued (left) and Devalued (right) lesion groups. Semi-transparent grey patches represent lesion damage in a single subject, and darker areas represent overlapping damage across multiple subjects. Coronal sections are identified in mm relative to bregma (Paxinos and Watson, 1997). (B) Rate of lever pressing during 3 days of instrumental acquisition. (C) Mean reward consumption during taste aversion learning, consumption of rewards paired with LiCl induced nausea decreased across injection pairings (Left), whereas consumption of rewards paired with saline injections increased across injection pairings. (D) Total lever presses during the 10 mins devaluation test in extinction. Within-session responding presented in Figure 1—figure supplement 1. (E) Total lever presses during the 20 mins re-acquisition test with rewards delivered instrumentally. Error bars depict + SEM. (*) Symbol denotes statistical significance of simple or main effects following a significant interaction.

https://doi.org/10.7554/eLife.37357.002

Next, animals in the devalued groups acquired a taste aversion to the instrumental reinforcer following pairings with LiCl injections (in contrast to pairings of the alternate reinforcer with control saline injections) whereas animals in the non-devalued groups acquired a taste aversion to the alternative reinforcer and the instrumental outcome was paired with saline injections. Taste aversion was successfully acquired to the food paired with LiCl, as shown by decreased consumption compared to the food paired with saline injections (Figure 1C), and there were no apparent group differences in acquiring this taste aversion. A mixed Lesion x Devaluation (devalued, non-devalued group) x Pairing (3 pairings) x Injection (LiCl, saline) ANOVA supported the acquisition of taste aversion with significant main effects of Pairing (F(2, 56)=10.55, p<0.001), Injection (F(1, 28)=35.96, p<0.001), and a Pairing x Injection interaction (F(2, 56)=176.17, p<0.001, all remaining F < 2.48, p>0.09). Similarly, a significant Pairing x Injection linear trend effect (F(1, 28)=513.68, p<0.001) suggested that consumption of food paired with LiCl significantly decreased across pairings (linear trend across Pairing for LiCl, F(1, 28)=286.92, p<0.001) whereas consumption of the food paired with saline increased (linear trend across Pairing for saline, F(1, 28)=67.57, p<0.001). These findings support previous reports that OFC lesions do not affect initial learning of instrumental lever pressing behaviour, or sensitivity to acquiring taste aversions.

Devaluation of the instrumental response was then assessed by an extinction test of lever pressing. At test the groups with the devalued instrumental reinforcer performed fewer lever presses than the non-devalued groups (i.e. the groups with the alternative reinforcer devalued), but this was not differentially affected by lesion or sham surgery (Figure 1D). A univariate Lesion x Devaluation ANOVA revealed a significant main effect of Devaluation (F(1, 28)=5.14, p=0.03) that did not significantly interact with Lesion (Lesion x Devaluation F(1, 28)=1.19, p=0.29, main effect of Lesion F(1, 28)=0.21, p=0.65). Therefore, a significant devaluation effect was found across both lesion groups (additional analysis of devaluation performance within the session presented in Figure 1—figure supplement 1).

Similar to the devaluation test in extinction, subsequent re-acquisition of the instrumental response with the delivery of the outcome was significantly affected by the acquired taste aversion showing a strong devaluation effect (Figure 1E). A univariate Lesion x Devaluation ANOVA revealed a significant main effect of Devaluation (F(1, 28)=181.01, p<0.001) that did not significantly interact with Lesion (Lesion x Devaluation F(1, 28)=0.43, p=0.52, main effect of Lesion F(1, 28)=2.09, p=0.16).

These findings combine with those of Ostlund and Balleine, (2007a) to show that the OFC is not necessary for the flexible control of action-outcome behaviour. Furthermore, we rule out the possibility that this discrepancy between Pavlovian and instrumental devaluation effects following OFC is due to differences in the method of devaluation that is taste-aversion or specific satiety.

Using similar LO lesions we also replicate another finding reported by Ostlund and Balleine, (2007a), no effect of LO lesions on sensory-specific Pavlovian-to-instrumental transfer (sPIT; Appendix 1—figure 1). In contrast to these relatively anterior LO lesions, more posterior OFC lesions encompassing both LO and VO have been found to disrupt the sPIT effect (Balleine et al., 2011; Scarlet et al., 2012). Chemogenetic inactivation of these posterior LO and VO aspects of the rodent OFC have also recently been found to disrupt instrumental outcome devaluation by specific satiety under some training conditions (Parkes et al., 2018).

Pavlovian devaluation by taste aversion

An alternative account of the absence of OFC lesion effects on instrumental devaluation, in contrast to robust devaluation deficits by taste aversion in rodents (Gallagher et al., 1999; Pickens et al., 2003, 2005), is the extent and specificity of OFC lesion damage. It is notable that OFC lesions in these Pavlovian devaluation studies encompass many orbital subregions (VO, LO, DLO, AI, and even MO). In contrast, the OFC lesions in the present studies, and Ostlund and Balleine, (2007a), are predominantly focussed on the anterior extent of LO. In addition to testing whether these anterior LO lesions are sufficient to replicate the effect of large OFC lesions on outcome devaluation by taste aversion, a second group of lesion animals was created with posterior LO lesions. In rats, LO spans a large anterior-posterior plane (at least 3 mm), so we tested for functional heterogeneity between anterior and posterior LO subregions on Pavlovian outcome devaluation and reversal learning to identify their role in these two canonical OFC dependent tasks.

Rats underwent sham or excitotoxic lesion surgery using a range of co-ordinates, and two distinct lesion groups were established (described in Materials and methods section), anterior and posterior LO lesion groups (Figure 2A, Figure 2—figure supplement 1) were defined by damage predominantly anterior or posterior to bregma +3.70 respectively (Figure 2B).

Figure 2 with 2 supplements see all
The effects of subregion specific OFC lesions on Pavlovian devaluation by taste aversion.

(A) Representative OFC lesion damage in the anterior (left) and posterior (right) LO lesion groups; additional histology presented in Figure 2—figure supplement 1. Semi-transparent grey patches represent lesion damage in a single subject, and darker areas represent overlapping damage across multiple subjects. Coronal sections are identified in mm relative to bregma (Paxinos and Watson, 1997). (B) Quantification of percent bilateral OFC damage in anterior and posterior lesion groups at each coronal plane, in mm relative to bregma. (C) Rate of acquisition to the Pavlovian CSs in blocks of 3 days. Response rates presented as duration of magazine activity during the CS minus activity in the PreCS period. (D) The acquisition of a specific taste aversion following pairings of one outcome with LiCl injections (Devalued) or saline injections (Non-Devalued). The mean weight of each outcome consumed prior to each injection pairing is plotted. (E) An additional pairing of each outcome with LiCl or saline injections was conducted in the experimental chambers following non-contingent delivery of reward into magazine. Data presented as total duration of magazine activity in the session. This allowed for a measure of the transfer of the taste aversion to the testing context. (F) Magazine responding (CS – PreCS) to the CSs associated with the devalued and non-devalued outcomes, presented in extinction. (G) An outcome specific reinstatement test in which responding to the CSs was assessed after brief exposure to its associated outcome. Error bars depict + SEM. (*) Symbol denotes statistical significance of simple or main effects following a significant interaction. Effect of OFC lesions on locomotor activity presented in Figure 2—figure supplement 2.

https://doi.org/10.7554/eLife.37357.004

First, all animals were trained on two unique Pavlovian cue-outcome relationships. Acquisition of responding to the CSs predicting the to-be devalued and non-devalued USs did not differ within groups but differed between lesion groups (Figure 2C) such that responding was lower in the posterior OFC lesion group. A mixed Group x CS (devalued, non-devalued) x DayBlock (4 Blocks of 3 days) ANOVA supported this observation with a significant main effect of Group (F(2, 41)=3.67, p=0.03) and DayBlock (F(3, 123)=102.14, p<0.001) but all other effects failed to reach significance (Group x US F(2, 41)=2.55, p=0.09, Group x US x DayBlock F(6, 123)=2.01, p=0.07, all remaining F < 1.00, p>0.44). Bonferroni corrected pairwise comparisons of overall responding revealed that the posterior group had lower performance than the sham group (F(1, 41)=7.34, p=0.03), but no significant differences were found between anterior and sham (F(1, 41)=2.29, p=0.42), or posterior and anterior groups (F(1, 41)=1.65, p=0.62).

Taste aversion was successfully acquired by all groups (Figure 2D). Food consumption (g) was analysed using a Group x Pairing (injection 1, 2) x Devaluation (LiCl, saline) ANOVA which revealed significant effects of Devaluation (F(1, 41)=8.23, p=0.01), Pairing (F(1, 41)=141.39, p<0.001) and a Pairing x Devaluation interaction (F(1, 41)=37.83, p<0.001), but no main effect or interactions with Group (all remaining F < 1.00, p>0.68). Follow up simple effects revealed that consumption of the US paired with LiCl did not differ from saline prior to the first injection (pairing 1 F(1, 41)=0.09, p=0.77), but was significantly reduced relative to saline prior to the second injection (pairing 2 F(1, 41)=59.199, p<0.001). The third injection pairings performed in the test chambers showed successful transfer of the taste aversion to this context in all groups (Figure 2E). A Group x Devaluation mixed ANOVA on magazine duration behaviour revealed a significant effect of Devaluation (F(1, 41)=16.16, p<0.001) that did not differ with Group (all remaining F < 1.57, p>0.22). Taken together, consumption and approach towards the US paired with LiCl was successfully reduced compared to the US paired with saline injections, but the magnitude of this unique taste aversion did differ between groups.

Devaluation testing was conducted under extinction to ensure that behaviour was guided by the expected/recalled value of the outcomes (Figure 2F). The sham group showed a significant reduction in magazine behaviour to the CS that predicted the devalued relative to the non-devalued US, but this devaluation effect was not evident in the anterior and posterior lesion groups. This pattern of results was supported by a Group x Devaluation mixed ANOVA revealing a significant Group x Devaluation interaction (F(2, 41)=3.46, p=0.04), the main effects of Devaluation (F(1, 41)=3.74, p=0.06) and Group (F(2, 41)=0.41, p=0.41) did not reach significance. Simple effects revealed that this interaction was due to a significant devaluation effect in the sham group (F(1, 41)=7.33, p=0.01), but not the anterior (F(1, 41)=2.06, p=0.16) or posterior groups (F(1, 41)=0.81, p=0.37). This suggests that lesions of the anterior or the posterior LO are sufficient to disrupt Pavlovian devaluation by taste aversion, previously established with much larger OFC lesions in rodents (Gallagher et al., 1999; Pickens et al., 2003, 2005).

Next, a US specific reinstatement test was conducted to see if the lesion groups could appropriately reduce behaviour to the devalued cue following a brief reminder of the outcome value. Rats were first exposed to one of the USs, and after a short delay they were presented with the CS that predicted that US (in extinction). All groups remained sensitive to the taste aversion when re-exposed to the USs in the test chamber (uneaten devalued USs observed by experimenter when cleaning the chamber prior to test). A mixed Group x Period (pre, post US delivery) x Devaluation ANOVA on magazine behaviour during US re-exposure (data not shown) revealed a significant effect of Period (F(1, 41)=71.20, p<0.001), Devaluation (F(1, 41)=72.05, p<0.001) and Period x Devaluation interaction (F(1, 41)=79.30, p<0.001, all remaining F < 1.33, p>0.28). Simple main effects revealed that magazine behaviour did not differ before US delivery (F(1, 41)=1.23, p=0.02), but was significantly higher after delivery of the non-devalued than the devalued US (F(1, 41)=93.46, p<0.001).

During the reinstatement test, all groups showed significant evidence of sensitivity to US devaluation in the presence of the CSs (Figure 2G). A mixed Group x Devaluation ANOVA supported this pattern of results with a significant main effect of Devaluation (F(1, 41)=50.73, p<0.001), but no significant effect of Group (F(2, 41)=1.12, p=0.34) or Group x Devaluation interaction (F(2, 41)=1.97, p=0.15). Therefore, re-exposure to the US prior to test elicited a robust devaluation effect in all groups. This suggests that the disruption of the Pavlovian devaluation effect following LO lesions is not caused by a failure to acquire sensory specific cue-outcome associations, not the ability to acquire a sensory specific taste-aversion, nor perseverative responding to any predictive cues. Instead, the deficit is specific to recalling the new value of the devalued outcome and/or integrating it into appropriate behavioural control.

Sign-tracking and reversal

The finding that posterior LO lesions retarded acquisition of initial Pavlovian conditioned approach behaviour is surprising given that these animals can appropriately modulate their cue driven behaviour based on outcome value when given contact with the US in a reinstatement test. It was hypothesised that this might reflect an impairment in the attribution of value/salience to the Pavlovian cue itself. When a lever is used as a Pavlovian cue, rats will come to approach and engage with the lever cue (sign-tracking) instead of the normal conditioned approach to the magazine (goal-tracking behaviour) (Brown and Jenkins, 1968; Jenkins and Moore, 1973; Locurto et al., 1976). Many researchers have argued that sign-tracking behaviour reflects a process by which the lever CS acquires enhanced incentive salience so that the incentive motivational value of the outcome becomes attributed to the cue (Berridge, 2004). Therefore, it was predicted that the posterior LO group would not attribute incentive salience to a lever cue and show a deficit in sign-tracking. The sham, anterior, and posterior LO lesion groups were retrained on a discriminated sign-tracking procedure in using rewarded (CS+) and non-rewarded (CS-) lever cues (left and right lever, counterbalanced).

To ensure that any differences in lever pressing are not confounded by differential levels of competing responses, it is important to establish that there are no group differences in baseline magazine behaviour. Mixed Group x DayBlock (4 blocks of 3 days) ANOVAs for the PreCS magazine duration did not differ between groups during acquisition (Group or Group x DayBlock interactions, all F < 1.75, p>0.12) or subsequent reversal (all F < 2.01, p>0.14, data not shown).

During acquisition, lever pressing during the CS+ was greater than CS-, but the lesion groups made fewer responses than the sham group (Figure 3A, left panel). A mixed Group x CS (CS+, CS-) x DayBlock (4 blocks of 3 days) ANOVA partially supported the observed differences with a significant main effect of Group (F(2, 41)=3.75, p=0.03) and a 3-way Group x CS x DayBlock interaction (F(6, 123)=3.42, p<0.01, all remaining effects also reached significance F > 2.20, p<0.05). While there were no group differences on DayBlocks 1 and 2 (non-significant main effects of Group and Group x CS interactions for DayBlock 1 and 2, all F < 2.27, p>0.12), on DayBlocks 3 and 4 there were significant main effects of Group (DayBlock 3 F(2, 41)=4.97, p=0.01, DayBlock 4 F(2, 41)=5.01, p=0.01) and Group x Cue interactions (DayBlock 3 F(2, 41)=3.99, p=0.03, DayBlock 4 F(2, 41)=4.70, p=0.01). Bonferroni corrected simple effects revealed that there were no group differences in responding to the CS- (DayBlock 3 and 4, all F < 4.21, p>0.14), whereas CS +lever pressing was greater in the sham than the posterior group (DayBlock 3 F(1, 41)=9.51, p=0.01, DayBlock 4 F(1, 41)=8.77, p=0.02) but not different between sham and anterior or anterior and posterior lesions (DayBlock 3 and 4, all F < 3.87, p>0.17). Therefore, lever responding to the CS+ was lower for the posterior lesion than the sham group in the second half of acquisition but no differences between anterior lesions and the sham or posterior lesion groups were revealed.

The effects of subregion specific OFC lesions sign-tracking behaviour and reversal learning.

Lever (A) and magazine (B) CS responding during 12 days of acquisition (left), and reversal (right) of a rewarded (CS+) and non-rewarded (CS-) lever cue. Response competition during acquisition (C) and reversal (D) of the CS+. Lever response bias calculated as the difference between standardised lever and magazine responding so that positive scores represent greater lever bias and negative scores represent greater magazine response bias. Error bars depict + SEM. (*) Symbol denotes statistical significance of simple or main effects following a significant interaction.

https://doi.org/10.7554/eLife.37357.007

Magazine duration responding in the CS- decreased across acquisition in all groups whereas responding to the CS+ only decreased in the sham and anterior groups but not in the posterior lesion group (Figure 3A, left panel). A mixed Group x CS x DayBlock ANOVA supported these observations. CS+ responding was greater than CS- responding, and while responding decreased across days this decline was more rapid to the CS- than the CS+ (main effect of DayBlock F(3, 123)=30.06, p<0.001, and a CS x DayBlock interaction F(3, 123)=5.82, p=0.001). A 3-way Group x CS x DayBlock interaction (F(6, 123)=2.94, p=0.01, and a significant Group x DayBlock interaction F(3, 123)=2.19, p<0.05) suggested that the differential decline in responding to each CS was not the same in each group (all remaining F < 1.00, p>0.39). Separate follow-up Group x DayBlock ANOVAs were conducted on each CS to explore the 3-way interaction. Responding during the CS- decreased to the same extent in all groups (significant main effect of DayBlock F(3, 123)=32.25, p<0.001, but no main effect of Group F(2, 41)=0.55, p=0.58, or Group x DayBlock interaction F(3, 123)=0.24, p=0.96). In contrast, during the CS+ there were significant group differences in responding (significant main effect of DayBlock F(3, 123)=18.23, p<0.001, no main effect of Group F(2, 41)=1.06, p=0.26, significant Group x DayBlock interaction F(3, 123)=3.65, p<0.01). Simple main effects analysing group differences in CS+ magazine duration found that that there were no group differences on DayBlocks 1 and 2 and 3 (effect of Group on DayBlock 1 F(2, 41)=0.14, p=0.87, DayBlock 2 F(2, 41)=0.30, p=0.75, DayBlock 3 F(2, 41)=2.38, p=0.11) however there were significant group differences on DayBlock 4 (effect of Group on DayBlock 4 F(2, 41)=4.81, p=0.01). Follow up Bonferroni corrected comparisons revealed that on DayBlock 4, posterior group magazine responding was greater than the sham group (F(1, 41)=8.12, p=0.02).

A popular measure of sign-tracking behaviour is the Pavlovian conditioned approach (PCA) index (Flagel et al., 2011) which combines a number measures relating to the probability, latency, and relative bias in lever pressing over magazine approach. However, there is no principled justification for the specific choice or relative weighting of these measures, so a data driven alternative was used to quantify sign-tracking behaviour. A Doubly Multivariate ANOVA (Tabachnick and Fidell, 2013) was employed to directly assess response competition between the lever pressing and magazine duration measures in the sign-tracking procedure. This allowed for the comparison of two fundamentally different measures, lever pressing and magazine duration, which are likely to be correlated due to response competition that is high scores on one measure preclude high scores on the other measure.

A Group x DayBlock x CS (rewarded, non-rewarded) MANOVA with lever pressing and magazine duration response measures revealed a significant Group x DayBlock x CS interaction (F(6, 37)=2.95, p=0.02). Follow up Group x DayBlock MANOVAs revealed a significant Group x DayBlock interaction for the rewarded CS (F(6, 37)=2.59, p=0.03) but not the non-rewarded CS (F(6, 37)=0.86, p=0.53). This significant multivariate interaction was investigated using a planned composite of the differences between measures (standardised with respect to within group variances and the grand mean) that is lever pressing - magazine duration. This composite reflects the expected competition between responses and was verified (post-hoc) by a discriminant analysis on the final day block of acquisition which revealed standardised coefficients of 0.64 (lever pressing) and −0.56 (magazine duration) associated with the first eigenvalue.

The difference scores on the standardised variate revealed that during acquisition of responding directed at the rewarded lever, all groups expressed a bias towards magazine responding at the start of training (Figure 3C). However, by the end of training the sham and anterior groups were responding more to the lever than the magazine whereas the posterior group was responding similarly to both the magazine and the lever. This pattern of observed results was supported statistically. A Group x DayBlock ANOVA on the acquisition of the rewarded lever revealed a significant main effect of Group (F(2, 41)=3.45, p=0.04), DayBlock (F(3, 123)=95.89, p<0.001) and Group x DayBlock interaction (F(6, 123)=3.67, p<0.01). Bonferroni adjusted simple effects revealed that there were no group differences in on DayBlock 1 and 2 (all p>0.17) whereas on DayBlock 3 and 4 the posterior group had significantly lower scores than the sham group (p<0.01, all remaining p>0.15). These findings suggest that there is a difference in OFC function within LO along the anterior-posterior gradient.

Reversal learning

A commonly reported deficit following OFC damage is in reversal learning (Boulougouris et al., 2007; Rudebeck and Murray, 2011a; Schoenbaum et al., 2002), so a reversal manipulation was employed to test whether anterior and posterior LO damage result in a reversal deficit. The identity of the CS+ and CS- levers was reversed and acquisition continued for another 12 days.

Reversal learning resulted in more lever presses being directed towards the new CS+ than the CS-, but the lesion groups made fewer responses than the sham group (Figure 3A, right panel). A mixed Group x CS (CS+, CS-) x DayBlock (4 blocks of 3 days) ANOVA partially supported the observed differences with a significant main effect of Group (F(2, 41)=5.09, p=0.01) and a 3-way Group x CS x DayBlock interaction (F(6, 123)=3.94, p=0.001, all remaining effects also reached significance F > 3.56, p<0.04, except the Group x DayBlock interaction F(6, 123)=1.52, p=0.181). The 3-way interaction was decomposed into separate Group x CS ANOVAs conducted for each DayBlock. On DayBlock 1 responding was greater to the CS- than the CS+ (main effect of CS on DayBlock 1 F(1, 41)=5.27, p=0.03) but this did not differ between groups (non-significant main effect of Group and Group x CS interaction, all F < 1.82, p>0.18). On DayBlocks 2, 3 and 4 the main effect of CS remained significant such that CS+ responding was now higher than CS- responding (DayBlock 2 F(1, 41)=109.59, p<0.001, DayBlock 3 F(1, 41)=185.42, p<0.001, DayBlock 4 F(1, 41)=222.47, p<0.001), however there were also significant main effects of Group (DayBlock 2 F(2, 41)=5.05, p=0.01, DayBlock 3 F(2, 41)=5.42, p=0.01, DayBlock 4 F(2, 41)=4.09, p=0.02) and Group x Cue interactions (DayBlock 2 F(2, 41)=3.60, p=0.04, DayBlock 3 F(2, 41)=5.45, p=0.01, DayBlock 4 F(2, 41)=3.97, p=0.03). Bonferroni corrected simple effects revealed that there were no group differences in responding to the CS- (DayBlock 2, 3 and 4, all F < 5.01, p>0.09), whereas CS+ lever pressing was greater in the sham than the posterior group (DayBlock 2 F(1, 41)=8.47, p=0.02, DayBlock 3 F(1, 41)=10.72, p=0.01, DayBlock 4 F(1, 41)=8.10, p=0.02) but not different between sham and anterior, or anterior and posterior lesions (DayBlock 2, 3 and 4, all F < 4.16, p>0.14). Therefore, similar to initial acquisition, lever responding to the CS+ was lower for the posterior lesion than the sham group later in acquisition but no differences between anterior lesions and the sham or posterior lesion groups could be concluded.

A reversal deficit was also found in the posterior LO lesion group using the measure of magazine duration. Magazine duration decreased more rapidly to the CS+ than the CS- during reversal for the sham and anterior lesion group but there was no apparent reduction in responding to either CS in the posterior lesion group (Figure 3B, right panel). These observations were supported by a Group x CS x DayBlock ANOVA on magazine duration data which revealed significant differences in responding to each CS (CS x DayBlock interaction F(3, 123)=5.24, p<0.01) and group differences in the rate of response reduction across the session (main effect of Group F(2, 41)=3.44, p=0.04, main effect of DayBlock F(3, 123)=18.48, p<0.001, and Group x DayBlock interaction F(6, 123)=2.43, p=0.03). Overall, simple main effects revealed that responding was higher for CS+ than CS-on DayBlock 5 (F(1, 41)=17.38, p<0.001), but at similar levels on DayBlock 6, 7 and 8 (all F(1, 41)<1.00, p>0.28). Simple main effects examining group differences revealed that responding reduced across reversal in the sham and anterior lesion groups (effect of DayBlock for sham group F(3, 39)=5.30, p<0.01, anterior group F(3, 39)=4.69, p=0.01) but not in the posterior lesion group (F(3, 39)=2.33, p=0.09).

Similar to the analysis of acquisition, a multivariate approach was used to assess competition between lever and magazine behaviour in reversal. A Group x DayBlock x CS (rewarded, non-rewarded) MANOVA with lever pressing and magazine duration response measures revealed a significant Group x DayBlock x CS interaction (F(6, 37)=3.11, p=0.01). Follow up Group x DayBlock MANOVAs revealed a significant Group x DayBlock interaction for the rewarded CS (F(6, 37)=3.34, p=0.01) but not the non-rewarded CS (F(6, 37)=2.11, p=0.08). A discriminant analysis was performed on the final day block of acquisition which revealed standardised coefficients of 0.41 (lever pressing) and −0.69 (magazine duration) associated with the first eigenvalue, which supported the choice of a difference score again.

The difference scores on the standardised variate revealed that during reversal of the rewarded lever, all groups responded more towards the magazine than the lever at the start of training (Figure 3D). However, by the end of training the sham and anterior groups were performing more to the lever than the magazine whereas the posterior group was performing equally to both the magazine and the lever. A Group x DayBlock ANOVA on the acquisition of the rewarded lever revealed a significant main effect of Group (F(2, 41)=5.30, p=0.01), DayBlock (F(3, 123)=47.55, p<0.001) and Group x DayBlock interaction (F(6, 123)=3.20, p=0.01). Bonferroni adjusted simple effects revealed that there were no group differences in on DayBlock 1 (all p>0.99) whereas on DayBlocks 3 and 4 the posterior group had significantly lower scores than the sham group (all p<0.01, all remaining p>0.05). Similar to acquisition, the posterior lesion group were significantly impaired in sign-tracking to the CS+ during reversal.

Discussion

Our results demonstrated two important neural and behavioural dissociations within the rodent OFC. First, we directly confirmed the dissociable role of the rodent OFC in Pavlovian but not instrumental behavioural flexibility following outcome devaluation (Gallagher et al., 1999; Ostlund and Balleine, 2007a). Second, we showed a novel dissociation within anterior and posterior subregions of rodent LO in outcome devaluation, sign-tracking, and reversal learning procedures. Together, these findings indicate that many contradictory findings in OFC research may be reconciled as functional heterogeneity within the putative orbital subregions.

Outcome devaluation

Successfully updating behaviour in an outcome devaluation procedure provides strong evidence that the organism has (i) a representation of the specific identity of the predicted outcome, (ii) access to its current motivational value, and (iii) can flexibly update behaviour based on this information. Prominent model-based and sensory-specific outcome-expectancy coding accounts of the OFC argue that deficits in outcome devaluation following OFC lesions are due to an inability to access the representation of the specific identity of expected outcomes (Delamater, 2007; Wikenheiser et al., 2017; Wilson et al., 2014). Alternatively, these deficits can be modelled as an inability to use a cognitive model of the task structure to mentally ‘simulate’ the consequences of their actions on future states (Wilson et al., 2014).

Model-based and sensory-specific outcome-expectancy coding accounts of the OFC (Delamater, 2007; Rudebeck and Murray, 2014; Wilson et al., 2014) predict that OFC lesions should disrupt the devaluation effect in both instrumental and Pavlovian outcome devaluation. However, the absence of an effect of OFC lesions on instrumental devaluation suggests that the representation of the specific properties of instrumental and Pavlovian outcomes are dissociable (Ostlund and Balleine, 2007b). This absence also suggests that organisms represent task states and/or state transitions differently if they are caused by an instrumental action or an external Pavlovian stimulus, a distinction that has recently been incorporated into some model-based reinforcement learning theories (Dayan and Berridge, 2014; Lesaint et al., 2014; Zhang et al., 2009). Consistent with a selective role for the OFC in Pavlovian model-based inferences, there is mounting evidence that the OFC is necessary for making inferences in procedures that require a model-based representation the relationship between external cues (Jones et al., 2012; Sadacca et al., 2018). Given that outcome devaluation is considered a canonical deficit following OFC lesions, the absence of a lesion deficit in instrumental outcome devaluation must be incorporated into theories of OFC function.

In this paper we explored a number of possible reasons for the reported absence of an OFC lesion deficit in instrumental outcome devaluation (Ostlund and Balleine, 2007a). One possibility is that the authors used task parameters that were not appropriate to detect a subtle OFC lesion deficit. For example, the number of distinct responses/outcomes that are trained concurrently affects the sensitivity of instrumental conditioning to outcome devaluation. Behaviour can become insensitive to devaluation (habitual) with overtraining (Adams, 1982; Dickinson, 1985) if only a single lever-outcome procedure is used. This overtraining effect is abolished if training procedures allow the animal to experience multiple distinct action-outcome contingencies (Colwill and Rescorla, 1985; Kosaki and Dickinson, 2010). Ostlund and Balleine, (2007a) employed two unique levers and outcomes during acquisition, and a simultaneous two-lever choice test following devaluation. We tested whether these parameters, which have been shown to promote devaluation sensitive behaviour, masked a subtle OFC lesion deficit in instrumental devaluation. Our task employed a single lever-outcome design, and a random interval schedule of training, both of which have been shown to encourage the formation of habitual/devaluation insensitive behaviour (Adams, 1982; Dickinson, 1985). Despite these parameters, both sham and lesion groups exhibited a weak but robust devaluation effect (Figure 1D).

Another possibility is that the method of devaluation can affect whether OFC lesion deficits are observed in instrumental outcome devaluation. Ostlund and Balleine, (2007a) used sensory specific satiety rather than lithium-chloride induced taste aversion as the method of devaluation. These two devaluation methods are often used interchangeably in computational models of learning (e.g. Dranias et al., 2008; Grossberg et al., 2008) and often yield similar results following lesions of neural regions involved in goal-directed and habitual behavioural control (Coutureau and Killcross, 2003; Killcross and Blundell, 2002; Killcross and Coutureau, 2003). Our results confirm that the absence of an OFC lesion deficit on instrumental devaluation is not simply due to a difference between lithium-chloride taste aversion and sensory specific satiety devaluation methods.

In contrast to instrumental devaluation, we found that OFC lesions abolished the appropriate reduction in Pavlovian approach responding when the outcome was devalued by taste aversion (Gallagher et al., 1999; Pickens et al., 2003, 2005). These findings confirm that focal lesions of LO are sufficient to disrupt Pavlovian devaluation in rodents, previously reported with substantially larger OFC lesions targeting VO, LO, DLO, AI, and MO subregions (Gallagher et al., 1999; Pickens et al., 2003, 2005).

Consistent with model-based theories of OFC function, at test LO lesions disrupted the ability to infer the new expected value of the CS that predicted the devalued outcome based. Specifically, OFC is argued to be necessary for inferring ‘hidden’ task states, such as the new value of the expected outcome, when this information is not externally available/signalled (Wilson et al., 2014). Consistent with this prediction, providing a brief re-exposure to the devalued reward immediately prior to test (reinstatement test) allowed LO lesioned animals to successfully integrate the devalued outcome into their anticipatory responding. However, this result must be interpreted with caution as it is possible that re-exposure to the devalued US in the magazine resulted in some form of short-term avoidance to the magazine that persisted throughout the subsequent test session when the devalued CS was presented.

Sign-tracking and reversal learning

A consistently reported finding is that OFC lesions leave intact the initial acquisition and behavioural expression of either cue-outcome or action-outcome contingencies (Boulougouris et al., 2007; Chudasama et al., 2007; Chudasama and Robbins, 2003; Gallagher et al., 1999; Rudebeck and Murray, 2011a; Schoenbaum et al., 2002; but see Walton et al., 2011). This finding is critical to ruling out many alternative explanations of the effect of OFC lesions on outcome devaluation such as general learning deficits. In the present experiments, anterior LO lesions did not affect instrumental conditioning, Pavlovian conditioning, or taste aversion learning. Unexpectedly, lesions that were focused on the posterior LO region did suppress behavioural responding during Pavlovian acquisition. This effect is not simply a general suppression of activity, as there was no difference in locomotor activity (Figure 2—figure supplement 2), nor a change in appetite, as there was no difference in consumption levels at the start of taste aversion learning.

One possible account of the reduced Pavlovian conditioned approach behaviour in the posterior LO group is that the CS did not acquire incentive salience. Incentive salience refers to the process by which the incentive-motivational properties of the outcome are transferred to the CS (Berridge, 2004), such that if a lever CS is presented a rat will attempt to ‘consume’ the lever as if it were the pellet that it predicts. This behaviour directed at the lever CS (sign-tracking) comes at the expense of the traditional Pavlovian approach response to the site of reward delivery, the magazine (goal-tracking). Sham control and anterior LO lesions did not affect the propensity to acquire sign-tracking behaviour, whereas sign-tracking was significantly reduced following posterior LO lesions. This finding is consistent with evidence that rats showing stronger sign-tracking tendencies have increased c-fos activity in posterior OFC regions following lever cue presentation (Flagel et al., 2011). This suggests that the posterior but not the anterior LO mediates the attribution of incentive-salience to cues. Alternatively, posterior LO may be involved in resolving response competition when multiple responses are supported by a predictive cue. In the present experiment, extensive Pavlovian training during the outcome devaluation procedure preceded the sign-tracking procedure, which may have resulted in a pre-existing dominant magazine approach response that could not be overcome following posterior LO lesions.

Surprisingly, extensive LO lesions have also been shown to have no effect on sign-tracking behaviour (Chang, 2014), but did retard subsequent reversal learning when rewarded (CS+) and non-rewarded (CS-) lever cues reversed reward contingencies. The present study found a similar impairment in reversal learning following posterior but not anterior LO lesions. This reversal deficit was not simply due to differences in the acquisition of the sign-tracking response as the posterior LO lesioned animals could reverse their lever approach behaviour. Instead, the deficit was specific to the magazine approach response which failed to extinguish in the posterior LO group when the previously rewarded CS+ was reversed to a non-rewarded CS-.

It is important to consider the limitations of a pre-training lesion approach to manipulating OFC function. While pre-training lesions guarantee loss of OFC function throughout training, it is possible that other neural regions might compensate for the loss of this function. Pre-training lesions are an important approach for probing deficits in acquisition, but limit inferences about whether these deficits reflect impaired encoding or retrieval. For example, ehile it has been shown that OFC lesions disrupt outcome devaluation when performed before and after training (Gallagher et al., 1999; Pickens et al., 2003, 2005), the effect of OFC lesions on reversal learning depends on whether they occur before or after initial training (Boulougouris et al., 2007; Boulougouris and Robbins, 2009). Further research is required to clarify the nature and extent of the anterior and posterior LO lesion deficits and how they relate to the function of the orbitofrontal region as a whole.

Rodent and primate homology

Human and non-human primate OFC can be defined cytoarchitectonically using clear granular, agranular, and dysgranular areas (Price, 2006). In contrast, the rodent OFC only consists of agranular cortical regions, which led Brodmann, 1909 to conclude that rodents do not have a comparable orbital or frontal cortex. However, Rose and Woolsey, 1948 proposed a different approach to identifying rodent homologs of the orbital and frontal cortex based on similar connectivity between the putative OFC of rabbits and cats and the mediodorsal nucleus of the thalamus (MD). This approach, based on MD connectivity, has been repeatedly extended to other regions of the frontal cortex in rodents (Groenewegen, 1988; Uylings et al., 2003). However, Price, 2006 noted that Brodmann’s original problem of defining precise homologs between rodent and primate OFC with comparable cytoarchitecture still remains (an argument that some researchers have maintained, for example Preuss, 1995; Rolls, 2014; Rudebeck and Murray, 2011a; Wise, 2008). The solution to this problem has been to base rodent and primate OFC homology on a combination of similar connectivity and functional evidence (e.g. Roesch and Schoenbaum, 2006; Rudebeck and Murray, 2014).

The hallmark behavioural consequences of OFC lesions in rodents and primates, critical to establishing cross-species homology, have been questioned. For example, deficits in extinction learning have been cited and form the basis of models of OFC function (Butter, 1969; Kolb et al., 1974; Wilson et al., 2014), but have been poorly replicated (Burke et al., 2009; Panayi and Killcross, 2014). The two behavioural disturbances following OFC damage that have dominated the literature (Murray et al., 2007) are impaired reversal learning and outcome devaluation deficits. Recently, the robustness of reversal learning deficits following OFC lesions in primates has been challenged as an artefact of aspiration lesions which can cause unintended damage to neighbouring white matter tracts (Rudebeck et al., 2013). Fibre sparing excitotoxic lesions fail to replicate reversal learning deficits but do significantly disrupt outcome devaluation in primates. This finding has important implications for questions of homology between primate and rodent OFC as it suggests very few functional similarities exist. However, the apparent lack of functional similarities may be a consequence of poor OFC subregion specificity within the rodent and primate literature.

Our findings provide the first evidence of a dissociation of devaluation and reversal learning deficits within anterior and posterior regions of the lateral OFC subregion. Specifically, both anterior and posterior LO are necessary for updating behaviour based on the current value of expected outcomes (i.e. disrupt devaluation performance), but only posterior LO appears to be necessary for rapidly updating behaviour when predictive cue-outcome contingencies change (i.e. reversal learning deficits). Recently Murray et al. (2015) provided similar demonstrations of functional dissociations between anterior (area 11) and posterior (area 13) macaque OFC in Pavlovian outcome devaluation. Together these data suggest the importance of anterior-posterior differences in OFC subregions that complement the growing literature on functional differences between medial and lateral OFC subregions in both rodents (Balleine et al., 2011; Bradfield et al., 2015; Corwin et al., 1994; Izquierdo, 2017; Mar et al., 2011) and primates (Bouret and Richmond, 2010; Noonan et al., 2010; Rudebeck and Murray, 2011a; Walton et al., 2015).

Theoretical accounts of OFC function

The importance of differentiating OFC subregions has implications for theories of OFC function. One class of theories of OFC function proposes that the OFC represents information about the sensory specific properties of expected outcomes (Burke et al., 2008; Delamater, 2007; Schoenbaum and Esber, 2010; Schoenbaum et al., 2009). During Pavlovian conditioning in normal animals, a stimulus may form associations with multiple features of a reward such as its general motivational properties and sensory specific properties; Associative activation of these different properties can lead to different classes of responses such as general preparatory or specific consummatory responses (Delamater, 2007, 2012; Dickinson and Dearing, 1979; Hall, 2002; Konorski, 1967; Wagner and Brandon, 1989). Here, the OFC is proposed as the neural substrate of the associatively activated representation of an expected reward’s sensory specific properties. For example, if an animal learns that a tone stimulus predicts lemon flavoured sucrose reward, then in the presence of the tone and in anticipation of reward delivery the OFC might represent information about the lemon flavour, viscous fluid properties, and sweet taste of the upcoming reward. This theory accounts for the effect of OFC lesions in outcome devaluation since an animal needs to know which outcome is predicted (outcome identity) to selectively reduce anticipatory responding for a no-longer valuable outcome.

Model-based theories of OFC function can be considered modern extensions of these sensory-specific encoding accounts. When the sensory-sensory associations formed between a CS and the sensory properties of a reward are relevant to solving a task (as in outcome devaluation), they can be interpreted more generally as forming part of the task structure. We propose that our lesioned animals could represent the specific properties of the expected outcome but could not use this representation/underlying task-structure to access the current motivational value of that outcome. This proposal is in line with sensory specific outcome expectancy theories of OFC function but suggests a limited role for the anterior LO in accessing the current/updated expected value of an outcome based on its sensory properties and using this to modulate behaviour accordingly. It may be that a unified representation of an expected outcome, such as predicted likelihood, taste, location, hedonic value, and motivational value is represented across multiple OFC subregions.

Further evidence for the distribution of these representations across multiple OFC subregions in rodents comes from recent studies showing similar dissociations. For example, we replicated the absence of an effect of anterior LO lesions on sensory specific Pavlovian-to-instrumental transfer (sPIT, Appendix 1—figure 1A; Ostlund and Balleine, 2007a), a procedure that requires inferences between a number of hidden task states. In contrast, larger pre-training lesions encompassing both LO and VO do disrupt the sPIT effect (Balleine et al., 2011; Scarlet et al., 2012), but not instrumental outcome devaluation. Recently, Parkes et al. (2018) showed that chemogenetic disruption of posterior LO and DLO can disrupt instrumental devaluation under certain training conditions. Together, these findings suggest that the role of the OFC as a whole may be to generate a cognitive map of underlying task structure. However, the encoding and subsequent use of these underlying task structures to guide behaviour appears to be distributed amongst the many orbital subregions.

A similar conclusion is reached by Murray et al. (2015) in macaques, who found that temporary inactivation of anterior OFC (area 11) disrupted satiety devaluation when inactivation occurred at test but not when inactivation occurred during the satiety devaluation procedure prior to test. In contrast, posterior OFC (area 13) inactivation only disrupted performance when inactivated during the satiety procedure but not at test. This suggests that posterior OFC in macaques is necessary for updating the value of expected rewards, whereas anterior OFC is critical for translating this knowledge into behaviour. This parallels our suggested role for the anterior LO in rodents in accessing the current value of an expected outcome to guide behaviour. Furthermore, this potential homology predicts that posterior LO in rodents might be important for value updating. Our findings provide prima facie evidence for this prediction, showing that posterior LO lesions suppress overall levels of Pavlovian learning, and extinction of learnt value during reversal learning, consistent with impoverished value updating. However, direct tests of this dissociation are still needed to confirm this homology between rodent and non-human primate OFC.

Materials and methods

Animals

Rats were housed four per cage in ventilated Plexiglass cages in a temperature regulated (22 ± 1°C) and light regulated (12 hr light/dark cycle, lights on at 7:00 AM) colony room. At least one week prior to behavioural testing, feeding was restricted to ensure that weight was approximately 95% of ad libitum feeding weight, and never dropped below 85%. All animal research was carried out in accordance with the National Institute of Health Guide for the Care and Use of Laboratories Animals (NIH publications No. 80–23, revised 1996) and approved by the University of New South Wales Animal Care and Ethics Committee. Subjects were forty-eight male Long Evans rats (Monash Animal Services, Gippsland, Victoria, Australia) approximately 4 months old (Experiment 1, N = 32, weighing between 301–359 g, M = 326.6 g; Appendix 1 Experiment, N = 16, weighing between 321–399 g, M = 342.1 g), and one hundred and twelve male Wistar rats (BRC Laboratory Animal Service, University of Adelaide, South Australia, Australia) approximately 4 months old (Experiment 2, N = 64, weighing between 343–452 g, M = 403.6 g).

Apparatus

Behavioural testing was conducted in eight identical operant chambers (30.5 × 32.5 × 29.5 cm; Med Associates) individually housed within ventilated sound attenuating cabinets. Each chamber was fitted with a 3 W house light that was centrally located at the top of the left-hand wall. Food pellets could be delivered into a recessed magazine, centrally located at the bottom of the right-hand wall. Delivery of up to two separate liquid rewards via rubber tubing into the magazine was achieved using peristaltic pumps located above the testing chamber. The top of the magazine contained a white LED light that could serve as a visual stimulus. Access to the magazine was measured by infrared detectors at the mouth of the recess. Two retractable levers were located on either side of the magazine on the right-hand wall. A speaker located to the right of the house light could provide auditory stimuli to the chamber. In addition, a 5 Hz train of clicks produced by a heavy-duty relay placed outside the chamber at the back right corner of the cabinet was used as an auditory stimulus. The chambers were wiped down with ethanol (80% v/v) between each session. A computer equipped with Med-PC software (Med Associates Inc., St. Albans, VT, USA) was used to control the experimental procedures and record data.

Devaluation chambers. 

Request a detailed protocol

To provide individual access to reinforcers during the devaluation procedure, rats were individually placed into a mouse cage (33 × 18 × 14 cm clear Perspex cage with a wireframe top). Pellet reinforcers were presented in small glass ramekins inside the box and liquid reinforcers were presented in water bottles with a sipper tube. 1 day prior to the start of the devaluation period, all rats were exposed to the mouse cages and given 30 mins of free access to home cage food and water to reduce novelty to the context and consuming from the ramekin and water bottles.

Locomotor activity was assessed in a set of 4 rat open field arenas (Med Associates Inc., St. Albans, VT, USA) individually housed in light and sound attenuating cabinets. A 3 W light attached on the top left corner of the sound attenuating cabinet provided general illumination in the chamber and was always on. A 28 V DC fan on the right hand wall of the sound attenuating cabinet was also left on throughout testing to mask outside noise. The floor of the open field arena was smooth plastic and the four walls were clear Perspex with a clear Perspex roof containing ventilation holes. The internal dimensions of the chamber were 43.2 × 43.2 × 30.5 cm (length x width x height). Two opposing walls contained an array of 16 evenly spaced infrared detectors set 3 cm above the floor to detect animal locomotor activity. A second pair of infrared beam arrays was set 14 cm above the floor on the remaining walls to detect rearing behaviours. Infrared beam breaks were recorded using a computer equipped with Activity Monitor software (Med Associates Inc., St. Albans, VT, USA) which provided a measure of average distance travelled based on beam break information.

Surgery

Request a detailed protocol

Excitotoxic lesions targeting the lateral OFC were performed prior to any training. Rats were anesthetized with isoflurane, their heads shaved, and placed in a stereotaxic frame (World Precision Instruments, Inc., Sarasota, FL, USA). The scalp was incised, and the skull exposed and adjusted to flat skull position. Two small holes were drilled into the skull and the dura mater was severed to reveal the underlying cortical parenchyma. A 1 µL Hamilton needle (Hamilton Company, Reno, NV, USA) was lowered through the two holes targeting the lateral OFC (co-ordinates specified below). At each site the needle was first left to rest for 1 min. Then an infusion of N-methyl-D-aspartic acid (NMDA; Sigma-Aldrich, Switzerland), dissolved in phosphate buffered saline (pH 7.4) to achieve a concentration of 10 μg/μL, was infused for 3 mins at a rate of 0.1 µ/min. Finally, the needle was left in situ for a further 4 mins to allow the solution to diffuse into the tissue. Following the diffusion period the syringe was extracted and the scalp cleaned and sutured. Sham lesions proceeded identically to excitotoxic lesions except that no drugs were infused during the infusion period. After a minimum of 1 week of postoperative recovery, rats were returned to food restriction for 2 days prior to further training.

Animals were randomly assigned to one of two lesion conditions in Experiments 1and Supplementary Experiment (Appendix 1—figure 1A), with the following stereotaxic co-ordinates AP: +3.5 mm, ML: ±2.2 mm, D-V: −5.0 mm from bregma (Experiment 1, sham, n = 16; lesion, n = 16; Supplementary Experiment, sham, n = 8; lesion, n = 8). In Experiment 2, three sets of lesion co-ordinates were used to encourage distinct lesion subgroups. The co-ordinates used were AP: +4.2 mm, ML: ±2.6 mm, D-V: 4.8 mm (n = 16 lesion, n = 6 sham), AP: +3.7 mm, ML: ±3.2 mm, D-V: −5.0 mm (n = 16 lesion, n = 5 sham) and AP: +3.7 mm, ML: ±2.6 mm, D-V: −5.0 mm (n = 16 lesion, n = 5 sham). Final group designation was based on post-experimental lesion characterisation.

Reinforcers

Request a detailed protocol

The reinforcers used were a single grain pellet (45 mg dustless precision grain-based pellets; Bio-serv, Frenchtown, NJ, USA), 20% w/v sucrose solution and 20% w/v maltodextrin solution (Myopure, Petersham, NSW, Australia). Liquid reinforcers were flavoured with either 0.4% v/v concentrated lemon juice (Berri, Melbourne, Victoria, Australia) or 0.2% v/v peppermint extract (Queen Fine Foods, Alderley, QLD, Australia) to provide unique sensory properties to each reinforcer. Liquids were delivered over a period of 0.33 s via a peristaltic pump corresponding to a volume of 0.2 mL. The volume and concentration of liquid reinforcers was chosen to match the calorific value of the corresponding grain pellet reward, and have been found to elicit similar rates of Pavlovian and instrumental responding as a pellet reward in other experiments conducted in this lab. In all experiments involving liquids, the magazine was scrubbed with warm water and thoroughly dried between sessions to remove residual traces of the liquid reinforcer. To reduce neophobia to the reinforcers, one day prior to magazine training sessions all animals were pre-exposed to the reinforcers (10 g of pellets per animal and 25 ml of liquid reinforcer per animal) in their home cage.

Magazine training

Request a detailed protocol

I n all experiments, animals received two sessions of magazine training, one for each reinforcer with the following parameters: reward delivery was on an RT60 s schedule for 16 rewards with the house light and fan kept on throughout the session. Sessions were separated by at least 2 hr.

Experiment 1. instrumental devaluation by LiCl taste aversion

Request a detailed protocol

All animals received 2 separate sessions of training each day with the pellet and sucrose rewards, an instrumental lever training session (lever extended) and a magazine training (lever retracted) session with non-contingent reward delivery to provide equivalent exposure to the alternative reward. The order of training sessions and the identity of the instrumental and alternate reward were fully counterbalanced across all groups. All training session were separated by a period of at least 2 hr.

First, animals were familiarised with lever training using a fixed ratio 1 schedule (FR1, reward delivered on each lever press), for 60 mins or until a maximum of 25 rewards were earned. The alternative, non-instrumental, reward was delivered on an RT30s (random time 30 s) schedule for 1 hr or until 25 rewards had been delivered.

Instrumental acquisition training occurred on the following 3 days. Instrumental training sessions lasted until 40 rewards were achieved and lever pressing was rewarded on a RI30s schedule (random interval 30 s such that on average every 30 s a reward becomes available to reward the next lever press). The alternate reward session involved an RT30s schedule for 40 rewards. The use of interval and time based schedules of reinforcement was designed to match the instrumental and alternate reward sessions so that all experiences were identical except for the presence (and response requirement) of the lever in the instrumental session.

Following devaluation of the reward by taste aversion, all animals were tested with the instrumental lever to assess devaluation. The test was conducted under extinction and the lever was extended for 10 mins. On the following day, all animals were given a 20 min re-acquisition test to assess devaluation in the presence of the instrumental reinforcer (RI30s schedule).

Taste aversion

Request a detailed protocol

Following instrumental training all animals received taste aversion training on one of the reinforcers. Half the animals in each surgery condition (sham and lesion) were allocated to a devalued or a non-devalued group after being matched on their level of instrumental performance. The devalued groups received pairings of the instrumental reinforcer paired with 0.15M LiCl injections i.p. (15 mL/Kg) after 30 mins of individual access to that reinforcer in the devaluation chamber. The non-devalued groups received similar LiCl injections following access to the alternate (i.e. non-instrumental) reinforcer. On alternating days (order counterbalanced) all groups received 0.9% w/v saline injections (15 mL/Kg) following 30 mins access to the reinforcer that was not paired with LiCl. This procedure was repeated over 6 days such that all animals received 3 reinforcer-LiCl pairings and 3 reinforcer-saline pairings. Therefore, all animals had one reinforcer devalued with LiCl such that in the devalued groups it was the instrumental reinforcer and in the non-devalued groups it was the alternate reinforcer.

All animals were given an additional day without injections at the end of the taste aversion procedure before any further behavioural testing was conducted. This minimised the possibility of nausea persisting at test after the final LiCl injection, and ensured that all animals were at a comparable level of hunger at test.

Experiment 2: pavlovian devaluation by LiCl taste aversion

Histology and lesion group allocation

Request a detailed protocol

Lesion damage is depicted in Figure 2A. Lesion extent was judged by a trained observer blind to group allocation. Once approximate lesion extent was drawn, a second trained observer (also blind to surgical conditions) independently verified the extent of the drawn lesions and the grounds for exclusion. Animals were excluded if there was only unilateral OFC damage, evidence of damage to the dorsal part of the anterior olfactory nucleus ventral to OFC or if there was extensive damage to the white matter of the forceps minor of the corpus callosum. Seven animals were excluded due to the presence of infection that was evident across the entirety of the frontal cortex, and a further two animals were excluded due to illness throughout behavioural training. Three animals were excluded due to insufficient bilateral damage to OFC structures. Seven animals were excluded based on significant unilateral or bilateral damage to the dorsal part of the anterior olfactory nucleus. One animal was excluded due to almost complete unilateral damage to primary and secondary motor areas M1 and M2, ventral to the OFC. Final group numbers were sham n = 13, lesion n = 31 (N = 44).

The lesion drawings were then analysed to establish the extent of damage to the subregions of the OFC from which two distinct lesion groups could be formed. OFC lesions were predominantly confined to LO and DLO as in previous experiments and were distributed across a large anterior-posterior range. This observation was quantified by estimating the percentage of bilateral damage across all OFC structures at 7 coronal planes (+5.20 to +2.20 mm from bregma in steps of 0.50 mm). At each coronal plane the total area of each orbital structure and the total area of lesion damage were estimated (number of pixels counted using Adobe Photoshop CS; San Jose, CA). Bilateral damage was defined by comparing the hemisphere with the smallest lesion area for each orbital subregion and the total area of the structure in that hemisphere. Total OFC damage at each section was defined by the sum of damaged area relative to the sum of the total area of each orbital structure that is % Bilateral OFC damage = 100 x Total lesion areaTotal orbital structure area. The OFC structures included in this analysis were LO, DLO, VO, AI, AId and AIv, however the damage (Figure 2A) was relatively confined to LO and DLO. Most animals had OFC damage at +3.70 mm from bregma, so anterior and posterior lesions were based on comparing relative lesion volume anterior (+5.20, +4.70, +4.20 mm) and posterior (+3.20, +2.70, +2.20) to this point. Animals with a greater lesion damage anterior to +3.70 were assigned to the anterior OFC lesion group, and animals with greater lesion damage posterior to +3.70 were allocated to the posterior OFC group. While these criteria for anterior and posterior OFC were based on the present sample, these criteria also define the anterior-posterior split that defines DLO and AId/AIv, and the presence of the forceps minor of the corpus callosum, which supports the external validity of these criteria.

Final lesion group numbers were anterior n = 16, posterior n = 15. A Group (anterior, posterior) x Plane (+5.20, +4.70, +4.20, +3.70, +3.20, +2.70, +2.20) mixed ANOVA analysing the percentage bilateral lesion volume (Figure 2B) revealed no significant overall effect of Group (F(1, 29)=0.21, p=0.65) but a significant main effect of Plane (F(6, 174)=64.07, p<0.001) and Group x Plane interaction (F(6, 174)=17.70, p<0.001). Follow up planned contrasts comparing groups at each coronal plane revealed greater damage in the anterior group at +4.70 (F(1, 29)=7.87, p=0.01) and +4.20 (F(1, 29)=33.74, p<0.001), and greater damage in the posterior group at +3.70 (F(1, 29)=6.97, p=0.01) and +3.20 (F(1, 29)=12.78, p=0.001) but no significant differences at +5.20 (F(1, 29)=2.89, p=0.10) or +2.20 (F(1, 29)=1.86, p=0.18) (Figure 2B). These differences indicate that the grouping criteria were effective at creating partially overlapping but distinct lesion groups.

Acquisition

Request a detailed protocol

Pavlovian acquisition training occurred over 12 days involving one session of training per day. Each session consisted of 32 trials with a 90 s ITI, 15 s CS duration co-terminating with the delivery of a single reward. Two CS (5 Hz click and 78 dB white noise) and US (grain pellet and lemon sucrose) relationships were maintained throughout training such that rats always experienced 16 of each unique CS-US pairings each session (counterbalanced).

Taste aversion

Request a detailed protocol

Taste aversion to one of the rewards (counterbalanced) was achieved by pairing reward consumption with nausea induced by Lithium Chloride (LiCl; Sigma-Aldrich, Switzerland). All rats received 3 pairings of one reward with an i.p. injection of 0.15 M LiCl (15 mL/Kg) and 3 pairings of the other reward with saline (0.9% w/v; Sigma-Aldrich, Switzerland). The first 2 food-injection pairings occurred immediately after providing rats with 30 mins free access to the reward in the devaluation chambers. The final food-injection pairings occurred in the test chamber after rats were exposed to a magazine training session with one of the reinforcers (reward delivered randomly on an RT60 s schedule for 16 rewards). The order of food-injection pairings was counterbalanced and alternated across the 6 days of taste aversion training. The final food-injection pairings in the test chamber were conducted to ensure that the taste aversion transferred between the devaluation chambers and the testing chambers. All animals were given an additional day without injections at the end of the taste aversion procedure before any further behavioural testing was conducted. This minimised the possibility of nausea persisting at test after the final LiCl injection, and ensured that all animals were at a comparable level of hunger at test.

Devaluation test

Request a detailed protocol

Devaluation testing was identical to Pavlovian acquisition training except that it was performed under extinction that is no rewards were delivered throughout the session. The identity of the first and second cue and was predetermined at test to allow for counterbalancing. Animals were tested again on the following day with the identity of the first cue changed to fully counterbalance the test procedure.

US specific reinstatement

Request a detailed protocol

After the final devaluation test, all rats received a US-specific reinstatement test to verify whether any failure of devaluation was due to impaired retention of the acquired taste aversion. On each day animals were pre-exposed to a single US type within the test chamber before being tested with 8 presentations in extinction of the CS that predicted the US. Exposure sessions involved a 5 min baseline period in which nothing happened in the chamber, followed by a reward delivery every 5 for 30 s (6 rewards), and then a post reward period of 5 mins. After the session, rats were temporarily returned to their home cage to allow for any remaining rewards to be collected for counting later and thorough cleaning of the reward site. Rats were then returned to the testing chamber for a test consisting of 8 CS presentations (90 s ITI) in extinction with the CS that predicted the recently delivered US. The order of outcome testing across both days was fully counterbalanced.

Re-acquisition

Request a detailed protocol

After reinstatement testing, all rats received 3 days of re-acquisition training. These were identical to Pavlovian acquisition training except that only the CS paired with the non-devalued CS was presented for all 32 trials.

Autoshaping

Request a detailed protocol

Following the reacquisition training, all animals were trained for 12 days on a discriminated autoshaping procedure where the non-devalued reward continued to serve as the US. Each session consisted of 32 trials with a 90 s ITI and 15 s CS duration, 16 rewarded CS+ trials and 16 non-rewarded CS- trials. The CS+ and CS- involved the insertion of the lever on the left or right hand side of the magazine (counterbalanced). Responding on the lever had no programmed consequences but was recorded for analysis.

Reversal

Request a detailed protocol

Autoshaping was followed by reversal training for 12 days such that the CS+ and CS- contingencies were reversed that is the rewarded lever cue no longer predicted reward and the non-rewarded lever cue predicted reward.

Locomotor screening

Request a detailed protocol

All animals were tested for locomotor activity before surgery, and again at the end of training, to verify the absence of any effects on locomotor activity in a within-subjects design.

Statistical analysis

Request a detailed protocol

Baseline responding. Baseline rates of responding across all experiments did not differ between groups. Separate mixed ANOVAs on baseline responding in each experimental stage did not reveal significant main effects or interactions with Group (all F < 1.75, p>0.14).

CS responding was operationalized as the time spent exploring the magazine during the 15 s CS period. PreCS responding was operationalized as the duration of responding during the 15 s immediately preceding the 15 s CS and was used as a measure of baseline responding to the testing context. All data were analysed with mixed ANOVAs, and significant interactions of interest were followed up with ANOVAs on the relevant subset of data. Following significant omnibus ANOVA tests, planned linear and quadratic orthogonal trend contrasts and their interactions between groups were analysed to assess differences in rates of responding.

Appendix 1

Effect of OFC lesions on sensory specific pavlovian to instrumental transfer

Similar to Ostlund and Balleine, (2007a), we tested whether the effects of our specific OFC lesions affected the use of sensory-specific Pavlovian information to guide instrumental responding using a Pavlovian to instrumental transfer test (PIT).

Methods

Acquisition training

On each day all animals received either a single Pavlovian training session, or two instrumental training sessions. The order of Pavlovian and instrumental sessions alternated each day.

Pavlovian training

All animals received a total of 16 days of Pavlovian training. Pavlovian training sessions consisted of 3 CSs, a 2800 Hz, 80 dB tone, 78 dB white noise and a 5 Hz train of clicks. There were 4 presentations of each cue (i.e. a total of 12 cues presented within a session) each lasting 2 mins with a variable ITI of 300s. Reward was delivered throughout the cue period on a RT 30s schedule. Each cue was paired with a unique outcome (grain pellet, lemon sucrose, and peppermint maltodextrin) and the identity of that outcome remained constant. All unique cue-outcome combinations were counterbalanced across animals and within groups.

Instrumental training

Prior to Pavlovian and instrumental acquisition training all animals were given 2 days of lever training on a continuous reinforcement schedule (each lever press was rewarded) using the same parameters as the instrumental training sessions.

All animals received a total of 12 days of instrumental training. Instrumental training involved two sessions per day, separated by at least one hour. During the session a single lever was extended and lever pressing was rewarded with a unique liquid outcome, either lemon sucrose or peppermint maltodextrin. During the second instrumental session of the day, a different lever was extended and lever pressing was rewarded with the unique liquid outcome that was not paired with the earlier lever. The identity of the lever outcome pairings was kept constant throughout training and was counterbalanced between subjects and within groups. After initial lever acquisition, animals received three days of Random interval RI 15s, three days of RI 30s and six days of RI 60s.

PIT test

The PIT test involved a single lever presented at the start of the session for 10 mins with no programmed consequences to extinguish lever pressing behavior to a low baseline rate (this allows for clearer demonstration of the potential rate-enhancing effect of CS presentations). Then the CSs were played for 2 min with a fixed 2 min inter-stimulus interval. Each CS was played three times (a total of 9 CS presentations) and the order of CS presentation was randomized. Throughout the session no rewards were delivered and lever pressing and magazine entry were recorded with no programmed consequences. A second identical test session was conducted on the following day using the lever that had yet to be tested. Order of lever presentation was counterbalanced. This pattern of tests was repeated once after 4 days of retraining on Pavlovian and instrumental sessions.

Results

Histology

Lesion damage is depicted in (Appendix 1—figure 1A ). One sham animal was excluded due to extensive damage to primary and secondary motor areas M1 and M2. Final N = 15; sham n = 7, lesion n = 8.

Behavioural results

Pavlovian

A mixed Group x Day (16 days) x US (sucrose, maltodextrin, pellet) ANOVA was conducted on CS-PreCS magazine entry rate to quantify Pavlovian acquisition. This analysis revealed that responding was greater for pellets than sucrose reinforcers (F(1, 13)=8.69, p=0.03, main effect of US, F(2, 26)=4.56, p=0.02, no other differences between reinforcers reached significance, sucrose vs maltodextrin, F(1, 13)=0.37, p=0.91, maltodextrin vs pellets, F(1, 13)=4.42, p=0.16). Surprisingly, CS responding was significantly greater in the sham than the lesion group (F(1, 13)=12.49, p=0.004). Importantly, there were no significant interactions between Group, Day, or US (remaining F < 1.35, p>0.11). Both groups showed significant acquisition over days of training (significant main effect of Day, F(15, 195)=4.86, p<0.001, significant positive linear, F(1, 13)=8.69, p<0.001, and negative quadratic trend, F(1, 13)=6.29, p=0.03), and responding (entries per minute, CS-PreCS) on the final day of acquisition were sham (M = 7.76, SD = 1.36) lesion (M = 6.03, SD = 1.27). It is likely that the increased responding to the pellet reinforcer is a result of the use of magazine frequency as a measure as we routinely observe the opposite pattern when using magazine duration as a measure (e.g. Experiment 2). Unfortunately, magazine duration data were not recorded during this experiment to determine whether the difference between sham and lesion groups was only present on this measure. Furthermore, it is important to note that this measure of Pavlovian conditioning is conflated with consummatory responses since the USs were delivered at random times throughout the CS, and as such it is hard to draw clear conclusions about any observed differences in responding.

Instrumental acquisition did not differ between groups, a pattern supported by a mixed Group x Day (12 days) ANOVA finding a significant main effect of Day (F(11, 143)=58.15, p<0.001) but no significant effect of Group (F(1, 13)=1.62, p=0.23) or Group x Day interaction (F(11, 143)=1.10, p=0.36). Response levels (lever presses per minute) on the final day of instrumental training were similar in sham (M = 11.00, SD = 3.66) and lesion (M = 9.12, SD = 3.61) groups.

Extinction of magazine and lever responding in the 10 min prior to testing did not reveal any group differences (Figure S1 B, C). Separate mixed Group x Block (10 blocks of 1 min) ANOVAs on lever pressing and magazine approach revealed significant main effects of Block (lever pressing F(9, 117)=8.95, p<0.001, magazine entries F(9, 117)=2.60, p=0.01) but no effect of Group or Group x Block interactions (remaining F < 1.42, p>0.19).

At test, lever pressing was assessed in the presence of the CSs that either predicted the same outcome as the instrumental response, a different outcome (predicted by the alternative instrumental response) or a general outcome not predicted by either instrumental response. In both groups lever pressing was potentiated most by CS same, moderately by CS different and minimally by CS general (Figure S1 D). A mixed Group x Cue (same, different, general) ANOVA confirmed that responding differed between cues (main effect of Cue F(2, 26)=6.32, p=0.01) but was not differentially affected by lesion group (main effect of Group F(1, 13)=0.04, p=0.85, Group x Cue interaction F(2, 26)=1.26, p=0.30). Bonferroni adjusted simple main effects revealed that responding to CS same was greater than CS general (F(1, 13)=10.25, p=0.02), however CS same did not differ from CS different (F(1, 13)=3.58, p=0.24) and CS different did not differ from CS general (F(1, 13)=3.92, p=0.21).

Additional comparisons examined whether responding to each cue was significantly different from baseline (i.e. 0). The data were collapsed across groups as there was no significant interaction with group. Lever responding was significantly greater than baseline for CS Different (F(1, 13)=8.29, p=0.01) and CS Same (F(1, 13)=20.20, p=0.001), but not for CS General (F(1, 13)=0.35, p=0.56) which suggests that there was no significant evidence of a general PIT effect for CS General.

Magazine responding during the test session was not differentially affected by either group or cues (Figure S1 E). A mixed Group x Cue (same, different, general) ANOVA supported this observation with all effects failing to reach significance (all F < 1.23, p>0.29).

Appendix 1—figure 1
The effects of excitotoxic OFC lesions on specific Pavlovian-to-instrumental transfer (PIT).

(A) Representative OFC lesion damage in the lesion group. Semi-transparent grey patches represent lesion damage in a single subject, and darker areas represent overlapping damage across multiple subjects. Coronal sections are identified in mm relative to bregma (Paxinos and Watson, 1997). Rate of lever pressing (B) and magazine entry behaviour (C) during extinction of the instrumental response prior to PIT testing. Instrumental lever pressing (D) and magazine entry behaviour during the specific PIT test. Responding plotted as the mean response rate per minute during each cue minus the preceding baseline no-cue period. Same and different conditions indicate whether the Pavlovian CS predicted the same or different liquid reinforcer to the instrumental response, the general condition indicates responding during the CS that predicted pellets which were never an instrumental reinforcer. Error bars depict + SEM. (*) Symbol denotes statistical significance of simple or main effects following a significant interaction.

https://doi.org/10.7554/eLife.37357.009

References

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
    Vergleichende Lokalisationslehre Der Großhirnrinde in Ihren Prinzipien Dargestellt Auf Grund Des Zellenbaues
    1. K Brodmann
    (1909)
    Leipzig: Leipzig: Barth.
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
    The role of the orbitofrontal cortex in sensory-specific encoding of associations in Pavlovian and instrumental conditioning
    1. AR Delamater
    (2007)
    In: G Schoenbaum, J. A Gottfried, E. A Murray, S. J Ramus, editors. Linking Affect to Action: Critical Contributions of the Orbitofrontal Cortex, 1121. Oxford: Blackwell Publishing. pp. 152–173.
    https://doi.org/10.1196/annals.1401.030
  21. 21
  22. 22
    Appetitive-aversive interactions and inhibitory processes
    1. A Dickinson
    2. MF Dearing
    (1979)
    In: A Dickinson, R. A Boakes, editors. Mechanisms of Learning and Motivation: A Memorial Volume to Jerzy Konorski. Hillsdale, New Jersey: Lawrence Erlbaum Associates. pp. 203–232.
  23. 23
    Actions and habits: the development of behavioural autonomy
    1. A Dickinson
    (1985)
    Philosophical Transactions of the Royal Society B: Biological Sciences 308:67–78.
    https://doi.org/10.1098/rstb.1985.0010
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
    Associative structures in Pavlovian and instrumental conditioning
    1. G Hall
    (2002)
    In: C. R Gallistel, editors. Steven’s Handbook of Experimental Psychology, 3. New York: John Wiley & Sons. pp. 1–45.
    https://doi.org/10.1002/0471214426.pas0301
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
    Associative representations of emotionally significant outcomes
    1. AS Killcross
    2. P Blundell
    (2002)
    In: S. C Moore, M Oaksford, editors. Emotional Cognition: From Brain to Behaviour, 44. Amsterdam: John Benjamins Publishing Company. pp. 35–74.
    https://doi.org/10.1075/aicr.44.03kil
  39. 39
  40. 40
  41. 41
    Integrative Activity of the Brain; an Interdisciplinary Approach
    1. J Konorski
    (1967)
    Chicago: University of Chicago Press.
  42. 42
  43. 43
  44. 44
  45. 45
  46. 46
  47. 47
  48. 48
  49. 49
  50. 50
  51. 51
  52. 52
  53. 53
  54. 54
  55. 55
  56. 56
    The Rat Brain in Stereotaxic Coordinates(3rd Ed)
    1. G Paxinos
    2. C Watson
    (1997)
    San Diego: Academic Press.
  57. 57
  58. 58
  59. 59
  60. 60
     Architectonic structure of the orbital and medial prefrontal cortex
    1. JL Price
    (2006)
    In: D. H Zald, A. L Rauch, editors. The Orbitofrontal Cortex. Oxford University Press. pp. 3–18.
    https://doi.org/10.1093/acprof:oso/9780198565741.003.0001
  61. 61
  62. 62
  63. 63
    Emotion and Decision-Making explained(First Edit)
    1. ET Rolls
    (2014)
    New York: Oxford University Press.
  64. 64
    The orbitofrontal cortex and its connections with the mediodorsal nucleus in rabbit, sheep and cat
    1. JE Rose
    2. CN Woolsey
    (1948)
    Research Publications - Association for Research in Nervous and Mental Disease 1:210–232.
  65. 65
  66. 66
  67. 67
  68. 68
  69. 69
  70. 70
  71. 71
  72. 72
  73. 73
  74. 74
  75. 75
  76. 76
    Using Multivariate Statistics (6th ed)
    1. BG Tabachnick
    2. LS Fidell
    (2013)
    Boston: Pearson Education.
  77. 77
  78. 78
    Evolution of a Structured ConnectionistModel of Pavlovian Conditioning (AESOP)
    1. AR Wagner
    2. SE Brandon
    (1989)
    In: S. B Klein, R. R Mowrer, editors. Contemporary Learning Theories: Pavliocian Conditioning and the Status of Tradional Learning Theories. Hillsdale, NJ: Lawrence Erlbaum. pp. 149–189.
  79. 79
  80. 80
  81. 81
  82. 82
  83. 83
  84. 84
  85. 85

Decision letter

  1. Geoffrey Schoenbaum
    Reviewing Editor; National Institute on Drug Abuse, National Institutes of Health, United States
  2. Sabine Kastner
    Senior Editor; Princeton University, United States

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for sending your article entitled "The rodent lateral orbitofrontal cortex represents expected Pavlovian outcome value but not identity" for peer review at eLife. Your article is being evaluated by three peer reviewers, and overseen by Geoffrey Schoenbaum as the Reviewing Editor and Sabine Kastner as the Senior Editor.

Given the list of essential revisions, including new experiments, the editors and reviewers invite you to respond within the next two weeks with an action plan and timetable for the completion of the additional work. We plan to share your responses with the reviewers and then issue a binding recommendation.

Overall the reviewers felt that the experiments addressed an extremely interesting and important set of questions, yielding intriguing and valuable results. However, all three reviewers had major concerns about the core experiment in Figure 2, claimed to show that the orbitofrontal cortex is not necessary for devaluation by satiety and, by extension, the authors strong conclusions that the orbitofrontal cortex is therefore not representing information about outcome identity. The reviews express the variety of these concerns, but perhaps the most critical is that the negative effect there comes against a backdrop of a very weak positive effect in controls. So, after discussions, it was felt that even with appropriate caveating and consideration of alternatives, it was not possible to accept this experiment. Two solutions were discussed.

1) Remove this experiment and focus the paper on the important and novel findings in Figure 4, Figure 5 and Figure 6. This would obviously mean extensive changes to the framing of the paper and the claims based on the negative results in Figure 2. However, this is not new experimental work. And it would allow the publication of what all the reviewers agreed were well-done and important and novel results.

2) Repeat experiment 2, ideally using a within subjects design where each animal is tested under both forms of devaluation or in a way that differences in lesion location are not an issue, and with rewards that differ only in sensory and not motivational properties. This would represent substantially more work, but would directly test the question at issue, without confounds of lesion or others noted. If this were done, it would obviously address the concerns and would be a great experiment, assuming the effects (positive or negative) were statistically robust and clear.

The reviewers felt that it was worth hearing a response from the authors before making a final decision.

Reviewer #1:

In this study the authors aim to clarify the role of central/lateral orbitofrontal cortex in a variety of different behaviors – instrumental versus Pavlovian behavior, illness versus satiety induced devaluation, and also reversal learning and sign tracking. In the process, they show that neurotoxic lesions of rat orbitofrontal cortex do not affect instrumental devaluation (between subjects) or Pavlovian devaluation by specific satiety (within subjects). They confirm that orbital lesions do affect Pavlovian devaluation by illness (within subjects), whether lesions were "anterior" or "posterior", and they extend this finding by showing that presentation of the outcomes prior to testing restores normal behavior (somewhat like satiety). In addition to these results, they show that specific transfer is sensitive to satiety effects. And they produce novel data suggesting that their more posterior lesions also cause deficits in allocation of behavior in a sign tracking task, both during acquisition and reversal learning. From these results, the authors draw fairly strong conclusions regarding this collection of areas. Overall, I think the areas of conflicts the authors have tried to address are important, and the data they provide are valuable. They represent a wealth of diverse and very interesting results. However, the designs and how the studies fit together are not ideal, and as a result I am uncomfortable with the strong conclusion and framework the authors put on their data. I think their results with regard to Pavlovian devaluation are open to a variety of other interpretations that are basically not acknowledged. I also think that the paper would be simpler and clearer if it were divided into two. That is, the points regarding devaluation are dissociable from the points regarding reversal learning and sign tracking.

Regarding the devaluation experiments, the main conclusion is that the orbitofrontal cortex is not involved in representation of information about the sensory features of expected outcomes. This is based on two pieces of information: (1) the finding that pre-training orbitofrontal lesions did not affect Pavlovian devaluation by selective satiation, whereas similar lesions do affect Pavlovian devaluation by illness (here and elsewhere) and (2) the idea that satiety, unlike illness, can affect responding either through a change in the desirability of the outcome or via a sort of habituation effect on the sensory representation of the outcome. The authors seem to be proposing that the orbitofrontal cortex (writ large) is doing the former but not the latter. I have several problems with basing this strong conclusion on these data.

First, I think that to the extent there is a second way that satiety (because of its proximity to testing) but not illness (because of its distance) can affect behavior, the results may show that the area lesioned is not required for this sensory habituation mechanism, but they cannot show that the area is not involved in the representation of the information. And it is clear from numerous studies that value-neutral information about predicted events is represented in the areas targeted here, in rats, monkeys and humans. Given this, the strong conclusion, expressed even in the title, seems unwarranted.

Second, I think the authors are ignoring causal studies that suggest the information about upcoming sensory events in orbitofrontal cortex is necessary for behavior, at least in some circumstances. For example, inactivation of this area affects performance and learning that depends on such information in a sensory preconditioning task (Jones et al., 2012). Also, while pre-training lesions do not affect specific transfer, post-training lesions or post-training manipulations of afferents from amygdala do affect specific transfer (Lichtenberg et al., 2018; Ostlund and Balleine, 2007). The authors do not discuss these results, but I think they are quite important. At a minimum they show that if the brain is allowed to learn normally, the orbitofrontal cortex is in fact necessary for using information about identity. Perhaps the intact performance here reflects how the lesioned brain learns the original information, rather than some dichotomy in how information about reward identity is used in devaluation by illness versus satiety.

In any event, in light of these other studies, I think the failure to affect satiety here ends up looking more like a special case or an exception to the rule than a strong demonstration that orbitofrontal cortex either does not represent this information or has it but is not using it.

In addition, I think there are other differences between illness and satiety that are relevant and not considered. One is simply the strength of the effect. Illness is presumably stronger than satiety in changing behavior, both generally and in the two studies that are compared here. Perhaps it is this weakness that supports the negative finding and not a fundamental difference in the areas involvement in devaluation by satiety. Another difference is that satiety is something that can be experienced previously during training, since over the course of training the rats gain experience with the cues in various stages of satiation. This prior experience could allow for the adjustment of behavior to some extent without the sort of processing proposed to depend on orbitofrontal cortex. Illness induced devaluation is not subject to this potential confound. Third it seems to me that the two outcomes used here differ not only in their sensory properties but also in their motivational basis. Pellets satisfy hunger, whereas sucrose satisfies hunger and thirst. Possibly the selective satiation affects behavior through these motivational mechanisms and that is what is not dependent on orbitofrontal cortex? Any or some combination of these proposals could explain the negative effects.

Finally, I think there are also differences in the lesions in the satiety and devaluation experiments – at least it looks to me like the lesions in figure 4 are larger and more lateral, which I think would be more likely to yield effects. Hindsight is easy, but a within subjects design for this critical comparison would have been better. As it stands, there are key lateral areas that seem to be more involved in the experiment where there are lesion effects.

Despite these issues, I do regard the individual experiments as well done and the overall topic is excellent and worth investigating, so I think the authors have valuable results that can help inform future work. I could imagine a different manuscript that more fairly considers these and perhaps other alternatives.

My preference would also be to separate the devaluation studies from those on reversal and sign tracking data. I fewer issues with these data and their interpretation, maybe because their conclusions are less strong and more alternatives are considered. But I think their presence detracts from a full consideration of the other data.

Reviewer #2:

This manuscript presents data from several experiments in rats that focus on the effects of orbitofrontal cortex lesions on devaluation. Individual experiments vary in the type of devaluation (specific satiety, taste aversion), lesion location (anterior, posterior), or learning (Pavlovian, instrumental). The main conclusion is that OFC does not represent sensory features of the outcome. This is based primarily on the finding that OFC lesions only diminish Pavlovian devaluation by taste aversion but not specific satiety (in conjunction with the assumption that habituation, leading to an inability to represent outcome features, is the defining difference between the two devaluation methods). In addition, there is evidence regarding a functional differentiation between anterior and posterior OFC.

This manuscript is well written and presents a large amount of interesting experimental data. My overall concern is that the conclusions are not well justified. They are based on a combination of null results and qualitative comparisons between different experiments which differ in more than one factor. For instance, the two key experiments (Figure 2 and Figure 4) differ not only in devaluation method but also in lesion location (and there is evidence that this may be important). In addition, besides habituation, there are other differences between the two devaluation methods that may explain the current findings and considering these would change the conclusions.

1) The main conclusion that OFC does not represent specific outcome properties is based on an assumption about the key difference between devaluation by specific satiety and taste aversion. Because specific satiety involves repeated exposure to the US, it "involves habituation of the sensory systems required to represent the sensory properties of the outcome." There are two issues with this assumption. (1) Even if specific satiety induces habituation of the sensory systems, it is unclear whether this would also diminish representations of predicted sensory features in OFC and the ability to retrieve the updated value of these outcomes. (2) There are more prominent differences between the two forms of devaluation that may explain the current results. Most importantly, whereas specific satiety slightly reduces the value of the US, taste aversion involves a much more dramatic change in value and can thus be expected to result in a relatively stronger devaluation effect. Indeed, comparing the effects of the two devaluation methods on behavior in the Sham group shows that taste aversion reduces magazine activity by about 40% (Figure 4F), whereas specific satiety changes behavior by only about 20% (Figure 2C). Thus, differences between the two methods could simply result from differences in their effectiveness. If the effect of devaluation by specific satiety is too small in the Sham group, the probability to detect differences between groups decreases.

2) In line with this, effects of Pavlovian devaluation by satiety were modest (Figure 2C). The authors report a significant main effect of devaluation, but are the effects within each group (Sham and lesion) individually significant? The devaluation effect in the OFC group appears smaller compared to the Sham group. Is it possible that the devaluation effect is driven by the Sham but not the Lesion group, and that the experiment was underpowered to find group differences?

3) The conclusion that OFC lesions affect Pavlovian devaluation by taste aversion, but not specific satiety is based on a positive and a null result from different experiments. However, the lesions for Pavlovian devaluation by satiety were more anterior compared to the lesions in the taste aversion experiment (Figure 2A and Figure 4A). Thus, the two experiments differ not only in the devaluation method but also in lesion location. Given that the results of Pavlovian devaluation by taste aversion may depend on where the lesion is made in OFC, this difference should not be taken lightly. To convincingly demonstrate that the devaluation method matters, both methods should be compared directly within the same experiment and with the same lesion location.

Reviewer #3:

In this report, the effects of lateral OFC lesions on a variety of behavioral tasks are presented. First, replicating previous results, pre-training LO lesions are shown not to affect sensitivity of instrumental responding to devaluation, suggesting intact action-outcome signaling. In contrast to previous reports, LO lesions are shown not to affect sensitivity of a Pavlovian CR to devaluation induced by sensory-specific satiety. The expression of specific PIT is shown to be sensitive to sensory-specific satiety, which the authors interpret as an indication that sensory-specific satiety acts via habituation of the sensory-specific features of the outcome and, thus, that LO is not required for representing such features. The most interesting and clear finding is the dissociability of anterior v. posterior OFC necessity for sensitivity of the Pavlovian response to devaluation, sign-tracking, and reversal learning. This report has many strengths including the thorough statistical reporting, multi-faceted behavioral analysis, combination of behaviors, and anterior v. posterior OFC analysis. The anterior v. posterior OFC lesions effects are especially exciting. Enthusiasm is somewhat diminished by the lack of clear support for many of the main conclusions (described below) and the limitations of the lesion approach used here. The relevance of this data to the broad readership of eLife over a more specialized journal is also not immediately clear, though this could be remedied.

1) My primary concern is that many of the conclusions in this report do not appear to be fully supported.

a) "[…] directly confirm the dissociable role of the rodent OFC in Pavlovian but not instrumental behavioural flexibility following outcome devaluation". While I don't totally disagree, this is not directly shown here because there is no direct comparison between instrumental and Pavlovian CR sensitivity to the same form of devaluation within subjects with the same lesions. Indeed, the extent of the lesion, esp. in the lateral and posterior domains was different between the data in Figure 1 and Figure 2 and Figure 4. I suggest removing or tempering this language.

b) "OFC lesions in rodents only disrupt the Pavlovian outcome devaluation effect when outcome value is manipulated by taste aversion but not specific satiety." This cannot be interpreted here for two reasons. First, this was not directly assessed between subjects with similar lesions. Indeed, the extent of the lesion is quite different for the subjects in Figure 2 v. Figure 4 and that is a major important finding that more posterior but not anterior lesions disrupt sensitivity of the Pavlovian CR to devaluation, leaving open the possibility that more posterior OFC lesions would also disrupt sensitivity to devaluation by sensory-specific satiety. Second, the efficacy of the sensory-specific satiety manipulation was not confirmed here with post-test consumption. Indeed, the sensory-specific satiety devaluation effect is not very convincing here in control subjects.

c) "Using a specific PIT test, we establish that, unlike taste aversion devaluation, specific satiety devaluation can act via a reduction in the efficacy of sensory specific outcome properties". While it is intriguing that PIT is sensitivity to sensory-specific satiety devaluation, to support this interpretation the authors would have to show that with their specific PIT procedures that PIT is not sensitive to taste aversion devaluation. It remains plausible that these findings indicate that during PIT subjects are using the CS to retrieve a representation of the outcome and its current value then determines whether the rat will press or not. Moreover, that these graphs lack any indication of error or variance makes these data difficult to interpret and, on top of this, the PIT effect size at ~1 press/min is much smaller than previous reports, including most of the reports cited as previous evidence for lack of PIT sensitivity to taste aversion, likely owing to the different procedures used.

d) The title: "The rodent lateral orbitofrontal cortex represents expected Pavlovian outcome value but not identity". Loss of function studies cannot tell you what information is represented or especially what information is not represented, this would require recordings. Moreover, that all the lesions were pre-training further confounds ability to interpret their effects on initial learning, updating, v. retrieval of stimulus-outcome memories. I suggest changing the title to better reflect the type of assessment provided by the data.

2) The difference in taste aversion and specific satiety results here is hypothesized to result from the possibility of sensory-specific satiety acts via habituation of the sensory-specific features of the outcome. This becomes crucial for the interpretation of the lesion results. But there are other differences between taste aversion and sensory-specific satiety. Taste aversion is certainly more severe; it is also a permanent, consolidated memory. This possibility and its relevance to interpretation of the OFC lesions results should be considered.

3) There are several previous findings that are seemingly contradictory to these results that are not, but should be discussed here, including data showing the representation of reward features and identity in human lateral OFC (see recent work from Kahnt and O'Doherty labs), evidence that post-training OFC lesions disrupt expression of specific PIT (Ostlund and Balleine), and evidence that inactivation of amygdala projections to lateral OFC disrupts both specific PIT and sensitivity of the Pavlovian CR to sensory-specific satiety devaluation (recent Wassum lab paper). These recording papers are also of relevance to the interpretation here: Mcdannald et al., 2014 Stalnaker et al., 2014.

4) The lesions here are purported to be "focal LO" lesions. By my eye they include VLO and this should be clarified.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Functional heterogeneity within the rodent lateral orbitofrontal cortex dissociates outcome devaluation and reversal learning deficits" for further consideration at eLife. Your revised article has been favorably evaluated by Sabine Kastner (Senior Editor), a Reviewing Editor, and three reviewers.

The manuscript has been improved but there are some remaining issues that need to be addressed before acceptance, as outlined below. Most important were issues #1 and 3 raised by reviewer 3.

Reviewer #3:

In this revised and more focused report the authors replicate previous findings that lOFC lesions disrupt sensitivity of Pavlovian conditional responding to devaluation, but do not disrupt instrumental devaluation. They nicely demonstrate the Pavlovian devaluation effect for both anterior and posterior lOFC lesions. They also show that posterior lOFC lesions disrupt acquisition of a sign-tracking conditional response and the reversal of this response, whereas anterior lOFC lesions spare this behavior. The report provides a nice addition to the OFC literature. I do have a few remaining concerns regarding the interpretation and discussion of the data.

1) The data in Figure 2G are interpreted as demonstrating that the anterior and posterior lOFC lesions did not cause "a failure to acquire sensory specific cue-outcome associations" and that the deficits in Pavlovian devaluation are "specific to recalling the new value of the devalued outcome and/or integrating it into appropriate behavioural control". See subsection “Outcome Devaluation”, subsection “Rodent and primate homology” and subsection “Theoretical accounts of OFC function”. I don't think this claim can be made for two reasons. (1) During this test the animals get feedback about their entries that will change the likelihood that they will enter subsequently. Thus, they need not have learned the specific cue-outcome relationships to show reduced performance during the devalued CS in this task. (2) The lesion was not restricted to training or test, thus making it impossible to distinguish learning v. recall effects- of which both are likely at play. Thus, it cannot be concluded that the lOFC is selectively important for recalling the new value or outcome information, or with a "selective role for the OFC in Pavlovian model-based inferences". To be clear, I do not disagree that the lOFC is important for retrieving outcome information, there are clear data that support its function in this regard, but the pre-training lesion data here cannot be used to support a selective role in retrieval or inference and not one in encoding.

2) It remains the case that the findings of deficits in initial acquisition from posterior lOFC lesions makes it difficult to interpret a reversal learning deficit here.

3) There is very little consideration of the limitations of the pre-training lesion approach selected here. Encoding v. retrieval, compensatory mechanisms, etc. These limitations are important to understanding the present results and should be discussed. Also, in my opinion the discussion is too long. It reads more like a review paper than a discussion of the present results. I suggest making it much shorter.

4) Figure 1D. I still think the devaluation effect is weak in the Sham group and worry these data might color the report for readers. Perhaps plotting the data over time might reveal a bigger effect earlier in the test.

https://doi.org/10.7554/eLife.37357.013

Author response

[Editors' note: the authors’ plan for revisions was approved and the authors made a formal revised submission.]

Reviewer #1:

[…] Despite these issues, I do regard the individual experiments as well done and the overall topic is excellent and worth investigating, so I think the authors have valuable results that can help inform future work. I could imagine a different manuscript that more fairly considers these and perhaps other alternatives.

My preference would also be to separate the devaluation studies from those on reversal and sign tracking data. I fewer issues with these data and their interpretation, maybe because their conclusions are less strong and more alternatives are considered. But I think their presence detracts from a full consideration of the other data.

These concerns are addressed by the removal of experiments 2, 3, and 5 from the manuscript. We thank the reviewer for this considered review and agree that the more focused manuscript allows for a more careful consideration of the remaining data.

Reviewer #2:

1) The main conclusion that OFC does not represent specific outcome properties is based on an assumption about the key difference between devaluation by specific satiety and taste aversion. Because specific satiety involves repeated exposure to the US, it "involves habituation of the sensory systems required to represent the sensory properties of the outcome." There are two issues with this assumption. (1) Even if specific satiety induces habituation of the sensory systems, it is unclear whether this would also diminish representations of predicted sensory features in OFC and the ability to retrieve the updated value of these outcomes. (2) There are more prominent differences between the two forms of devaluation that may explain the current results. Most importantly, whereas specific satiety slightly reduces the value of the US, taste aversion involves a much more dramatic change in value and can thus be expected to result in a relatively stronger devaluation effect. Indeed, comparing the effects of the two devaluation methods on behavior in the Sham group shows that taste aversion reduces magazine activity by about 40% (Figure 4F), whereas specific satiety changes behavior by only about 20% (Figure 2C). Thus, differences between the two methods could simply result from differences in their effectiveness. If the effect of devaluation by specific satiety is too small in the Sham group, the probability to detect differences between groups decreases.

These concerns are addressed by the removal of experiments 2, 3, and 5 from the manuscript.

2) In line with this, effects of Pavlovian devaluation by satiety were modest (Figure 2C). The authors report a significant main effect of devaluation, but are the effects within each group (Sham and lesion) individually significant? The devaluation effect in the OFC group appears smaller compared to the Sham group. Is it possible that the devaluation effect is driven by the Sham but not the Lesion group, and that the experiment was underpowered to find group differences?

These concerns are addressed by the removal of experiments 2, 3, and 5 from the manuscript.

3) The conclusion that OFC lesions affect Pavlovian devaluation by taste aversion, but not specific satiety is based on a positive and a null result from different experiments. However, the lesions for Pavlovian devaluation by satiety were more anterior compared to the lesions in the taste aversion experiment (Figure 2A and Figure 4A). Thus, the two experiments differ not only in the devaluation method but also in lesion location. Given that the results of Pavlovian devaluation by taste aversion may depend on where the lesion is made in OFC, this difference should not be taken lightly. To convincingly demonstrate that the devaluation method matters, both methods should be compared directly within the same experiment and with the same lesion location.

These concerns are addressed by the removal of experiments 2, 3, and 5 from the manuscript.

Reviewer #3:

[…] 1) My primary concern is that many of the conclusions in this report do not appear to be fully supported.

a) "[…] directly confirm the dissociable role of the rodent OFC in Pavlovian but not instrumental behavioural flexibility following outcome devaluation". While I don't totally disagree, this is not directly shown here because there is no direct comparison between instrumental and Pavlovian CR sensitivity to the same form of devaluation within subjects with the same lesions. Indeed, the extent of the lesion, esp. in the lateral and posterior domains was different between the data in Figure 1 and Figure 2 and Figure 4. I suggest removing or tempering this language.

These concerns are addressed by the removal of experiments 2, 3, and 5 from the manuscript.

b) "OFC lesions in rodents only disrupt the Pavlovian outcome devaluation effect when outcome value is manipulated by taste aversion but not specific satiety." This cannot be interpreted here for two reasons. First, this was not directly assessed between subjects with similar lesions. Indeed, the extent of the lesion is quite different for the subjects in Figure 2 v. Figure 4 and that is a major important finding that more posterior but not anterior lesions disrupt sensitivity of the Pavlovian CR to devaluation, leaving open the possibility that more posterior OFC lesions would also disrupt sensitivity to devaluation by sensory-specific satiety. Second, the efficacy of the sensory-specific satiety manipulation was not confirmed here with post-test consumption. Indeed, the sensory-specific satiety devaluation effect is not very convincing here in control subjects.

These concerns are addressed by the removal of experiments 2, 3, and 5 from the manuscript.

c) "Using a specific PIT test, we establish that, unlike taste aversion devaluation, specific satiety devaluation can act via a reduction in the efficacy of sensory specific outcome properties". While it is intriguing that PIT is sensitivity to sensory-specific satiety devaluation, to support this interpretation the authors would have to show that with their specific PIT procedures that PIT is not sensitive to taste aversion devaluation. It remains plausible that these findings indicate that during PIT subjects are using the CS to retrieve a representation of the outcome and its current value then determines whether the rat will press or not. Moreover, that these graphs lack any indication of error or variance makes these data difficult to interpret and, on top of this, the PIT effect size at ~1 press/min is much smaller than previous reports, including most of the reports cited as previous evidence for lack of PIT sensitivity to taste aversion, likely owing to the different procedures used.

These concerns are addressed by the removal of experiments 2, 3, and 5 from the manuscript.

d) The title: "The rodent lateral orbitofrontal cortex represents expected Pavlovian outcome value but not identity". Loss of function studies cannot tell you what information is represented or especially what information is not represented, this would require recordings. Moreover, that all the lesions were pre-training further confounds ability to interpret their effects on initial learning, updating, v. retrieval of stimulus-outcome memories. I suggest changing the title to better reflect the type of assessment provided by the data.

These concerns are addressed by the removal of experiments 2, 3, and 5 from the manuscript.

2) The difference in taste aversion and specific satiety results here is hypothesized to result from the possibility of sensory-specific satiety acts via habituation of the sensory-specific features of the outcome. This becomes crucial for the interpretation of the lesion results. But there are other differences between taste aversion and sensory-specific satiety. Taste aversion is certainly more severe; it is also a permanent, consolidated memory. This possibility and its relevance to interpretation of the OFC lesions results should be considered.

These concerns are addressed by the removal of experiments 2, 3, and 5 from the manuscript.

3) There are several previous findings that are seemingly contradictory to these results that are not, but should be discussed here, including data showing the representation of reward features and identity in human lateral OFC (see recent work from Kahnt and O'Doherty labs), evidence that post-training OFC lesions disrupt expression of specific PIT (Ostlund and Balleine), and evidence that inactivation of amygdala projections to lateral OFC disrupts both specific PIT and sensitivity of the Pavlovian CR to sensory-specific satiety devaluation (recent Wassum lab paper). These recording papers are also of relevance to the interpretation here: Mcdannald et al., 2014 Stalnaker et al., 2014.

These concerns are addressed by the removal of experiments 2, 3, and 5 from the manuscript. Additional consideration of OFC heterogeneity in relation to the specific PIT effect are discussed in the Discussion section.

4) The lesions here are purported to be "focal LO" lesions. By my eye they include VLO and this should be clarified.

This is an important consideration given the claims of functional heterogeneity within LO. We note here that bilateral damage was mostly confined to LO. Bilateral damage to VO was minimal when present. We examined whether there were any specific correlations between our estimates of damage to any of the OFC subregions and devaluation/sign-tracking/reversal effects but found no significant relationships that might indicate that damage to any of these other regions might be important for these effects.

It is also of note that the designation of VLO is not always used. This region was originally demarcated as the orbital surface directly above the ‘orbital notch’ (Krettek and Price, 1977). This places the structure half-way between VO and LO (which we have distinguished as medial and lateral to the orbital notch). Close inspection of a number of lesion and neuroanatomical projection studies indicate that VLO may not be a unique structure, and most effects attributed to this structure are driven by damage to VO e.g. an attentional circuit appears to exist between VO, dorsocentral striatum, and posterior parietal cortex (Burcham et al., 1997; Cheatwood, Corwin and Reep, 2005; Cheatwood, Reep and Corwin, 2003; Conte et al., 2008; Corwin et al., 1994; Corwin and Reep, 1998; King, Corwin and Reep, 1989; Reep, Cheatwood and Corwin, 2003; Reep, Corwin and King, 1996; Reep et al., 1994; Reep and Corwin, 1999, 2009; Van Vleet et al., 2000; VanVleet et al., 2002; Vargo et al., 1988). However, we have not directly tested or dissociated the adjacent VO/LO regions, so we can only rely on secondary sources to support this conclusion. It is clear that our understanding of neuroanatomical and functional distinctions within the rodent OFC is still quite poor.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Reviewer #3:

[…] 1) The data in Figure 2G are interpreted as demonstrating that the anterior and posterior lOFC lesions did not cause "a failure to acquire sensory specific cue-outcome associations" and that the deficits in Pavlovian devaluation are "specific to recalling the new value of the devalued outcome and/or integrating it into appropriate behavioural control". See subsection “Outcome Devaluation”, subsection “Rodent and primate homology” and subsection “Theoretical accounts of OFC function”. I don't think this claim can be made for two reasons. (1) During this test the animals get feedback about their entries that will change the likelihood that they will enter subsequently. Thus, they need not have learned the specific cue-outcome relationships to show reduced performance during the devalued CS in this task. (2) The lesion was not restricted to training or test, thus making it impossible to distinguish learning v. recall effects- of which both are likely at play. Thus, it cannot be concluded that the lOFC is selectively important for recalling the new value or outcome information, or with a "selective role for the OFC in Pavlovian model-based inferences". To be clear, I do not disagree that the lOFC is important for retrieving outcome information, there are clear data that support its function in this regard, but the pre-training lesion data here cannot be used to support a selective role in retrieval or inference and not one in encoding.

This is a valid point. In this test the animals are briefly re-exposed to the USs, removal of the animal from the chamber, cleaning with hot water before drying the magazine, and returning to the chamber for the test. It is possible that the re-exposure alone has caused some aversion/withdrawal/inhibitory response to the magazine location/magazine approach action. This limitation has been acknowledged when discussing this result:

“However, this result must be interpreted with caution as it is possible that re-exposure to the devalued US in the magazine resulted in some form of short-term avoidance to the magazine that persisted throughout the subsequent test session when the devalued CS was presented.”

2) It remains the case that the findings of deficits in initial acquisition from posterior lOFC lesions makes it difficult to interpret a reversal learning deficit here.

We agree that the acquisition deficit does make the reversal deficit harder to interpret, but in general reversal learning procedures are quite complex and any deficit is hard to interpret without pursuing simpler follow up tasks to probe the nature of the deficit. Nonetheless, we have stressed that the acquisition deficit appears to affect both magazine approach and lever pressing behaviour whereas the reversal deficit selectively disrupts extinction of the magazine approach but not lever pressing. Therefore, we would argue that the two deficits are distinct.

3) There is very little consideration of the limitations of the pre-training lesion approach selected here. Encoding v. retrieval, compensatory mechanisms, etc. These limitations are important to understanding the present results and should be discussed. Also, in my opinion the discussion is too long. It reads more like a review paper than a discussion of the present results. I suggest making it much shorter.

This point is well taken, and we agree that there are benefits and limitations to pre-training lesion approaches that should be highlighted. This has been added to the Discussion section.

4) Figure 1D. I still think the devaluation effect is weak in the Sham group and worry these data might color the report for readers. Perhaps plotting the data over time might reveal a bigger effect earlier in the test.

In presenting the data as total session responding we aimed to keep the presentation of devaluation effect the same between experiments. To supplement this analysis we have included the data plotted and analysed over time as a figure supplement. This analysis reveals a main effect of Devaluation in the Lesion group, but a Devaluation x Time interaction in the Sham group. Again, the effect is not as strong in the Sham group, but hopefully this additional analysis and significant effect will help clarify the strength of the effect for the readers.

https://doi.org/10.7554/eLife.37357.014

Article and author information

Author details

  1. Marios C Panayi

    1. School of Psychology, The University of New South Wales, Kensington, Australia
    2. Department of Experimental Psychology, University of Oxford, Oxford, United Kingdom
    Contribution
    Conceptualization, Formal analysis, Investigation, Methodology, Writing—original draft, Writing—review and editing
    For correspondence
    marios.panagi@psy.ox.ac.uk
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-2635-5638
  2. Simon Killcross

    School of Psychology, The University of New South Wales, Kensington, Australia
    Contribution
    Conceptualization, Resources, Supervision, Funding acquisition, Methodology, Writing—review and editing
    Competing interests
    No competing interests declared

Funding

Australian Research Council (DP0989027)

  • Simon Killcross

Australian Research Council (DP120103564)

  • Simon Killcross

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We gratefully acknowledge Fred Westbrook, Nathan Holmes, David Bannerman, and Mark Walton for their invaluable feedback. Research supported by grants awarded to Simon Killcross from the Australian Research Council (ARC Discovery Grant DP0989027 and DP120103564).

Ethics

Animal experimentation: All animal research was carried out in accordance with the National Institute of Health Guide for the Care and Use of Laboratories Animals (NIH publications No. 80-23, revised 1996) and approved by the University of New South Wales Animal Care and Ethics Committee.

Senior Editor

  1. Sabine Kastner, Princeton University, United States

Reviewing Editor

  1. Geoffrey Schoenbaum, National Institute on Drug Abuse, National Institutes of Health, United States

Publication history

  1. Received: April 7, 2018
  2. Accepted: July 24, 2018
  3. Accepted Manuscript published: July 25, 2018 (version 1)
  4. Version of Record published: August 20, 2018 (version 2)

Copyright

© 2018, Panayi et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,171
    Page views
  • 249
    Downloads
  • 12
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

  1. Further reading

Further reading

    1. Neuroscience
    Louise P Kirsch et al.
    Short Report Updated
    1. Neuroscience
    Jingjing Sherry Wu et al.
    Research Article Updated