Risk of punishment influences discrete and coordinated encoding of reward-guided actions by prefrontal cortex and VTA neurons

  1. Junchol Park
  2. Bita Moghaddam  Is a corresponding author
  1. University of Pittsburgh, United States
  2. Oregon Health and Science University, United States

Abstract

Actions motivated by rewards are often associated with risk of punishment. Little is known about the neural representation of punishment risk during reward-seeking behavior. We modeled this circumstance in rats by designing a task where actions were consistently rewarded but probabilistically punished. Spike activity and local field potentials were recorded during task performance simultaneously from VTA and mPFC, two reciprocally connected regions implicated in reward-seeking and aversive behaviors. At the single unit level, we found that ensembles of putative dopamine and non-dopamine VTA neurons and mPFC neurons encode the relationship between action and punishment. At the network level, we found that coherent theta oscillations synchronize VTA and mPFC in a bottom-up direction, effectively phase-modulating the neuronal spike activity in the two regions during punishment-free actions. This synchrony declined as a function of punishment probability, suggesting that during reward-seeking actions, risk of punishment diminishes VTA-driven neural synchrony between the two regions.

Article and author information

Author details

  1. Junchol Park

    Department of Neuroscience, University of Pittsburgh, Pittsburgh, United States
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-4739-0793
  2. Bita Moghaddam

    Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, United States
    For correspondence
    bita@ohsu.edu
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-5205-417X

Funding

National Institute of Mental Health (R56MH084906)

  • Bita Moghaddam

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Ethics

Animal experimentation: All surgical and experimental procedures were in strict accordance with the National Institute of Health's Guide to the Care and Use of Laboratory Animals, and were approved by the University of Pittsburgh Institutional Animal Care and Use Committee (Protocol #: 15065884).

Copyright

© 2017, Park & Moghaddam

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 5,420
    views
  • 744
    downloads
  • 45
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Junchol Park
  2. Bita Moghaddam
(2017)
Risk of punishment influences discrete and coordinated encoding of reward-guided actions by prefrontal cortex and VTA neurons
eLife 6:e30056.
https://doi.org/10.7554/eLife.30056

Share this article

https://doi.org/10.7554/eLife.30056

Further reading

    1. Neuroscience
    Markus R Tünte, Stefanie Hoehl ... Ezgi Kayhan
    Research Advance

    Several recent theoretical accounts have posited that interoception, the perception of internal bodily signals, plays a vital role in early human development. Yet, empirical evidence of cardiac interoceptive sensitivity in infants to date has been mixed. Furthermore, existing evidence does not go beyond the perception of cardiac signals and focuses only on the age of 5–7 mo, limiting the generalizability of the results. Here, we used a modified version of the cardiac interoceptive sensitivity paradigm introduced by Maister et al., 2017 in 3-, 9-, and 18-mo-old infants using cross-sectional and longitudinal approaches. Going beyond, we introduce a novel experimental paradigm, namely the iBREATH, to investigate respiratory interoceptive sensitivity in infants. Overall, for cardiac interoceptive sensitivity (total n=135) we find rather stable evidence across ages with infants on average preferring stimuli presented synchronously to their heartbeat. For respiratory interoceptive sensitivity (total n=120) our results show a similar pattern in the first year of life, but not at 18 mo. We did not observe a strong relationship between cardiac and respiratory interoceptive sensitivity at 3 and 9 mo but found some evidence for a relationship at 18 mo. We validated our results using specification curve- and mega-analytic approaches. By examining early cardiac and respiratory interoceptive processing, we provide evidence that infants are sensitive to their interoceptive signals.

    1. Neuroscience
    Elissa Sutlief, Charlie Walters ... Marshall G Hussain Shuler
    Research Article

    Reward-rate maximization is a prominent normative principle in behavioral ecology, neuroscience, economics, and AI. Here, we identify, compare, and analyze equations to maximize reward rate when assessing whether to initiate a pursuit. In deriving expressions for the value of a pursuit, we show that time’s cost consists of both apportionment and opportunity cost. Reformulating value as a discounting function, we show precisely how a reward-rate-optimal agent’s discounting function (1) combines hyperbolic and linear components reflecting apportionment and opportunity costs, and (2) is dependent not only on the considered pursuit’s properties but also on time spent and rewards obtained outside the pursuit. This analysis reveals how purported signs of suboptimal behavior (hyperbolic discounting, and the Delay, Magnitude, and Sign effects) are in fact consistent with reward-rate maximization. To better account for observed decision-making errors in humans and animals, we then analyze the impact of misestimating reward-rate-maximizing parameters and find that suboptimal decisions likely stem from errors in assessing time’s apportionment—specifically, underweighting time spent outside versus inside a pursuit—which we term the ‘Malapportionment Hypothesis’. This understanding of the true pattern of temporal decision-making errors is essential to deducing the learning algorithms and representational architectures actually used by humans and animals.