Risk of punishment influences discrete and coordinated encoding of reward-guided actions by prefrontal cortex and VTA neurons
Abstract
Actions motivated by rewards are often associated with risk of punishment. Little is known about the neural representation of punishment risk during reward-seeking behavior. We modeled this circumstance in rats by designing a task where actions were consistently rewarded but probabilistically punished. Spike activity and local field potentials were recorded during task performance simultaneously from VTA and mPFC, two reciprocally connected regions implicated in reward-seeking and aversive behaviors. At the single unit level, we found that ensembles of putative dopamine and non-dopamine VTA neurons and mPFC neurons encode the relationship between action and punishment. At the network level, we found that coherent theta oscillations synchronize VTA and mPFC in a bottom-up direction, effectively phase-modulating the neuronal spike activity in the two regions during punishment-free actions. This synchrony declined as a function of punishment probability, suggesting that during reward-seeking actions, risk of punishment diminishes VTA-driven neural synchrony between the two regions.
Article and author information
Author details
Funding
National Institute of Mental Health (R56MH084906)
- Bita Moghaddam
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Ethics
Animal experimentation: All surgical and experimental procedures were in strict accordance with the National Institute of Health's Guide to the Care and Use of Laboratory Animals, and were approved by the University of Pittsburgh Institutional Animal Care and Use Committee (Protocol #: 15065884).
Copyright
© 2017, Park & Moghaddam
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 5,420
- views
-
- 744
- downloads
-
- 45
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Neuroscience
Several recent theoretical accounts have posited that interoception, the perception of internal bodily signals, plays a vital role in early human development. Yet, empirical evidence of cardiac interoceptive sensitivity in infants to date has been mixed. Furthermore, existing evidence does not go beyond the perception of cardiac signals and focuses only on the age of 5–7 mo, limiting the generalizability of the results. Here, we used a modified version of the cardiac interoceptive sensitivity paradigm introduced by Maister et al., 2017 in 3-, 9-, and 18-mo-old infants using cross-sectional and longitudinal approaches. Going beyond, we introduce a novel experimental paradigm, namely the iBREATH, to investigate respiratory interoceptive sensitivity in infants. Overall, for cardiac interoceptive sensitivity (total n=135) we find rather stable evidence across ages with infants on average preferring stimuli presented synchronously to their heartbeat. For respiratory interoceptive sensitivity (total n=120) our results show a similar pattern in the first year of life, but not at 18 mo. We did not observe a strong relationship between cardiac and respiratory interoceptive sensitivity at 3 and 9 mo but found some evidence for a relationship at 18 mo. We validated our results using specification curve- and mega-analytic approaches. By examining early cardiac and respiratory interoceptive processing, we provide evidence that infants are sensitive to their interoceptive signals.
-
- Neuroscience
Reward-rate maximization is a prominent normative principle in behavioral ecology, neuroscience, economics, and AI. Here, we identify, compare, and analyze equations to maximize reward rate when assessing whether to initiate a pursuit. In deriving expressions for the value of a pursuit, we show that time’s cost consists of both apportionment and opportunity cost. Reformulating value as a discounting function, we show precisely how a reward-rate-optimal agent’s discounting function (1) combines hyperbolic and linear components reflecting apportionment and opportunity costs, and (2) is dependent not only on the considered pursuit’s properties but also on time spent and rewards obtained outside the pursuit. This analysis reveals how purported signs of suboptimal behavior (hyperbolic discounting, and the Delay, Magnitude, and Sign effects) are in fact consistent with reward-rate maximization. To better account for observed decision-making errors in humans and animals, we then analyze the impact of misestimating reward-rate-maximizing parameters and find that suboptimal decisions likely stem from errors in assessing time’s apportionment—specifically, underweighting time spent outside versus inside a pursuit—which we term the ‘Malapportionment Hypothesis’. This understanding of the true pattern of temporal decision-making errors is essential to deducing the learning algorithms and representational architectures actually used by humans and animals.