Reward-based training of recurrent neural networks for cognitive and value-based tasks
Abstract
Trained neural network models, which exhibit features of neural activity recorded from behaving animals, may provide insights into the circuit mechanisms of cognitive functions through systematic analysis of network activity and connectivity. However, in contrast to the graded error signals commonly used to train networks through supervised learning, animals learn from reward feedback on definite actions through reinforcement learning. Reward maximization is particularly relevant when optimal behavior depends on an animal's internal judgment of confidence or subjective preferences. Here, we implement reward-based training of recurrent neural networks in which a value network guides learning by using the activity of the decision network to predict future reward. We show that such models capture behavioral and electrophysiological findings from well-known experimental paradigms. Our work provides a unified framework for investigating diverse cognitive and value-based computations, and predicts a role for value representation that is essential for learning, but not executing, a task.
Article and author information
Author details
Funding
Office of Naval Research (N00014-13-1-0297)
- H Francis Song
- Guangyu R Yang
- Xiao-Jing Wang
- H Francis Song
- Guangyu R Yang
- Xiao-Jing Wang
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Copyright
© 2017, Song et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 11,196
- views
-
- 1,976
- downloads
-
- 129
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Computational and Systems Biology
- Neuroscience
Accumulating evidence to make decisions is a core cognitive function. Previous studies have tended to estimate accumulation using either neural or behavioral data alone. Here, we develop a unified framework for modeling stimulus-driven behavior and multi-neuron activity simultaneously. We applied our method to choices and neural recordings from three rat brain regions—the posterior parietal cortex (PPC), the frontal orienting fields (FOF), and the anterior-dorsal striatum (ADS)—while subjects performed a pulse-based accumulation task. Each region was best described by a distinct accumulation model, which all differed from the model that best described the animal’s choices. FOF activity was consistent with an accumulator where early evidence was favored while the ADS reflected near perfect accumulation. Neural responses within an accumulation framework unveiled a distinct association between each brain region and choice. Choices were better predicted from all regions using a comprehensive, accumulation-based framework and different brain regions were found to differentially reflect choice-related accumulation signals: FOF and ADS both reflected choice but ADS showed more instances of decision vacillation. Previous studies relating neural data to behaviorally inferred accumulation dynamics have implicitly assumed that individual brain regions reflect the whole-animal level accumulator. Our results suggest that different brain regions represent accumulated evidence in dramatically different ways and that accumulation at the whole-animal level may be constructed from a variety of neural-level accumulators.
-
- Neuroscience
Our propensity to materiality, which consists in using, making, creating, and passing on technologies, has enabled us to shape the physical world according to our ends. To explain this proclivity, scientists have calibrated their lens to either low-level skills such as motor cognition or high-level skills such as language or social cognition. Yet, little has been said about the intermediate-level cognitive processes that are directly involved in mastering this materiality, that is, technical cognition. We aim to focus on this intermediate level for providing new insights into the neurocognitive bases of human materiality. Here, we show that a technical-reasoning process might be specifically at work in physical problem-solving situations. We found via two distinct neuroimaging studies that the area PF (parietal F) within the left parietal lobe is central for this reasoning process in both tool-use and non-tool-use physical problem-solving and can work along with social-cognitive skills to resolve day-to-day interactions that combine social and physical constraints. Our results demonstrate the existence of a specific cognitive module in the human brain dedicated to materiality, which might be the supporting pillar allowing the accumulation of technical knowledge over generations. Intensifying research on technical cognition could nurture a comprehensive framework that has been missing in fields interested in how early and modern humans have been interacting with the physical world through technology, and how this interaction has shaped our history and culture.