Reward-based training of recurrent neural networks for cognitive and value-based tasks

  1. H Francis Song
  2. Guangyu R Yang
  3. Xiao-Jing Wang  Is a corresponding author
  1. New York University, United States

Abstract

Trained neural network models, which exhibit features of neural activity recorded from behaving animals, may provide insights into the circuit mechanisms of cognitive functions through systematic analysis of network activity and connectivity. However, in contrast to the graded error signals commonly used to train networks through supervised learning, animals learn from reward feedback on definite actions through reinforcement learning. Reward maximization is particularly relevant when optimal behavior depends on an animal's internal judgment of confidence or subjective preferences. Here, we implement reward-based training of recurrent neural networks in which a value network guides learning by using the activity of the decision network to predict future reward. We show that such models capture behavioral and electrophysiological findings from well-known experimental paradigms. Our work provides a unified framework for investigating diverse cognitive and value-based computations, and predicts a role for value representation that is essential for learning, but not executing, a task.

Article and author information

Author details

  1. H Francis Song

    Center for Neural Science, New York University, New York, United States
    Competing interests
    The authors declare that no competing interests exist.
  2. Guangyu R Yang

    Center for Neural Science, New York University, New York, United States
    Competing interests
    The authors declare that no competing interests exist.
  3. Xiao-Jing Wang

    Center for Neural Science, New York University, New York, United States
    For correspondence
    xjwang@nyu.edu
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-3124-8474

Funding

Office of Naval Research (N00014-13-1-0297)

  • H Francis Song
  • Guangyu R Yang
  • Xiao-Jing Wang

Google

  • H Francis Song
  • Guangyu R Yang
  • Xiao-Jing Wang

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Copyright

© 2017, Song et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 11,101
    views
  • 1,961
    downloads
  • 128
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. H Francis Song
  2. Guangyu R Yang
  3. Xiao-Jing Wang
(2017)
Reward-based training of recurrent neural networks for cognitive and value-based tasks
eLife 6:e21492.
https://doi.org/10.7554/eLife.21492

Share this article

https://doi.org/10.7554/eLife.21492

Further reading

    1. Neuroscience
    Pál Barzó, Ildikó Szöts ... Gábor Tamás
    Research Article

    The basic excitatory neurons of the cerebral cortex, the pyramidal cells, are the most important signal integrators for the local circuit. They have quite characteristic morphological and electrophysiological properties that are known to be largely constant with age in the young and adult cortex. However, the brain undergoes several dynamic changes throughout life, such as in the phases of early development and cognitive decline in the aging brain. We set out to search for intrinsic cellular changes in supragranular pyramidal cells across a broad age range: from birth to 85 y of age and we found differences in several biophysical properties between defined age groups. During the first year of life, subthreshold and suprathreshold electrophysiological properties changed in a way that shows that pyramidal cells become less excitable with maturation, but also become temporarily more precise. According to our findings, the morphological features of the three-dimensional reconstructions from different life stages showed consistent morphological properties and systematic dendritic spine analysis of an infantile and an old pyramidal cell showed clear significant differences in the distribution of spine shapes. Overall, the changes that occur during development and aging may have lasting effects on the properties of pyramidal cells in the cerebral cortex. Understanding these changes is important to unravel the complex mechanisms underlying brain development, cognition, and age-related neurodegenerative diseases.

    1. Neuroscience
    Lotfi Ferhat, Rabia Soussi ... Michel Khrestchatisky
    Research Article

    Preclinical and clinical studies show that mild to moderate hypothermia is neuroprotective in sudden cardiac arrest, ischemic stroke, perinatal hypoxia/ischemia, traumatic brain injury, and seizures. Induction of hypothermia largely involves physical cooling therapies, which induce several clinical complications, while some molecules have shown to be efficient in pharmacologically induced hypothermia (PIH). Neurotensin (NT), a 13 amino acid neuropeptide that regulates body temperature, interacts with various receptors to mediate its peripheral and central effects. NT induces PIH when administered intracerebrally. However, these effects are not observed if NT is administered peripherally, due to its rapid degradation and poor passage of the blood-brain barrier (BBB). We conjugated NT to peptides that bind the low-density lipoprotein receptor (LDLR) to generate ‘vectorized’ forms of NT with enhanced BBB permeability. We evaluated their effects in epileptic conditions following peripheral administration. One of these conjugates, VH-N412, displayed improved stability, binding potential to both the LDLR and NTSR-1, rodent/human cross-reactivity and improved brain distribution. In a mouse model of kainate (KA)-induced status epilepticus (SE), VH-N412 elicited rapid hypothermia associated with anticonvulsant effects, potent neuroprotection, and reduced hippocampal inflammation. VH-N412 also reduced sprouting of the dentate gyrus mossy fibers and preserved learning and memory skills in the treated mice. In cultured hippocampal neurons, VH-N412 displayed temperature-independent neuroprotective properties. To the best of our knowledge, this is the first report describing the successful treatment of SE with PIH. In all, our results show that vectorized NT may elicit different neuroprotection mechanisms mediated by hypothermia and/or by intrinsic neuroprotective properties.