The human brain is capable of implementing inverse reinforcement learning, where an observer infers the hidden reward structure of a decision problem solely through observing another individual take actions.
Memory over 24 hours was impaired in Parkinson's patients off, rather than on, dopaminergic medication during reinforcement learning, whereas dopamine did not affect positive and negative reinforcement, in contrast to previous studies.
fMRI evidence for off-task replay predicts subsequent replanning behavior in humans, suggesting that learning from simulated experience during replay helps update past policies in reinforcement learning.
Computational modeling suggests that feedback between striatal cholinergic neurons and spiny neurons dynamically adjusts learning rates to optimize behavior in a variable world.
A mathematical model built around the assumption that the desire to maintain internal homeostasis drives the behavior of animals, by affecting their learning processes, can explain many real-world behaviors, including some that might otherwise appear irrational.
A two-part neural network models reward-based training and provides a unified framework in which to study diverse computations that can be compared to electrophysiological recordings from behaving animals.
Confidence-dependent reinforcement learning is active and produces trial-to-trial choice updating even in well-learned perceptual decisions without explicit reward biases, across species and sensory modalities.
Teaching signals from "tutor" brain areas should be adapted to the plasticity mechanisms in "student" areas to achieve efficient learning in two-stage systems such as the vocal control circuit of the songbird.
Neural confidence signals can take the role of reward signals and explain perceptual learning without external feedback as a form of internal reinforcement learning.