Offline replay supports planning in human reinforcement learning

Abstract
Data availability
Article and author information
Metrics

Abstract

Making decisions in sequentially structured tasks requires integrating distally acquired information. The extensive computational cost of such integration challenges planning methods that integrate online, at decision time. Furthermore, it remains unclear whether 'offline' integration during replay supports planning, and if so which memories should be replayed. Inspired by machine learning, we propose that (a) offline replay of trajectories facilitates integrating representations that guide decisions, and (b) unsigned prediction errors (uncertainty) trigger such integrative replay. We designed a 2-step revaluation task for fMRI, whereby participants needed to integrate changes in rewards with past knowledge to optimally replan decisions. As predicted, we found that (a) multi-voxel pattern evidence for off-task replay predicts subsequent replanning; (b) neural sensitivity to uncertainty predicts subsequent replay and replanning; (c) off-task hippocampus and anterior cingulate activity increase when revaluation is required. These findings elucidate how the brain leverages offline mechanisms in planning and goal-directed behavior under uncertainty.

Data availability

Neural and behavioral data have been available online at OpenNeuro (https://openneuro.org/datasets/ds001612/versions/1.0.0).

The following data sets were generated

1. Momennejad I
2. Ott AR
3. Daw N
4. Norman KA
(2018) Neural and behavioral data from Offline Replay Supports Planning in Human Reinforcement Learning
OpenNeuro, doi:10.18112/openneuro.ds001612.v1.0.1.

https://openneuro.org/datasets/ds001612/versions/1.0.0

Article and author information

Author details

Ida Momennejad

Princeton Neuroscience Institute, Princeton University, Princeton, United States

For correspondence
idam@princeton.edu

Competing interests
The authors declare that no competing interests exist.

"This ORCID iD identifies the author of this article:" 0000-0003-0830-3973
A Ross Otto

Department of Psychology, McGill University, Quebec, Canada

Competing interests
The authors declare that no competing interests exist.

"This ORCID iD identifies the author of this article:" 0000-0002-9997-1901
Nathaniel D Daw

Princeton Neuroscience Institute, Princeton University, Princeton, United States

Competing interests
The authors declare that no competing interests exist.

"This ORCID iD identifies the author of this article:" 0000-0001-5029-1430
Kenneth A Norman

Princeton Neuroscience Institute, Princeton University, Princeton, United States

Competing interests
The authors declare that no competing interests exist.

"This ORCID iD identifies the author of this article:" 0000-0002-5887-9682

Funding

John Templeton Foundation (57876)

Ida Momennejad
Kenneth A Norman

National Institute of Mental Health (R01MH109177)

Nathaniel D Daw

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Ethics

Human subjects: The Princeton University Institutional Review Board approved the study. All participants gave informed consent to participate in the fMRI study and signed a screening form that ensured they had normal or corrected to normal vision, had no metal in their body, and had no history of psychiatric or neurological disorders.(Protocol#6014).

Copyright

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.