A unifying account of replay as context-driven memory reactivation
Figures
CMR-replay.
(a) Consider a task of encoding a sequence consisting of four items, each denoted by a shade of blue. (b) We propose a model of replay that builds on the CMR, which we refer to as CMR-replay. The model consists of four components: item (), context (), item-to-context associations (), and context-to-item associations (). At each timestep during awake encoding, represents the current item and is a recency-weighted average of associated contexts of past and present items. CMR-replay associates and at each timestep, updating and according to a Hebbian learning rule. and , respectively, support the retrieval of an item’s associated context and a context’s associated items. During replay, represents the current reactivated item, and is a drifting context representing a recency-weighted average of associated contexts of past and present reactivated items. Here, too, the model updates and to associate reactivated and . The figure illustrates the representations of , ,, and as the model encodes the third item during learning. Lengths of color bars in and represent relative magnitudes of different features. Shades of gray illustrate the weights in and . Orange features represent task-irrelevant items, which do not appear as inputs during awake encoding but compete with task-relevant items for reactivation during replay. (c) During both awake encoding and replay, context drifts by incorporating the current item ’s associated context and downweighting previous items’ associated contexts. The figure illustrates how context drifts during the first time the model encodes the example sequence. (d) The figure illustrates and updates as the model encodes the third item during the first presentation of the sequence. (e) Consider the activation of items at the onset of sleep and awake rest across sessions of learning. At replay onset, an initial probability distribution across items varies according to the behavioral state (i.e. awake, rest, or sleep). Compared to sleep, during awake rest, is strongly biased toward features associated with external inputs during awake rest. For awake rest, the figure shows an example of when the model receives a context cue related to the fourth item. Through repeated exposure to the same task sequence across sessions of learning, activities of the four task-related items (i.e. blue items) become suppressed in relative to task-irrelevant items (i.e. orange items). (f) Each replay period begins by sampling an item according to , where denotes the current timestep. If is a task-related item, its associated context is reinstated as to enter a recursive process. During this process, at each timestep , evokes a probability distribution that excludes previously reactivated items. Given , the model samples an item and reinstates ’s associated context , which is combined with to form a new context to guide the ensuing reactivation. The dashed arrow indicates that becomes for the next timestep. At any , the replay period ends with a probability of 0.1 or if a task-irrelevant item is reactivated.
Context-dependent variations in memory replay.
(a) As observed in rodents (left), replay in CMR-replay (right) is predominantly forward at the start of a run and backward at the end of a run on a linear track. (b) Consistent with rodent data (left), in CMR-replay (right), the proportion of forward replay is higher during sleep than during awake rest. (c) The presence of external cues during sleep biases replay toward their associated memories both in animals (left) and in CMR-replay (right). Error bars represent ±1 standard error of the mean. *p<0.05; **p<0.01; ***p<0.001. Left image in a adapted from Diba and Buzsáki, 2007, Nature Publishing Group; left image in b adapted from Wikenheiser and Redish, 2013, Wiley.
© 2007, Diba and Buzsáki. Panel A was reprinted with permission from Figure 1 of Diba and Buzsáki, 2007, which was published under a CC BY-NC license. Further reproductions must adhere to the terms of this license
Reward leads to over-representation in sleep and modulates the rate of backward replay.
(a) Sleep over-represents items associated with reward in animals (left) and in CMR-replay (right). Error bars represent ±1 standard error of the mean. (b) Varying the magnitude of reward outcome leads to differences in the frequency of backward but not forward replay in animals (left) and CMR-replay (right). In the animal data (left), error bars show 95% confidence intervals. For simulation results (right), error bars show ±1 standard error of the mean. Left image in a adapted from Ólafsdóttir et al., 2015, eLife; images in the left column adapted from Ambrose et al., 2016, Elsevier.
Replay activates remote experiences and links temporally separated experiences.
(a) The two panels show examples of remote and novel (shortcut) replay sequences observed in animals. The colored items indicate the temporal order of the sequences (light blue, early; purple, late). The red item denotes the resting position. (b) CMR-replay also generates remote and shortcut rest replay, as illustrated in the replay sequences in the two panels. (c) Proportion of replay events that contain remote sequences in animals (left) and in CMR-replay (right). Error bars show ±1 standard error of the mean in the data and model. (d) In Liu et al., 2019, participants encoded scrambled versions of two true sequences and : , and (Figure 7g). After learning, human spontaneous neural activity showed stronger evidence of sequential reactivation of the true sequences (left). CMR-replay encoded scrambled sequences as in the experiment. Consistent with empirical observation, subsequent replay in CMR-replay over-represents the true sequences (right). Error bars show ±1 standard error of the mean in the model. Images in a adapted from Gupta et al., 2010, Elsevier; left image in c adapted from Gupta et al., 2010, Elsevier; left image in d adapted from Liu et al., 2019, Elsevier.
Variations in replay as a function of experience.
(a) In CMR-replay, through repeated exposure to the same task, the frequency of replay events decreases (left), the average length of replay events increases (middle), and the proportion of replay events that are backward remains stable (after a slight initial uptick; right). (b) With repeated experience in the same task, animals exhibit lower rates of replay (left) and longer replay sequences (middle), while the proportion of replay events that are backward stays relatively stable (right). (c) In a T-maze task, where animals display a preference for traversing a particular arm of the maze, replay more frequently reflects the opposite arm (Carey et al., 2019) (left). CMR-replay preferentially replays the right arm after exposure to the left arm and vice versa (right). Error bars show ±1 SEM in all panels. Images in b adapted from Shin et al., 2019, Elsevier; left image in c adapted from Carey et al., 2019, Nature Publishing Group.
Learning from replay.
(a) Sleep increased the likelihood of reactivating the learned sequence in the correct temporal order in CMR-replay, as seen in an increase in the proportion of replay for learned sequences post-sleep. (b) Sleep leads to greater reactivation of rewarded than non-rewarded items, indicating that sleep preferentially strengthens rewarded memories in CMR-replay. (c) In the simulation of Liu et al., 2021, CMR-replay encoded six sequences, each of which transitioned from one of three start items to one of two end items. After receiving a reward outcome for the end item of a sequence, we simulated a period of rest. After, but not before rest, CMR-replay exhibited a preference for non-local sequences that led to the rewarded item. This preference emerged through rest despite the fact that the model never observed reward in conjunction with those non-local sequences, suggesting that rest replay facilitates non-local learning in the model. (d) We trained a ‘teacher’ CMR-replay model on a sequence of items. After encoding the sequence, the teacher generated replay sequences during sleep. We then trained a separate blank-slate ‘student’ CMR-replay model exclusively on the teacher’s sleep replay sequences. To assess knowledge of the original sequence, we collected sleep replay sequences from both models and assessed the probability that each model reactivates the item at position + lag of the sequence immediately following the reactivation of the ith item of the sequence, conditioned on the availability of the ith item for reactivation. Both models demonstrated a tendency to reactivate the item that immediately follows or precedes the just-reactivated item on the original sequence. This result suggests that the student acquired knowledge of the temporal structure of original sequence by encoding only the teacher’s replay sequences. Error bars show ±1 SEM.
Task simulations.
Each enclosed box corresponds to a unique item. Each arrow represents a valid transition between two items. Each dashed arrow represents a distractor that causes a drift in context between two items. Task sequences initiate at light gray boxes. Dark gray boxes represent salient items in each task. For tasks with multiple valid sequences, the order in which sequences are presented is randomized. (a) Simulation of a linear track. (b) Simulation of the task in Liu et al., 2021. (c) Simulation of a two-choice T-maze. (d) Simulation of a T-maze. (e) Simulation of the task in Bendor and Wilson, 2012. (f) Simulation of a linear track task with distinct directions of travel. (g) Simulation of input sequences in Liu et al., 2019. (h) Simulation of a fixed-item sequence.