Introduction

The capacity of cortical circuits to undergo plasticity in response to experience is a fundamental feature of the brain (Buonomano and Merzenich, 1998; Costandi, 2016; Li, 2016). This plasticity can be induced not only by active, task-dependent training, such as visual perceptual learning (Watanabe and Sasaki, 2015; Chen et al., 2016; Lu and Dosher, 2022), but also by passive, repetitive exposure (Gutnisky et al., 2009; Sasaki et al., 2010; Ekman et al., 2017). For instance, after repeated exposure to a moving dot sequence, flashing a dot (i.e., cue) at the starting point of the sequence elicits successive elevated neural responses along the motion path in visual cortex, akin to those induced by the actual moving dot sequence in both humans (Ekman et al., 2017, 2023; Lu et al., 2020) and animals (Eagleman and Dragoi, 2012; Xu et al., 2012). This cue-triggered reactivation is thought to be driven by expectation and is prediction-related (Ekman et al., 2017, 2023) since it has been shown to facilitate the prediction of upcoming stimuli and affect the perception of sensory information (Gutnisky et al., 2009; Baker et al., 2014; Pojoga et al., 2020).

Rather than simple white dot sequences used in previous studies (Xu et al., 2012; Ekman et al., 2017, 2023; Lu et al., 2020), temporal sequences to which we are exposed in daily life usually contain rich feature information. However, it remains unknown whether the cue-triggered elevated responses are feature-specific. On the one hand, if they are not feature-specific, these elevated responses may simply reflect general cortical preparedness for any upcoming stimuli. On the other hand, if they are indeed specific to a particular feature in the sequence, the elevated responses could only facilitate the processing of specific future events, according to the view that expectation leads to sharper neural tuning and therefore facilitates the processing of expected stimuli (Kok et al., 2012). For instance, expectation increased the prestimulus baseline of sensory neurons tuned to the expected stimulus (Wyart et al., 2012; Kok et al., 2014) and elicited preactivation of the stimulus template in both visual (Kok et al., 2017) and auditory (Demarchi et al., 2019) cortices. It should be noted that all these prediction-related feature-specific activities arose from static contexts. Whether prediction-related responses in dynamic temporal contexts (e.g., visual sequence) are feature-specific remains unclear.

After exposure to a feature-contained temporal sequence, another important question is how the feature information in the sequence is encoded and organized in neural activity following the cue. Previous studies have demonstrated that, after training with a temporal sequence, memory consolidation (Gridchyn et al., 2020; Gillespie et al., 2021) or learning process (Igata et al., 2021) has been linked to a neural pattern known as replay. Replay refers to the sequential reactivation of neural activity patterns associated with the trained sequence in both sleep (Lee and Wilson, 2002) and awake (Foster and Wilson, 2006) states and is believed to be crucial for memory consolidation and learning (Carr et al., 2011; Liu et al., 2021b, 2022). During replay, the neural representation of the trained sequence was temporally compressed and displayed in either forward or backward direction (Carr et al., 2011; Joo and Frank, 2018; Nour et al., 2021; Liu et al., 2022; McFadyen et al., 2023). These replay events frequently coincide with sharp-wave ripples (SWRs), which are high-frequency (150–220 Hz) oscillations (O’Keefe and Nadel, 1978; Bush et al., 2022) detected in the local field potentials of the hippocampus (Buzsáki, 1986, 2015). To date, replay was exclusively found when participants performed a task with sequences, e.g., nonlocal reinforcement learning (Liu et al., 2021b) and episodic memory retrieval (Wimmer et al., 2020). It remains unclear whether simple visual exposure to temporal sequences could also trigger replay events in humans.

In the current study, we used magnetoencephalography (MEG) to investigate whether the cue-triggered elevated brain responses following visual sequence exposure are feature-specific and, more importantly, to understand how the feature information in the exposed sequence is encoded and organized in the brain after exposure. To address these questions, participants were initially exposed to a predetermined motion sequence. We subsequently decoded the motion direction information during the blank period after presenting only the first or last motion direction in the sequence as a cue. Surprisingly, we found that the cue-triggered elevated responses were not specific to a particular motion direction. Interestingly, the motion direction information was spontaneously reactivated during the blank period and the motion sequence was backward replayed in a time-compressed manner. This backward replay was identified even with brief exposure. However, neither forward nor backward replay was detected when an intermediate motion direction in the sequence was presented as a cue. Lastly, through MEG source reconstruction analysis, we found that medial temporal lobe (MTL) activation preceded visual cortical activation, implying that the replay events observed in visual cortex may be triggered by activities in the MTL.

Results

The visual stimuli used in the study were four random-dot kinematograms (RDKs). All dots in an RDK moved in a direction (i.e., 0°, 90°, 180°, or 270°) with 100% coherence. In Experiment 1, we included three trial conditions: full sequence trial, start-only trial, and end-only trial (Figure 1A). In a full sequence trial, participants were exposed to a predefined sequence of the four RDKs in either clockwise (i.e., 0° → 90°→ 180° → 270°) or counterclockwise in the center of the screen, with a 300 ms of inter-stimulus interval (ISI) between every two RDKs. In a start-or end-only trial, the first or last RDK of the sequence was presented at the beginning of the trial, followed by a 2.4 s of blank period. Participants were instructed to complete three successive phases: functional localizer phase, exposure phase, and main phase (Figure 1B). The functional localizer data were used to train models to decode each motion direction in the sequence. During this phase, one of the four RDKs was randomly presented for 1 s in each trial. During the exposure phase, the participants were exposed to only full sequence trials for about 30 min. In the main phase, 50%, 25%, and 25% trials were full sequence, start-and end-only trials, respectively. The full sequence trials served as topping-up exposure to maintain the exposure effect.

Stimuli, experimental procedure and evoked responses in Experiment 1

(A) Participants were presented with RDKs in three conditions. In the full sequence trial condition, four successive RDKs were presented. In the start-or end-only condition, only the first or last RDK in the sequence was presented at the beginning of the trial. (B) During MEG scanning, the participants were presented with two functional localizer runs (i.e., functional localizer phase) before providing any sequence information. Next, they were exposed with four full sequence runs (i.e., exposure phase). Finally, in the main phase, full sequence, start-and end-only trials were presented in a pseudo-randomized order. (C) Evoked brain responses as a function of time relative to trial onset in the full sequence, start-and end-only trial conditions, respectively. Bold black lines at the bottom indicate temporal clusters in which they reached significance when compared to the pre-stimulus baseline. Inset figures at the top-right corner show the peak amplitudes during the four corresponding RDK intervals after baseline correction.

Cue-triggered elevated responses are not feature-specific

We first measured the event-related field (ERF) activity evoked by RDKs in the three trial conditions using all occipital gradiometer sensors (see Methods). Figure 1C showed the evoked response by full sequence trials (left panel, cluster-based permutation test with cluster forming threshold t > 3.1 and 5000 permutations). Remarkably, start-and end-only trials elicited similar wave-like responses as the full sequence trials, despite the absence of stimuli following the first RDK (Figure 1C, middle and right panels, cluster-based permutation test with cluster forming threshold t > 3.1 and 5000 permutations). To further quantify the ERF peak amplitudes in the three conditions while mitigating baseline confounds, we calculated each peak amplitude by using the 300 ms blank period just preceding the onset of the corresponding RDK as baseline (see Methods). We found that the four successive RDKs in the full sequence trials evoked four comparable peaks after stimulus onset (Figure 1C, inset figure in the left panel; two-tailed t-test; all ts(17) > 6.8072, all ps < 10-5). Interestingly, in start-and end-only trials, we could still observe significant peaks during the periods corresponding to the intervals of the second and third RDKs in the full sequence trials (Figure 1C, inset figures in the middle and right panels; start-only condition, two-tailed t-test; all ts(17) > 4.5790, all ps < 10-3; end-only condition, two-tailed t-test; all ts(17) > 6.0140, all ps < 10-4), although there was no significant peak during the period corresponding to the interval of the last RDK (start-only condition, two-tailed t-test; t(17) = 1.3221, p = 0.2036; end-only condition, two-tailed t-test; t(17) = 0.2473, p = 0.8076). These results demonstrate cue-triggered elevated brain responses following visual exposure to the RDK sequences, resembling previous studies using simple white dot sequences (Ekman et al., 2017; Lu et al., 2020).

Given that we observed the elevated responses even in the absence of stimuli following the cue, next we examined whether these responses were specific to a particular feature (i.e., motion direction). Specifically, we asked if we could successfully decode motion directions in start-and end-only trials, particularly during the blank periods that corresponding to the intervals of the second and third RDKs in full sequence trials. To this end, we applied a time-resolved decoding analysis. For each participant, we trained a one-versus-rest Lasso logistic regression model using the functional localizer data to classify the neural activity pattern elicited by each motion direction in the main phase. MEG signals from 72 occipital sensors were selected as features in the training model. To validate the reliability of our data, we first used a leave-one-out cross-validation scheme on the localizer data to independently estimate the decoding accuracy at each time point. At the group level, decoding accuracies peaked at 411 ms, 464 ms, 444 ms, and 434 ms after stimulus onset for motion directions 0° (55.37% ± 1.21), 90° (58.10% ± 1.14), 180° (54.07% ± 1.47), and 270° (53.64% ± 1.34), respectively. There were no significant differences among the four motion directions in terms of either latency (F(3, 51) = 0.375, p = 0.7716, ηp2 = 0.0159) or decoding accuracy (F(3, 51) = 2.757, p = 0.0517, ηp2 = 0.091) at the peak time point. Thereafter, the trained classifier was applied to the MEG signals in the three types of trials in the main phase. Finally, decoding probability (Turner et al., 2023) for each motion direction was calculated at each time point at the group level (Figure 2). Here, we used decoding probability because it provides a time-resolved decoding preference for each motion direction, rather than only a decoded label (e.g., 0°, 90°, 180°, or 270°).

Time-resolved decoding probability for each motion direction

(A) Visualization of the time-resolved decoding probability in the three trial conditions. Each row shows the decoding probabilities for the four motion directions at that time point and each column indicates one of the four motion directions. (B) The line plots of the time-resolved decoding probability for the three conditions. Each colored line shows the time course of the decoding probability for each motion direction. For the start-and end-only conditions, we calculated the permutation threshold estimated by randomly shuffling of the labels and re-decoding, only the decoding probability of the cue surpassed the threshold.

In full sequence trials, the motion direction information could be successfully decoded, as evidenced by the highest decoding probability during the interval corresponding to the presentation of the respective RDK (Figure 2A and 2B, left panels). In start-and end-only trials, we could also reliably decode the motion direction of the first RDK (i.e., the cue) after stimulus onset (Figure 2A and 2B, middle and right panels, only the decoding probability of the first RDK surpass the peak-level significance threshold obtained from a nonparametric permutation test, FWE corrected across time). Surprisingly, however, subsequent motion direction information was absent at the group level during the post-cue blank period, where the cue-triggered elevated responses were previously observed. Together, these results demonstrate that the cue-triggered elevated response induced by visual sequence exposure was not consistently specific to a particular feature across participants and trials.

Time-compressed backward replay of exposed motion sequence

How is the motion direction information encoded and organized in the brain during the post-cue blank period? Clearly, the motion direction representation is not time-locked to the onset of the cue in either start-or end-only trials. However, it is possible that individual motion directions are encoded in the MEG signals in a spontaneous way but are sequentially organized (Liu et al., 2019).

To test this possibility, we trained four decoding models using the functional localizer data to capture the neural features of the four motion directions. Here, we employed four decoding models because our aim was to obtain four feature-specific classifiers. Each classifier was designed to be sensitive to only one motion direction, thereby being useful for quantifying the evidence of feature-specific sequence in subsequent steps. The models used a one-versus-rest Lasso logistic regression algorithm, with MEG signals from 72 occipital sensors as features. We then selected the time point with the highest decoding accuracy, which was estimated during the validation of the functional localizer data mentioned above, as the optimal time point for each participant and motion direction, since these time points are believed to carry the richest feature information (Mo et al., 2019). For subsequent analyses, we used classifiers trained at their respective optimal time point. Finally, we obtained four classifiers, each yielding significant decoding probabilities only when the stimulus matched the motion direction that the classifier was trained to detect (Figure 3A). We did not find any spatial correlation between any two trained classifiers (highest correlation r < 0.12, Figure S1A). Since each participant and motion direction had an optimal time point at different latencies, for visualization at the group level, we circularly shifted the decoding probability over time and aligned their optimal time points to an arbitrary time point (here, 200 ms after stimulus onset) for each participant and motion direction, respectively (Figure 3A).

Illustration of replay analysis pipeline: Temporally delayed linear modeling (TDLM)

(A) The decoding probabilities were aligned at 200 ms post-stimulus onset according to their corresponding optimal time point per participant and motion direction. Each colored line indicates the decoding results of a classifier applied to a motion direction dataset. The dashed horizontal line indicates the permutation threshold estimated by random shuffling of the labels and re-decoding. (B) Regression models were trained for each participant and motion direction using MEG signals from the functional localizer data. (C) Classifiers were next applied to MEG signals during the post-cue blank period to derive a decoded [time x state] reactivation matrix. An example of backward sequential reactivation of stimuli is shown on the left. (D) Using TDLM, we quantified the evidence for sequential replay of motion directions during the post-cue blank period in start-and end-only conditions. We initially performed a time-lagged regression to create a regression coefficient matrix [states x states] for each time lag by regressing each lagged predictor matrix X(ι1t) onto the state reactivation matrix, Y (i.e., 1st level GLM analysis). The degree to which the reactivations systematically follow a transition matrix of interest was measured by “sequenceness” (i.e., 2nd level GLM analysis). We tested the magnitude of this effect at each time lag independently for all transition lags up to 600 ms. The dashed line represents the corrected nonparametric statistical significance threshold (see Methods for details). The green area indicates the lags when the evidence of sequenceness in the backward direction exceeded the permutation threshold.

Having established the classifiers for each participant and motion direction, we then applied the classifiers to the MEG signals during the post-cue blank period in start-and end-only conditions to estimate the reactivation probability (i.e., decoding probability) for each motion direction at each time point (Liu et al., 2019; Wimmer et al., 2020; Nour et al., 2021). Example trials in the start-and end-only conditions are shown in Figure S2A. The results revealed that motion direction reactivations occurred sparsely during the post-cue blank period, rather than crowded within the intervals corresponding to the RDK presentations in the full sequence condition, suggesting that the motion direction information might be reactivated in a spontaneous way. Next, we applied temporally delayed linear modeling (TDLM) to quantify the evidence whether and how the spontaneous reactivations follow the order of the exposed sequence (Figure 3D). This algorithm includes two-level regression analyses; a first-level regression that quantifies the evidence for each pairwise state-to-state transition, and a second-level regression that evaluates the extent to which the reactivation patterns align with a particular order of interest. Finally, we defined “sequenceness” as a metric of forward (i.e., 1 → 2 → 3 → 4) or backward (i.e., 4 → 3 → 2 → 1) replay (Kurth-Nelson et al., 2016; Liu et al., 2019, 2021a; Nour et al., 2021).

As shown in Figure 4A and 4B, in the start-only condition, we found evidence of backward replay of the exposed motion sequence (i.e., the replay sequence was 270°→ 180° → 90° → 0° when the exposed motion sequence was 0° → 90° → 180° →270°) during the post-cue blank period, with a peak transition lag at 28–40 ms (maximal effect at 32 ms-lag: β = −0.0202 ± 0.002, p < 1/24 ≈ 0.042, peak-level significance threshold derived from a nonparametric permutation test, FWE corrected across lags, Figure 4B). For visualization, an example of backward replay of the motion sequence was illustrated in Figure 4A. This effect was observed in most participants (Figure S3A).

Backward replay in both start-and end-only conditions

(A and D) Examples of backward sequential reactivation in start-(A) and end-(D) only conditions from a representative participant. Each row represents the reactivation probabilities for the four motion directions at that time point and each column indicates one of the four motion directions. (B and E) Backward replay of the exposed motion sequence with peak transition lags at 28–40 ms in the start-only condition (B) and at 28 – 36 ms in the end-only condition (E). Horizontal dashed lines represent corrected significance levels from a nonparametric permutation test at the 2nd-level GLM analysis of TDLM. (C and F) Backward replay of the exposed motion sequence predominantly appeared at 1.2 – 1.8 s in the start-only condition (C) and 0.6 – 1.8 s in the end-only condition (F) after the onset of the blank period.

In the end-only condition, we also found evidence of backward replay during the post-cue blank period, with a peak transition lag at 28–36 ms (maximal effect at 32 ms-lag: β= −0.0145 ± 0.0016, p < 1/24 ≈ 0.042, peak-level significance threshold obtained from a nonparametric permutation test, FWE corrected across lags; Figure 4E). Figure 4D shows an example of backward replay of the motion sequence in this condition, found in most participants (Figure S3A). Note that the time lag in the horizontal axis in Figure 4B and 4E indicates the interval between the onsets of every two items in the replayed motion sequence. We found that the replayed sequence was approximately 10 times faster than the evoked activity sequence in the full sequence condition.

To further examine the period during which the backward replay occurred most frequently, we divided the blank period (2.4 s) into 4 stages, each lasting 600 ms. In start-only trials, we found that the backward replay predominantly appeared within the third stage of the blank period (1.2 – 1.8 s after the onset of the blank period, Wilcoxon signed-rank test, p = 0.0108), but there was no significant sequenceness in the other three stages (first stage, p = 0.7112; second stage, p = 0.5566; fourth stage, p = 0.6791; Figure 4C). In end-only trials, the backward replay was more likely to occur during the second and third stages of the blank period, although the results approached, but did not reach, significance (Wilcoxon signed-rank test, second stage, p = 0.0936; third stage, p = 0.0642; Figure 4F), In contrast, the replay was not frequently observed during the first and last stages (Wilcoxon signed-rank test, first stage, p = 0.9479; fourth stage, p = 0.4997). Finally, in a control analysis, we conducted a comprehensive examination of all 24 potential sequences. Only the backward replay was detected in both start-and end-only conditions (Figure S4).

Backward replay is cue-dependent and depends on the amount of exposure

So far, we have demonstrated that the cue-triggered elevated responses induced by the motion sequence exposure were not motion direction specific. However, the motion information was spontaneously reactivated and the motion sequence was backward replayed in a time-compressed manner. It remains unknown whether the observed backward replay of the motion sequence was cue-dependent or not. In other words, does the replay of the motion sequence occur irrespective of which item of the sequence is presented as a cue? To address this question, we conducted Experiment 2, mirroring the design of Experiment 1 but with a different cue. Instead of flashing the first or last RDK, we presented the second or third RDK as a cue (2nd-only or 3rd-only condition, see Figure S5A and Methods) to examine whether these two non-terminal cues could induce replay during the post-cue blank period. We found no evidence of either forward or backward replay in either condition (Figure 5A; maximal nonsignificant effect at 32 ms-lag: 2nd-only condition, β = −0.0024 ± 0.0014; 3rd-only condition, β =−0.0018 ± 0.0015).

Backward replay is cue-dependent and depends on the amount of exposure

(A) No evidence for replay was found in either the 2nd-only (left) or 3rd-only (right) condition in Experiment 2. Horizontal dashed lines have the same meaning as those in Figure 4B and 4E. (B) In Experiment 3, immediately after the functional localizer phase, participants entered the main phase without the exposure phase. In the end-only condition, backward replay was observed; however, in the start-only condition, no such replay was observed.

Another interesting question is how changing the amount of exposure would affect the replay events during the blank period. To explore this issue, we conducted Experiment 3, also similar to Experiment 1 except for the removal of the exposure phase. Accordingly, only full sequence trials (50% trials) in the main phase served for exposure (see Figure S5B and Methods). In the start-only condition, we found a numerical trend of backward replay at transition lags at 20–40 ms, although it did not reach statistical significance (Figure 5B, left panel; maximum effect at 28 ms-lag: β =−0.0139 ± 0.0021). In the end-only condition, the evidence for backward replay just reached the significance level at transition lags at 20 – 40 ms (Figure 5B, right panel; maximal effect at 24 ms-lag: β = −0.0147 ± 0.0019, p < 1/24 ≈ 0.042, using the peak-level significance threshold from a nonparametric permutation test, FWE-corrected across lags).

Power increase in replay-associated sharp-wave ripple frequencies

Previous studies on rodents and humans showed that replay events were associated with increased high-frequency ripple power (Buzsáki, 2015; Liu et al., 2019). To investigate whether such replay-associated power increase could be observed in our study, we performed a time-frequency analysis using combined data from start-and end-only conditions in Experiment 1. We first identified putative replay onsets during the post-cue blank period in each trial, using probabilities decoded from MEG signals in the occipital region. Replay onsets were determined by choosing time points with a high (>95th percentile) probability for backward replay with a 32 ms-lag transition (maximal replay effect at the group level for both start-and end-only conditions, Figure 4B and 4E; see Methods for details). Using the MEG signals recorded from whole brain sensors, we found a transient ripple power increase at 120–180 Hz at the onset of replay events, compared to the baseline period of 50–100 ms prior to replay onset (Figure 6A; cluster-based permutation test with cluster forming threshold t > 3.1 and 5000 permutations).

Replay events align with ripple power, with source activation in the MTL preceding activation in visual cortex

(A) Top: In Experiment 1, a time-frequency decomposition of sensor-level data revealing a brief increase in high-frequency oscillatory power at replay onset. Bottom: a cluster-based permutation test (cluster forming threshold, t > 3.1; number of permutations = 5000) could identify a significant cluster around 140 Hz. (B) Source localization of ripple-band power 30 ms before replay onset showing significant activation in the MTL (peak Montreal Neurological Institute [MNI] coordinates: X = 21, Y = 21, Z = 13. Neurological orientation.) (C) Source localization of ripple-band power at replay onset showing significant activation in visual cortex (peak MNI coordinates: X = 20, Y = 8, Z = 17). (D) The activation time course of the MTL at its peak MNI coordinate is shown in red, whereas that of visual cortex at its peak MNI coordinate is displayed in green. The MTL reached its peak activation before visual cortex.

Visual cortical activation lags MTL activation

Since replay events were obtained using decoding probabilities from occipital sensors only, we next examined whether the observed replay onsets were related to power increase within visual cortex. Using a linearly constrained minimum variance (LCMV) beamforming algorithm (Van Veen et al., 1997), we first epoched the data using replay onsets combined from both start-and end-only conditions and then beamformed the broadband power into the source space (see Methods for details). We found that the power increase at replay onset was localized to sources in bilateral visual cortex (Figure 6C). Moreover, the right MTL, including the hippocampus and amygdala, was associated with power increase prior to the onset of replay events (Figure 6B). Specifically, activation (i.e., power increase) in the MTL occurred 30 ms earlier than that in visual cortex (p < 0.05, corrected, whole brain nonparametric permutation test). For display purposes, we extracted activations from the 10 most activated voxels within the MTL and visual cortex, respectively, and plotted the time courses of their broadband power. Peak activation in visual cortex at replay onset was preceded by peak activation in the MTL (Figure 6D), implying an information flow from the MTL to visual cortex.

Discussion

We found the co-existence of time-locked, non-feature-specific elevated responses and non-time-locked, feature-specific backward replay after visual sequence exposure in human brain. The feature-specific backward replay occurred in a time-compressed manner and could be triggered only by the first or last stimulus of the sequence. Interestingly, even brief exposure to the sequence could still induce a trend for the backward replay. Finally, we observed that MTL activity preceded the ripple power increase in visual cortex at replay onset.

Our study provides two new findings in the fields of vision and learning. First, different from many previous studies showing that expectation-based responses in static contexts are feature-specific (Wyart et al., 2012; Kok et al., 2014, 2017), we show here that prediction-related elevated responses in the dynamic temporal context are not. This non-feature-specific elevated responses potentially facilitate the general processing of any upcoming stimuli, rather than stimuli with a specific feature. Second, contrary to the prevailing notion that replay in learning and memory requires lengthy training with a task, our study shows that even brief exposure to a visual sequence is sufficient to induce replay. The replay we unveiled here may play a critical role in memorizing the visual sequence.

Regarding the two-fold neural consequences following the visual sequence exposure, an interesting question is whether the elevated responses and the backward replay share the same neural origin, for instance, originating from the hippocampus or other brain areas. For the elevated responses, the cue serves to provide temporal information for upcoming stimuli but without detailed feature information, thereby priming the cortices for activation and facilitating the general processing of future events. Previous animal studies (Xu et al., 2012; Gavornik and Bear, 2014) showed that this process can be implemented through a simple local synaptic mechanism in visual cortex (Xu et al., 2012) without top-down guidance. Moreover, Ekman et al. (2023) recently found no functional relationship between activities in V1 and the hippocampus when exposing human participants with a white dot sequence, further suggesting that the elevated responses may indeed originate in visual cortex.

Different from the elevated responses, the replay events are likely initiated by the hippocampus. This speculation is supported by two key findings. First, the replay of the motion direction information manifested in a backward direction and a time-compressed manner. Second, the replay is observed regardless of whether the cue is the first or last RDK in the sequence. Both of these two properties cannot be explained by a simple local synaptic mechanism (Xu et al., 2012) or a pattern completion-like mechanism (Hindy et al., 2016; Ekman et al., 2017; Kok and Turk-Browne, 2018) within visual cortex alone. Instead, the replay entails reorganizing the motion direction sequence in the brain. Therefore, we propose the involvement of active communication between the hippocampus and visual cortex in the occurrence of replay events, consistent with the view that the hippocampus encodes relationships among stimuli whereas visual cortex primarily acts as a platform for the manifestation of replay events (Whittington et al., 2020). Previous studies on memory consolidation considered the exchange of information between these two regions critical for facilitating replay events through hippocampal-neocortical circuits (Buzsáki, 1996; Ji and Wilson, 2007; Carr et al., 2011; Ólafsdóttir et al., 2016; Buch et al., 2021). Functionally, replay events offer a mechanism for transferring recent experience from the hippocampus to the cortex, enabling the encoding of stimulus relationships in the cortex (Marr and Brindley, 1971; Alvarez and Squire, 1994; Redish and Touretzky, 1998; Dimakopoulos et al., 2022).

A natural behavioral consequence of visual sequence exposure is visual sequence learning (Baker et al., 2014; Finnie et al., 2021; Ekman et al., 2023). How is the learning implemented through hippocampus-dependent replay in visual cortex? Three key processes are considered here. First, the hippocampus is involved in encoding relationships among stimuli (Staresina and Davachi, 2009; Turk-Browne et al., 2009; Hsieh et al., 2014; Garvert et al., 2017); in fact, the ability to encode such relationships drastically decreases when the hippocampus is damaged (Chun and Phelps, 1999; Hannula et al., 2006; Konkel et al., 2008; Schapiro et al., 2014; Finnie et al., 2021). Second, visual cortical areas act as a “cognitive blackboard” where task-relevant features are highlighted through feedback connections (Roelfsema and de Lange, 2016). Such a blackboard can be flexibly written or edited. Third, there are strong bidirectional connections between the hippocampus and sensory cortices (Eichenbaum et al., 2007; Henke, 2010). As proposed by the hippocampal-cortical backward projection model (Rolls, 2000), sequential reactivations of feature information initially generated in the hippocampus can be quickly and accurately sent back to the sensory cortices, consistent with the findings of Ji and Wilson, (2007). A recent study also provided direct evidence that visual sequence plasticity is impaired when the hippocampus is damaged (Finnie et al., 2021), supporting the hypothesis of functional feedback information flow.

Why does the replay manifest in a reverse order? To date, there is no consensus on this issue. In rodents, a seminal study of replay during the awake state showed that when animals stopped at the end of a rewarded pathway, place cells were reactivated in the reverse order of the previously experienced direction (Foster and Wilson, 2006). However, a later study revealed that awake replay could occur in either a forward or backward direction relative to behavioral experience (Diba and Buzsáki, 2007). Similarly, in humans, both directions have been observed in different nonspatial cognitive tasks (Liu et al., 2019, 2021b; Wimmer et al., 2020). Forward replay may be associated with planning (Ólafsdóttir et al., 2015; Gillespie et al., 2021), providing information pertaining to the assessment of future pathways (Diba and Buzsáki, 2007). Backward replay may be more related to experience, as often observed at the end of runs when animals consume a reward (Foster and Wilson, 2006; Ambrose et al., 2016). Nevertheless, the exact function of the replay direction remains mysterious, as both forward and backward replays are modulated by task demands (Ólafsdóttir et al., 2017). Thus, the underlying neural mechanisms of backward replay in visual cortex remain to be investigated.

Finally, we also found that replay occurrence is modulated by the cue. This result highlights the importance of the start and end points of the sequence in the replay. One fascinating proposal is that the replay event is sensitive to sequence boundaries, as indicated by the role of the start or end point of the sequence as salient boundaries that anchor place cell firing (Rivard et al., 2004). Accordingly, previous studies have shown that when rats are trained to run along a linear track starting at different points, place fields tend to be anchored to either the start or end of the journey (Gothard et al., 1996; Redish et al., 2000), suggesting that a boundary is essential to sequence integrity and may play a pivotal role in triggering replay onset. An alternative explanation to these findings posits that the onset of the second or third stimulus in the sequence reinstate the neural representations of a partial visual sequence (Ekman et al., 2023). For example, for a given sequence, A → B → C → D, flashing the third stimulus (i.e., C) only trigger a backward reactivation of the sequence giving C → B → A during the blank period.

Taken together, we found that simple visual sequence exposure could concurrently induce two-fold brain plasticity, that is, non-feature-specific elevated responses and feature-specific backward replay in the human visual cortex. We speculate that the non-feature-specific elevated responses may enhance general processing of upcoming visual stimuli, whereas the feature-specific backward replay may subserve visual sequence learning and memory. These findings significantly advance our understanding of the task-independence and the multifaceted nature of brain plasticity in response to visual experience.

Methods

Participants

A total of 59 healthy participants (29 females) were involved in the three experiments (Experiment 1: n = 21, 11 females, 22.1 ± 2.61 years; Experiment 2: n = 18, 10 females, 23.56 ± 3.29 years; Experiment 3: n = 20, 10 females, 20.65 ± 2.62 years). In Experiment 1, data from three participants were excluded before analyses, as two showed large head motion (>20 mm) and the behavioral performance of one was at chance level. In Experiment 3, data from two participants were excluded before analyses because of large head motion (>20 mm). All participants were recruited in exchange for monetary compensation (100 RMB/h). They reported normal or corrected-to-normal vision and had no history of neurological disorders. They were naive to the purposes of the study. The experiments reported here were carried out in accordance with the guidelines expressed in the Declaration of Helsinki. All participants provided written informed consent in accordance with the procedures and protocols approved by the Human Subject Review Committee of Peking University.

Task

Experiment 1

Visual stimuli were RDKs with 100% coherence. All dots in an RDK moved in the same direction (luminance: 2.86 cd/ m2; diameter: 0.1°; speed: 8°/s) and were presented against a gray background (luminance: 16.7 cd/m2). At any one moment, 400 dots were visible within a 9° circular aperture and moved in one of the four directions: 0° (right), 90° (up), 180° (left), and 270° (down). Each participant completed three phases successively in the MEG scanner, namely, the functional localizer phase, the exposure phase, and the main phase (Figure 1C). In the functional localizer phase, each trial started with the presentation of an RDK for 1 s followed by a 1–1.5 s intertrial interval (ITI). Participants did not need to perform any task in this phase. The motion direction of the RDK was randomly chosen from the four directions. Each participant performed two functional localizer runs and, each run comprised 100 trials, resulting in a total of 50 trials per motion direction. The localizer data were used to train motion direction classification models. This phase took approximately 10 min.

In the exposure phase, we showed participants the four RDKs in either clockwise or counterclockwise order (e.g., 0° → 90° → 180° → 270°, Figure 1A and 1C); each was displayed for 400 ms, followed by 300 ms of a blank screen with fixation only. Therefore, the total sequence lasted for 2.8 s. The order was counterbalanced among participants, but once decided, it was fixed for each participant. Participants were instructed to detect an oddball RDK by pressing a button, i.e., in 20% trials, dots in one of the four RDKs traveled at a faster speed (9°/s). Each participant completed four runs of 50 trials.

In the main phase, participants were presented with trials in three different conditions: full sequence condition (50% trials), start-only condition (25% trials), and end-only condition (25% trials). Trials in the full sequence condition were identical to those in the exposure phase which served as “topping-up” exposure to maintain the exposure effect, similar to “topping-up” adaptation in visual adaptation studies (Fang et al., 2005). In the start-and end-only conditions; however, we only presented the first RDK (start-only condition) or the last RDK (end-only condition) of the full sequence. For a given run, the order of the three conditions was pseudorandomized with the restriction that the start-and end-only trials were always preceded or followed by a full sequence trial. Participants performed an identical oddball detection task as during the exposure phase. The oddball occurred in 10% of full sequence trials. Finally, each participant completed four runs of 48 trials, yielding a total of 96 full sequence trials, 48 start-only trials and 48 end-only trials.

Experiments 2 and 3

In Experiment 2, we only presented the second or the third RDK as a cue at the start of the trial, referred to as the 2nd-only and 3rd-only conditions, respectively. The procedure of Experiment 2 was similar to that of Experiment 1 except that the start-and end-only conditions were replaced with the 2nd-only and 3rd-only conditions. Experiment 3 followed the same procedure as Experiment 1 except that the exposure phase was removed.

Quantification and statistical analysis

MEG acquisition and preprocessing

Neuromagnetic signals were recorded continuously at 1000 Hz with a 306-channel (204 planar gradiometers; 102 magnetometers), whole-head MEG system (Elekta Neuromag TRIUX) in a magnetically shielded room. Before scanning, four-headed position indicator coils attached to the scalp determined the head position with respect to the sensor array. Coil location was digitized with respect to three anatomical landmarks (nasion and preauricular points) with a 3D digitizer (Polhemus Isotrak system). Participants sat upright inside the scanner, while the stimuli were projected onto a screen suspended in front of them. Participants responded using a MEG-compatible button box held in their right hand.

To reduce noise from external environment, the temporal extension of signal-space separation (tSSS) method was applied at the preprocessing stage using the Elekta Neuromag MaxFilter software (Taulu and Simola, 2006). MEG signals were high-pass filtered at 0.5 Hz using a first-order IIR filter to remove slow-drifts. Data were then downsampled from 1000 Hz to 250 Hz for sequenceness analysis or 500 Hz for time-frequency analysis. Excessively noisy segments and sensors were automatically removed before independent component analysis (FastICA, http://research.ics.aalto.fi/ica/fastica) and performed to remove artifacts including cardiac signals, eye movements and blinks, and environmental noise. Artifact components were removed by visual inspection of spatial topography, time course, kurtosis of the time course, and frequency spectrum of all components. Only 72 sensors (including both gradiometers and magnetometers) covering the occipital lobe, labeled as “Occipital” in the MEG data acquisition system were used for MEG data analyses, except for source localization. The sensor selection was primarily motivated by the main objective of the study, examining replay events in visual cortex.

Event-related fields

To calculate the event-related fields (ERFs), MEG epochs around trial onset were segmented for each trial and baseline-corrected using the mean activity in the time window of [−0.3 s, 0] before trial onset. Only 48 planar gradiometers of “Occipital” sensors were used to calculate the ERFs. We then averaged the planar-combined ERF activity for each condition. To avoid any potential baseline influence due to the short interval between every two RDKs, we further calculated the ERF peak amplitude by using the mean activity in the time window of [−0.3 s, 0] before the onset of the corresponding RDK as baseline. Subsequently, peak amplitudes during the four corresponding RDK intervals were calculated as the difference between the peak and its corresponding baseline.

Decoding analysis

Decoding analysis was performed to classify the neural activity patterns elicited by the motion directions of the four RDKs in the main phase. A one-versus-rest Lasso-regularized logistic regression model was trained using the MEG signals from the 72 occipital sensors in the functional localizer phase. Specifically, we trained a five-class classifier, including four classes from trials in which the four RDKs were presented, and an additional class comprising an equivalent amount of null data extracted from the 1–1.5 s ITI. Null data were included to reduce the spatial correlation among the classes, thereby concurrently lowering the decoding probabilities for all classes (Liu et al., 2021a; Nour et al., 2021). Class weights were balanced by adjusting inversely proportional to class frequencies (i.e., trial numbers) in the training procedure. Additionally, every 5 trials were averaged within the same class to reduce noises. We then applied the trained classifier to the MEG signals at each time point in the main phase. Finally, decoding probabilities were averaged across each condition and motion direction to yield a time course of decoding probability.

Optimal time point of motion direction representation

The optimal time point of motion direction representation was considered as the time point with the highest decoding accuracy. To index the optimal time point of each motion direction for each participant, we conducted time-resolved motion direction decoding analysis on the functional localizer data using Lasso-regularized logistic regression models. A leave-one-trial-out cross-validation procedure was used to train the classifier to determine one of the four motion directions, yielding a decoding accuracy at each time point for each participant. Finally, the time point with the highest decoding accuracy was independently extracted for each participant and each motion direction. These time points are referred to as optimal time points and were used for sequenceness analysis.

Sequenceness measure

To identify sequenceness during the post-cue blank period in each trial, we trained models to detect transient spontaneous neural reactivation of each motion direction representation. Therefore, one-versus-rest Lasso-regularized logistic regression models were trained separately for each motion direction using the functional localizer data. MEG signals from the 72 occipital sensors obtained throughout all scanning sessions were used to train the decoding models. For each motion direction, we trained a binomial classifier, using positive instances from trials in which that feature (i.e., one motion direction) was presented and negative instances from trials in which all other features (i.e., the other three motion directions) were presented, together with an equivalent amount of null data from the 1–1.5 s ISI. The sensor distributions of beta estimates and the spatial correlation among the classifiers are shown in Figure S1.

The sequenceness analysis pipeline is illustrated in Figure 3. We first applied the trained classifiers with the functional localizer data at the optimal time point to MEG signals at each time point during the blank period, to generate a [time x state] reactivation probability matrix for each trial. The TDLM framework was then used to quantify evidence for sequential reactivations consistent with the inferred task transition structure (Liu et al., 2021b, 2021a; Nour et al., 2021).

TDLM is a multiple linear regression approach to quantify the extent to which a lagged reactivation time course of state i, (X(Δt)i, Δt indicates lag time) can predict the reactivation time course of state j, (Xj). Two steps were included in this pipeline. First, we performed multiple separate regressions using each state’s reactivation time course as a dependent variable (j ∈ [1: 4]) and the historical (i.e., time-lagged) reactivation time courses of all states (i ∈ [1: 4]) as predictor variables.

The regression coefficients from Equation 1 quantify the evidence for each empirical state → state reactivation pattern at a given time lag, Δt. For example, β(Δt)ij is the coefficient capturing the unique variance in Xj explained by X(Δt)i. All such first-level coefficients were placed in a lag-specific [4 x 4] empirical transition matrix Β, representing evidence for all possible state-to-state transitions at a specific lag.

In the second step, we quantified evidence supporting that the empirical (lag-specific) transition matrix, Β, can be predicted by the sequences of interest, i.e., the exposed motion sequence. All transitions of interest were specified in a model transition matrix (1 for transitions of interest and 0 otherwise), separately for forward (as in the visual experience) (TF), and backward (opposite to the visual experience) direction (TB). Then, the strength of all sequences of interest was measured by:

where Β is the [state x state] empirical transition matrix, Tr is the [state x state] predictor transition matrix (for regressor r), predicting the empirical transitions B. We included four predictor matrices: (1) TF: transitions as in the motion sequence in the forward direction (transitions corresponding to [A → B → C → D]), (2) TB: transitions opposite to the experience in the backward direction ([D → C → B → A], TB is the transpose of TF), (3) Tauto: self-transitions to control for autocorrelation ([4 x 4] identity matrix), and (4) Tconst: a constant matrix to model away the average of all transitions, ensuring that any weight on TF and TB was not due to general background neural dynamics.

Zr is the scalar regression coefficient quantifying evidence for the hypothesized sequences in the empirical transitions, i.e., sequence strength. Note, this estimate of sequence strength is a relative quantity. An estimate of zero for state i to state j, for example, does not mean lack of replay from i → j but no stronger replay of i → j than that of other transitions. Repeating the regression in Equation 2 at each time lag (Δt = 4, 8, 12,…,600 ms) results in both forward and backward sequence strength as a function of time lag. Shorter lags indicate greater time compression, corresponding to faster speed.

In the current study, sequenceness was defined from the contrast between the evidence for sequential replay of task structure in the forward ([A → B → C → D]) versus backward ([D → C → B → A]) direction (i.e., Z1–Z2); thus, removing between-participant variance in the sequential replay per se (which may arise secondary to task engagement and measurement sensitivity (Liu et al., 2021a; Nour et al., 2021)). Positive sequenceness values indicate replay in a predominantly forward direction, whereas negative sequenceness values indicate replay in a predominantly backward direction.

For statistical inference, we used nonparametric permutation tests involving all possible permutations of the stimulus labels at the second-level regression, equivalent to permuting the rows and columns together of the transition matrices used to calculate sequenceness. For each permutation, we calculated the peak absolute mean sequence strength over participants and across lags (controlling for multiple comparisons across lags). Sequence strength in unpermuted data was considered significant if its absolute magnitude was >95% of the within-permutation peak.

Identifying replay onsets and analyzing time-frequency

Given that the maximal evidence for replay transitions at the group level was a 32 ms-lag, we next identified replay onsets. Replay onsets were defined as the times when a strong reactivation of one stimulus (e.g., A) was followed by a strong reactivation of the next stimulus (e.g., B) in the sequence, with a 32 ms-lag (Liu et al., 2019; Nour et al., 2021). Specifically, we shifted the reactivation matrix X up to the time lag Δt for the first time with maximal evidence for replay (i.e., 32 ms), obtained X(Δt).

We then multiplied X by the transition matrix P; corresponding to the unscrambled sequences: X x P. The transition matrix P defines the mapping between the task state corresponding to column i in X, and column i in Orig. Specifically, column i in Orig is the reactivation time course of the state preceding state i in T.

Next, we multiplied Proj and Orig element-wise, summing over the columns of the resulting matrix; therefore, creating a [time x 1] vector, R, in which each element (i.e., row) indicates the strength of replay with a 32 ms-lag at a given moment in time.

Finally, we identified putative replay event onsets by threshold R at its 95th percentile preceded by a 100 ms pre-onset baseline exhibiting a low probability of replay at each time point.

We then epoched MEG data in the blank period surrounding each onset and computed a frequency decomposition (wavelet transformation) in the time window of −100–150 ms using all sensors. We aimed to find an evidence for the power increase in the high-frequency (120–160 Hz) region of interest at replay onset, compared with a pre-onset baseline (−100–−50 ms from onset), as mentioned in our previous study (Liu et al., 2019). Finally, we generated two separate [time x frequency] matrices (i.e., using forward and backward task transition matrices separately) by averaging estimates over sensors and events, capturing the typical spectrally-resolved power change at replay onset.

MEG source reconstruction

We identified the neural sources associated with increased ripple power at putative replay onsets. Forward models were generated based on a single shell using the superposition of basic functions that approximately correspond to the plane tangential to the MEG sensor array. Linearly constrained minimum variance beamforming (Van Veen et al., 1997) was used to reconstruct the epoched MEG data to a grid in MNI space (grid step, 5 mm). The sensor covariance matrix for beamforming was estimated using broadband power data across all frequencies. The baseline activity was the mean activity averaged over −100 – −50 ms relative to replay onset. All nonartifactual replay epochs were baseline-corrected at source level. We obtained whole-brain results for voxels predicted by participant-specific ripple power at replay onset. Nonparametric permutation tests were performed on the volume of interest to compute the multiple comparison P values of clusters >10 voxels (whole-brain corrected, cluster-defining threshold; t = 3.1, n = 5000 permutations).

Data and Code availability

Data used to generate the findings of this study will be available upon request (subject to participant consent) to the Lead Contact. TDLM MATLAB Code available at: https://github.com/YunzheLiu/TDLM. Custom computer code will be made available upon request to the Lead Contact.

Acknowledgements

This study was supported by National Science and Technology Innovation 2030 Major Program (2022ZD0204802), the National Natural Science Foundation of China (31930053), Beijing Natural Science Foundation (5244044), and the Young Scientists Fund of the Humanities and Social Science Foundation of Ministry of Education of China (23YJCZH071).

Additional information

Author contributions

Conceptualization, T.H., Y.L., and F.F.; Investigation, T.H., Q.W., X.G., and X.Z.; Writing – Original Draft, T.H.; Writing – Review & Editing, T.H., Q.W., X.Z., Y.L., and F.F.

Declaration of interests

The authors declare no competing interests.

Supplementary information

Sensor maps and spatial correlation of trained Lasso logistic regression models (related to Figure 3)

(A) Left: Sensor map for each state decoding model in Experiment 1; MEG signals from 72 occipital sensors were selected as features while training the classifier. Right: Correlation matrix among classifiers. No spatial correlation was found among trained classifiers (highest correlation; r < 0.12). (B and C) Same as (A), except the sensor maps and correlation matrix correspond to Experiments 2 (Panel B, highest correlation; r < 0.1) and 3 (Panel C, highest correlation; r < 0.1), respectively.

Decoded feature representations in start-and end-only conditions (related to Figures 3)

Time series of reactivation probability output from the four regression models on an example trial. Time zero corresponds to trial onset. Both participants S08 and S12 were exposed with a fixed motion sequence of 0° → 90° → 180° → 270°.

Sequenceness distribution across participants (related to Figures 4 and 5)

(A) In Experiment 1, of the 18 participants, 14 showed backward replay at 32 ms-lag for the start-only condition (left), and 13 showed backward replay at 32 ms-lag for the end-only condition (right). Each dot represents an individual participant. Backward and forward sequenceness is denoted by blue and yellow dots, respectively. The inset histogram shows the distribution of deviations from the unity line. (B) Same as described in (A), but for the results of Experiment 2 in the 2nd-only (left) and 3rd-only (right) conditions, respectively. (C) Same as described in (A) but for the results of Experiment 3 in the start-(left) and end-only (right) conditions, respectively.

Sequenceness for each of the 24 possible orders (related to Figure 4)

(A) In the start-only condition of Experiment 1, only the backward sequence [4 → 3 → 2→ 1] with lags at 28–40 ms was significantly different (gray panels). The horizontal dashed lines represent significance thresholds derived from state label permutation at the 2nd-level GLM matrix of temporally delayed linear modeling. (B) Same as described in (A) but for the end-only condition in Experiment 1.

Stimuli and experimental procedures in Experiments 2 and 3 (related to Figure 5)

(A) We presented the second (2nd-only condition) or third RDK (3rd-only condition) as a cue in Experiment 2. (B) The procedure of Experiment 3 was identical to that of Experiment 1 except for removing the exposure phase.