Sequence action representations contextualize during early skill learning

  1. Debadatta Dash
  2. Fumiaki Iwane
  3. William Hayward
  4. Roberto F Salamanca-Giron
  5. Marlene Bönstrup
  6. Ethan R Buch  Is a corresponding author
  7. Leonardo G Cohen  Is a corresponding author
  1. Human Cortical Physiology and Neurorehabilitation Section, NINDS, NIH, United States
  2. Department of Neurology, University of Leipzig Medical Center, Germany
6 figures and 1 additional file

Figures

Figure 1 with 1 supplement
Experimental design and behavioral performance.

(A) Skill learning task. Participants engaged in a procedural motor skill learning task, which required them to repeatedly type a keypress sequence, "4-1-3-2-4" (1=little finger, 2=ring finger, 3=middle finger, and 4=index finger) with their non-dominant, left hand. The Day 1 Training session included 36 trials, with each trial consisting of alternating 10 s practice and rest intervals. The rationale for this task design was to minimize reactive inhibition effects during the period of steep performance improvements (early learning; Bönstrup et al., 2020; Pan and Rickard, 2015; see Materials and methods). After a 24-hr break, participants were retested on performance of the same sequence (4-1-3-2-4) for nine trials (Day 2 Retest) to inform on the generalizability of the findings over time and MEG recording sessions, as well as single-trial performance on nine different control sequences (Day 2 Control; 2-1-3-4-2, 4-2-4-3-1, 3-4-2-3-1, 1-4-3-4-2, 3-2-4-3-1, 1-4-2-3-1, 3-2-4-2-1, 3-2-1-4-2, and 4-2-3-1-4) to inform on specificity of the findings to the learned skill. MEG was recorded during both Day 1 and Day 2 sessions with a 275-channel CTF magnetoencephalography (MEG) system (CTF Systems, Inc, Canada). (B) Skill Learning. As reported previously1, participants on average reached 95% of peak performance by trial 11 of the Day 1 Training session (see Figure 1—figure supplement 1A for results over all Day 1 Training and Day 2 Retest trials). Shaded regions in main plot indicate the 95% confidence interval of the group mean. At the group level, total early learning was exclusively accounted for by micro-offline gains during inter-practice rest intervals (B, inset; F [2,75]=14.79, p=3.86 × 10–6; micro-online vs. micro-offline: p=7.98 × 10–6; micro-online vs. total: p=0.0002; micro-offline vs. total: p=0.669). These results were not impacted by potential preplanning effects on initial skill performance (Ariani and Diedrichsen, 2019) since alternative measurements of cumulative micro-online and -offline gains remain unchanged after omission of the first 3 keypresses in each trial from the correct sequence speed computation (paired t-tests; micro-online: t25=–0.0223, p=0.982; micro-offline: t25=–0.879, p=0.388). Center line of box plots shown in inset indicate the group median, while box limits indicate the 1st and 3rd quartiles. Whisker lengths are set at the extreme value ≤1.5×IQR. (C) Keypress transition time (KTT) variability. Distribution of KTTs normalized to the median correct sequence time for each participant and centered on the mid-point for each full sequence iteration during early learning (see Figure 1—figure supplement 1B for results over all Day 1 Training and Day 2 Retest trials). Note the initial variability of the relative KTT composition of the sequence (i.e., – 4–1, 1–3, 3–2, 2–4, 4–4), before it stabilizes in the early learning period.

Figure 1—figure supplement 1
Behavioral performance during skill learning.

(A) Total Skill Learning over Day 1 Training (36 trials) and Day 2 Retest (9 trials). As reported previously (Bönstrup et al., 2019a), participants on average reached 95% of peak performance during Day 1 Training by trial 11. Note that after trial 11, performance stabilizes around a plateau through trial 36. Following a 24-hr break, participants displayed an upward shift in performance during the Day 2 Retest – indicative of an overnight skill consolidation effect. Shaded regions indicate the 95% confidence interval of the group mean. (B) Keypress transition time (KTT) variability. Distribution of KTTs normalized to the median correct sequence time for each participant and centered on the mid-point for each full sequence iteration during early learning. Note that the initial variability of the five component transitions in the sequence (i.e. 4–1, 1–3, 3–2, 2–4, 4–4) stabilize by trial 6 in the early learning period and remain stable throughout the rest of Day 1 Training (through trial 36) and Day 2 Retest.

Figure 2 with 2 supplements
Spatial and oscillatory contributions to neural decoding of finger identities.

(A) Contribution of whole-brain oscillatory frequencies to decoding. When trained on broadband activity relative to narrow frequency band features, decoding accuracy (i.e. test sample performance) was highest for whole-brain voxel-space (74.51% ± SD 7.34%, t=8.08, p<0.001) and parcel-space (70.11% ± SD 7.11%, t=13.22, p<0.001) MEG activity. Thus, decoders trained on whole-brain broadband data consistently outperformed those trained on narrowband activity. Dots depict decoding accuracy for each participant. Center line of box plots indicate the group median, while notches represent the 95% confidence interval of the group median. Box limits indicate the 1st and 3rd quartiles while whisker lengths are set at the extreme value ≤1.5×IQR. Outlier values located outside of the whisker range are marked with “+” symbols. *p<0.05, **p<0.01, ***p<0.001, n.s. - no statistical significance (p>0.05). (B) Whole-brain parcel-space decoding. Color-coded brain surface plot displaying the relative importance of individual brain regions (parcels) to broadband whole-brain parcel-space decoding performance (far-left light gray box plot in A). (C) Whole-brain voxel space decoding. Color-coded brain surface plot displaying the relative importance of individual voxels to broadband whole-brain voxel-space decoding performance (far-left dark gray box plot in A). (D) Regional voxel-space decoding. Broadband voxel-space decoding performance for top-ranked brain regions across the group is displayed on a standard (FreeSurfer fsaverage) brain surface and color-coded by accuracy. Note that while whole-brain parcel- and voxel-space decoders relied more on information from brain regions contralateral to the engaged hand, regional voxel-space decoders performed similarly for bilateral sensorimotor regions.

Figure 2—figure supplement 1
Oscillatory contributions at individual brain regions.

Decoding performance of regional voxel-space activity patterns within individual brain areas for broadband and each narrowband oscillatory range is displayed in the form of a heatmap for both the left and right hemisphere. Optimal decoding performance for broadband regional voxel-space decoders was obtained from bilateral superior frontal (Left: 68.77% ± SD 7.6%; Right: 67.52%% ± SD 6.78%), middle frontal (Left: 63.41% ± SD 7.58%; Right: 62.78%% ± SD 76.94%), pre-central (Left: 62.37%% ± SD 6.32%; Right: 62.69% ± SD 5.94%), and post-central (Left: 61.71% ± SD 6.62%; Right: 61.09% ± SD 6.2%) brain regions. Superior parietal, central, paracentral, anterior cingulate, and precuneus regions also showed broadband decoding performance exceeding 60%. With respect to decoders constructed from narrowband oscillatory input features, only Delta-band voxel-space activity from bilateral superior frontal regions achieved at least 60% decoding accuracy of keypresses.

Figure 2—figure supplement 2
Distribution of correlation coefficients between parcel-space time-series and their constituent voxels.

Data is shown for all subjects. Parcels represented in the regional voxel-space features of the hybrid-space decoder are marked with red vertical boxes (bilateral superior frontal, middle frontal, pre-central, and post-central regions). The y-axis indicates the absolute correlation coefficients for each voxel time series with the time series of the parcel it is a member of (1=complete redundancy; 0=orthogonality). Note that while signal in some voxels correlates strongly with parcel-space time series, others are fully orthogonal. That is, the degree to which information obtained at the two different spatial scales is complementary (or redundant) varies substantially over the regional voxel space. This finding is consistent with the documented increase in correlational structure of neural activity across larger spatial scales that does not reflect perfect dependency or orthogonality (Munn et al., 2024). The normalized cumulative distributions of parcel-to-voxel-space correlations depicted on the right show that voxels included in the hybrid-space decoder (red) are correlated less overall (two-sample Kolmogorov-Smirnov test: D=0.2484, p<1 × 10–10) with their respective parcel-space time-series relative to excluded voxels (gray).

Figure 3 with 7 supplements
Hybrid spatial approach for neural decoding during skill learning.

(A) Pipeline. Sensor-space MEG data (N=272 channels) were source-localized (voxel-space features; N=15,684 voxels), and then parcellated (parcel-space features; N=148) by averaging the activity of all voxels located within an individual region defined in a standard template space (Desikan-Killiany Atlas). Individual regional voxel-space decoders were then constructed and ranked. The final hybrid-space keypress state (i.e. 4-class) decoder was constructed using all whole-brain parcel-space features and top-ranked regional voxel-space features (see Materials and methods). (B) Decoding performance across parcel, voxel, and hybrid spaces. Note that decoding performance was highest for the hybrid space approach compared to performance obtained for whole-brain voxel- and parcel spaces. Addition of linear discriminant analysis (LDA)-based dimensionality reduction further improved decoding performance for both parcel- and hybrid-space approaches. Each dot represents accuracy for a single participant and method. Center line of box plots indicates the group median, while notches (and shaded areas) represent the 95% confidence interval of the group median. Box limits indicate the 1st and 3rd quartiles while whisker lengths are set at the extreme value ≤1.5×IQR. Outlier values located outside of the whisker range are marked with “+” symbols. ***p<0.001 and *p<0.05. (C) Confusion matrix of individual finger identity decoding for hybrid-space manifold features. True predictions are located on the main diagonal. Off-diagonal elements in each row depict false-negative predictions for each finger, while off-diagonal elements in each column indicate false-positive predictions. Please note that the index finger keypress had the highest false-negative misclassification rate (11.55%).

Figure 3—figure supplement 1
Contribution of whole-brain oscillatory frequencies to decoding.

Accuracy for decoders trained on four different input feature spaces—sensor, whole-brain parcel, whole-brain voxel, and hybrid (combination of whole-brain parcel plus regional voxel)—was highest for broadband MEG activity, followed by Delta-band activity. The hybrid approach resulted in the highest decoding accuracy, regardless of whether input features were broadband or narrowband-limited. Sensor-, parcel-, and voxel-space decoders displayed similar accuracy with respect to one another for broadband MEG activity, and also for all narrowband ranges assessed. Dots depict decoding accuracy for each participant. Center line of box plots indicates the group median, while notches represent the 95% confidence interval of the group median. Box limits indicate the 1st and 3rd quartiles while whisker lengths are set at the extreme value ≤1.5×IQR. Outlier values located outside of the whisker range are marked with “+” symbols. ***p<0.001, n.s. - no statistical significance (p>0.05).

Figure 3—figure supplement 2
Comparison of different dimensionality reduction techniques.

Dimensionality reduction was applied to the input features for each approach (parcel-space: N=148; voxel-space: N=15684; hybrid-space: N=1295; Maaten and Postma, 2009). The results with principal component analysis (PCA, in green), multi-dimensional scaling (MDS, in blue), minimum redundant maximum relevance algorithm (MRMR, in red), linear discriminant analysis (LDA, in black) are shown in comparison to performance obtained using all input features (in magenta). For parcel-space input features, all these approaches increased the mean decoding accuracy with PCA and LDA (both of which result in extraction of orthogonal features) showing statistically significant improvement (one-way ANOVA: F=13.05, p<0.001; post hoc Tukey tests: p=0.032; PCA: p<0.001; LDA: p>0.05). For voxel-space features, there was no statistically significant improvement with any of the approaches (p>0.05). While MRMR resulted in the largest voxel-space decoding accuracy improvement, it was not statistically significant (post hoc Tukey test: p=0.14), and application of LDA dimensionality reduction actually reduced performance dramatically. Uniquely for hybrid-space features—all dimensionality reduction techniques improved decoding performance significantly (one-way ANOVA: F=21.32; post hoc Tukey tests: p<0.05) with the best largest improvement observed following application of LDA. Center line of box plots indicates the group median, while notches represent the 95% confidence interval of the group median. Box limits indicate the 1st and 3rd quartiles while whisker lengths are set at the extreme value ≤1.5×IQR. Outlier values located outside of the whisker range are marked with “+” symbols. ***p<0.001, **p<0.01, *p<0.05, n.s. - no statistical significance (p>0.05).

Figure 3—figure supplement 3
ICA artefacts do not contribute to decoding.

(A) Example of ICA component time-series for components labeled as artifacts from a single subject during MEG data pre-processing. The features of these components are consistent with known motion and physiological artifacts in MEG data. (B) 4-class confusion matrix and (C) decoding performance of keypress action labels from ICA components labeled as artifacts and removed from the MEG data during pre-processing. These components failed to predict keypress labels above empirically determined chance levels (as shown by decoding performance after random label shuffling). Note that in all cases, decoding performance from movement and physiological artifacts was substantially lower than 4-class MEG hybrid-space decoding for all participants. (D) Head position was assessed at the beginning and at the end of each recording and used to measure head movement. The mean measured head movement across the study group was 1.159 mm (±1.077 SD). Center line of the box plot indicates the group median, while notches represent the 95% confidence interval of the group median. Box limits indicate the 1st and 3rd quartiles while whisker lengths are set at the extreme value ≤1.5×IQR. Outlier values located outside of the whisker range are marked with “+” symbols.

Figure 3—figure supplement 4
Confusion matrices for decoding performance on Day 2 Retest (A) and Day 2 Control (B) data.

Note that the hybrid-space decoding strategy generalized to Day 2 data with 87.11% overall accuracy for keypresses embedded within the trained sequence (Day 2 Retest) and 79.44% overall accuracy for keypresses embedded within untrained control sequences (Day 2 Control).

Figure 3—figure supplement 5
Decoding performance across temporal scales.

(A) Average decoding accuracies across participants with varying window parameters. The x-axis indicates the onset of the time window (in ms) used to relate MEG activity time series to individual keypresses (i.e. KeyDown event = 0ms), while the y-axis indicates the window duration (in ms). The heatmap color denotes the decoding accuracy for all window onset/duration pairings. The best decoding accuracy across subjects was obtained using a window duration of 200ms with the leading edge aligned to the KeyDown event (i.e. 0ms; marked by the dashed lines and open circle). (B) Decoder window parameters (onset and duration) used for each subject in reported decoder accuracy comparisons (Figures 24). Please note that the group-optimal set of parameters (window onset = 0ms; window duration = 200ms; LDA dimensionality reduction) was utilized for all contextualization analyses (Figure 5) to allow for comparison across participants. Center line of box plots indicates the group median, while notches represent the 95% confidence interval of the group median. Box limits indicate the 1st and 3rd quartiles while whisker lengths are set at the extreme value ≤1.5×IQR. Outlier values located outside of the whisker range are marked with “+” symbols.

Figure 3—figure supplement 6
Comparison of decoding performances with two different hybrid approaches.

HybridOverlap (regional voxel-space features from top-ranked parcels combined with all whole-brain parcel-space features as shown in Figure 3B, Figure 3—figure supplements 1; 35 of the manuscript) and HybridNon-overlap (regional voxel-space features of top-ranked parcels and spatially non-overlapping whole-brain parcel-space features). Filled circle markers represent decoding accuracy for individual subjects. Dashed lines indicate within-subject performance changes between decoding approaches. Note that the HybridOverlap (the approach used in our manuscript) significantly outperforms the HybridNon-overlap approach (Wilcoxon signed rank test, z=3.7410, p=1.8326e-04), despite the removed features (n=8) only comprising less than 1% of the overall input feature space. These results indicate that the spatially overlapping whole-brain (lower resolution) parcel-space and regional (higher resolution) voxel-space features provide complementary—as opposed to redundant—information to the hybrid-space decoder. Center line of box plots indicates the group median, while notches represent the 95% confidence interval of the group median. Box limits indicate the 1st and 3rd quartiles while whisker lengths are set at the extreme value ≤1.5×IQR.

Figure 3—figure supplement 7
Comparison of different decoder methods.

Performance for all different machine learning decoders assessed is shown for each participant. The results show that the linear discriminant analysis (LDA) classifier outperformed other methods, on average, across the group. Decoding analysis performance comparisons reported in the current study utilized the LDA decoder for all subjects. Center line of box plots indicates the group median, while notches represent the 95% confidence interval of the group median. Box limits indicate the 1st and 3rd quartiles while whisker lengths are set at the extreme value ≤1.5×IQR. Outlier values located outside of the whisker range are marked with “+” symbols.

Figure 4 with 3 supplements
Evolution of keypress neural representations with skill learning.

(A) Keypress neural representations differentiate during early learning. t-SNE distribution of neural representation of each keypress (top scatter plots) is shown for trial 1 (start of training; top-left), 11 (end of early learning; top-center), and 36 (end of training; top-right) for a single representative participant. Individual keypress manifold representation clustering in trial 11 (top-center; end of early learning) depicts sub-clustering for the index finger keypress performed at the two different ordinal positions in the sequence (IndexOP1 and IndexOP5), which remains present by trial 36 (top-right). Spatial distribution of regional contributions to decoding (bottom brain surface maps). The surface color heatmap indicates feature importance scores across the brain. Note that decoding contributions shifted from contralateral right pre-central cortex at trial 1 (bottom-left) to contralateral superior and middle frontal cortex at trials 11 (bottom-center) and 36 (bottom-right). (B) Confusion matrix for 5-class decoding of individual sequence items. Decoders were trained to classify contextual representations of the keypresses (i.e. 5-class classification of the sequence elements 4-1-2-3-4). Note that the decoding accuracy increased to 94.15% ± SD 4.84% and the misclassification of keypress 4 was significantly reduced (from 141 to 82). (C) Trial-by-trial classification accuracy for 2-class decoder (IndexOP1 vs. IndexOP5). A decoder (200ms window duration aligned to the KeyDown event) was trained to differentiate between the two index finger keypresses embedded at different positions within the practiced skill sequence (IndexOP1=index finger keypress at ordinal position 1 of the sequence; IndexOP5=index finger keypress at ordinal position 5 of the sequence). Decoder accuracy progressively improved over early learning, stabilizing around 96% by trial 11 (end of early learning). Similar results were observed for other decoding window sizes (50, 100, 150, 250, and 300ms; see Figure 4—figure supplement 2). Taken together, these findings indicate that the neural feature space evolves over early learning to incorporate sequence location information. Shaded region indicates the 95% confidence interval of the group mean.

Figure 4—figure supplement 1
Quantification of trial-by-trial parcel-space feature importance scores during skill learning.

Trial-by-trial changes in parcel-space feature importance scores are shown for right superior frontal, middle frontal, pre-central, and post-central cortex (i.e. the contralateral regions showing the highest regional voxel-space decoding accuracy). Note that the feature importance is initially higher for the contralateral pre-central cortex in early trials before shifting towards the contralateral middle and superior frontal cortex during later trials, as can be seen with the divergence of line plots beginning around trial 11.

Figure 4—figure supplement 2
Trial-by-trial classification accuracy for 2-class decoder (IndexOP1 vs. IndexOP5).

Several decoders (with varying window durations aligned to the KeyDown event) were trained to differentiate between the two index finger keypresses embedded at different positions within the practiced skill sequence (IndexOP1 at ordinal position 1 vs. IndexOP5 at ordinal position 5). Decoding accuracy for the 200ms duration windows (i.e. the optimal window size for 5-class decoding of individual keypresses) progressively improves over early learning, stabilizing around 96% by trial 11 (end of early learning). Similar results were observed for all other decoding window sizes (50, 100, 150, 250, and 300ms), with overall accuracy slightly lower compared to 200ms. These findings indicate that the neural representations of the skill action are updated over early learning to incorporate sequence location information. Shaded regions indicate the 95% confidence interval of the group mean.

Figure 4—figure supplement 3
Eye movement features do not contribute to decoding.

(A) Scatter plot of gaze positions at the KeyDown event and 200ms after the KeyDown event (i.e. beginning and ending of window used for decoding keypress labels from MEG input features) from a representative participant. Transparent gray dots indicate all sampled gaze positions during practice trials. The overall mean gaze position during practice trials is indicated by the black filled circle marker. Colored right-pointing triangle markers indicate the gaze position at the KeyDown event for each ordinal position keypress (IndexOP1 – magenta; LittleOP2 – yellow; MiddleOP3 – blue; RingOP4 – green; IndexOP5 – brown), while left-pointing triangle markers indicate the gaze position 200ms after the KeyDown event. The mean gaze position for these two time points is indicated by the larger-sized triangle markers. On average, gaze position is largely fixed for the OP1 and OP3 keypresses, moves from left to right for OP2 and OP4 keypresses, and from right to left for OP5 keypresses (which is when the asterisk moves leftward from the last sequence item back to the first). (B) Confusion matrix showing that three eye movement features fail to predict asterisk position on the task display above chance levels (Fold 1 test accuracy = 0.21718; Fold 2 test accuracy = 0.22023; Fold 3 test accuracy = 0.21859; Fold 4 test accuracy = 0.22113; Fold 5 test accuracy = 0.21373; Overall cross-validated accuracy = 0.2181). Since the ordinal position of the asterisk on the display is highly correlated with the ordinal position of individual keypresses in the sequence, this analysis provides strong evidence that keypress decoding performance from MEG features is not explained by systematic relationships between finger movement behavior and eye movements (i.e. behavioral artifacts). (C) 5-class decoding of ordinal position keypress labels from eye movement recording features approached empirically determined chance levels (as shown by decoding performance after random label shuffling). Note that all decoding performances from eye movement data were substantially lower than MEG hybrid-space decoding for all participants. Sample distribution means are indicated by the solid blue horizontal line with the 95% confidence interval of the group mean indicated by the shaded blue rectangular box.

Figure 5 with 7 supplements
Neural representation distance between index finger keypresses performed at two different ordinal positions within a sequence.

(A) Contextualization increases over Early Learning during Day 1 Training. Online (green) and offline (purple) neural representation distances (contextualization) between two index finger key presses performed at ordinal positions 1 and 5 of the trained sequence (4-1-3-2-4) are shown for each trial during Day 1 Training. Both online and offline contextualization between the two index finger representations increases sharply over Early Learning before stabilizing across later Day 1 Training trials. Shaded regions indicate the 95% confidence interval of the group mean. (B) Contextualization develops predominantly during rest periods (offline) on Day 1. The cumulative neural representation differences during early learning were significantly greater over rest (Offline contextualization; right) than during practice (Online contextualization; left) periods (t=4.84, p<0.001, df = 25, Cohen’s d=1.2). Center line of box plot indicates the group median, while notches represent the 95% confidence interval of the group median. Box limits indicate the 1st and 3rd quartiles while whisker lengths are set at the extreme value ≤1.5×IQR. (C) Contextualization acquired on Day 1 was retained on Day 2 specifically for the trained sequence. The neural representation differences assessed across both rest and practice for the trained sequence (4-1-3-2-4) were retained at Day 2 Retest. This is in stark contrast with the reduction in contextualization for several untrained sequences controlling for: (1) index finger keypresses located at the same ordinal positions 1 and 5 but within a different intervening sequence pattern (Pattern Specificity Control: 4-2-3-1-4, 51.05% lower contextualization); (2) use of a finger different than the index (little or ring finger) in both ordinal positions 1 and 5 (Finger Specificity Control: 2-1-3-4-2, 1-4-2-3-1 and 2-3-1-4-2; 35.80% lower contextualization); and (3) multiple index finger keypresses occurring at ordinal positions other than 1 and 5 (Position Specificity Control: 4-2-4-3-1 and 1-4-3-4-2; 22.06% lower contextualization). Note that offline contextualization cannot be measured for the Day 2 Control sequences as each sequence was only performed over a single trial. Error bars indicate S.E.M.

Figure 5—figure supplement 1
Relationship between offline neural representational changes and micro-offline learning.

(A) Relationship between offline neuronal representational changes and micro-offline learning. Offline contextualization—calculated as the Euclidian distance between the neural representations observed for the first IndexOP1 keypress from practice trial, n, and the last IndexOP5 keypress from practice trial, n-1—increased over early learning. A linear regression analysis (shown in the inset) revealed a strong temporal relationship (correlation coefficient [r]=0.903 and coefficient of variance explained [R2]=0.816) between contextualization and cumulative micro-offline gains over early learning. Shaded regions indicate the 95% confidence interval of the group mean. (B) Changes in offline contextualization for different decoding window durations as a function of rest breaks. We constructed decoders from different MEG input feature time windows (window durations of 50, 100, 150, 200, 250, and 300ms; all aligned to the KeyDown event), to assess the robustness of the offline contextualization finding with respect to this parameter selection. Offline contextualization showed similar trends for all options tested. (C) Relationship between offline neural representational changes and micro-offline learning across all window durations. The linear regression analysis from (A) was repeated for all contextualization measures from (B) obtained after varying the MEG input feature window size (50–300ms). This strong temporal relationship was observed for all window durations (0.598 ≥ R2≥0.816), except for 300ms (R2=0.284) where temporal overlap of individual keypress features was most prominent.

Figure 5—figure supplement 2
Trial-by-trial trends for different measurement approaches of offline and online contextualization changes.

(A) Offline contextualization between the last sequence of a preceding trial and the second sequence of the subsequent one (skipping the first sequence of that trial) rendered a comparable result to the measure reported in Figure 5, Figure 5—figure supplement 1 which use the first sequence—inconsistent with a possible confounding effect of pre-planning (Ariani and Diedrichsen, 2019). Shaded regions indicate the 95% confidence interval of the group mean. (B) Two different measurement approaches were used to characterize online contextualization changes. The sequence-based approach calculated the mean distance between IndexOP1 and IndexOP5 for each correct sequence iteration within a trial (green). A second trial-based approach was also implemented, which controlled for the passage of time between observations used in both online and offline distance measures (10 s between IndexOP1 and IndexOP5 observations in both cases). Note that the trial-based approach showed no increase in online contextualization over early learning. Importantly, the overall magnitude of online contextualization by the end of early learning was similar for both measurement approaches, and both showed reduced online relative to offline contextualization. Shaded regions indicate the 95% confidence interval of the group mean.

Figure 5—figure supplement 3
Online contextualization versus micro-online learning.

The relationship between online contextualization and online learning is shown for both sequence- (A, left) and trial-based (B, right) distance measurement approaches. There was no significant relationship between online learning and online contextualization regardless of the measurement approach. Shaded regions indicate the 95% confidence interval of the group mean.

Figure 5—figure supplement 4
Within-subject correlations between online and offline contextualization changes versus learning.

Pirate plots displaying individual subject correlation coefficients for offline (i.e. over rest) and online (i.e. during practice) contextualization changes versus micro-offline and -online performance gains. Zero correlation is marked by the horizontal dashed line. Distribution means are indicated by the solid black horizontal line with the 95% confidence interval of the group mean indicated by the shaded rectangular box. Within-subject correlations were significantly greater for offline contextualization changes versus micro-offline performance gains than for online contextualization changes versus either micro-offline or -online performance gains. The average correlation between offline contextualization and micro-offline gains within individuals was significantly greater than zero (left; t=3.87, p=0.00035, df = 25, Cohen’s d=0.76) and stronger than correlations between online contextualization and either micro-online (middle; t=3.28, p=0.0015, df = 25, Cohen’s d=1.2) or micro-offline gains (right; t=3.7021, p=5.3013e-04, df = 25, Cohen’s d=0.69).

Figure 5—figure supplement 5
Online versus offline changes in keypress transition patterns.

(A) Trial-by-trial Euclidean distance between the relative share of each keypress transition time to the full sequence duration (i.e. differences in typing rhythm). This distance was calculated for the first and last sequence of each trial (online pattern distance; green) and the last sequence of a trial versus the first sequence of the next (offline pattern distance; purple). Shaded regions indicate the 95% confidence interval of the group mean. (B) Cumulative online (green; left) and offline (purple; right) pattern distances recorded over all forty-five trials covering Days 1 and 2. Note the comparable online and offline typing rhythm changes do not explain differences between online and offline contextualization, which is fully developed by trial 11 (Figure 5).

Figure 5—figure supplement 6
The relationship between adjacent index finger transitions and online contextualization.

Scatter plot showing that the sum of adjacent index finger keypress transition times (i.e. the 4–4 transition at the conclusion of one sequence iteration and the 4–1 transition at the beginning of the next sequence iteration) versus online contextualization distances measured during practice trials. Both the keypress transition times and online contextualization scores were z-score normalized within individual subjects and then concatenated into a single data superset. A simple linear regression between keypress transition time predictor and the online contextualization response variable showed a very weak linear relationship between the two (R2=0.00507, F[1,3202]=16.3). This result shows that contextualization of index finger representations does not reflect the amount of overlap between adjacent keypresses.

Figure 5—figure supplement 7
Between-subject differences in typing speed versus online contextualization.

(A) Between-subject relationship between plateau performance speed and online contextualization. The plateau performance typing speed showed no significant relationship with the degree of online contextualization (R2=0.028, p=0.41). Each dot represents the maximum speed attained and the corresponding degree of contextualization of each participant. Thus, the magnitude of online contextualization was not dependent on how fast individuals could perform the task at the end of early learning. (B) Trial-by-trial relationship between typing speed and degree of online contextualization. We also performed a trial-by-trial regression analysis that related the degree of online contextualization for each trial with the median typing speed for that trial. The R2 values obtained for regression analyses performed on individual trials were also low and not statistically significant (mean R2=0.06; p>0.05). Red and black horizontal lines indicate the group median and mean R2 values, respectively.

Author response image 1
Matrix rank computed for whole-brain parcel- and voxel-space time-series in individual subjects across the training run.

The results indicate that whole-brain parcel-space input features are full rank (rank = 148) for all participants (i.e. - MEG activity is orthogonal between all parcels). The matrix rank of voxel-space input features (rank = 267 ± 17 SD), on the other hand, approached the number of useable MEG sensor channels (n = 272). Although not full rank, the voxel-space rank exceeded the parcel-space rank for all participants. Thus, some voxel-space features provide additional orthogonal information to representations at the parcel-space scale. An expression of this is shown in the correlation distribution between parcel and constituent voxel time-series in Figure 2—figure Supplement 2.

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Debadatta Dash
  2. Fumiaki Iwane
  3. William Hayward
  4. Roberto F Salamanca-Giron
  5. Marlene Bönstrup
  6. Ethan R Buch
  7. Leonardo G Cohen
(2025)
Sequence action representations contextualize during early skill learning
eLife 13:RP102475.
https://doi.org/10.7554/eLife.102475.4