Abstract
Decades of research have shown that early ramping activity is a very reliable antecedent of self-initiated movement. However, the dominant paradigm has been to only analyze data epochs culminating in movement, which discounts how often slow ramping might occur when no movement follows. To address this, we introduced a matched control condition that did not culminate in movement. We recorded electroencephalography data and compared these two conditions using a variety of powerful machine-learning classifiers. Although early ramping activity was present, it did not predict movement. Instead, classification accuracy rose abruptly about 100 ms before movement. By contrast, the traditional approach reproduced the spurious impression of early predictability. Our results resolve a long-standing controversy, showing that the neural commitment to act is a late-stage event consistent with the timing of conscious intention and the ability to inhibit movement. More broadly, our results show that a reliable antecedent is not necessarily a good predictor, underscoring the need for proper control conditions in time series analyses.
Introduction
The neuroscience of self-initiated movements
Living organisms constantly initiate movements to interact with their environment, sometimes in response to an external stimulus, and other times through an internally generated process, as in the case of self-initiated voluntary movements. Despite decades of research, significant issues persist in how best to pursue fundamental questions regarding the neural basis of self-initiated voluntary movements. Here, we address issues concerning the time course of neural markers that lead up to self-initiated actions. Specifically, we expose the drawbacks of time-locked analyses and provide a solution to those drawbacks through the use of a control condition when investigating the neural antecedents of self-initiated movements. Importantly we show that a reliable neural antecedent (in time-locked analyses) is not necessarily a good predictor of impending movement.
Modern neuroscientific research in this area began in the 1960s with the seminal work of Kornhuber and Deecke (1965), who experimented with allowing participants to perform abrupt voluntary movements with no temporal cue telling them when to move, while recording electroencephalography (EEG). This paradigm led to the discovery of the Bereitschaftspotential, or readiness potential (RP), a slow buildup of neural activity in motor areas that may begin up to one full second or more before the onset of movement. This pre-movement buildup of neuronal activity has long been assumed to reflect planning and preparation for movement (Kornhuber and Deecke, 1990) and has been observed with EEG (Kornhuber and Deecke, 1965; Libet et al., 1983), magnetoencephalography (MEG) (Deecke et al., 1982; Pedersen et al., 1998; Erdler et al., 2000), and single-unit recordings in humans (Fried et al., 2011) as well as in other vertebrate (Romo and Schultz, 1987,1992; Hyland, 1998; Lee and Assad, 2003; Seki et al., 2005; Maimon and Assad, 2006; Isomura et al., 2009) and even invertebrate (Kagaya and Takahata, 2010) species. Besides the RP, suppression of Mu (7-15 Hz) and Beta band (15-30 Hz) power, called event-related desynchronizations (ERD), have also been identified prior to self-initiated movements (Pfurtscheller andAranibar, 1979; Pfurtscheller and Berghold, 1989).
While these signals often appear to begin more than a second before movement onset, how early a commitment to move occurs remains a central question in our understanding of self-initiated movements. Answers to this question fall under one of two main categories: the early-decision account or the late-decision account (Schurger et al., 2021). According to the early-decision account, movement decisions (i.e. commitment to a course of action) happen early at the neural level, perhaps up to a second or more before they are executed. Proponents of the early-decision account have taken early brain activity, like the RP, to support the thesis that decisions happen early in the brain, taking these early signals to reflect a cascading process that brings about the forthcoming action. Conversely, late-decision accounts propose that decisions occur late at the neural level, perhaps even simultaneously with motor execution (Schurger et al., 2021). According to this account, early activity, like the RP, would be antecedent to the decision rather than a consequence of the decision.
While the data indicates the presence of early signals, it is not clear whether these signals reflect a commitment to move since people can inhibit self-initiated movements up to 200 ms in advance (Schultze-Kraft et al., 2016). Furthermore, the early-decision account presents two main concerns: there are no current models or mechanisms proposed to justify such an early and variable neural commitment and this early commitment does not align with participants’ reported subjective experience (Libet et al., 1983).
As first proposed by Eccles (1985), conditioning one’s analysis on the onset of a movement might reveal fluctuations in the brain activity that favor the occurrence of a movement without necessarily reflecting an early commitment for said movement. Within the past decade, two late-decision accounts have been put forth, both addressing these shortcomings of the early-decision account. The leaky-stochastic-accumulator (LSA) model proposes that early activity, like the RP, reflects a partly stochastic buildup of activity that only triggers a decision upon crossing a threshold (Schurger et al., 2012), which is proposed to occur within 150 ms of movement onset. The slow-cortical potential (SCP) sampling hypothesis proposes that self-initiated movements are more likely to occur during the negative-going phase of slow cortical potentials in motor areas (Schmidt et al., 2016). Both models are arguably more parsimonious than early-decision interpretations, as they articulate a means both for how the decision process unfolds as well as for how the RP comes about, and they align with participants’ reports of the timing of their decisions, which are typically relatively close to movement onset (Libet et al., 1983; Braun et al., 2021).
Here we show that early ramping signals, like the RP, arise specifically because the analysis of the neural activity is either time-locked to movement or the data are compared to an earlier window of time within the same epoch. We argue that these early signals only reflect an uncommitted preparatory state, not the decision itself. Furthermore, we aim to expose these problems while proposing a strategy for addressing them. We show that, when accounting for both the time locking and comparison to an early window of time, the data better aligns with late-decision accounts, indicating that the final neural commitment to initiate movement likely arises very close to the time of movement onset, in the context of self-initiated movements that are not pre-planned.
The problem with time-locked analyses
In a typical experiment that focuses on self-initiated movement, the precise times at which movements are generated are unconstrained, and the analyses are performed on data epochs aligned to the time of movement onset. Hence, research on self-initiated movement has relied heavily on the paradigm of movement-locked averaging.
A major pitfall of that approach is that, when using time-locking to analyze what happens prior to an event, non-specific early signals can be revealed, even if these signals play a limited or non-causal role. For instance, if you looked at weight gain prior to someone deciding to go on a diet, you might see that on average, diets are preceded by a period of weight gain. However, while gaining weight might influence someone’s decision to go on a diet, the time of the weight gain does not necessarily reflect the time of their decision, i.e. their decision to go on a diet was not made the moment they started gaining weight. There could be myriad other causes, such as a doctor’s advice, season of the year, or a friend’s recommendation. Thus, while you would very robustly And this early signal (weight gain) upon time-locking and averaging across multiple instances of people going on a diet, its use in predicting the timing of a person deciding to go on a diet might be very limited.
The problem with comparing to an early time window
As explained above, the standard approach cannottease apart late- and early-decision accounts as it typically time-locks the analysis to the movement, thereby conditioning the signal on an event occurring in the future. One way to tease them apart would be to train a classifier to predict whether a self-initiated movement is about to occur, looking only at past brain signals. The early-decision account, which postulates that the commitment to move is made well before movement onset, would predict that such classification would be possible well in advance. Late-decision accounts would postulate that classification is possible only close in time to movement onset. Such studies have been carried out with EEG (Bai et al., 2011; Lew et al., 2012; Abou Zeid and Chau, 2015) and intracranial recordings (Fried et al., 2011). However, while their results were varied, all of these studies fell victim to another pitfall of the standard approach: the lack of a proper control condition. The standard approach compares the signal immediately preceding the event (movement onset) to a signal taken earlier in time in relation to the event (baseline period). These two classes are not temporally independent, since the earlier window always precedes the movement, and by a Axed amount of time. We refer to this method as the time-based approach in the rest of this paper. We show here that such a choice of control condition can lead to spuriously high classification performance when applied to autocorrelated signals (Figures 3A and 4). This pitfall is common to all autocorrelated signals, regardless of the recording modality. Slightly different approaches have been proposed for human fMRI (Soon et al., 2008) and mice widefield recording (Mitelut et al., 2022), but they also fail to select an independent control condition (see Discussion, The need for a control condition in self-initiated movement studies). To address this issue, we use a control condition matched for delay and anticipation but not movement. Our participants watch a slideshow of nature images. The slideshow advances to the next image either by the participant’s pressing of a button (the “active” condition), or automatically after a period of time drawn from the participant’s own waiting time distribution in the movement condition (the “passive” condition; see Materials and Methods, Procedure). By having sensory-event-locked epochs that either terminate with a movement (active) or terminate without a movement (passive), butthat are well matched in terms of mean viewing time and anticipation for a change of visual stimuli, we provide the control condition that is commonly lacking. We referto this method as the task-based approach in the rest of this paper.
Note that one could always create a more closely matched control condition (see Discussion, Limitations) however it suffices to show that there exists a control condition that is matched well enough such that early classification fails, even though there are early signals (similar to the RP), and those signals can be picked up (erroneously) using time-based decoding.
A new approach to self-initiated movement analysis
This study investigates which of the early or late accounts is best supported by EEG activity in a self-initiated action paradigm. Improving on the traditional self-paced movement task, and to enhance ecological validity, our participants advanced through a slideshow of nature images by either making self-initiated button presses (active trials) or simply waiting for the slide to advance automatically after a delay (passive trials) drawn randomly from the participant’s own continuously-updated distribution of viewing times on active trials (see Materials and Methods, Procedure and Figure 1). Thus, by the end of the experiment, the two sets of trials (passive and active) had a similar distribution in terms of mean viewing time and were well matched for anticipation of a slide transition (see Discussion, Limitations), with one key difference being the fact of having initiated a movement to trigger the event, with all the anticipations and expectations that come with it. We used several powerful machine learning techniques on held-out participants and applied the best performing approach to our main set of participants to ask how early in time EEG measurements of brain activity indicate that a movement is about to occur by trying to classify movement epochs from no-movement epochs in a sliding window at the single trial level (see Materials and Methods, Machine Learning). We found that the time course of classification performance (measured as area under the receiver operating characteristic curve, or AUC) was compatible with late-decision accounts of voluntary movement and incompatible with the presence of early signals indicative of a commitment to move - from the vantage point of cortical activity.

Example sequence of stimuli and button presses in the experiment.
Participants viewed images of nature. In manual trials, they watched the image until they pressed a button to advance to the next image. In automatic trials, they viewed the image until it automatically advanced to the next image. Before each image, for one second, a word prompt instructed them whether this was a manual or an automatic trial followed by a 0.25 s buffer. A color-coded fixation cross was also present afterwards overlayed on the image. In manual trials, participants were instructed to wait for at least roughly 3 seconds before pressing, however they were instructed not to count time in their heads.
In this context, mapping the time course of neural activity predictive of impending movement can be framed as a classification problem in machine learning: measuring how classifiable active and passive epochs are from one another at different points in time. We postulate that the ability to discriminate active versus passive epochs should correlate with the presence of a decision- or movement-specific process, as the critical difference between these two types of epochs is the fact of having performed a movement. Note that we readily concede that the two conditions are not matched in every respect (see Discussion, Limitations). However, rendering the two conditions less well-matched can only serve to increase classification AUC, not the other way around.
This approach depends on there being sufficient data and signal strength. Upon analysis of data previously collected (in 2012 in a different lab;hereafter referred to as held-out data) on the same task, we determined that 350 trials per participants was enough to produce a robust effect (Appendix figure S9). We therefore opted to collect many trials per participants (350 to 1400;collected over multiple sessions) to assess our research questions. The data were collected in 2020 and 2023 (yielding similar results when grouped per year; appendix figure S13). It is also worth noting that while the previously collected held-out data (2012) was used to refine the task and analysis and is therefore not included in our main statistical claim here, the results were the same in both the held-out and main data, despite differences in equipment, location, research team and language.
To further maximize our ability to detect brain activity predictive of upcoming movement in EEG data, we use an ensemble machine-learning algorithm called adaptive boosting, also known as AdaBoost (Schapire and Freund, 2012), together with custom-designed features. For our features, we applied sets of Haar wavelets to map out a wide array of spatial-, time- and time-frequency-domain patterns that could contain discriminating information between active and passive trials. We referto our approach as Haar-AdaBoost in the rest of this paper. This approach is described in more details in the Materials and Methods section. AdaBoost has the advantage that it can handle many features without a significant impact on its performance (see chapter 5 Schapire and Freund, 2012), allowing us, with our custom-made features, to systematically capture every event in the time and time-frequency domains, at every channel and in all pairwise differences between channels. Haar-AdaBoost also has the notable advantage of being highly interpretable (see Materials and Methods, Machine Learning) and having the mathematically provable guarantee that, given sufficient data and weak classifiers that reliably classify better than a random guess, a very powerful aggregate classifier with arbitrarily high accuracy can be achieved (Schapire and Freund, 2012). To ensure that our algorithm is best equipped to detect any differences, we benchmarked it against algorithms that are traditionally used in the field of neuroscience and found that, at movement onset, our approach outperformed all the algorithms we tested. Further, at classification peaks (after movement onset), our algorithm performed with very high area under the curve (AUC) and tapped into features typically associated with self-initiated movement (contralateral channels over motor cortices, Mu rhythms, and Beta band desynchronization (Pfurtscheller, 1997).
Late-decision accounts would predict that using a matched control, early classification is not possible despite the presence of early signals. Early-decision models, on the contrary, would predict that early classification, in virtue of these early signals, would be successful. Despite using a brute force ensemble algorithm and systematically investigating a wide array of frequencies and temporal differences and a large corpus of data across multiple locations (see Results and Supporting Figures), we found little to no early predictive activity. While no method can be used to definitely prove a lack of information based on chance-level classification, our approach casts significant doubt, especially because it is able to classify extremely well just before, at, and just after the time of movement onset—but not earlier.
While very strong at movement onset, the classifier’s performance for the task-based approach dropped abruptly to near chance by 0.1 s before movement onset. In light of the improved control provided by our paradigm, this pattern of results indicates that the neural decision to initiate movement occurs much later than suggested by the movement-locked averaging approach.
We also compared the signal to an earlier time period in the same epoch, using the time-based approach, as is standard in the field (Bai et al., 2011; Fried et al., 2011; Lew et al., 2012; Abou Zeid and Chau, 2015). Upon doing so, we found an exceedingly early onset of supposed high discriminability. Not only was this the case with our EEG data, but it was also corroborated using simulated data that was specifically constructed to contain no signal. This reenforces our claim that the field should move away from considering an early time window as a control for the signal that directly precedes the movement, but rather should strive to include well-matched no-movement data in control conditions interleaved with self-initiated movements. It also reassures us that our algorithm is sensitive to discriminable activity even well in advance of movement onset.
Our results highlight two main caveats in the neuroscience of volition. First, by showing that the task-based approach cannot decode a commitment to move as early as standard early signals obtained by movement-locked averaging, we expose the bias inherent in movement-locked averaging. Second, by showing that the task-based approach cannot decode as early as the time-based approach, we expose the importance of having a proper no-movement control condition. By addressing these caveats, we show that the data align with late-decision accounts (Schurger et al., 2012; Schmidt et al., 2016), with participants’ subjective reports (Libet et al., 1983), and with folk intuitions on self-initiated actions (Kozuch and Nichols, 2011).
Results
We recorded EEG data at Chapman University in Orange, California (OC) in 2020 and 2023 from 15 participants. We used M/EEG data from 3 participants previously recorded in a different lab (Neurospin Research Center near Paris, France in 2012; PF) but using the exact same paradigm, as a model selection cohort, referred to as held-out data. This held-out data was used to select the hyperparameters and evaluate our approach (see Materials and Methods, Parameters and algorithm selection). In this section, we describe the analyses we performed on these data, notably machine-learning analyses applied in a sliding window, to map out the time course of neural activity predictive of impending movement.
Note that because we had large effects at the single participant level in our held-out data, we opted for a small sample size with large amount of data points per participants. Additionally, note that EEG data from PF, the held-out set, was used in an initial investigation to assess optimal parameters for analysis. Data from OC and the MEG data from PF were left out of any parametrization step, hence, all hyper parameters in this study have been selected based on PF’s EEG data. EEG and MEG data from PF align with the results displayed in this section (see appendix for PF results; appendix figures S1 and S2).
Traditional time-locked analyses reveal an early signal
To confirm that our data revealed the standard early signal, a pre-movement buildup that reliably precedes self-initiated movement (Kornhuber and Deecke, 1990), we first examined the event-locked averages of the EEG data from OC at electrode C3 which had the earliest onset in our held-out data, generally referred to as a movement-locked cortical potential or movement-related cortical potential (MRCP; see Materials and Methods, Event-related potentials). While our data reflects slow ramping activity over motor cortices typical of self-initiated actions (see Figure 2A and appendix figure S3 for individual MRCP), it is not a readiness potential per se. Traditionally the RP requires full spontaneity of movement initiation without external stimuli, whereas here our participants acted whenever they wanted to advance to the next slide. Asimilar paradigm has been used recently to study self-initiated actions (Hussain et al., 2022) and spontaneous perceptions (Baror et al., 2024). However, we obtained the very same results in data from an additional 3 participants who performed a traditional RP spontaneous movement task (appendix figure S4).

Traditional electrophysiological and behavioral features of self-initiated actions.
(A) Grand average (n=15) movement-related cortical potential at C3 for active (blue) and passive (red) trials aligned to slide transition (t=0;dotted gray line). The three gray dashed vertical lines indicate the time of onset of the MRCP signal for active trials according to three onset-identification methods from the RP literature. The shaded area represents the standard error of the mean. A topography (bottom left) shows the grand average activity in the 0.5 s right before slide transition (blue is more negative). (B) Grand average (n=15) lateralized readiness potential taken as the difference wave between electrodes C3 and C4 for active (blue) and passive (red) trials aligned to slide transition (t=0;dotted gray line). Note that all movements were right index finger button presses. The shaded area represents the standard error of the mean. (C) Grand average (n=15) power spectrum at C3 for active trials aligned to slide transition (t=0;dotted gray line) and normalized using data from -3 to -2.5 s, this window was chosen to avoid the inclusion of edge artifacts in the baseline. Color represents power, with blue indicating a decrease in power relative to baseline while yellow indicates an increase. No cluster survived significance testing with cluster correction (cluster-level p<0.05;pixel-level p<0.01;see Materials and Methods, Event-related desynchronization). (D) Waiting time distributions in the active trials (i.e., how long people waited before advancing to the next slide). Each shade of grey represents a different participant.
We also captured the lateralization of the potential (Figure 2B) as well as Mu rhythms and Beta desynchronization (Figure 2C) typically associated with movement initiation. Further, the long tail of the waiting time distribution that we observed (Figure 2D) is also typical of self-initiated movement tasks (Schurger et al., 2012). Finally, we used three standard methods to identify the onset of the RP (for a complete description of the methods, see Verbaarschot et al., 2015). Each yielded onsets earlier than a half second: the RP onset-by-eye method yields -1.13 s, the RP90% -0.77 s, and the t-test method -0.53 s (Figure 2A).
The last 0.5 s of the motor-related cortical potential before movement was significantly more negative than baseline (M = 1.60, SD = 2.52; t(14) = 2.46, p < .05, comparing the mean signal from -4 s to -3 s to the mean signal over the last 0.5 s before slide transition on active trials) in line with previous results for self-initiated actions (Libet et al., 1983; Dominik et al., 2024). However, the lateralized readiness potential (LRP), a standard measure of hemispheric lateralization of brain activity prior to a movement (see Materials and Methods, Event-related potentials), although visually different, was not significantly different (M = -0.27, SD = 4.85; t(14) = -0.21, p = 0.83, comparing LRP amplitude during baseline, -4 s to -3 s, to the amplitude during the 0.5 s window before slide transition on active trials). Post-hoc, cluster corrected, directional paired samples tests did not yield any significant cluster of difference between the two conditions’ LRP over the duration of the epoch. Note that, we chose to represent the LRP using electrodes C3 and C4 because we used electrode C3 for our movement-related cortical potential.
Not only does the event-related potential (ERP), i.e. the signal locked to the event, begin long before slide transition on active trials (Figure 2A), but the event-related desynchronizations also appear to begin early (Figure 2C), although no clusters survived cluster correction (see Materials and Methods, Event-related desynchronization). The topography is also most negative over central areas, typical of self-initiated or self-timed movement-initiation paradigms (Dominik et al., 2024). Taken together with the signal’s lateralization (Figure 2B), these data display typical features found in self-initiated movement paradigms.
It is also worth noting that there is some sign of an early buildup even in passive trials (Figure 2A): Using the RP-onset t-test method applied to the passive trials, we find an onset that is -0.22 s ahead of slide transition, and using the onset-by-eye method, we find it to be -1.01 s. Although this might seem surprising given that slide transitions came randomly, untriggered by the participants, it is a feature found in anticipation-related paradigms, where slow signals can typically be seen when anticipating an upcoming stimulus (Brunia and Damen, 1988; van Boxtel and Böcker, 2004; Garipelli et al., 2013). However, while there might be a slight anticipatory signal, there is no clear lateralization of the EEG signal (Figure 2B) before passive slide transitions. Our machine learning analysis could pick up on lateralization because the features included not only the signal at each channel, but also every possible pairwise difference between channels.
Highly predictive performance from the EEG starts close to movement onset
Two downsides of event-locked averaging are first, that it does not guarantee that the signal reliably occurs prior to movement, simply that it occurs in enough trials with enough magnitude to show up in the average, and second, that it discounts the possibility of the signal happening even in the absence of movement, and thus does not allow for the estimation of the false-positive rate. This is important given that late-decision accounts can be perfectly compatible with the existence of early brain signals that are related to the decision-making process, as long as they do not systematically and necessarily predict movement onset. Machine-learning-based analyses, on the other hand, offer a better insight into this question; if a signal is not a good predictor of an upcoming movement, then it is no longer a good candidate for a commitment-to-move signal in early-decision accounts.
Using a sliding-window approach (with time aligned to the leading edge of the window), we constructed the time course of the classifier’s performance discriminating trials in which participants initiated a movement to advance to the next slide (active) from trials in which the slide advanced automatically (passive) (see Materials and Methods, Procedure). We found that while a negative trend in the scalp electrical potential over the motor cortices was present very early in time (Figure 2A), we could not decode with above chance AUC until very close to movement onset (Figure 3A). Even when using the MEG data from the held-out data from PF, despite its higher spatial resolution and high discriminability post-movement, passive versus active epochs could only be classified very close to movement onset (Figure 3A). Furthermore, across 15 participants, the grand average AUC for EEG (OC) rose to 0.87 immediately after movement onset, indicating a strong-performing model, with AUC for 12 of the 15 participants reaching peaks near or above 0.90 (appendix figure S5). Note that the performance of classifier just after movement onset cannot be tied to the visual event (disappearance of the current slide) because this event is common to both conditions.

Model performance for the task- and time-based methods.
(A) Grand average (n=15) time course of the validation AUC (10-fold) using the task-based (blue) and time-based (green) method. The task-based method with the MEG data from the parametrization set (purple) show near ceiling performance post-slide transition. X-axis marks the leading edge (in seconds) of the sliding window aligned to slide transition (t=0; dotted gray line). Thin gray dashed vertical lines indicate onset times labeled with the corresponding onset-method for the task-based approach; these onsets are determined with respect to a baseline AUC taken from -2.5 s to -2 s. This different baseline was selected as the earliest available 0.5 s AUC period. Shaded areas represent the standard error of the mean. (B) Boxplot of earliest decoding times (EDT) across participants (n=15) for both the time-based (green) and task-based methods (blue). Y-axis shows time aligned to slide transition (t = 0;dashed gray line). Participants’ mean single trial derived EDT (earliest correct classification of each trial) are displayed on the left, AUC derived EDT (earliest time the lower AUC error bound remains above 0.5) on the right (for a description of the two methods see Materials and Methods, Model evaluation). Red crosses indicate outliers. Red dotted horizontal lines indicate the MRCP onsets for each method (see Figure 2A). Note that in panel A, AUC for the task-based method seems to onset much earlier (0.44 s before with the onset-by-eye method) than the median earliest decoding (panel B, -0.085 s for the single trial method and +0.02 s for the AUC method). This is in part due to A showing an average of time courses, with the early rise in the average being driven by outliers, and in part due to the different methodologies used to derive each metric. The same data without outliers is shown in the appendix (see appendix figure S7).
We used the same three standard methods for estimating the onset of above-chance AUC that we used for estimating the onset time of the MRCP (see Materials and Methods, Event-related potentials). We found much later onsets than for MRCPs: -0.44 s for the onset-by-eye method, -0.34 s for the 90% method and -0.10 s for the t-test method. What is more, when looking at participants individually, the onset of above-chance AUC comes even later (appendix figure S5). For instance, the first timepoints at which the lower bound of the standard error of the AUC, at the participant level, exceeds and stays above 0.5 had a median value of 0.02 s after the button press (Figure 3B, right panel). While the median earliest time trials started and kept being correctly classified was -0.085 s relative to button press (Figure 3B, left panel). Not only did the model performance start to rise only just before movement onset, but it also coincided with the timing at which the model started focusing on relevant features (appendix figure S6).
Using a Bayesian paired-samples t-test, we found decisive evidence for a difference between AUC onsets and the MRCP onsets-by-eye (BF=190, posterior median effect size δ=1.18, with a 95% credible interval of [0.50, 1.91]). This difference occurred in the expected direction with MRCP onsets occurring earlier than AUC onsets (M=-0.87 SD=0.66). This strong level of evidence confirms that our sample size of N=15 is more than sufficient to detect our effects of interest, and also supports our assertion that reliable antecedents are not necessarily good predictors.
Traditional comparison against an early time window spuriously supports early-decision accounts
We now focus our attention on the time-based method used in prior studies (e.g. Bai et al., 2011; Lew et al., 2012; Abou Zeid and Chau, 2015; Fried et al., 2011). The time-based method consists in classifying activity within each temporal offset of a time window against activity from a fixed early time window, using machine learning (see Materials and Methods, Machine Learning). For example, we treat the window positioned at -3 s to -2.5 s as the negative exemplar and then to determine classification performance 0.32 s before movement, we would use the window from -0.82 s to -0.32 s as the positive exemplar. We repeat this process for every position of the positive exemplar window in steps of 0.02 s while keeping the negative exemplars’window fixed, mapping out a time course of the AUC. Our data show that the MRCP right before movement is significantly more extreme (negative in this case) than within the early reference time window (see Results, Traditional time-locked analyses reveal an early signal). We also find that later windows are highly classifiable when compared to the reference window (-3 sto -2.5 s with respect to movement onset; Figure 3A and B). Note that for classification, the reference time window used in the time-based approach came just after the window used for drift correction, this was done to avoid any overlap of the two. The earliest decodable times for the time-based approach were also earlier than those of the task-based approach by about 1 full second using the single trial derived EDT (M = 0.97, SD = 0.30; t(14) = 12.53, p < 0.001) and by almost 2 seconds using the AUC derived EDT (M = 1.98, SD = 0.45; t(14) = 17.13, p < .001). This (exceedingly) high and early discriminability also attests to the capability of Haar-AdaBoost to pick up on slow ramp-like changes well in advance of movement onset.
It is important to note that this high discriminability is a feature of any autocorrelated time series. Autocorrelated signals can lead to spurious high-performance classification using the timebased approach. If there are autocorrelated fluctuations, then two time points near one another are not independent. Therefore, a difference is bound to be revealed when comparing the activity right before movement to a reference time window that is temporally preceding it. Over time and on average, in a signal dominated by low frequency fluctuations, the more the distance increases between two timepoints, one of which is near a crest, the more their values diverge - even if the signal is stochastic. To show that this is the case, we generated two simulated datasets. In our first dataset, we used the parameters as reported in (Schurger et al., 2012) to generate autocorrelated noise with a drift aligned to a threshold crossing, as hypothesized in one of the late-decision accounts (Figure 4A). In a second dataset we simply generated pink noise (see Materials and Methods, Simulated data; Figure 4B).

Comparison of the task- and time-based methods using simulated datasets.
(A) Time course of the validation AUC (10-fold) using the task-based method (blue), time-based with fixed baseline (green), and time-based with pre-trial start baseline (blue-green) for leaky stochastic accumulator simulated data. The x-axis represents the time of the leading edge of the sliding window aligned to the threshold crossing (t=0; dotted gray line). The shaded area represents the standard error of the mean of the AUC. (B) Time course of the validation AUC (10-fold) using the task-based method (blue) and time-based method (green) for pink noise simulated data. The shaded area represents the standard error of the mean of the AUC.
When feeding these simulated data into the Haar-AdaBoost analysis, we can see an early rampup of decoding performance (Figure 4A and B) despite there being no early signal by design. This absence of signal is also confirmed by the time course of the AUC for the task-based approach hovering around chance (0.5). Note that this is entirely expected and by design: in theory no algorithm can classify two signals drawn from the same distribution, but we report it here to illustrate that despite there being no signal to classify when using a matched-control, the time-based approach still yields a ramp-up of above-chance classification (Figure 4A and B). In the first dataset we evaluated an additional time-based method in which we used the pre-trial activity as a reference window to classify against which is sometimes used (e.g. Schultze-Kraft et al., 2016). The results were similar to the standard time-based approach (Figure 4A). Note that once a threshold is crossed the AUC for the time-based approach plateaus just under 1, this is because of the locking of the trials to the threshold crossing. After time 0, all pseudo-movement trials are around the same level of activity, therefore easily differentiable from activity during the earlier reference window.
Benchmark analyses of Haar-AdaBoost against standard models
We used a customized class of Haar wavelets to generate our features, applied to all 23 crown channels (see Materials and Methods, EEG acquisition and pre-processing) and all pairwise differences between channels. We show that they offer a decisive advantage when investigating EEG signals. Given their simplicity, our algorithm was able to quickly generate a wide array of temporal, spatial, and frequency features, which enabled us to identify precisely which time points, frequencies, and channels mattered for correct classification (see appendix figure S6A and B). Haar-AdaBoost reached high AUCs, surpassing 0.80 in all but 3 participants and nearly reaching 1 after movement onset when applied using the time-based method (see appendix figure S5 and Figure 3A).
Haar-AdaBoost was selected a priori as the best performing model on the last 0.5 s directly preceding slide transition on both EEG and MEG data of the three participants from PF, our held-out validation set. We compared AdaBoost with support vector machine (SVM) and random forest, all with a wide range of features (see appendix table S1). Haar-AdaBoost had the highest AUC of all tested combinations at time of slide transition - it was therefore selected for the rest of the analyses. A-posteriori, after we ran our main analyses, to ensure that our approach was on a par or better than using other standard machine learning models, we benchmarked our approach on the held-out PF data, against three standard models for EEG classification: common-spatial patterns with linear discriminant analysis (CSP-LDA) (Blankertz, Tomioka et al. 2008), support vector machines (SVM) and convolutional deep neural networks (EEGNet) (Lawhern, Solon et al. 2018). While no model outperformed Haar-AdaBoost at time of slide transition (appendix figure S8), before slide transition EEGNet, had earlier onsets for participants 1 and 3 at PF. As a result, we ran, post-hoc, EEGNet on our main data. The performance of EEGNet yielded similar performance to Haar-AdaBoost (Figure 5).

Benchmark analysis of Haar-AdaBoost against standard models.
Validation AUC of Haar-AdaBoost using the task-based approach applied to the EEG of the fifteen participants of the OC main dataset using Haar-AdaBoost (blue) EEGNet (purple) and basic slope LDA (pink). Time 0 is the time of slide transition (dotted gray line). The AUC is aligned to the leading edge of the sliding window. Shaded regions indicate the standard error of the mean of AUC across participants.
We ran an additional basic model to assess the importance of slow ramping of the signal. At each channel, we fit a line to the 0.5 s window of data and used LDA to classify using the slopes of these 23 fitted lines. Essentially such an approach is only informed by the general slope of the data over the course of a 0.5 s period. We tested this approach because it provides a direct test of the predictive power of these slow early ramping signals that appear in the MRCP averages (S8 Fig and Fig 5 Basic Slope LDA). Its performance resembled that of Haar-AdaBoost until shortly following the slide transition after which it underperformed in relation to Haar-AdaBoost. This further confirms our main claim that slow ramping in the EEG might not reflect decision specific early signals.
A more in-depth investigation in the appendix supports that we have enough trials, as our approach converges after roughly 300 trials (appendix figure S9), and demonstrates that Haar-AdaBoost can pick up on differences in spectral power at different frequencies (appendix figure S10). A similar time course of decoding AUC was also found when using a growing rather than a sliding window, indicatingthat a 0.5 s window was sufficiently wide to capture any differences between the two conditions (appendix figure S11). We also show that Haar-AdaBoost can classify the two conditions with above-chance AUC when focusing on the moment the participants are processing the task instructions (appendix figure S12), by time locking to the start of the trial rather than the slide transition. Overall, Haar-AdaBoost equaled or outperformed all of the tested algorithms, which are standard in machine learning for EEG - indicating a result robust across choice of classifier.
Discussion
Early-decision accounts of self-initiated actions took reliable neural antecedents of self-initiated actions as evidence that a decision is made early at the neural level (see Schurger et al., 2021). Our work shows that early-decision accounts may have arrived at this conclusion mistakenly due to the use of event-locked epochs in the analysis of event-preceding activity and to the lack of an appropriate control. When using a well-matched control and a powerful classifier, support for an early commitment-to-move signal is lacking. Instead, the data better support the more recent late-decision accounts (Schurger et al., 2012; Schmidt et al., 2016).
The problem of movement-locked analysis
Movement-locked data epochs have long been assumed to reveal processes generated by the brain for the specific purpose of initiating movement (Kornhuber and Deecke, 1965); however, this need not be the case for the reasons brought to light here. Analyzing only movement (active) examples introduces a strong bias in the patterns of activity that will be observed. In our view, the slow buildup in neural activity reflects processes not specifically related to movement initiation, such as spontaneous fluctuations in cortical activity or anticipation of the sensory outcome (or both).
Isolating patterns of activity specific to the initiation of movement requires a different strategy, which has been the aim of the present work. Using a well-matched and independent nonmovement control condition and a powerful machine-learning technique, we mapped the time course of pre-movement activity specific to and predictive of movement onset. Our results clearly show that the slow buildup in activity that appears to be present when only movement epochs are analyzed proves not to be predictive under more carefully controlled conditions (i.e. when classified against a control condition using the task-based approach). Instead, our results are consistent with the view that the neural commitment to initiate movement begins much closer in time to the actual movement, casting further doubt on the decades-old argument that a commitment to act was made at the neural level well before the participant has consciously decided to initiate movement (Libet et al., 1983).
To validate our claims, we attempted to provide experimental conditions and an analysis approach that is as favorable as possible. We chose AdaBoost because it offers two main advantages. First, AdaBoost is known to work well in the presence of vast numbers of features. This allowed us to be agnostic as to which signal features would be most predictive; instead, we only had to select generic classes of features. We used simple Haar wavelets and moving average features that can capture a large variety of spectral and temporal patterns, examining the data with minimal prior assumptions. Secondly, AdaBoost is an ensemble method; therefore, it can capture spatial patterns by combining simple predictors, each concentrating on a specific spatial location. AdaBoost, coupled with decision stumps, systematically explores the entire space of spatial/temporal/spectral features, capturing the most predictive patterns while being relatively immune to overfitting (Schapire and Freund, 2012).
Applied to different windows preceding a movement, this approach (task-based) offers a better insight into the mechanisms that specifically carry information about an upcoming movement. While traditional analysis—time-locking to movement onset and comparing against an early reference window (time-based)—revealed an early onset of the signal in our data, our approach revealed that activity specific to the timing of movement onset only appeared late, about 0.1 s before movement (t-test method for onset of the AUC).
It is tempting to conclude that because some participants can be predicted around as early as their MRCP onset (e.g. participant 12 in appendix figure S5) then decisions must be made early. However, that would be a case of affirming the consequent. It is important to understand here that our conclusions are not contradicted by the existence of such cases. We do not claim that early classification from M/EEG is never possible, rather we claim that early signals in the average (Figure 2A, B and C) do not necessarily imply early classification (Figure 3).
The need for a control condition in self-initiated movement studies
We also show that when using machine learning as an alternative to the standard time-locking approach, classification against an early reference window (time-based) can be entirely driven by autocorrelation in the signal (Figures 3 and 4). We therefore showed that the inclusion of a control condition (task-based approach) offers a useful way to distinguish between precursors that are merely correlated with impending movement, and those that actually represent a commitment to move. The significant difference between earliest decodable times using both the single trial and the AUC method between the time-based and the task-based method (Figure 3B) strongly indicates that using an early reference period (time-based) to classify against can lead to very high classification simply in virtue of signal features not specific to the timing of the decisions (e.g. autocorrelation in the signal, general anticipation, or both). Other approaches have classified time bins leading up to movement onset. Some using a multi-class classifier (e.g. Soon et al., 2008) which presents the same problem, i.e. all the time bins are locked to the time of movement. Others, classifying movement data against randomly selected data from the same trials (e.g. Mitelut et al., 2022), which also by definition conditions the analysis on the occurrence of a movement.
Late-decision accounts
Early-decision accounts posit that there is an early neural commitment to a decision. The early onset of the RP and other motor-related cortical potentials have been taken to support the claim that the neural commitment occurs early. We show here that the data does not support this early neural commitment.
One might now ask, what is the RP if it is not a neural commitment? Two answers have been put forth in the literature, both compatible with our data. The leaky stochastic accumulator approach explains this early activity to reflect stochastic fluctuations associated with an accumulation-to-bound process (Schurger et al., 2012), while the SCP sampling hypothesis explains it to reflect the higher likelihood for actions to be initiated during certain phases of slow cortical potentials (Schmidt et al., 2016). These two accounts, both of which are consistent with our data, relegate the RP to a bias or inclination rather than a commitment to a course of action, while the real commitment to move occurs just before the movement itself. Late-decision accounts further do away with the counter-intuitive notion that commitments to move are made at RP onset, perhaps 2 seconds prior to movement execution, a notion that has little behavioral efficiency and is hard to reconcile with subjective reports (Libet et al., 1983) or with how late participants can veto their actions (Schultze-Kraft et al., 2016). Late decision accounts would also explain the lack of reports of real-time predictions of the timing of movement onsets. Real time prediction of upcoming movements using EEG typically finds relatively late predictive activity, with earliest decoding times under 0.7 s before movement and with relatively poor performance largely limited by the false positive rate (Bai et al., 2011; Lew et al., 2012; Schultze-Kraft et al., 2016).
Further, a neural commitment to move would imply a difficulty to inhibit the prepared action. A late onset for the classifier’s ability in our paradigm aligns with the ability to veto a voluntary action until very close to movement onset (Schultze-Kraft et al., 2016). Furthermore, a parallel could be drawn with the reaction time literature where participants can typically stop a movement they started in response to a target cue only if presented with the stop signal within a short period of around 200 milliseconds following the target cue (Logan and Cowan, 1984; Verbruggen and Logan, 2008), indicating a relatively short motor commitment period - in accordance with our results. It is possible that the apparent start of the conventional MRCP is not a flrm commitment, and that the process remains changeable until just prior to movement. The shortcomings of the canonical method did not make this testable until now - the important message of this work is to draw attention to the difference between the onset of the various processes associated with voluntary action and the commitment to make the voluntary action.
Limitations
One might argue that the poor discriminability of the active and passive trials earlier than about 0.1 s before movement onset does not disprove the possibility of an early commitment to move. However, the near-ceiling performance in discriminating the two conditions at and just after the time of movement onset argues against the possibility that the classifier is not powerful enough to discriminate between the conditions, as does the high early performance observed when we used the (biased) comparison with the baseline time window (time-based approach; Figure 3A). Also, the large amount of data collected from each participant and our strategic choice of feature classes provide circumstantial evidence that the early tail of movement-locked neural activity in the time domain is not specific to the initiation of movement. Our results further held across two different labs, the main set (OC) presented here but also the held-out data (PF), data previously collected upon which this analysis was framed, with different equipment (NeuroMag and BioSemi), two modalities (EEG and MEG), two tasks (non-spontaneous and spontaneous; see appendix figure S4) and multiple algorithms (appendix figure S8 and Figure 5). Future work could explore this question with a different modality or algorithm. It is worth noting that our results cannot rule out the possibility of an early neural commitment that was undetectable in EEG or MEG data; instead, we argue that the early signal in the EEG that was previously taken to indicate an early neural commitment does not exhibit the characteristics that a neural signature of a commitment would be expected to have. Therefore, we claim that slow MRCPs do not reflect early neural commitments detectable with EEG (the standard modality used in self-initiated movement tasks). However, it is possible that such an early neural commitment might be lurking at the level of individual neurons.
Another potential concern might stem from the AUC onset being very close to movement despite the LRP seemingly starting about a half second before slide transition. Two important points are worth making here. Firstly, our LRP did not differ significantly between the active and passive conditions until very late. This could be due to a multitude of factors, for instance our participants always utilized the right hand to advance the slide. This might lead to a decrease in the lateralization of the signal due to the lack of alternation between left and right hands as is typically the case in a standard LRP paradigm. Further our task is novel and might not induce a strong lateralization. Secondly, the AUC onset-by-eye (Figure 3A) is 0.44 s before movement, which is relatively close to the onset of the LRP.
Note that the active and passive trials differed for the obvious reason that in one case the participants perform a movement to initiate the slide transition, while in the other case the participants just waited for the slide transition. Thus, although our two experimental conditions were well-matched for anticipation of a visual event (the slide transition), they were not matched for anticipation of the proprioceptive consequences of the movement. This might seem like a limitation. However, this difference could only render the data from the two classes more rather than less discriminable. This further begs the question of why boosting failed to And a difference if the early pre-movement buildup is predictive of initiating a movement. Note also that we were able to classify the two different kinds of epochs at above-chance levels when time-locked to the start of the trial (appearance of the word “manual” or “automatic” on the display; appendix figure S12). Nevertheless, future work could match the conditions for the expectation of an upcoming movement by having participants respond to slide transitions with a button press or by having the slide transition triggered by a finger-movement induced mechanically or via muscle stimulation.
Another potential concern might be that our paradigm did not use one of the traditional spontaneous movement tasks that typically produce an RP. In these classical tasks, the experimenters try to foster movements that are initiated spontaneously. In contrast, our task tries to be more ecologically valid. Decisions were not strictly spontaneous in the sense that they were likely influenced by the participants’ perceptions of the stimuli. We still found canonical EEG signals and waiting time distributions consistent with those obtained from classical self-initiated action tasks (Figure 2A and D). Additionally, we also tried a more spontaneous version of our task, based on Libet (1983), in three new participants and we found more or less identical results (appendix figure S4). What is more, in line with the above point, any divergence from traditional volition tasks should only enhance the discriminability of our two classes.
It is worth noting as well that while combining data across laboratories may add unnecessary variability, it can also be seen as a strength as we show consistently strong results at the individual participant level across locations (PF and OC), recording modality (EEG and MEG), populations (France and USA) and experiments (slideshow task and Libet task; appendix figure S4) which indicates a high likelihood of replicability.
Recent approaches to measuring the onset of intentions found that participants report some level of already intending to act during the slow deflection of the RP. For instance, in one study participants were already thinking about their upcoming movement 1.42 s before performing it (Matsuhashi and Hallett, 2008). Others found similar results (Verbaarschot et al., 2015,2019). We propose that these early signals that precede movement onset reflect a mounting inclination to move that is available to introspection when cued. However, they do not represent a commitment to initiate movement. This viewpoint is supported by reports that participants find it difficult to judge whether they were already intending at a given point during a trial (Verbaarschot et al., 2019).
Implications and future directions
As Roskies (2010) puts it, the real test of the neuroscience of volition is whether it can “show volition to have or lack characteristics that comport with our intuitive notions of the requirements for freedom of the will”. In the West, folk intuitions of moral responsibility typically require us to be able to initiate our actions consciously in order to be held accountable for them (Maoz and Yaffe, 2015; Shepherd, 2015). However, a sizeable body of neuroscience literature (see Dominik et al., 2024) is taken to support early-decision accounts of decision-making, arguing that simple arbitrary actions are initiated unconsciously well in advance of the conscious facet of the decision (Schurger et al., 2021). Our work suggests that this body of literature may have arrived at this conclusion mistakenly due to a lack of an appropriate control and to the use of event-locked epochs in the analysis of event-preceding activity. When using a well-matched control and a powerful classifier, support for an early commitment to move signal is lacking. Instead, the data better support the more recent late-decision accounts (Schurger et al., 2012; Schmidt et al., 2016).
Our results invite the field to move away from its reliance on response-locked averaged ERPs, and to employ more advanced analyses, such as machine learning, to investigate how early selfinitiated actions become fully formed at the neural level by comparing them against a control condition. Importantly, our study does not invalidate studies of early activity, rather it makes the claim that these early signals do not reflect a commitment to move, begging the question of understanding the nature of these early signals and how they might bias movements without determining them. Another important future direction would be to reproduce this investigation with higher signal-to-noise ratio data, such as intracranial recordings.
Materials and Methods
Participants
The study was carried out at Chapman University, Orange, California, USA. Seventeen participants with EEG only participated (11 females, 6 males, age M=21.6 SD=3.6 range 18-30, all right-handed) - although only fifteen were used in the analysis after the exclusion of two participants’ data due to excessive low and high frequency noise that resulted in more than half the trials being rejected. Eight participants came for two sessions of 700 trials each (1 excluded), three participants for two sessions of 350 trials, and six participants for a single session of 350 trials (1 excluded). The experiment was approved by the Chapman University Institutional Review Board. All participants gave written informed consent and were compensated at the rate of $15 per hour. The number of trials per session was reduced from 700 to 350 after the first eight participants in order to reduce the duration of the experiment. There were no significant differences in MRCP amplitude between these two groups of participants (appendix figure S13) and classification performance as a function of number of trials did not improve after about 300 trials (appendix figure S9). A sample size of N=15 is supported by a post-hoc Bayesian two-samples t-test on the difference between MRCP onset-by-eye and AUC onset which provided decisive evidence (BF10>100) for an effect with a large effect size (posterior median δ=1.18, 95% credible interval [0.50,1.91]), indicating that AUC onsets occurred later than MRCP onsets.
Stimuli
Stimuli were full-color photographs of nature scenes, flowers, and landscapes culled from the internet and selected to be neutral and pleasant to look at, displayed using E-Prime 2.0 software (Psychology Software Tools, Pittsburgh, PA). Stimuli were displayed using a 60 Hz LCD computer screen (24.5 inch, 1920x1080px) at a distance of approximately 60 cm with participants sitting in a wooden desk chair.
Procedure
Participants were told that they would watch a slideshow of nature photos and that sometimes the slide would advance automatically after a few seconds, and sometimes they would choose when to advance the slide by pressing a button. Each photo was preceded by a cue screen, with either the word manual or automatic for one second followed by a buffer of 0.25 s. The word manual instructed the participants that it was up to them to advance to the next slide whenever they wanted to by pressing a button (active trials). We asked participants to look at each photo for a minimum of about 3 seconds, but without counting time. The word automatic instructed the participants to view the photo passively, with the slide transition happening automatically after a few seconds (passive trials). The viewing time on passive trials was randomly chosen around a mean of 5 seconds for the first few instances and then was drawn from each participant’s continuously updated distribution of viewing times on subsequent active trials.
For subsequent analyses, data were time-locked to the software event associated with the disappearance of the slide, whether triggered by the participant (active trials) or by the computer (passive trials). In the active trials the delay between the button press and the slide transition was negligeable, tests with a photosensor revealed no delay larger than 10 ms over a block of trials. Participants had an equal number of passive and active trials, in random order at each session, with sessions of either 350 or 700 trials. Participants took a short break every 50 trials.
EEG acquisition and pre-processing
We recorded EEG using a 64-channel BioSemi system referenced to the common average and with electrode offsets kept below 10 mV. EEG electrodes were positioned such that Cz was at the vertex with the midline electrodes aligned to the inion-nasion axis and the central electrodes aligned along the tragus-to-tragus axis. All data were sampled at 2048 Hz (with an analog low-pass filter at 409.6 Hz cutoff), then down-sampled to 500 Hz offline before data analyses. No high-pass filtering was applied; data were recorded in DC mode.
For participants who returned for multiple sessions, the same EEG cap was used during each session. We used standard procedures for identifying and interpolating bad channels in the EEG recordings. No more than 5 bad channels were identified and interpolated in any recording session. Data epochs were extracted from 4 s before to 0.5 s after each slide transition. Independent components analysis (ICA) was used to identify and remove eye-blink and eye-movement artifacts. Trials containing muscle or movement artifacts were manually excluded from the EEG data during visual inspection of the 23 channels of the crown subset that were used in the machine learning analyses (F3, F1, Fz, F2, F4, FC3, FC1, FCz, FC2, FC4, C3, C1, Cz, C2, C4, CP3, CP1, CPz, CP2, CP4, P1, Pz and P2). For the topography, rejection of trials was done automatically on all 64 channels, on a trial-by-trial basis, channels at which the signals’ range (max-min over the trial epoch) exceeded 120 μV were interpolated usingthe average activity ofthe nearest5 channels. Trialswith more than 5 channels interpolated were rejected from the rest ofthe topography analysis. A baseline subtraction was applied, taking the average activity at each channel from 4 s to 3 s before movement. This period was chosen due to its temporal distance from the movement onset while remaining within the duration of most trials - following the standard practice.
In the rest of this manuscript, parameters and algorithms selected for analysis ofthe data were selected based on the validation on held-out data collected previously at a different location (Paris, France; PF) using a similar set-up (for a complete description of the parameters selected this way see below, “Parameters and algorithm selection”; for a description ofthe PF data see the appendix, Model selection set).
Event-related potentials
We computed the average time course at electrode C3 positioned over the left motor cortex, which showed the earliest onsets for active trials in the held-out model selection set (PF data independent from data at OC) in epochs extending from 2.5 s before to 0.5 s after the slide transition. The mean from -4.0 s to +0.5 s was removed from the data for plotting and ERP analyses. We also computed the time-locked average of the difference between electrodes C3 and C4, on opposite sides of the midline, as an estimate of the lateralized readiness potential (LRP) (Eimer, 1998). Typically, LRPs are computed as a difference of differences between hemispheres depending on the hand used. Here we only looked at the difference between left and right channels for right-hand button presses. For visualization, we used a 4th order Butterworth low-pass filter with a cutoff at 30 Hz. A topography was also computed, using the FieldTrip (Oostenveld et al., 2011) (version 20240129) toolbox for MATLAB (R2023b), as the grand average across participants of EEG activity at all 64 EEG channels in the 0.5 s before movement. This was done to assess how the topography in our data relates to the topography in other self-initiated action studies.
Event-related potential onset was determined using 3 different methods commonly found in the literature (Verbaarschot et al., 2015). The RP “onset-by-eye” was operationalized as the last time point the signal crossed the zero line (the average over the baseline window). The 90% area method was operationalized as the timepoint corresponding to the 90th percentile of the area under the curve, starting from movement onset and working backward to the RP onset-by-eye. Lastly, for the t-test method, we used the first of three consecutive time points significantly different from baseline using a paired samples bidirectional t-test. All three methods have been used in the field and, taken as a whole, render a more holistic estimate of RP onset (Dominik et al., 2024). A simple paired samples t-test was used for both the motor-related cortical potential and the LRP to compare the activity during the baseline to the activity in the last 0.5 s before movement (as is standard in the field). Post-hoc cluster corrected t-tests over the entire epoch were also performed (Maris and Oostenveld, 2007).
Event-related desynchronization
Mu and Beta band activity over contralateral motor cortices are known to desynchronize before self-initiated movements (Pfurtscheller, 1997). Here, we used complex Morlet wavelets to decompose our data at C3 into 78 logarithmically spaced frequency bands from 2 Hz to 80 Hz with logarithmically decreasing cycle counts (Cohen, 2014). The epochs were shortened to -3.5 s to +0.3 s to remove edge artifacts. Finally, we normalized this analysis to the mean power at an earlier window from 3.5 s to 3 s before the slide transition. This choice of baseline period for normalization, different from the baseline period used elsewhere, was due to the previous step of shortening the epochs to remove edge artifacts. Cluster correction was performed using 1000 permutations and a threshold of p<0.05 for clusters and p<0.01 for pixels (Cohen, 2014).
Simulated data
In our first dataset, we used the parameters as reported in Schurger et al. (2012) to generate autocorrelated noise with a drift aligned to a threshold crossing, as hypothesized in one of the late-decision accounts (Figure 4A). In this case we evaluate two choices of reference windows for the time-based approach: fixed to threshold crossing and pre-trial start. Fixed reference windows were always taken in the 2.5 s to 2 s before threshold crossing. To obtain pseudo pre-trial reference activity, we let the accumulator run without drift for two seconds, and then we used the 0.5 s preceding the onset of the drift as the pre-trial baseline activity. While fixed reference windows have been used in decoding studies (e.g. Fried et al., 2011), pre-trial early windows have also been used in other decoding studies (e.g. Schultze-Kraft et al., 2016) as the control condition.
In our second dataset, “pink noise”, we generated the simulated data by starting with white noise in the frequency domain, imposing a slope on the power spectrum proportional to 1/f, and then performing an inverse Fourier transform. To match the real EEG data used in this study, we generated 1400 trials for each dataset, containing 2251 samples, each representing a 4.5-second-long epoch at 500 Hz. For the task-based approach, we randomly labelled half of these trials as passive and the others as active, by design active and passive simulated data are identical and should not be discriminable.
Machine Learning
Sliding window
To study the time course of neural information predictive of impending movement, we extracted 0.5 s windows of EEG data every 0.02 s along the time axis of our epochs. Within each of these 0.5 s windows, we trained and tested an aggregate classifier (Haar-AdaBoost, see description below) that labelled each window of EEG data as belonging to an active or a passive trial using a 10-fold cross-validation. As such, we randomly split the data into ten equal size bins, each bin being a set of trials, trained Haar-AdaBoost on the data from 9 of these 10 bins and tested on the remaining bin, yielding one validation AUC per bin. Cross-validation ensures that the model is not tested on data that it was trained on, while still allowing for the entire dataset to be accessible. This process was repeated at each position of the sliding window. We then aligned the resulting 10-fold averaged AUCs to the leading edge of their respective windows. We aligned to the leading edge in order to exclude any information from the epoch that came from the future, relative to the location of the window. As such, for each participant, each single trial at each single time bin was labelled by our classifier as either passive or active. AUC was chosen as a metric to reflect overall data discriminability.
Time and task-based approaches
We compared two approaches: the time-based and the task-based approach. In the time-based approach we started with an early time window taken from 3 s to 2.5 s before movement across all trials and labelled these the negative (“no-movement”) exemplars. We chose this window to avoid overlap with the baseline window used for drift correction (also known as baseline correction in EEG preprocessing jargon; 4 s to 3 s before movement here). Then, one position of the window at a time, we trained Haar-AdaBoost to distinguish the data in this window (the positive or “movement” exemplar) from the set of negative exemplars (the early reference window or “no-movement” epoch). This was repeated for all positions of the window in our sliding window analysis, aligning results to the leading edge of the window. For the task-based approach we classified the data within the sliding window as belonging to a passive or an active trial epoch. For instance, windows from 1.7 s to 1.2 s before slide transition in the active condition were compared to windows from 1.7 s to 1.2 s before slide transition in the passive condition.
Adaptive boosting
Our approach depended on maximizing the classification performance of active and passive trials. Forthat purpose, we employed a customized class of wavelets (Haar wavelets described below) combined with adaptive boosting (AdaBoost). AdaBoost combines the output of multiple weak classifiers into a strong classifier (Schapire and Freund, 2012). These weak classifiers are also called base classifiers, and in our case we used decision stumps. A decision stump is a simple rule that makes its prediction according to whether some particular feature is above or below some threshold. AdaBoost constructs the strong classifier iteratively by adding one base classifier at a time in a series of rounds. In our case we used 200 iterations. On each round, AdaBoost assigns an importance weight to each training example and trains a base classifier that minimizes the weighted training error. AdaBoost starts with uniform importance weights over the training dataset. Then, on each round, it increases the importance weight on examples that were misclassified by the most recent base classifier and decreases the weight on those that were correctly classified. Thus, AdaBoost concentrates more on the examples that were misclassified on previous rounds.
Custom Haar wavelets and moving average features
For our features, we drew inspiration from Haar wavelet convolution and adapted them to our problem. Haar wavelets are step-like mathematical functions with sharp, abrupt edges and a prespecified width that can be used to extract time-frequency features of a signal. For our custom wavelet, while fixing the height of each square-like step (to 1 for the positive step and -1 for the negative step), we chose a range of 22 step widths in the time domain to linearly delineate the frequencies between 2 and 80 Hz. At a sampling rate of 500 Hz, we used widths of 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 21, 24, 28, 32, 39, 50, 68, 107 and 250 timepoints for the different steps. The steps were either 1 for the positive or-1 for the negative (i.e. for all timepoints within the width of the steps) and everywhere else the values were 0. Because all other timepoints were set to 0, this means that each wavelet focused specifically on the timepoints close to its center and ignored the rest. Our Haar wavelets consisted of a positive step followed by a negative step. The center edges of these steps were also spaced by 12.5, 25 and 37.5 % of the steps’width. Each dot product of each wavelet with the data resulted in one separate feature. While being computationally very simple, this method allows the extraction, by computing its dot product with a signal, of a rough estimate of both frequency (width of the step) and temporal (location of the midpoint between the two steps) features. The midpoint between the two steps were positioned every 0.02 s in our 0.5 s windows. This means that for each ten timepoints at 500 Hz in the sliding window, we created a set of wavelets that were centered on it. Therefore, a strong 10 Hz signal occurring only for a 0.3 s period, would yield a large dot product value only for those wavelets with widths tuned to 10 Hz and centered on that 0.3 s period. Therefore, we took the dot product at each channel and each pairwise difference between channels of each of our wavelets and used these outputs as features for our classifier. Similarly, we used an approximation of a moving average (identical to our Haar wavelets but with a single positive step rather than both a positive and a negative step) of 22 different widths at 0.02 s intervals and used the outputs as features.
Model evaluation
We further computed the earliest decodable times (EDT) using two techniques: single trial derived EDT and AUC derived EDT. For the single trial EDT, we looked backwards from the slide transition on each single trial and identified the last leading edge of a sliding windowto be correctly classified. We then averaged these times across all trials within each participant, yielding one earliest decoding time value per participant. For the AUC derived EDT, we calculated the standard error of the AUC (Hanley and McNeil, 1982). We then took the lower bound of that standard error and identified the last timepoint this lower bound stayed above an AUC of 0.5 for three consecutive samples. The first of such time points was taken as each participant’s AUC earliest decodable time. A paired samples t-test was used to compare the earliest decodable times between the time-based and the task-based approach.
We additionally conducted a post-hoc Bayesian paired-samples t-test usingJASP (version 0.18.3) to quantify the level of evidence supporting the hypothesis that AUC onsets occurred later than MRCP onsets, and to assess whether the size of our sample provided enough evidence.
The choice to use Haar-AdaBoost was based on a model evaluation step performed on the validation data from PF, where we compared model performances in the window directly preceding the slide transition. We compared AdaBoost with SVM and Random Forest, each tested with a variety of features. Haar-derived features with AdaBoost was overall the best performing approach on PF’s data (appendix table S1).
A posteriori we decided to assess Haar-AdaBoost sliding window performance against more standard models as well as a basic model tuned to classify based on linear trends in the data. We opted for support vector machine (SVM), common spatial patterns with linear discriminant analysis (CSP-LDA) with three frequency bands (0-7 Hz, 7-14 Hz and 15-25 Hz) using the top 4 filters per band as the input to the LDA and EEGNet a deep neural network classifier (Lawhern, Solon et al. 2018). Additionally, we tested a basic classifier by fitting lines to each window of 0.5 s and feeding the coefficients of the first-degree term for each channel into an LDA classifier. This approach relies solely on the slow trend of the data (e.g. such as an RP or other MRCP).
For SVM classification we used MATLAB fitcsvm function from the statistics and machine learning toolbox (R2023b). For CSP-LDA temporal filtering was applied using butterworth causal filtering in MATLAB, CSP filters were derived according to (Blankertz et al., 2008) and LDA was performed using the fitcdiscr function from the statistics and machine learning toolbox (R2023b). For EEG-Net, we conducted a grid search on the 0.5 s of data immediately preceding the slide transition in the participants from the held-out PF group, tuning two hyperparameters. While this approach involves a form of double dipping on the PF held-out validation set, it remains fully independent from our main set (OC), preserving the integrity of our final evaluation.
We tuned the number of temporal filters in the first convolutional layer (F1; tested values: 4, 8 and 16) and the depth multiplier for the depth-wise convolution layer (D; tested values: 1, 2 and 4). The highest AUCs were achieved with F1 = 8 and D = 2. We used a dropout rate of 0.5 following recommendations for within-participants classifications (Lawhern et al., 2018). Raw data from the 23 crown channels were used as input to the classifier. Weights were re-initialized to the same initial state at each sliding window location. Training was limited to 50 epochs with a patience parameter of 5 on a 20% validation subset of trials drawn from the train set at each fold.
Parameters and algorithm selection
EEG data from the held-out PF validation set was processed similarly to EEG data from OC (see appendix, Model selection set). A sliding window of 0.5 s in width was chosen under the assumption that it was large enough to capture any slow trends. This was confirmed later in the PF data by comparing the sliding window approach to a growing window approach (appendix figure S11). A step size for the sliding window of 0.02 s was chosen to ensure that we could pick up small temporal differences in the classifier’s performance while balancing computational processing time. For the wavelets and moving averages, the widths were chosen to representa set of 22 frequencies ranging from 2 Hz (whole window) to 80 Hz. This choice proved sufficient to capture the intended spectral features in the PF EEG data (appendix figure S10). As with the sliding window, the time between each wavelet was also chosen to be of 0.02 sto ensure afine temporal granularity, balanced against computation time. The choice of pulse width step size (size of each step in the width of each pulse in proportion to the current window size) was selected after testing on PF EEG data: 12.5% only; 12.5% and 25%; or 12.5%, 25% and 37.5%. The highest classifier performance was obtained when using all three different pulse widths.
The choice of 23 channels was done to reduce the dimensionality of the feature space and based on prior knowledge that peripheral EEG channels contain little motor preparation information and tend to be noisier. The choice of using pairwise differences between channels together with single channels was based on prior knowledge of the lateralization of motor related cortical potentials. A benchmark analysis confirmed models including pairwise differences performed best on both EEG and MEG data of PF (appendix table S1). For the MRCP analysis (Figure 2), the choice of focusing on channel C3 was due to it instantiating the earliest MRCP onset (by-eye) in the active condition of the PF EEG data.
Our choice of Haar wavelets with moving averages combined with AdaBoost was further supported by benchmarking against a wide range of other models (Figure 5 and appendix table S1). In the supplementary material, further analyses on PF data are described, all indicating that Haar-AdaBoost is optimal to discriminate between active and passive trials. The AUC of EEG vs. MEG is compared to ensure that EEG contains enough information for this discrimination to be possible (appendix figure S1). We also tried with different numbers of trials to estimate how many trials we should have (per participants) in order to address our research question (appendix figure S9).
Appendix 1
Supporting Figures

Validation AUC of PF dataset.
Validation AUC of Haar-Adaboost applied to the MEG and EEG of the three participants of the PF dataset using the task-based approach. Time 0 is the time of slide transition (dotted gray line). The AUC is aligned to the leading edge of the sliding window. Shaded areas are the standard error of the AUC. While EEG provides great temporal resolution it typically has poor spatial resolution. MEG on the other hand has a much higher spatial resolution for an identical temporal resolution. In order to ensure that our analysis could be performed interchangeably on EEG and MEG we acquired concurrently EEG and MEG data in the same participants at PF. We found no improvement of using MEG over EEG before the slide transition. We do however see that MEG performs better after movement-indicating that post movement some spatial information is missing in the EEG.

Effect-matched spatial filtering of MEG data from PF.
Event-locked averages of the MEG data from PF projected onto a spatial filter. The top two rows represent the magnometers and the gradiometers averaged over active trials. The bottom two rows represent the magnometers and the gradiometers averaged over passive trials. Time 0 is the time of slide transition. Magnometers measure the total strength of the magnetic field under the sensor while gradiometers measure a difference in magnetic field strength between two sensors - as such they reveal different aspects ofthe same signal (see (Hämäläinen et al., 1993) fora more in-depth understanding). Thin lines represent individual sessions. Topographic plots represent the distribution ofthe spatial filters over the scalp. Each column is one of the three participants. In order to accurately represent the time course ofthe motor-related cortical field in the MEG we used a method called effect-matched spatial filtering (EMSF) to reduce the dimensionality ofthe data (see (Schurger et al., 2013)). For gradiometers and magnometers separately, a spatial filter maximizing the difference between the activity 2.9 s before slide transition with that at slide transition, was applied to the data to reduce all channels to a single time series. This encapsulates any trend in the signal that occurs over that 2.9 s period. We can see that for active trials, a negative deflection in the MEG is present.

Individual ERP.
Individual movement-related cortical potentials at C3 forthe 15 participants at OC. In blue are the averages ofthe passive trials and in red the averages ofthe active trials. Time 0 is the time ofslide transition (dotted gray line). Shaded areas are the standard errors of the mean.

AUC and ERP of Libet-task participants.
Machine learning and ERP results for three participants that we ran with a spontaneous movement initiation task based on Libet et al. (1983). (A) Individual validation AUC for the 3 participants. In light blue are the AUCs using the task-based approach and in green are the AUCs using the time-based approach. Time 0 is the time of slide transition (dotted gray line). The AUC is aligned to the leading edge of the sliding window. Shaded areas are the standard errors of the AUC. (B) Individual movement-related cortical potentials at C3 for the 3 participants. In blue are the averages of the passive trials andin red the averages of the active trials. Time 0 is the time of slide transition (dotted gray line). Shaded areas are the standard errors of the mean. Our slideshow paradigm, while providing a more ecologically valid task, is agnostic as to the spontaneity of the participants’ movement decisions. To ensure that our task would generalize to more standard paradigms in the field of self-initiated action, we performed our analyses on data from a spontaneous voluntary movement paradigm based on Libet et al. (1983). Here we collected three further participants (1 female, 2 males, age M=29.3,1 left-handed, 2 right-handed). They performed 350 trials of a task similar to the task performed by participants at OC except for the fact that there were no pictures, simply a fixation cross. Furthermore, the instructions for the manual trials were for participants to wait for a minimum of 3 seconds then start monitoring inwards for an urge to move. Whenever they detected such urge they were instructed to press as abruptly and spontaneously as possible, ending the trial. In the automatic trials, participants were instructed to do the same (monitor introspectively for an urge), except that they should not act on the urge if/when they felt it, but rather to wait for the next urge passively, and repeat such process until the trial ended automatically. This paradigm is much closer to the seminal studies on self-initiated actions (Libet et al., 1983; Kornhuber and Deecke, 1965). Here movements are performed spontaneously, and the matched condition (automatic) is identical in most regards apart from it not containing or terminating in a movement. All preprocessing and data analysis applied were the same. We found no qualitative difference with our main result (Figures 2A and 3A), suggesting that our task would generalize to other types of self-initiated actions.

Individual validation AUC.
Individual validation AUC for the 15 participants at OC. In light blue are the AUCs using the task-based approach and in green are the AUCs using the time-based approach. Time 0 is the time of slide transition (dotted gray line). The AUC is aligned to the leading edge ofthe sliding window. Shaded areas are the standard errors ofthe AUC.

Feature importance.
Feature importance for the task-based method (A) Time-frequency plot of the grand average (n=15) feature importance as a normalized measure of the final weights selected by the model associated with features from each frequency at each time point. (B) Heatmap of the grand average (n=15) channel importance as a normalized measure of the final weights selected by the model associated with each channel over time. Values are normalized such that a value of 0.07 means that activity at this channel accounted for 7% of the model’s decision at this timepoint. When performing our sliding window analysis using Haar-AdaBoost (Figure 3) we extracted the importance of the different features used overtime by the algorithm. For each window of the sliding window, each selected weak learner (using a decision stump classifier) was associated with a single feature. One clear advantage of this approach is that each feature corresponds to a channel or a pair of channels, a unique frequency, and a unique time interval. Using the weights each of these weak classifiers was attributed, one can then extract from the final model, by normalizing the weights, the importance of each channel, and frequencies over time for the classification. Once the model was trained, we thus extracted the weight attributed to each channel and frequency over time. For each sliding window, we extracted the proportional weight a channel or a frequency played in the classification. Notably, when we ran our analysis, the frequency feature importance (panel A) focused on the Mu rhythms and Beta desynchronization (Figure 2C) right after movement onset. Channels over left motor cortices also exhibited higher relevance right before, during and after movement (panel B).

Validation AUC with outliers excluded.
Model performance for the task- and time-based methods without earliest decoding time outliers. Grand average time course of the validation AUC (10-fold) using the task-based method (blue; N=12) and time-based method (green; N=13) from OC EEG data. The three gray dashed lines indicate the time of onset of the signal according to their respectively labeled indicators for the control method at OC EEG data. The shaded area represents the standard error of the mean. For the task-based method, 3 outliers were removed (participants 3, 6 and 8). For the time-based method, 2 outliers were removed (participants 8 and 11). Outliers were defined using the Tukey method as participants with earliest decoding times either larger than 1.5 times the difference between the third and first quartiles plus the third quartile, or smaller than 1.5 times the difference between the third and first quartiles minus the first quartile. AUC onset t-test method here is at 20 ms before slide transition. Matching MRCP onsets with outliers removed are displayed in red. The thick dotted gray line represents the time of slide transition.

Benchmark comparison of standard classifiers.
Benchmark comparison of standard algorithms on PF data. (A-E) Validation AUC of Haar-AdaBoost using the task-based approach applied to the EEG of the three participants of the PF validation dataset using Haar-AdaBoost (blue), EEGNet (purple), CSP-LDA (green), SVM (yellow), and basic slope LDA (pink). The color scheme matches the one used in Figure 5.

Number of trials required for classification.
Performance of Haar-Adaboost on EEG data from PF (averaged over the 3 participants) using the task-based approach as function of the number of trials. (A) The time course of the validation AUC of 10 runs of Haar-Adaboost with a different number of trials. Time 0 is the time of slide transition (dotted gray line). (B) The time course of the standard error of the validation AUC of 10 runs of Haar-Adaboost with a different number of trials. Time 0 is the time of slide transition (dotted gray line). We iteratively ran via random subsampling our pipeline on our data from PF. We found that after 150 trials the AUC time course plateaued and did not improve much with more samples (panel A). Similarly, after about 300 trials the standard error did not improve much and converged (panel B).

Decoding performance for frequency based differences.
Performance, in the frequency domain, of Haar-Adaboost using the task-based approach on EEG data from PF. (A) Time courses of the validation AUC of Haar-Adaboost for three participants using raw data (purple) or data decomposed into different frequency bands (light green). Time 0 is the time of slide transition (dotted gray line). AUC is aligned to the leading edge of the sliding window. Shaded areas indicate the standard error of the AUC. (B) AUC of Haar-Adaboost when classifying raw data from passive trials with data from passive trials that has been bandstopped filtered at a specific frequency (x-axis) aligned to the slide transition. Error bars represent the standard error of the AUC. To establish whether our Haar wavelet features gave Haar-AdaBoost the ability to pick up on frequency differences in specific bands we performed two separate analyses. Using data from PF, we initially decomposed it using a causal butterworth filter and Hilbert transform into 30 frequencies ranging from 2 to 80 Hz. We then compared it to using only raw data with Haar-AdaBoost and found that they perform on a par (panel A). Next, we took the passive trials from PF data and made 30 copies to each of which we selectively applied a bandstop filter using 30 distinct bands from 2 to 80 Hz. We then ran the algorithm 30 times classifying each time the non-bandstopped trials to those that have been bandstopped filtered. We found that Haar-AdaBoost could classify all of them with high AUCs despite the two classes’ difference each time being only on a missing frequency range (panel B) indicating that Haar-AdaBoost can effectively pick up on subtle frequency differences.

Comparison of growing and sliding windows.
Average validation AUC of Haar-Adaboost ofthe three participants ofthe PF EEG dataset using a sliding window (light blue) or growing window approach (pink) with the task-based approach. Time 0 is the time of slide transition (dotted gray line). The AUC is aligned to the leading edge of the sliding window. Shaded area is the standard error ofthe mean. A growing window analysis is similar to the sliding window analysis except that at each iteration, instead of sliding the window, the window width is increased by 0.02 s. This means that for an epoch beginning 3 s before movement, the window of analysis for the leading edge aligned to the slide transition would be 3 s wide from -3 s to 0 s (while for the sliding window method it would be 0.5 s wide from -0.5 s to 0 s). While it enables the capture of more patterns, it is a computation heavy analysis. To ensure that we were not missing out on some substantial longer-term patterns with sliding window, we ran a growing window on our PF EEG data. We found no improvement of using growing window over sliding window.

Decoding performance at trial onset.
Average model performance using the task-based approach on the EEG data from 15 participants at OC in terms of validation AUC. The analysis is aligned to the start ofthe trial. The shaded red region represents the one second period during which the instructions (“manual” or “automatic”) were being displayed to the participants. The appearance ofthe slide occurred at time 1.25 s with respect to trial start (vertical gray line). In an effort to ensure that our algorithm would be able to detect cognitive differences between the two tasks prior to movement onset, we decided to check whether it could classify above chance with the epochs time-locked to trial start, while the instructions (manual or automatic) were displayed on screen. We aligned our epochs to the start of the trial and then ran Haar-Adaboost to classify manual versus automatic. At the start of each trial, the instructions were shown on screen for 1 second followed by a 0.25 s blank screen before the appearance of the slide. Here, for convenience, trial rejection was not performed manually but instead, any channel within each trial that had a range of more than 120 μV (max-min) were interpolated trial-by-trial. If more than 5 channels needed to be interpolated in a given trial, the trial was excluded from the analysis.While our decoding did not boast as high AUC values as it did right after slide transition, Haar-Adaboost did successfully classify while the instructions were on screen. Participant’s individual classification AUC peaked at about 0.65 during the 1 s instruction window and at 0.63 over a period of 1.75 s following the onset of the slide (peaks at the individual level occurred on average 0.61 s after the onset of the slide), indicating the presence of a second jittered decoding peak after the appearance ofthe slide (which is flattened on the average across participants; this post slide onset peak only reaches 0.55 at the group average). It is worth noting that this early classification time-locked to the trial start is not reflected in the classification time-locked to the slide transition (Figure 3) for multiple reasons. Firstly, it is a relatively low AUC which phases off after a little over a second following trial start. When combined with the jitter induced by the trial variability of response times (Figure 2D), in those trials that overlap, the jitter might be enough to smear the AUC. Secondly, AUC onset, using the MRCP onset methods (Figure 3A) or the AUC and single trial methods (Figure 3B) are computed backwards from slide transition. They could therefore miss early isolated temporal windows of above chance classification. We chose this approach to avoid false positive rate contaminating our earliest decoding time values but also because we were specifically interested in the onset of a signal similar to the early signals found in MRCP, that is, exhibiting an onset that is then sustained until the slide transition. Additionally, early decoding peaks followed by silent decoding periods have been reported in the literature. For instance, when predicting upcoming decision outcome from fMRI data, the pre-SMA first reflects this outcome significantly 6 seconds before movements then the significance goes aways and picks back up 2 seconds after the movement (see Figure 2 in (Soon et al., 2008)). In memory research, activity silent neurons have also been identified (Stokes, 2015; Trübutschek et al., 2019). All in all it is very plausible that the activity encoding the task (active vs. passive) was just not accessible to our classifier (i.e. not in the EEG), but this is entirely compatible with our claim that early MRCP (which are accessible to the classifier) are not predictive of an early commitment to a decision.

Comparison of total number of trials per session.
Summary of the results with participants split according to the number of trials they had per session. (A) Grand average movement-related cortical potential at C3 for non-outlier (see appendix figure S6) participants whose sessions contained 350 trials (blue; N=7) and those whose sessions contained 700 trials (red; N=5) aligned to slide transition (t=0; dotted gray line). The shaded area represents the standard error of the mean. The mean MRCP in the 0.5 s preceding slide transitions between the two types of participants was not significantly different (M=0.58 SD=1.56; t(10)=0.63, p=0.54). (B) Grand average time course of the validation AUC (10-fold) for non-outlier participants whose sessions contained 350 trials (light blue; N=7) and those whose sessions contained 700 trials (green; N=5). The x-axis represents the time of the leading edge of the sliding window aligned to slide transition (t=0; dotted gray line). The shaded area represents the standard error of the mean.
Supporting Table 1

Validation AUC across participants, data types, algorithms, and feature sets.
Supporting Text 1
Model selection set
We previously collected data (in 2011) from three participants (all males, age M=24.7, all right-handed), at the Neurospin Research Center, near Paris France (PF). The study was approved by the Comité de Protection des Personnes Ile de France VII, and participants were compensated 40€ per hour. We used this dataset to evaluate different algorithms and select Haar-AdaBoost and all its parameters. We also used this data to identify channel C3 as the channel of interest for MRCPs in our task, in the participants at PF channel C3 yielded the earliest MRCP onsets. The task performed by the participants at PF was identical to that described in Materials and Methods, Procedure, with the notable difference that the task instructions and on-screen text (‘manuel’ and ‘automatique’) were administered in French and that participants performed a finger lift rather than a button press.
Both EEG and MEG were acquired using an Elekta NeuroMag system. At PF, stimuli were back-projected onto a translucent viewing screen positioned approximately 60 cm in front of the participant, subtending approximately 10 degrees of visual angle. Ambient lighting was dim. Participants sat comfortably in a slightly reclined position with their heads inside of the MEG helmet and hands placed palms down on a table.
Participants at PF, completed four sessions of 350 trials each at the Neurospin Research Center near Paris (1400 trials total per participant). At each session there were 200 active trials and 100 passive trials in random order, plus an additional 50 active trials added at the end of each session.
The magnetoencephalography setup at PF included a whole-head 306-channel sensor array with 102 magnetometers and 102 pairs of orthogonal planar gradiometers. The participants also wore a 60- channel MEG-compatible EEG cap, referenced to the tip of the nose, with electrode impedances kept below 15 k Ω. Data were sampled at 1000 Hz (with an analog low-pass Alter at 300 Hz cutoff) then downsampled to 500 Hz. No highpass Altering was applied, the data was recorded in DC mode.
Each participant’s head position inside the MEG helmet was estimated at the beginning of each 5minute run. This information was used during the data analysis to correct for small changes in head position during the recording session. For participants who returned for multiple sessions, the same EEG cap was used during each session. The Signal Space Separation algorithm (Taulu et al., 2004) was applied offline to the MEG data, using Neuromag’s MaxFilter software, to correct for magnetic-field artifacts originating outside of the MEG helmet. We used standard procedures for identifying and interpolating bad channels in the MEG or EEG recordings. No more than 5 bad channels of each type (EEG, MEG) were identified and interpolated in any recording session. Data epochs were extracted from 4 s before to 0.5 s after each slide transition. Independent components analysis (ICA) was used to identify and remove eye-blink, eye-movement, and cardiac artifacts separately from the EEG, magnetometer, and gradiometer data. Trials containing muscle or movement artifacts were manually excluded from the EEG data during visual inspection of the 23 channels of the crown that were used in the machine learning (F3, F1, Fz, F2, F4, FC5, FC1, FC2, FC6, C3, C1, Cz, C2, C4, CP3, CP1, CP2, CP4, P1, Pz, P2, P3 and P4). A baseline subtraction was applied, taking the average activity at each channel from 4 s to 3 s before movement.
Data availability
All behavioral and electrophysiological data and data analysis computer code have been deposited at https://osf.io/wm95q/overview and are publicly available.
Acknowledgements
We thank Dr. Elnaz Lashgari for her help with data collection.
Additional information
Funding
EC | European Research Council (ERC) (640626 ACTINIT)
Aaron Schurger
EC | FP7 | People | FP7 People: Marie-Curie Actions (Specific Programme 'People' Implementing the Seventh Framework Programme of the European Community for Research, Technological Development and Demonstration Activities 2007 to 2013) (252665 CODEC)
Aaron Schurger
References
- Electrode fusion for the prediction of self-initiated fine movements from single-trial readiness potentialsInt J Neural Syst 25:1550014https://doi.org/10.1142/s0129065715500148Google Scholar
- Prediction of human voluntary movement before it occursClin Neurophysiol 122:364–72https://doi.org/10.1016/j.clinph.2010.07.010Google Scholar
- Neural Mechanisms Determining the Duration of Task-free, Self-paced Visual PerceptionJ Cogn Neurosci 36:756–775https://doi.org/10.1162/jocn_a_02131Google Scholar
- Optimizing Spatial Alters for Robust EEG Single-Trial AnalysisIEEE Signal Processing Magazine 25:41–56https://doi.org/10.1109/MSP.2008.4408441Google Scholar
- Cortical Measures of AnticipationJournal of Psychophysiology 18:61–76https://doi.org/10.1027/0269-8803.18.23.61Google Scholar
- A meta-analysis of Libet-style experimentsNeuroscience & Biobehavioral Reviews 128:182–198https://doi.org/10.1016/j.neubiorev.2021.06.018
- Distribution of slow brain potentials related to motor preparation and stimulus anticipation in a time estimation taskElectroencephalogr Clin Neurophysiol 69:234–43https://doi.org/10.1016/0013-4694(88)90132-0Google Scholar
- Analyzing Neural Time Series Data: Theory and PracticeThe MIT Press https://doi.org/10.7551/mitpress/9609.001.0001
- Magnetic fields of the human brain accompanying voluntary movement: BereitschaftsmagnetfeldExp Brain Res 48:144–8https://doi.org/10.1007/bf00239582Google Scholar
- Libet’s legacy: A primer to the neuroscience of volitionNeurosci Biobehav Rev 157:105503https://doi.org/10.1016/j.neubiorev.2023.105503Google Scholar
- Mental summation: The timing of voluntary intentions by cortical activityBehavioral and Brain Sciences 8:542–543https://doi.org/10.1017/S0140525X00044952
- The lateralized readiness potential as an on-line measure of central response activation processesBehavior Research Methods, Instruments, & Computers 30:146–156https://doi.org/10.3758/BF03209424
- Supplementary motor area activation preceding voluntary movement is detectable with a whole-scalp magnetoencephalography systemNeuroimage 11:697–707https://doi.org/10.1006/nimg.2000.0579Google Scholar
- Internally generated preactivation of single neurons in human medial frontal cortex predicts volitionNeuron 69:548–62https://doi.org/10.1016/j.neuron.2010.11.045Google Scholar
- Single trial analysis of slow cortical potentials: a study on anticipation related potentialsJ Neural Eng 10:036014https://doi.org/10.1088/1741-2560/10/3/036014Google Scholar
- The meaning and use of the area under a receiver operating characteristic (ROC) curveRadiology 143:29–36https://doi.org/10.1148/radiology.143.1.7063747Google Scholar
- Voluntary Motor Command Release Coincides with Restricted Sensorimotor Beta Rhythm PhasesJ Neurosci 42:5771–5781https://doi.org/10.1523/jneurosci.1495-21.2022Google Scholar
- Neural activity related to reaching and grasping in rostral and caudal regions of rat motor cortexBehav Brain Res 94:255–69https://doi.org/10.1016/s0166-4328(97)00157-5Google Scholar
- Magnetoencephalography—theory, instrumentation, and applications to noninvasive studies of the working human brainReviews of Modern Physics 65:413–497https://doi.org/10.1103/RevModPhys.65.413Google Scholar
- Microcircuitry coordination of cortical motor information in self-initiation of voluntary movementsNat Neurosci 12:1586–93https://doi.org/10.1038/nn.2431Google Scholar
- Readiness discharge for spontaneous initiation ofwalking in crayfishJ Neurosci 30:1348–62https://doi.org/10.1523/jneurosci.4885-09.2010Google Scholar
- Readiness for movement - The Bereitschaftspotential-StoryCurrent Contents Life Sciences 33:14Google Scholar
- Hirnpotentialänderungen bei Willkürbewegungen und passiven Bewegungen des Menschen: Bereitschaftspotential und reafferente PotentialePflüger’s Archiv für die gesamte Physiologie des Menschen und der Tiere 284:1–17https://doi.org/10.1007/BF00412364
- Awareness of unawareness: Folk psychology and introspective transparencyJournal of Consciousness Studies 18:135–160Google Scholar
- EEGNet: a compact convolutional neural network for EEG-based brain-computer interfacesJ Neural Eng 15:056013https://doi.org/10.1088/1741-2552/aace8cGoogle Scholar
- Putaminal activity for simple reactions or self-timed movementsJ Neurophysiol 89:2528–37https://doi.org/10.1152/jn.01055.2002Google Scholar
- Self-paced movement intention detection from human brain signals: Invasive and non-invasive EEGIn: 2012 Annual International Conference ofthe IEEE Engineering in Medicine and BiologySociety pp. 3280–3283https://doi.org/10.1109/EMBC.2012.6346665Google Scholar
- Preparation- or intention-to-act, in relation to pre-event potentials recorded at the vertexElectroencephalography and Clinical Neurophysiology 56:367–372https://doi.org/10.1016/0013-4694(83)90262-6
- On the ability to inhibit thought and action: A theory of an act of controlPsychological Review 91:295–327https://doi.org/10.1037/0033-295X.91.3.295Google Scholar
- Parietal area 5 and the initiation of self-timed movements versus simple reactionsJ Neurosci 26:2487–98https://doi.org/10.1523/jneurosci.3590-05.2006Google Scholar
- What does recent neuroscience tell us about criminal responsibility?Journal of Law and the Biosciences 3:120–139https://doi.org/10.1093/jlb/lsv051
- Nonparametric statistical testing of EEG- and MEG-dataJ Neurosci Methods 164:177–90https://doi.org/10.1016/j.jneumeth.2007.03.024Google Scholar
- The timing ofthe conscious intention to moveEurJ Neurosci 28:2344–51https://doi.org/10.1111/j.1460-9568.2008.06525.xGoogle Scholar
- Mesoscale cortex-wide neural dynamics predict self-initiated actions in mice several seconds prior to movementeLife 11:e76506https://doi.org/10.7554/eLife.76506
- FieldTrip: Open Source Software for Advanced Analysis of MEG, EEG, and Invasive Electrophysiological DataComputational Intelligence and Neuroscience 2011:156869https://doi.org/10.1155/2011/156869
- Origin of human motor readiness field linked to left middle frontal gyrus by MEG and PETNeuroimage 8:214–20https://doi.org/10.1006/nimg.1998.0362Google Scholar
- Evaluation of event-related desynchronization (ERD) preceding and following voluntary self-paced movementElectroencephalogr Clin Neurophysiol 46:138–46https://doi.org/10.1016/0013-4694(79)90063-4Google Scholar
- Patterns ofcortical activation during planning ofvoluntary movementElectroen-cephalographyand Clinical Neurophysiology 72:250–258https://doi.org/10.1016/0013-4694(89)90250-2
- EEG event-related desynchronization (ERD) and synchronization (ERS)Electroencephalography and Clinical Neurophysiology 103:26Google Scholar
- Neuronal activity preceding self-initiated or externally timed arm movements in area 6 of monkey cortexExp Brain Res 67:656–62https://doi.org/10.1007/bf00247297Google Scholar
- Role of primate basal ganglia and frontal cortex in the internal generation of movements. III. Neuronal activity in the supplementary motor areaExp Brain Res 91:396–407https://doi.org/10.1007/bf00227836Google Scholar
- How does neuroscience affect our conception of volition?Annu Rev Neurosci 33:109–30https://doi.org/10.1146/annurev-neuro-060909-153151Google Scholar
- Boosting: Foundations and AlgorithmsThe MIT Press https://doi.org/10.7551/mitpress/8291.001.0001
- ‘Catching the waves’ - slow cortical potentials as moderator of voluntary actionNeurosci Biobehav Rev 68:639–650https://doi.org/10.1016/j.neubiorev.2016.06.023Google Scholar
- The point of no return in vetoing self-initiated movementsProc Natl Acad Sci USA 113:1080–5https://doi.org/10.1073/pnas.1513569112Google Scholar
- What Is the Readiness Potential?Trends in Cognitive Sciences 25:558–570https://doi.org/10.1016/j.tics.2021.04.001
- Reducing multi-sensor data to a single time course that reveals experimental effectsBMC Neuroscience 14:122https://doi.org/10.1186/1471-2202-14-122
- An accumulator model for spontaneous neural activity prior to self-initiated movementProceedings ofthe National Academy of Sciences 109:E2904–E2913https://doi.org/10.1073/pnas.1210467109
- Readiness potential and movement initiation in the ratJpn J Physiol 55:1–9https://doi.org/10.2170/jjphysiol.R2073Google Scholar
- Consciousness, free will, and moral responsibility: Taking the folkseriouslyPhilosophical Psychology 28:929–946https://doi.org/10.1080/09515089.2014.962018Google Scholar
- Unconscious determinants of free decisions in the human brainNature Neuroscience 11:543–545https://doi.org/10.1038/nn.2112
- ‘Activity-silent’working memory in prefrontal cortex: a dynamic coding frameworkTrends Cogn Sci 19:394–405https://doi.org/10.1016/j.tics.2015.05.004Google Scholar
- Suppression of interference and artifacts by the Signal Space Separation MethodBrain Topogr 16:269–75https://doi.org/10.1023/b:brat.0000032864.93890.f9Google Scholar
- Probing the limits of activity-silent non-conscious working memoryProceedings of the National Academy of Sciences 116:14358–14367https://doi.org/10.1073/pnas.1820730116
- Lost in time…: The search for intentions and Readiness PotentialsConsciousness and Cognition 33:300–315https://doi.org/10.1016/j.concog.2015.01.011
- Probing for Intentions: Why Clocks Do Not Provide the Only Measurement of TimeFrontiers in Human Neuroscience :13https://doi.org/10.3389/fnhum.2019.00068
- Response inhibition in the stop-signal paradigmTrends Cogn Sci 12:418–24https://doi.org/10.1016/j.tics.2008.07.005Google Scholar
Article and author information
Author information
Version history
- Preprint posted:
- Sent for peer review:
- Reviewed Preprint version 1:
Cite all versions
You can cite all versions using the DOI https://doi.org/10.7554/eLife.109913. This DOI represents all versions, and will always resolve to the latest one.
Copyright
© 2026, Jeay-Bizot et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
- views
- 0
- downloads
- 0
- citations
- 0
Views, downloads and citations are aggregated across all versions of this paper published by eLife.