Neural signatures of vigilance decrements predict behavioural errors before they occur
Abstract
There are many monitoring environments, such as railway control, in which lapses of attention can have tragic consequences. Problematically, sustained monitoring for rare targets is difficult, with more misses and longer reaction times over time. What changes in the brain underpin these ‘vigilance decrements’? We designed a multiple-object monitoring (MOM) paradigm to examine how the neural representation of information varied with target frequency and time performing the task. Behavioural performance decreased over time for the rare target (monitoring) condition, but not for a frequent target (active) condition. There was subtle evidence of this also in the neural decoding using Magnetoencephalography: for one time-window (of 80ms) coding of critical information declined more during monitoring versus active conditions. We developed new analyses that can predict behavioural errors from the neural data more than a second before they occurred. This facilitates pre-empting behavioural errors due to lapses in attention and provides new insight into the neural correlates of vigilance decrements.
Introduction
When people monitor displays for rare targets, they are slower to respond and more likely to miss those targets relative to frequent target conditions (Wolfe et al., 2005; Warm et al., 2008; Rich et al., 2008; Reason, 1990; Reason, 2000). This effect is more pronounced as the time doing the task increases, which is often called a ‘vigilance decrement’. Theoretical accounts of vigilance decrements fall into two main categories. ‘Cognitive depletion’ theories suggest performance drops as cognitive resources are ‘used up’ by the difficulty of sustaining attention under vigilance conditions (Helton and Warm, 2008; Helton and Russell, 2011; Warm et al., 2008). In contrast, ‘mind wandering’ theories suggest that the boredom of the task tends to result in insufficient involvement of cognitive resources, which in turn leads to performance decrements (Manly et al., 1999; Smallwood and Schooler, 2006; Young and Stanton, 2002). Either way, there are many real-life situations where such a decrease in performance over time can lead to tragic consequences, such as the Paddington railway disaster (UK, 1999), in which a slow response time to a stop signal resulted in a train moving another 600 m past the signal into the path of an oncoming train. With the move towards automated and semi-automated systems in many high-risk domains (e.g., power-generation and trains), humans now commonly need to monitor systems for infrequent computer failures or errors. These modern environments challenge our attentional systems and make it urgent to understand the way in which monitoring conditions change the way important information about the task is encoded in the human brain.
To date, most vigilance and rare target studies have used simple displays with static stimuli. Traditional vigilance tasks, inspired by radar operators in WWII, require participants to respond to infrequent visual events on otherwise blank screens, and show more targets are missed as time on task increases (Mackworth, 1948). More recent vigilance tasks have participants detect infrequent target stimuli among non-targets, and typically show an increase in misses as time on task increases. In Temple et al., 2000, for example, with only 20% targets, after 10 min target detection rates declined from 97% to 93% for high contrast (easy) and from 95% to 83% for low (hard) contrast targets. Other approaches have been to test for vigilance effects using frequent responses to non-targets, which have the advantage of more data points for analysis. The Sustained Attention to Response Task (SART), for example, requires participants to respond to each non-target item in a rapid stream of stimuli and occasionally withhold a response to a target item (Beck et al., 1956; Rosenberg et al., 2013). These approaches usually show effects on reaction times (RTs), which increase and become more variable with time on task (Rosenberg et al., 2013; Möckel et al., 2015; Singleton, 1953), although others have found RTs decrease (Rubinstein, 2020). Faster RTs also occur for ‘target absent’ responses in rare target visual search (Wolfe et al., 2005; Rich et al., 2008). Overall, vigilance decrements in terms of poorer performance can be seen in both accuracy and in RTs, depending on the task.
Despite these efforts, modern environments (e.g., rail and air traffic control) have additional challenges not encapsulated by these measures. This includes multiple moving objects, potentially appearing at different times, and moving simultaneously in different directions. When an object moves in the space, its neural representation has to be continuously updated so we can perceive the object as having the same identity. Tracking moving objects also requires considerable neural computation: in addition to spatial remapping, for example, we need to predict direction, speed, and the distance of the object to a particular destination. These features cannot be studied using static stimuli; they require objects that shift across space over time. In addition, operators have complex displays requiring selection of some items while ignoring others. We therefore need new approaches to study vigilance decrements in situations that more closely resemble the real-life environments in which humans are now operating. Developing these methods will provide a new perspective on fundamental questions of how the brain implements sustained attention in moving displays, and the way in which monitoring changes the encoding of information compared with active task involvement. These new methods may also provide avenues to optimise performance in high-risk monitoring environments.
The brain regions involved in maintaining attention over time has been studied using functional magnetic resonance imaging (fMRI), which measures changes in cerebral blood flow (Adler et al., 2001; Benedict et al., 2002; Coull et al., 1996; Gilbert et al., 2006; Johannsen et al., 1997; Ortuño et al., 2002; Périn et al., 2010; Schnell et al., 2007; Sturm et al., 1999; Tana et al., 2010; Thakral and Slotnick, 2009; Wingen et al., 2008). These studies compared brain activation in task vs. resting baseline or sensorimotor control (which involved no action) conditions and used univariate analyses to identify regions with higher activation under task conditions. This has the limitation that there are many features that differ between the contrasted (subtracted) conditions, not just the matter of sustained attention. Specifically, this comparison cannot distinguish whether the activation during sustained attention is caused by the differences in the task, stimuli, responses, or a combination of these factors. As it is challenging to get sufficient data from monitoring (vigilance) tasks in the scanner, many previous studies used tasks with relatively frequent targets, in which vigilance decrements usually do not occur. However, despite these challenges, Langner and Eickhoff, 2013 reviewed vigilance neuroimaging studies and identified a network of right-lateralised brain regions including dorsomedial, mid- and ventrolateral prefrontal cortex, anterior insula, parietal and a few subcortical areas that they argue form the core network subserving vigilant attention in humans. The areas identified by Langner and Eickhoff, 2013 show considerable overlap with a network previously identified as being recruited by many cognitively challenging tasks, the ‘multiple demand’ (MD) regions, which include the right inferior frontal gyrus, anterior insula, and intra-parietal sulcus (Duncan and Owen, 2000; Duncan, 2010; Fedorenko et al., 2013; Woolgar et al., 2011; Woolgar et al., 2015a; Woolgar et al., 2015b).
Other fMRI studies of vigilance have focused on the default mode network, composed of discrete areas in the lateral and medial parietal, medial prefrontal, and medial and lateral temporal cortices such as posterior cingulate cortex (PCC) and ventral anterior cingulate cortex (vACC), which is thought to be active during ‘resting state’ and less active during tasks (Greicius et al., 2003; Greicius et al., 2009; Raichle, 2015). Eichele et al., 2008 suggested that lapses in attention can be predicted by decrease of deactivation of this default mode network. In contrast, Weissman et al., 2006 identified deactivation in the anterior cingulate and right prefrontal regions in pre-stimulus time windows when targets were missed. Ekman et al., 2012 also observed decreased connectivity between sensory visual areas and frontal brain areas on the pre-stimulus time span of incorrect trials in colour/motion judgement tasks. More recently, Sadaghiani et al., 2015 showed that the functional connectivity between sensory and ‘vigilance-related’ (cingulo-opercular) brain areas decreased prior to behavioural misses in an auditory task while between the same sensory area and the default-mode network the connectivity increased. These findings suggest that modulation of interactions between sensory and vigilance-related brain areas might be related to behavioural misses in monitoring tasks.
Detecting changes in brain activation that correlate with lapses of attention can be particularly challenging with fMRI, given that it has poor temporal resolution. Electroencephalography (EEG), which records electrical activity at the scalp, has much better temporal resolution, and has been the other major approach for examining changes in brain activity during sustained attention tasks. Frequency band analyses have shown that low-frequency alpha (8–10.9 Hz) oscillations predict task workload and performance during monitoring of simulated air traffic (static) displays with rare targets, while frontal theta band (4–7.9 Hz) activity predicts task workload only in later stages of the experiment (Kamzanova et al., 2014). Other studies find that increases in occipital alpha oscillations can predict upcoming error responses (Mazaheri et al., 2009) and misses (O'Connell et al., 2009) in go/no-go visual tasks with target frequencies of 11% and 9%, respectively. These changes in signal power that correlate with the task workload or behavioural outcome of trials are useful, but provide relatively coarse-level information about what changes in the brain during vigilance decrements.
Understanding the neural basis of decreases in performance over time under vigilance conditions is not just theoretically important, it also has potential real-world applications. In particular, if we could identify a reliable neural signature of attentional lapses, then we could potentially intervene prior to any overt error. For example, with the development of autonomous vehicles, being able to detect when a driver is not engaged, combined with information about a potential threat, could allow emergency braking procedures to be initiated. Previous studies have used physiological measures such as pupil size (Yoss et al., 1970), body temperature (Molina et al., 2019), skin conductance, and blood pressure (Lohani et al., 2019) to indicate the level of human arousal or alertness, but these lack the fine-grained information necessary to distinguish transient dips from problematic levels of inattention in which task-related information is lost. In particular, we lack detail on how information processing changes in the brain during vigilance decrements. This knowledge is crucial to develop a greater understanding of how humans sustain vigilance.
In this study, we developed a new task, multiple-object monitoring (MOM), which includes key features of real-life situations confronting human operators in high-risk environments. These features include moving objects, varying levels of target frequency, and a requirement to detect and avoid collisions. A key feature of our MOM task is that it allows measurement of the specific decrements in performance during vigilance (sustaining attention in a situation where only infrequent responses are needed) separate from more general decreases in performance simply due to doing a task for an extended period. Surprisingly, this is not typically the case in vigilance tasks. We recorded neural data using the highly sensitive method of magnetoencephalography (Baillet, 2017) and used multivariate pattern analyses (MVPA) to determine how behavioural vigilance decrements correlate with changes in the neural representation of information. We used these new approaches to better understand the way in which changes between active and monitoring tasks affect neural representation, including functional connectivity. We then examined the potential for using these neural measures to predict forthcoming behavioural misses based on brain activity.
Results
Participants completed the MOM task during which they monitored several dots moving on visible trajectories towards a centrally presented fixed object (Figure 1A). The trajectories spanned from corners of the screen towards the central object and deflected at 90° before contacting the central object. The participants’ task was to keep fixation on the central object and press the button to deflect the moving dot if it violated its trajectory and continued towards the central object after reaching the deflection point. They were tasked to do so before the object ‘collided’ with the central object. In each block, only dots of one colour (either green or red; called Attended vs. Unattended) was relevant and should be responded to by the participants (~110 s). Either 50% or 6% of the attended dots (cued colour) were targets (i.e., violated their trajectory requiring a response; see Materials and methods) generating Active and Monitoring conditions, respectively.

The multiple-object monitoring (MOM) task and types of information decoded.
(A) At the start of a block, the relevant colour is cued (here, green; distractors in red). Over the on-task period (~30 min per task condition), multiple dots entered from either direction, each moving along a visible individual trajectory towards the middle object. Only attended dots that failed to deflect along the trajectories at the deflection point required a response (Target: bottom right display). Participants did not need to press the button for the unattended dot (Distractor: top right display) or the dots that kept moving on the trajectories (Event: middle right panel). Each dot took ~1226 ms from appearance to deflection. (B) Direction of approach information (left display: left vs. right as indicated by dashed and solid lines, respectively) and distance to object information (right display). Note the blue dashed lines and orange arrows were not present in the actual display. d1, d2, etc. denote the ‘distance units’ used to train the classifier for the key distance to object information. A demo of the task can be found here [https://osf.io/5aw8v/].
Behavioural data: The MOM task evokes a reliable vigilance decrement
In the first block of trials (i.e., the first 110 s, excluding the two practice blocks), participants missed 29% of targets in the Active condition and 40% of targets in the Monitoring condition. However, note the number of targets in any single block is necessarily very low for the Monitoring (for a single block, there are 16 targets for Active but only two targets for Monitoring). The pattern becomes more robust over blocks, and Figure 2A shows the miss rates changed over time in different directions for the Active vs. Monitoring conditions. For Active blocks, miss rates decreased over the first five blocks and then plateaued at ~17%. For Monitoring, however, miss rates increased throughout the experiment: by the final block, these miss rates were up to 76% (but again, the low number of targets in Monitoring mean that we should use caution in interpreting the results of any single block alone). There was evidence that miss rates were higher in the Monitoring than Active conditions from the fourth block onwards (BF >3; Figure 2A). Participants’ RTs on correct trials also showed evidence of specific vigilance decrements, increasing over time under Monitoring but decreasing under Active task conditions (Figure 2B). There was evidence that RTs were slower for Monitoring compared with Active from the sixth block onwards (BF >3, except for Block #11). The characteristic pattern of increasing miss rates and slower RTs over time in the Monitoring relative to the Active condition validates the MOM task as effectively evoking vigilance decrements.

Behavioural performance on the MOM task.
The percentage of miss trials (A), and correct reaction times (B), as a function of block. Thick lines show the average across participants (shading 95% confidence intervals) for Active (blue) and Monitoring (red) conditions. Each block lasted for 110 s and had either 16 (Active) or 2 (Monitoring) targets out of 32 cued-colour and 32 non-cued colour dots. Bayes factors (BF) are shown in the bottom section of each graph: Filled circles show moderate/strong evidence for either hypothesis and empty circles indicate insufficient evidence when evaluating the contrast between Active and Monitoring conditions.
Neural data: Decoding different aspects of task-related information
We used multivariate pattern analysis (i.e., decoding) to extract two types of information from MEG data about each dot’s movement on the screen: information about the direction of approach (whether the dot was approaching the central object from left or right side of the screen) and distance to object (how far was the dot relative to the central object; Figure 1B; see Materials and methods).
With so much going on in the display at one time, we first needed to verify that we can successfully decode the major aspects of the moving stimuli, relative to chance. The full data figures and details are presented in Supplementary materials: We were able to decode both direction of approach and distance to object from MEG signals (see Figure 3—figure supplement 1). Thus, we can turn to our main question about how these representations were affected by the Target Frequency, Attention, and Time on Task.
The neural correlates of the vigilance decrement
As the behavioural results showed (Figure 2), the difference between Active and Monitoring conditions increased over time, showing the greatest difference during the final blocks of the experiment. To explore the neural correlates of these vigilance decrements, we evaluated information representation in the brain during the first five and last five blocks of each task (called early and late blocks, respectively) and the interactions between the Target Frequency, Attention, and the Time on Task using a three-way Bayes factor ANOVA.
Effects of target frequency on direction of approach information
Direction of approach information is a very clear visual signal (‘from the left’ vs. ‘from the right’) and therefore is unlikely to be strongly modulated by other factors, except perhaps whether the dot was in the cued colour (Attended) or the distractor colour (could be ignored: Unattended). There was moderate or strong evidence for a main effect of Attention (Figure 3A; BF >3, Bayes factor ANOVA, cyan dots) starting from 265 ms and lasting until dots faded. This is consistent with maintenance of information about the attended dots and attenuation of the information about unattended dots (Figure 3—figure supplement 1A). The large difference in coding attributable to attention remained for as long as the dots were visible.

Impact of different conditions and their interactions on information on correct trials (all trials except those in which a target was missed or there was a false alarm).
(A) Decoding of direction of approach information (less task-relevant) and (B) decoding of distance to object information (most task-relevant). Left two columns: Attended dots; Right two columns: Unattended (‘distractor’) dots. Thick lines show the average across participants (shading 95% confidence intervals). Horizontal dashed line refers to theoretical chance-level decoding (50%). Vertical dashed lines indicate critical times in the trial. Bottom panels: Bayesian evidence for main effects and interactions, Bayes factors (BF): Filled circles show moderate/strong evidence for either hypothesis and empty circles indicate insufficient evidence. Main effects and interactions of conditions calculated using BF ANOVA analysis. Cyan, purple, green, and red dots indicate the main effects of Attention, Target frequency, Time on Task, and the interaction between Target frequency and Time on Task, respectively. The results of BF analysis (i.e., the main effects of the three conditions and their interactions) are from the same three-way ANOVA analysis and are therefore identical for attended and unattended panels. Early = data from the first five blocks (~10 min). Late = data from the last five blocks (~10 min). Note the different scales of the BF panels, and the down-sampling, for clearer illustration.
In contrast, there was no sustained main effect of Target Frequency on the same direction of approach coding. For the majority of the epoch there was moderate or strong evidence for the null hypothesis (BF <1/3; Bayes factor ANOVA, Figure 3A, purple dots). The sporadic time point with a main effect of Target Frequency, observed before the deflection (BF >3), likely reflects noise in the data as there is no clustering. Recall that we only focus on time points prior to deflection, as after this point there are visual differences between Active and Monitoring, with more dots deflecting in the Monitoring condition.
There was also no sustained main effect of the Time on Task on information about the direction of approach (BF <3; Bayes factor ANOVA, green dots; Figure 3A). There were no sustained two-way or three-way interactions between Attention, Target Frequency, and Time on Task (BF <3; Bayes factor ANOVA). Note that the number of trials used in the training and testing of the classifiers were equalised across the eight conditions and equalled the minimum available number of trials across those conditions shown in Figure 3. Therefore, the observed effects cannot be attributed to a difference in the number of trials across conditions.
Effects of target frequency on critical distance to object information
The same analysis for the representation of the task-relevant distance to object information showed strong evidence for a main effect of Attention (BF > 10; Bayes factor ANOVA) at all 15 distances, no effect of Time on Task (BF < 0.3; Bayes factor ANOVA) at any of the distances, and an interaction between Time on Task and Target Frequency at one of the distances (BF = 6.7, Figure 3B). The interaction between Target Frequency and Time on Task at distance 13 (time-window: 160 to 240 ms after stimulus onset, BF = 6.7) reflected opposite effects of time on task in the Active and Monitoring conditions. In Active blocks, there was moderate evidence that coding was stronger in late blocks than in early blocks (BF = 3.1), whereas in the Monitoring condition, decoding declined with time and was weaker in late than in easy blocks (BF = 4.3). However, as there was only moderate evidence for this interaction at one of the time-windows, we do not overinterpret it. Decoding of attended information tended to be lower in late compared to early Monitoring blocks (Figure 3B lower panel red dotted line) in several time-windows across the trial, which may echo the behavioural pattern of performance (Figure 2). As there was moderate evidence for no interaction between Attention and Target Frequency (BF < 0.3, 2-way Bayes factor ANOVA) except for distance 6 (BF = 3.3; no consistent pattern (insufficient evidence for pairwise comparisons: BFs 2.4-2.8)), no interaction between Attention and Time on Task (BF < 0.3, 2-way Bayes factor ANOVA) or simultaneously between the three factors (BF < 0.3, 3-way Bayes factor ANOVA), we do not show those statistical results in the figure.
Although eye-movements should not drive the classifiers due to our design, it is still important to verify that the results replicate when standard artefact removal is applied. We can also use eye-movement data as an additional measure, examining blinks, saccades and fixations for effects of our attention and vigilance manipulations.
First, to make sure that our neural decoding results replicate after eye-related artefact removal, we repeated our analyses on the data after eye-artefact removal, which provided analogous results to the original analysis (see the decoding results with and without artefact removal in Figure 3—figure supplement 2). Specifically, for our crucial distance to object data, the main effect of Attention remained after eye-artefact removal, replicating our initial pattern of results. Moderate evidence (BF = 4.2) for an interaction between Target Frequency and Time on Task was also found, but now at distance 6 instead of distance 13. This interaction again reflected a larger effect of Time on Task in Monitoring compared to Active blocks (Monitoring: weaker coding in late relative to early blocks (BF = 3.1); Active: insufficient evidence for change in coding from early to late (BF = 2.0)).
Second, we conducted a post hoc analysis to explore whether eye movement data showed the same patterns of vigilance decrements and therefore could explain our decoding results. We extracted the proportion of eye blinks, saccades, and fixations per trial as well as the duration of those fixations from the eye-tracking data for correct trials (−100 to 1400 ms aligned to the stimulus onset time), and statistically compared them across our critical conditions (Figure 3—figure supplement 3). We saw strong evidence (BF = 4.8e8) for a difference in the number of eye blinks between attention conditions: There were more eye blinks for the Unattended (distractor) than Attended (potentially targets) colour dots. We also observed moderate evidence (BF = 3.4) for difference between the number of fixations, with more fixations in Unattended vs. Attended conditions. These suggest that there are systematic differences in the number of eye blinks and fixations due to our attentional manipulation, consistent with previous observations showing that the frequency of eye blinks can be affected by the level of attentional recruitment (Nakano et al., 2013). However, there was either insufficient evidence (0.3 < BF <3) or moderate or strong evidence for no differences (0.1 < BF <0.3 and BF <0.3, respectively) between the number of eye blinks and saccades across our Active, Monitoring, Early, and Late blocks, where we observed our ‘vigilance decrement’ effects in decoding. Therefore, this suggests that the main vigilance decrement effects in decoding, which were evident as an interaction between Target frequency (Active vs. Monitoring) and Time on the task (Early vs. Late; Figure 3), are not primarily driven by eye movements. However, artefact removal algorithms are not perfect, making it is impossible to fully rule out all potentially meaningful eye-related artefacts from the MEG data (e.g. the difference in the number of eye blinks between attended and unattended conditions). Thus, although the results are similar with and without standard eye-artefact removal, it is impossible to fully rule out all potential eye movement effects.
Together, these results suggest that while vigilance conditions had little or no impact on coding of the direction of approach, they did impact the critically task-relevant information about the distance of the dot from the object, albeit only for one 80ms time-window. In this time-window, coding declined as time on task increased specifically when the target events happened infrequently, forming a possible neural correlate for our behavioural vigilance decrements.
Is informational brain connectivity modulated by Attention, Target Frequency, and Time on Task?
Using graph-theory-based univariate connectivity analysis, it has been shown that the connectivity between relevant sensory areas and ‘vigilance-related’ cognitive areas changes prior to lapses in attention (behavioural errors; Ekman et al., 2012; Sadaghiani et al., 2015). Therefore, we asked whether vigilance decrements across the time course of our task corresponded to changes in multivariate informational connectivity, which evaluates the similarity of information encoding, between frontal attentional networks and sensory visual areas. In line with attentional effects on sensory perception, we predicted that connectivity between the frontal attentional and sensory networks should be lower when not attending (vs. attending; Goddard et al., 2019). Behavioural errors were also previously predicted by reduced connection between sensory and ‘vigilance-related’ frontal brain areas (Ekman et al., 2012; Sadaghiani et al., 2015). Therefore, we predicted a decline in connectivity when targets were lower in frequency, and with increased time on task, as these led to increased errors in behaviour, specifically under vigilance conditions in our task (i.e., late blocks in Monitoring vs. late blocks in Active; Figure 2). We used a simplified version of our method of RSA-based informational connectivity to evaluate the (Spearman’s rank) correlation between distance information RDMs across the peri-frontal and peri-occipital electrodes (Goddard et al., 2016; Figure 4A).

Relationship between informational connectivity and Attention, Target Frequency, Time on Task, and the behavioural outcome of the trial (i.e., correct vs. miss).
(A) Calculation of connectivity using Spearman’s rank correlation between RDMs obtained from the peri-frontal and peri-occipital sensors as indicated by coloured boxes, respectively. RDMs include decoding accuracies obtained from testing the 105 classifiers trained to discriminate different distance to object categories. (B) Connectivity values for the eight different conditions of the task and the results of three-way Bayes factor ANOVA with factors Time on Task (Early, Late), Attention (Attended, Unattended), and Target Frequency (Active, Monitoring), using only correct trials. (C) Connectivity values for the Active and Monitoring, Early and Late blocks of each task for correct and miss trials (attended condition only), and the result of Bayes factor ANOVA with factors Target Frequency (Active, Monitoring), Time on Task (Early, Late), and behavioural outcome (correct, miss) as inputs. Number of trials are equalised across conditions in B and C separately. Bars show the average across participants (error bars 95% confidence intervals). Bold fonts indicate moderate or strong evidence for either the effect or the null hypothesis.
Results showed strong evidence (Bayes factor ANOVA, BF = 6.5e3) for higher informational connectivity for trials with Attended compared to Unattended dots, and moderate evidence for no effect of Target Frequency (Bayes factor ANOVA, BF = 0.11; Figure 4B). There was insufficient evidence to determine whether there was a main effect of Time on Task (Bayes factor ANOVA, BF = 0.72). There was evidence in the direction of the null for the two-way interactions between the factors (Bayes factor ANOVA, two-way Time on Task-Target Frequency: BF = 0.36; Time on Task-Attention: BF = 0.39; Target Frequency-Attention: BF = 0.15) and insufficient evidence regarding their three-way interaction (BF = 0.95). These results suggest that [--deleted text--] trials in which the dots are in the distractor (Unattended) colour, in which the attentional load is low, result in less informational connectivity between occipital and frontal brain areas compared to [--deleted text--] Attended trials. This is consistent with a previous study (Alnæs et al., 2015), which suggested that large-scale functional brain connectivity depends on the attentional load, and might underpin or accompany the decrease in information decoding across the brain in the unattended condition., which suggested that large-scale functional brain connectivity depends on the attentional load, and might underpin or accompany the decrease in information decoding across the brain in these conditions.
We also compared the connectivity for the correct vs. miss trials (Figure 4C). This analysis was performed only for Attended condition as there are no miss trials for Unattended condition, by definition. There was moderate evidence for no difference in connectivity on miss compared to correct trials (Bayes factor ANOVA, BF = 0.11). In addition, there was moderate evidence for no effect of Time on Task and Target Frequency (BF = 0.11 and BF = 0.10, respectively), as well as for two-way and three-way interactions between the three factors (Bayes factor ANOVA, Behaviour-Target Frequency: BF = 0.14; Behaviour-Time on Task: BF = 0.14; Target Frequency-Time on Task: BF = 0.15; their 3-way interaction BF = 0.14). Therefore, in contrast to an auditory monitoring task which showed decline in univariate graph-theoretic connectivity before behavioural errors (Sadaghiani et al., 2015), we observed no change in informational connectivity on error. Note that, the number of trials is equalized across the 8 conditions in each of our analyses separately.
Is neural representation different on miss trials?
The results presented in Figure 3, which used only correct trials, showed changes due to target frequency to the representation of task-relevant information when the task was performed successfully. We next move on to our second question, which is whether these neural representations change when overt behaviour is affected, and therefore, whether we can use the neural activity as measured by MEG to predict behavioural errors before they occur. We used our method of error data analysis (Woolgar et al., 2019) to examine whether the patterns of information coding on miss trials differed from correct trials. For these analyses we used only attended dots, as unattended dots do not have behavioural responses, and we matched the total number of trials in our implementation of correct and miss classification.
First, we evaluated the representation of the less relevant information – the direction of approach measure (Figure 5A). The results for correct trials provided information dynamics very similar to the Attended condition in Figure 3A, except for higher overall decoding, which is explained by the inclusion of the data from the whole experiment (15 blocks) rather than just the five early and late blocks (note the number of trials is still matched to miss trials).

Decoding of information on correct vs miss trials.
(A) Decoding of direction of approach information (less task-relevant). (B) Decoding of distance to object information (most task-relevant). The horizontal dashed lines refer to theoretical chance-level decoding. Left panels: Decoding using correct trials; Right panels: Decoding using miss trials. In both right and left panels, the classifiers were trained on correct trials and tested on (left out) correct and all miss trials, respectively. Thick lines show the average across participants (shading 95% confidence intervals). Vertical dashed lines indicate critical events in the trial. Bayes factors (BF) are shown in the bottom section of each graph: Filled circles show moderate/strong evidence for either hypothesis and empty circles indicate insufficient evidence. They show the results of BF analysis when evaluating the difference of the decoding values from chance for Active (blue) and Monitoring (red) conditions separately, the comparison of the two conditions (green), and the comparison of correct and miss trials (black). Note that for the comparison of correct and miss trials, Active and Monitoring conditions were averaged separately. Note the different scales of the BF panels, and down-sampling, for clearer illustration.
For the direction of approach information, there was moderate or strong evidence (i.e., BF >3) in both Active and Monitoring conditions after ~100 ms for above-chance decoding. However, when the classifiers were tested on miss trials, from onset to deflection, the pattern of information dynamics were different, even though we had matched the number of trials. Specifically, while the level of information was comparable to correct trials with spurious instances (but no sustained time windows) of difference (BF >3 as indicated by black dots) before 500 ms, decoding traces were much noisier for miss trials with more variation across trials and between nearby time points (Figure 5A). Note that after the deflection, the visual signal is different for correct and miss trials, so the difference between their decoding curves (BF >3) is not meaningful. These results suggest a noisier representation of direction of approach information for the missed dots compared to correctly deflected dots.
We then repeated the same procedure on the representation of the most task-relevant distance to object information on correct vs. miss trials (Figure 5B). On correct trials, the distance information for both Active and Monitoring conditions was above chance (Figure 5B left panels; BF > 104). For miss trials, the corresponding distance information was still above chance (Figure 5B right panels; BF > 103) but the direct comparison revealed that distance information dropped on miss trials compared to correct trials (Figure 5B; Black dots; BF >3 across all distances; Active and Monitoring results were averaged for correct and miss trials separately before Bayes analyses).
In principle, the average decoding levels could be composed of ‘all or none’ misses or graded drops in information, and it is possible that on some miss trials there is a good representation but the target is missed for other reasons (e.g., a response-level error). As neural data are noisy and multivariate decoding needs cross-validation across subsamples of the data, and because each trial, at each distance, can only be classified correctly or incorrectly by a two-way classifier, we tend not to compare the decoding accuracies in a trial-by-trial manner, but rather on average (Grootswagers et al., 2017). However, if we look at an individual data set and examine all the miss trials (averaged over the 15 distances and cross-validation runs) in our distance-to-object decoding, we can get some insights into the underlying distributions (Figure 5—figure supplement 1). Results showed that, for all participants, the distribution of classifier accuracies for both correct and miss trials followed approximate normal distributions. However, while the distribution of decoding accuracies for correct trials was centred around 60%, the decoding accuracies for individual miss trials were centred around 56%. We evaluated the difference in the distribution of classification accuracies between the two types of trials using Cohen’s d. Cohen’s d ranged from 0 to 2.5 across participants and conditions. 14 out of 21 subjects showed moderate (d > 0.5) to large (d > 0.8; Cohen, 1969) differences between the distribution of correct and miss trials in either Active or Monitoring condition or both. Therefore, although the miss trials vary somewhat in levels of information, only a minority of (< 24%) miss trials are as informative as the least informative correct trials. These results are consistent with the interpretation that there was less effective representation of the crucial information about the distance from the object preceding a behavioural miss.
Please note that the results presented so far were from correct and miss trials and we excluded early, late, and wrong-colour false alarms to be more specific about the error type. However, the false alarm results (collapsed across all three types of false alarms) were very similar (Figure 5—figure supplement 2) to those of the missed trials (Figure 5): noisy information about the direction of approach and at-chance information about the distance to object. This may suggest that both miss and false alarm trials are caused by impaired processing of information, or at least, are captured similarly by our decoding methods. The average number of miss trials was 58.17 (±21.63 SD) and false alarm trials was 65.94 (±21.13 SD; out of 1920 trials).
Can we predict behavioural errors using neuroimaging?
Finally, we asked whether we could use this information to predict the behavioural outcome of each trial. To do so, we developed a new method that classified trials based on their behavioural outcomes (correct vs. miss) by asking how well a set of classifiers, pre-trained on correct trials, would classify the distance of the dot from the target (Figure 6A). To achieve this, we used a second-level classifier which labelled a trial as correct or miss based on the average accumulated accuracies obtained for that dot at every distance from the first-level decoding classifiers which were trained on correct trials (Figure 6A,B). If the accumulated accuracy for the given dot at the given distance was less than the average accuracy obtained from testing on the validation set minus a specific threshold (based on standard deviation), the testing dot (trial) was labelled as correct, otherwise miss. In this analysis, the goal was to maximise the accuracy of predicting behaviour. For that purpose, we accumulated classification accuracies along the distances. Moreover, as each classifier performs a binary classification for each testing dot at each distance, the accumulation of classification accuracies also avoided the spurious classification accuracies to drive the decision, providing smooth ‘accumulated’ accuracies for predicting the behaviour. As Figure 6B shows, there was strong evidence (BF >10) that decoding accuracy of distances was higher for correct than miss trials with the inclusion of more classifier accuracies as the dot approached from the corner of the screen towards the centre. This clear separation of accumulated accuracies for correct vs. miss trials allowed us to predict with above-chance accuracy the behavioural outcome of the ongoing trial (Figure 6D). To find the optimal threshold for each participant, we evaluated the thresholds used for all other participants except for a single testing participant for whom we used the average of the best thresholds that led to highest prediction accuracy for other participants. This was ~0.4 standard deviation below the average accuracy on the other participants’ validation (correct trial) sets (Figure 6C).

Prediction of behavioural outcome (correct vs miss) trial-by-trial using decoding of distance to object information.
(A) Sample classifiers’ accuracies (correct or incorrect classification of current distance as indicated by colours) for a miss (left panel; average accuracy ≅ 43% when the dot reached the deflection point) and a correct trial (right panel; average accuracy ≅ 67% at the deflection point). The classifiers were trained on the data from correct trials and tested on the data from correct and miss trials. For the miss trials, around half the classifiers categorised the dot’s distance incorrectly by the time it reached the deflection point. (B) Accumulation of classifiers’ accuracies over decreasing dot distances/time to deflection. This shows stronger information coding of the crucial distance to object information on the correct trials over miss trials. A variable threshold used in (C) is shown as a green dashed line. (C) Prediction of behavioural outcome as a function of threshold and distance using a second-level behavioural outcome classification. Results showed highest prediction accuracies on the participant set at around the threshold of 0.4SD under the decoding level for correct validation trials, increasing at closer distances. (D) Accuracy of predicting behavioural outcome for the left-out participant using the threshold obtained from all the other participants as function of distance/time from the deflection point. Results showed successful (~=59%) prediction of behavioural outcome of the trial as early as 80 ms after stimulus appearance. Thick lines and shading refer to average and one standard deviation around the mean across participants, respectively. Bayes factors (BF) are shown in the bottom section of each graph: Filled circles show moderate/strong evidence for either hypothesis and empty circles indicate insufficient evidence (black dots under B and D).
The prediction accuracy of behavioural outcome was above chance level (59% vs. 50%; BF > 10) even when the dot had only been on the screen for 80 ms, which corresponds to our furthest distance #15 (1160ms prior to deflection point; Figure 6D). The accuracy increased to 65.4% as the dot approached the centre of the screen, with ~64% accuracy with still 800 ms to go before required response. Importantly, the prediction algorithm showed generalisable results across participants; the threshold for decision obtained from the other participants could predict the accuracy of an independent participant’s behaviour using only their neural data.
The prediction of behavioural outcome (Figure 6) was performed using the data from the whole data set. To test if prediction accuracy depended on the stage of the experiment, we performed the behavioural prediction procedure on data sets obtained from the first 5 (early) and the last 5 (late) stages of the experiment separately (Figure 6—figure supplement 1). There was no evidence for a change in the prediction power in the late vs. early blocks of trials.
Discussion
This study developed new methods to gain insights into how attention, the frequency of target events, and the time doing a task affect the representation of information in the brain. Our new MOM task evoked reliable specific vigilance decrements in both accuracy and RT in a situation that more closely resembles real-life modern tasks than classic vigilance tasks. Using the sensitive analysis method of MVPA, we showed that neural coding was stronger for attended compared to distractor information. There was also one time-window where the interaction between the time on the task and target frequency affected decoding, with a larger decline in coding under monitoring conditions, which may reflect a neural correlate of the behavioural vigilance decrements. We also developed a novel informational brain connectivity analysis, which showed that the correlation between information coding across peri-occipital and peri-frontal areas varied with different levels of attention but did not change with errors. Finally, we utilised our recent error data analysis to predict forthcoming behavioural misses based on the neural data. In the following sections, we explain each of these findings in detail and compare them with relevant literature.
First, the MOM task includes key features of real-world monitoring situations that are not usually part of other vigilance tasks (e.g., Mackworth, 1948; Temple et al., 2000; Beck et al., 1956; Rosenberg et al., 2013), and the results show clear evidence of vigilance decrements. Behavioural performance, measured with both RT and accuracy, deteriorated over time in Monitoring (infrequent targets) relative to Active (frequent targets) conditions. One important additional advantage of the MOM task over conventional vigilance tasks is that it allows us to be specific about the vigilance decrements (by comparing Active and Monitoring conditions) separate from general time on task effects which affect both Active and Monitoring conditions (c.f. Figure 3). These vigilance decrements demonstrate that the MOM task can be used to explore vigilance in situations more closely resembling modern environments, namely those involving moving stimuli and selection of relevant from irrelevant information, giving a useful tool for future research.
Second, the high sensitivity of MVPA to extract information from neural signals allowed us to investigate the temporal dynamics of information processing along the time course of each trial. The manipulation of attention showed a strong overall effect with enhanced representation of both the less important direction of approach and the most task-relevant distance to object information for cued dots, regardless of how frequent the targets were (Figure 3). The improved representation of information under attention extends previous findings from us and others (Woolgar et al., 2015b; Goddard et al., 2019; Nastase et al., 2017) to moving displays, in which the participants monitor multiple objects simultaneously. When targets were infrequent, modelling real-life monitoring situations, there was a strong behavioural drop in performance (i.e., vigilance effects in both accuracy and RT; Figure 2) and a hint in the brain activity data of a change in neural coding (namely one time-window showing evidence of an interaction between Target Frequency and Time on Task). We need more data to fully test this effect, however, our main finding is that of being able to use the difference in decoding between correct and miss trials to predict behaviour. Although the results replicated after standard eye-artefact removal, as algorithms of artefact removal are not perfect, there is still the possibility that our MEG data could be affected by some residual patterns of eye movements across conditions. In a real-world setting, it may be possible to combine information from the brain and eye-movements to further improve the prediction accuracy.
When people miss targets, they might process or encode the relevant sensory information less effectively than when they correctly respond to targets. This is consistent with our finding that on the majority of miss trials, there was less effective representation about the task-relevant information in the neural signal, in contrast to the consistently more effective representation on correct trials. Note that our vigilance decrement effects are defined as the difference between Active and Monitoring conditions, which allows us to be sure that we are not interpreting general task (e.g., participant fatigue) or hardware-related effects as vigilance decrements.
It is important to note that previous studies have tried other physiological/behavioural measures to determine participants’ vigilance or alertness, such as pupil size (Yoss et al., 1970), response time variability (Rosenberg et al., 2013), blood pressure and thermal energy (Lohani et al., 2019) or even body temperature (Molina et al., 2019). We used highly sensitive analysis of neuroimaging data so that we could address two questions that could not be answered using these more general vigilance measures. We tested for changes in the way information is processed in the brain, particularly testing for differences in the impact of monitoring on the relevance of the information, rather than whether the participants were vigilant and alert in general. Moreover, we could also investigate how the most relevant and less relevant information was affected by the target frequency and time on the task, to find neural correlates for the behavioural vigilance decrements observed in many previous studies (e.g., Dehais et al., 2019; Wolfe et al., 2005; Wolfe et al., 2007; Kamzanova et al., 2014; Ishibashi et al., 2012). The less relevant information about direction of approach was modulated by attention, but its representation was not detectably affected by target frequency and time on task, and was noisier, but not noticeably attenuated, on error trials. The relative stability of these representations might reflect the large visual difference between stimuli approaching from the top left vs bottom right of the screen. In contrast, the task-relevant information of distance to object was affected by attention and was attenuated on errors. The difference might reflect the fact that only the distance information is relevant to deciding whether an item is a target, and/or the classifier having to rely on much more subtle differences to distinguish the distance categories, which collapsed over stimuli appearing on the left and right sides of the display, removing the major visual signal.
Our information-based brain connectivity method showed moderate evidence for no change in connectivity between correct and error trials. Informational connectivity is unaffected by differences in absolute levels of information encoding (e.g., lower coding on miss vs. correct trials). It could be sensitive to different levels of noise between conditions, but there was no evidence for that in this case. Apart from sensory information coding and sensory-based informational connectivity, which were evaluated here, there may be other correlates we have not addressed. Effects on response-level selection, for example, independently or in conjunction with sensory information coding, could also affect performance under vigilance conditions, and need further research.
Our connectivity method follows the recent major recent shift in literature from univariate to multivariate informational connectivity analyses (Goddard et al., 2016; Karimi-Rouzbahani et al., 2017; Anzellotti and Coutanche, 2018; Goddard et al., 2019; Kietzmann et al., 2019; Karimi-Rouzbahani et al., 2019; Basti et al., 2020; Karimi-Rouzbahani et al., 2021). This is in contrast with the majority of neuroimaging studies using univariate connectivity analyses which can miss existing connectivity across areas when encountering low-amplitude activity on individual sensors (Anzellotti and Coutanche, 2018; Basti et al., 2020). Informational connectivity, on the other hand, is measured either through calculating the correlation between temporally resolved patterns of decoding accuracies across a pair of areas (Coutanche and Thompson-Schill, 2013) or the correlation between representational dissimilarity matrices (RDMs) obtained from a pair of areas (Kietzmann et al., 2019; Goddard et al., 2016; Goddard et al., 2019; Karimi-Rouzbahani et al., 2019; Karimi-Rouzbahani et al., 2021). Either one measures how much similarity in information coding there is between two brain areas across conditions, which is interpreted as reflecting their potential informational connectivity, and is less affected by absolute activity values compared to conventional univariate connectivity measures (Anzellotti and Coutanche, 2018). The method we used here evaluated the correlation between RDMs, which has provided high-dimensional information about distance to object, obtained from multiple sensors across the brain areas. This makes our analysis sensitive to different aspects of connectivity compared to conventional univariate analyses.
Fourth, building upon our recently developed method of error analysis (Woolgar et al., 2019), we were able to predict forthcoming behavioural misses based on the decoding data, before the response was given. Our method is different from the conventional method of error prediction, in which people directly discriminate correct and miss trials by feeding both types of trials to classifiers in the training phase and testing the classifiers on the left-out correct and miss trials (e.g., Bode and Stahl, 2014). Our method only uses correct trials for training, which makes its implementation plausible for real-world situations since we usually have plenty of correct trials and only few miss trials (i.e., cases when the railway controller diverts the trains correctly vs. misses and a collision happens). Moreover, it allows us to directly test whether the neural representations of correct trials contain information which is (on average) less observable in miss trials. We statistically compared the two types of trials and showed a reliable advantage in the level of information contained at individual-trial-level in correct vs. miss trials.
Our error prediction results showed a reliable decline in the crucial task-relevant (i.e., distance to object) information decoding on miss vs. correct trials but less decline in the less task-relevant information (i.e., direction of approach). A complementary analysis allowed the prediction of behaviourally missed trials as soon as the stimulus appeared on the screen (after ~80 ms), which was ~1160 ms before the time of response. This method was generalisable across participants, with the decision threshold for trial classification based on other participants’ data successful in predicting errors for a left-out participant. A number of previous studies have shown that behavioural performance can be correlated with aspects of brain activity even before the stimulus onset (Bode and Stahl, 2014; Eichele et al., 2008; Eichele et al., 2010; Weissman et al., 2006; Ekman et al., 2012; Sadaghiani et al., 2015). Those studies have explained the behavioural errors by implicit measures such as less deactivation of the default-mode network, reduced stimulus-evoked sensory activity (Weissman et al., 2006; Eichele et al., 2008), and even the connectivity between sensory and vigilance-related/default-mode brain areas (Sadaghiani et al., 2015). It would be informative, however, if they could show how (if at all) the processing of task-relevant information is disrupted in the brain and how this might lead to behavioural errors. To serve an applied purpose, it would be ideal if there was a procedure to use those neural signatures to predict behavioural outcomes. Only three previous studies have approached this goal. Bode and Stahl, 2014, Sadaghiani et al., 2015, and Dehais et al., 2019 reported maximum prediction accuracies of 62%, 63%, and 72% (with adjusted chance levels of 50%, 55%, and 59%, respectively). Here, we obtained up to 65% prediction (with a chance level of 50%), suggesting our method accesses relevant neural signatures of attention lapses, and may be sensitive in discriminating these. The successful prediction of an error from neural data more than a second in advance of the impending response provides a promising avenue for detecting lapses of attention before any consequences occur.
The overall goal of this study was to understand how neural representation of dynamic displays was affected by attention and target frequency, and whether reliable changes in behaviour over time could be predicted on the basis of neural patterns. We observed that the neural representation of critically relevant information in the brain was particularly poor on trials where participants missed the target. We used this observation to predict behavioural outcome of individual trials and showed that we could predict behavioural outcome more than a second before action was needed. These results provide new insights about how momentary lapses in attention impact information coding in the brain and propose an avenue for predicting behavioural errors using novel neuroimaging analysis techniques.
Materials and methods
Participants
We tested 21 right-handed participants (10 male, 11 female, mean age = 23.4 years [SD = 4.7 years], all Macquarie University students) with normal or corrected to normal vision. The Human Research Ethics Committee of Macquarie University approved the experimental protocols and the participants gave informed consent before participating in the experiment. We reimbursed each participant AU$40 for their time completing the MEG experiment, which lasted for 2 hr including setup.
Apparatus
We recorded neural activity using a whole-head MEG system (KIT, Kanazawa, Japan) with 160 coaxial first-order gradiometers, at a sampling rate of 1000 Hz. We projected the visual stimuli onto a mirror at a distance of 113 cm above participants’ heads while they were in the MEG. An InFocus IN5108 LCD back projection system (InFocus, Portland, Oregon, USA), located outside the magnetically shielded room, presented the dynamically moving stimuli, controlled by a desktop computer (Windows 10; Core i5 CPU; 16 GB RAM; NVIDIA GeForce GTX 1060 6 GB Graphics Card) using MATLAB with Psychtoolbox 3.0 extension (Brainard, 1997; Kleiner et al., 2007). We set the refresh rate of the projector at 60 Hz and used parallel port triggers and a photodiode to mark the beginning (dot appearing on the screen) and end (dot disappearing off the screen) of each trial. We recorded participant’s head shape using a pen digitiser (Polhemus Fastrack, Colchester, VT) and placed five marker coils on the head which allowed the location of the head in the MEG helmet to be monitored during the recording – we checked head location at the beginning, half way through and the end of recording. We used a fibre optic response pad (fORP, Current Designs, Philadelphia, PA, USA) to collect responses and an EyeLink 1000 MEG-compatible remote eye-tracking system (SR Research, 1000 Hz monocular sampling rate) to record eye position. We focused the eye-tracker on the right eye of the participant and calibrated the eye-tracker immediately before the start of MEG data recording.
Task and stimuli
Task summary
Request a detailed protocolThe task was to avoid collisions of relevant moving dots with the central object by pressing the space bar if the dot passed a deflection point in a visible predicted trajectory without changing direction to avoid the central object (see Figure 1A; a demo can be found here https://osf.io/5aw8v/). A text cue at the start of each block indicated which colour of dot was relevant for that block. The participant only needed to respond to targets in this colour (Attended); dots in the other colour formed distractors (Unattended). Pressing the button deflected the dot in one of two possible directions (counterbalanced) to avoid collision. Participants were asked to fixate on the central object throughout the experiment.
Stimuli
Request a detailed protocolThe stimuli were moving dots in one of two colours that followed visible trajectories and covered a visual area of 3.8 × 5° of visual angle (dva; Figure 1A). We presented the stimuli in blocks of 110 s duration, with at least one dot moving on the screen at all times during the 110 s block. The trajectories directed the moving dots from two corners of the screen (top left and bottom right) straight towards a centrally presented static ‘object’ (a white circle of 0.25 dva) and then deflected away (either towards the top right or bottom left of the screen; in pathways orthogonal to their direction of approach) from the static object at a set distance (the deflection point).
Target dots deviated from the visible trajectory at the deflection point and continued moving towards the central object. The participant had to push the space bar to prevent a ‘collision’. If the response was made before the dot reached the centre of the object, the dot deflected, and this was counted as a ‘hit’. If the response came after this point, the dot continued straight, and this was counted as a ‘miss’, even if they pressed the button before the dot totally passed through central object.
The time from dot onset in the periphery to the point of deflection was 1226 ± 10 (mean ± SD) ms. Target (and distractor event) dots took 410 ± 10 (mean ± SD) ms to cross from the deflection point to the collision point. In total, each dot moved across the display for 2005 ± 12 (mean ± SD) ms before starting to fade away after either deflection or travel through the object. The time delay between the onsets of different dots (ISI) was 1660 ± 890 (mean ± SD) ms. There were 1920 dots presented in the whole experiment (~56 min). Each 110 s block contained 64 dots, 32 (50%) in red, and 32 (50%) in green, while the central static object and trajectories were presented in white on a black background.
Conditions
Request a detailed protocolThere were two target frequency conditions. In ‘Monitoring’ blocks, target dots were ~6.2% of cued-colour dots (2 out of 32 dots). In ‘Active’ blocks, target dots were 50% of cued-colour dots (16 out of 32 dots). The same proportion of dots in the non-cued colour failed to deflect; these were distractors (see Figure 1A, top right panel). Participants completed two practice blocks of the Active condition and then completed 30 blocks in the main experiment (15 Active followed by 15 Monitoring or vice versa, counterbalanced across participants).
The time between the appearance of target dots varied unpredictably, with distractors and correctly deflecting dots (events) intervening. In Monitoring blocks, there was an average time between targets of 57.88 (±36.03 SD) s. In Active blocks, there was an average time between targets of 7.20 (±6.36 SD) s.
Feedback: On target trials, if the participant pressed the space bar in time, this ‘hit’ was indicated by a specific tone and deflection of the target dot. There were three types of potential false alarm, all indicated by an error tone and no change in the trajectory of the dot. These were if the participant responded: (1) too early, while the dot was still on the trajectory; (2) when the dot was not a target and had been deflected automatically (‘event’ in Figure 1A, middle right); or (3) when the dot was in the non-cued colour (‘distractor’ in Figure 1A, top right) in any situation. Participants had only one chance to respond per dot; any additional responses resulted in ‘error’ tones. As multiple dots could be on the screen, we always associated the button press to the dot which was closest to the central object.
Pre-processing
Request a detailed protocolMEG data were filtered online using band-pass filters in the range of 0.03–200 Hz and notch-filtered at 50 Hz. We then imported the data into MATLAB and epoched them from −100 to 3000 ms relative to the trial onset time. We performed all the analyses once without and once with standard eye-artefact removal (post hoc, explained below) to see if eye movements and blinks had a significant impact on our results and interpretations. Finally, we down-sampled the data to 200 Hz for the decoding of our two key measures: direction of approach and distance to object (see below).
Eye-related artefact removal
Request a detailed protocolThere are two practical reasons that the effects of eye-related artefacts (e.g. eye-blinks, saccades, etc.) should not be dominantly picked up by our classification procedure. First, the decoding analysis is time-resolved and computed in small time windows (5 ms and 80 ms, for direction and distance information decoding, respectively). For eye-related artefacts to be picked up by the classifier, they would need to occur at consistent time points across trials of the same condition, and not in the other condition, which seems implausible. Second, our MEG helmet does not have the very frontal sensors where eye-related artefacts most strongly affect neural activations (Mognon et al., 2011). However, to check that our results were not dominantly driven by eye-movement artefacts, we also did a post hoc analysis in which we removed these using ‘runica’ Independent Components Analysis (ICA) algorithm as implemented by EEGLAB. We used the ADJUST plugin (Mognon et al., 2011) of EEGLAB to decide which ICA components were related to eye artefacts for removal. This toolbox extracts spatiotemporal features from components to quantitatively measure if a component is related to eye movements or blinks. For all subjects except two, we identified only one component which were attributed to eye artefacts (i.e., located frontally and surpassing the ADJUST’s threshold) which we removed. For the two other participants, we identified and removed two components with these characteristics. The body of the paper presents the results of our analyses on the data without eye-artefact removal, but the corrected data can be found in the Supplementary materials.
Multivariate pattern analyses
Request a detailed protocolWe measured the information contained in the multivariate (multi-sensor) patterns of MEG data by training a linear discriminant analysis (LDA) classifier using a set of training trials from two categories (e.g., for the direction of approach measure, this was dots approaching from left vs. right, see below). We then tested to see whether the classifier could predict the category of an independent (left-out) set of testing data from the same participant. We used a 10-fold cross-validation approach, splitting the data into training and testing subsets. Specifically, we trained the LDA classifier on 90% of the trials and tested it on the left-out 10% of the trials. This procedure was repeated 10 times each time leaving out a different 10% subset of the data for testing (i.e., 10-fold cross validation).
We decoded two major task features from the neural data: (1) the direction of approach (left vs. right); and (2) the distance of each moving dot from the centrally fixed object (distance to object), which correspond to visual (retinal) information changing over time. Our interest was in the effect of selective attention (Attended vs. Unattended) and Target Frequency conditions (Active vs. Monitoring) on the neural representation of this information, and how the representation of information changed on trials when participants missed the target.
We decoded left vs. right directions of approach (as indicated by yellow arrows in Figure 1B) every 5 ms starting from 100 ms before the appearance of the dot on the screen to 3000 ms later. Please note that as each moving dot is considered a trial, trial time windows (epochs) overlapped for 62.2% of trials. In Monitoring blocks, 1.2% of target trials overlapped (two targets were on the screen simultaneously but lagged relative to one another). In Active blocks, 17.1% of target trials overlapped.
For the decoding of distance to object, we split the trials into the time windows corresponding to 15 equally spaced distances of the moving dot relative to the central object (as indicated by blue lines in Figure 1B), with distance 1 being closest to the object, and 15 being furthest away (the dot having just appeared on the screen). Each distance covered a time window of ~80ms (varied slightly as dot trajectories varied in angle) which consisted of 4 or 5 signal samples depending on which of the 15 predetermined distances was temporally closest to each signal sample and therefore could incorporate it. Next, we concatenated the MEG signals from identical distances (splits) across both sides of the screen (left and right), so that every distance included data from dots approaching from both left and right side of the screen. This concatenation ensures that distance information decoding is not affected by the direction of approach. Finally, we trained and tested a classifier to distinguish between the MEG signals (a vector comprising data from all MEG sensors, concatenated over all time points in the relevant time window), pertaining to each pair of distances (e.g., 1 vs. 2) using a leave-one-out cross-validation procedure. As within-trial autocorrelation in signals could inflate classification accuracy (signal samples closer in time are more similar than those farther apart), we ensured that in every cross-validation run and each distance, the training and testing sets used samples from distinct sets of trials. To achieve this, trials were first allocated randomly into 10 folds, without separating their constituent signal samples. This way, the 4 or 5 signal samples from within each distance of a given trial remained together across all cross-validation runs and were never split across training and testing sets. We obtained classification accuracy for all possible pairs of distances (105 combinations of 15 distances). To obtain a single decoding value per distance, we averaged the 14 classification values that corresponded to that distance against other 14 distances. For example, the final decoding accuracy for distance 15 was an average of 15 vs. 14, 15 vs. 13, 15 vs. 12, and so on until 15 vs. 1. We repeated this procedure for our main Target Frequency conditions (Active vs. Monitoring), Attention conditions (Attended vs. Unattended), and Time on Task (first and last five blocks of each task condition, which are called early and late blocks here, respectively). This was done separately for correct and miss trials and for each participant separately.
Note that the ‘direction of approach’ and ‘distance to object’ information cannot be directly compared on an analogous platform as the two types of information are defined differently. There are also different number of classes in decoding for the two types of information: only two classes for the direction information (left vs. right), compared to the 15 classes for the distance information (15 distances).
Informational connectivity analysis
Request a detailed protocolTo evaluate possible modulations of brain connectivity between the attentional networks of the frontal brain and the occipital visual areas, we used a simplified version of our recently developed RSA-based informational connectivity analysis (Goddard et al., 2016; Goddard et al., 2019; Karimi-Rouzbahani, 2018; Karimi-Rouzbahani et al., 2019). Specifically, we evaluated the informational connectivity, which measures the similarity of distance decoding patterns between areas, across our main Target Frequency conditions (Active vs. Monitoring), Attention conditions (Attended vs. Unattended), and Time on Task (first and last five blocks of each task condition, which are called early and late blocks here, respectively). There are a few considerations in the implementation and interpretation of our connectivity analysis. First, it reflects the similarity of the way a pair of brain areas encode ‘distance’ information during the whole trial. This means that we could not use the component of time in the evaluation of our connectivity as we have implemented elsewhere (Karimi-Rouzbahani et al., 2019; Karimi-Rouzbahani et al., 2021). Second, rather than a simple correlation of magnitudes of decoding accuracy between two regions of interest, our connectivity measure reflects a correlation of the patterns of decoding accuracies across conditions (i.e., distances here). Finally, our connectivity analysis evaluates sensory information encoding, rather than other aspects of cognitive or motor information encoding, which might have also been affected by our experimental manipulations.
Connectivity was calculated separately for correct and miss trials, using RDMs (Kriegeskorte et al., 2008). To construct the RDMs, we decoded all possible combinations of distances from each other yielding a 15 by 15 cross-distance classification matrix, for each condition separately. We obtained these matrices from peri-occipital and peri-frontal areas to see how the manipulation of Attention, Target Frequency, and Time on Task modulated the correlation of information (RDMs) between those areas on correct and miss trials. We quantified connectivity using Spearman’s rank correlation of the matrices obtained from those areas, only including the lower triangle of the RDMs (105 decoding values). To avoid bias when comparing the connectivity on correct vs. miss trials, the number of trials were equalised by subsampling the correct trials to the number of miss trials and repeating the subsampling 100 times before finally averaging them for comparison with miss trials.
Error data analysis
Request a detailed protocolNext, we asked what information was coded in the brain when participants missed targets. To study information coding in the brain on miss trials, where the participants failed to press the button when targets failed to automatically deflect, we used our recently developed method of error data analysis (Woolgar et al., 2019). Essentially, this analysis asks whether the brain represents the information similarly on correct and miss trials. For that purpose, we trained a classifier using the neural data from a proportion of correct trials (i.e., when the target dot was detected and manually deflected punctually) and tested on both the left-out portion of the correct trials (i.e., cross-validation) and on the miss trials. If decoding accuracy is equal between the correct and miss trials, we can conclude that information coding is maintained on miss trials as it is on correct trials. However, if decoding accuracy is lower on miss trials than on correct trials, we can infer that information coding differs on miss trials, consistent with the change in behaviour. Since correct and miss trials were visually different after the deflection point, we only used data from before the deflection point.
For these error data analyses, the number of folds for cross-validation were determined based on the proportion of miss to correct trials (number of folds = number of miss trials/number of correct trials). This allowed us to test the trained classifiers with equal numbers of miss and correct trials to avoid bias in the comparison.
Predicting behavioural performance from neural data
Request a detailed protocolWe developed a new method to predict, based on the most task-relevant information in the neural signal, whether or not a participant would press the button for a target dot in time to deflect it on a particular trial. This method includes three steps, with the third step being slightly different for the left-out testing participant vs. the other 20 participants. First, for every participant, we trained 105 classifiers using ~80% of correct trials to discriminate the 15 distances. Second, we tested those classifiers using half of the left-out portion (~10%) of the correct trials, which we called validation trials, by simultaneously accumulating (i.e., including in averaging) the accuracies of the classifiers at each distance and further distances as the validation dot approached the central object. The validation set allowed us to determine a decision threshold for predicting the outcome of each testing trial: whether it was a correct or miss trial. Third, we performed a second-level classification on testing trials which were the other half (~10%) of the left-out portion of the correct trials and the miss trials, using each dot’s accumulated accuracy calculated as in the previous step. Accordingly, if the testing dot’s accumulated accuracy was higher than the decision threshold, it was predicted as correct, otherwise miss. For all participants, except for the left-out testing one, the decision threshold was chosen from a range of multiples (0.1 to 4 in steps of 0.1) of the standard deviation below the accumulated accuracy obtained for the validation set on the second step. For determining the optimal threshold for the testing participant, however, instead of a range of multiples, we used the average of the best performing multiples (i.e., the one which predicted the behavioural outcome of the trial more accurately) obtained from the other 20 participants. This avoided circularity in the analysis.
To give more detail on the second and third steps, when the validation/testing dots were at distance #15, we averaged the accuracies of the 14 classifiers trained to classify dots at distance #15 from all other distances. Accordingly, when the dot reached distance #14, we also included and averaged accuracies from classifiers which were trained to classify distance #14 from all other distances leading to 27 classifier accuracies. Therefore, by the time the dot reached distance #1, we had 105 classifier accuracies to average and predict the behavioural outcome of the trial. Every classifier’s accuracies were either 1 or 0 corresponding to correct or incorrect classification of dot’s distance, respectively. Note that accumulation of classifiers’ accuracies, as compared to using classifier accuracy on every distance independently, provides a more robust and smoother classification measure for deciding on the label of the trials. The validation set, which was different from the testing set, allowed us to set the decision threshold based on the validation data within each subject and from the 20 participants and finally test our prediction classifiers on a separate testing set from the 21st individual participant, iteratively. The optimal threshold was 0.4 (± 0.07) times the SD below the decoding accuracy on the validation set across participants.
Statistical analyses
Request a detailed protocolTo determine the evidence for the null and the alternative hypotheses, we used Bayes analyses as implemented by Krekelberg (https://klabhub.github.io/bayesFactor/) based on Rouder et al., 2012. We used standard rules for interpreting levels of evidence (Lee and Wagenmakers, 2005; Dienes, 2014): Bayes factors of >10 and <1/10 were interpreted as strong evidence for the alternative and null hypotheses, respectively, and >3 and <1/3 were interpreted as moderate evidence for the alternative and null hypotheses, respectively. We interpreted the Bayes factors which fell between 3 and 1/3 as reflecting insufficient evidence either way.
Specifically, for the behavioural data, we asked whether there was a difference between Active and Monitoring conditions in terms of miss rates and RTs. Accordingly, we calculated the Bayes factor as the probability of the data under alternative (i.e., difference) relative to the null (i.e., no difference) hypothesis in each block separately. In the decoding, we repeated the same procedure to evaluate the evidence for the alternative hypothesis of a difference between decoding accuracies across conditions (e.g., Active vs. Monitoring and Attended vs. Unattended) vs. the null hypothesis of no difference between them, at every time point/distance. To evaluate evidence for the alternative of above-chance decoding accuracy vs. the null hypothesis of no difference from chance, we calculated the Bayes factor between the distribution of actual accuracies obtained and a set of 1000 random accuracies obtained by randomising the class labels across the same pair of conditions (null distribution) at every time point/distance.
To evaluate the evidence for the alternative of main effects of different factors (Attention, Target Frequency, and Time on Task) in decoding, we used Bayes factor ANOVA (Rouder et al., 2012). This analysis evaluates the evidence for the null and alternative hypothesis as the ratio of the Bayes factor for the full model ANOVA (i.e., including all three factors of Target Frequency, Attention, and the Time on Task) relative to the restricted model (i.e., including the two other factors while excluding the factor being evaluated). For example, for evaluating the main effect of Time on Task, the restricted model included Attention and Target Frequency factors but excluded the factor of Time on Task.
The priors for all Bayes factor analyses were determined based on Jeffrey-Zellner-Siow priors (Jeffreys, 1961; Zellner and Siow, 1980) which are from the Cauchy distribution based on the effect size that is initially calculated in the algorithm using a t-test (Rouder et al., 2012). The priors are data-driven and have been shown to be invariant with respect to linear transformations of measurement units (Rouder et al., 2012), which reduces the chance of being biased towards the null or alternative hypotheses.
Appendix 1
Supplementary materials
Source text file for Figure 3—figure supplement 1
Our first analysis was to verify that our analyses could decode the important aspects of the display, relative to chance, given the overlapping moving stimuli. Here, we give the detailed results of this analysis.
We started with the information about the direction of approach (top left or bottom right of screen) which is a strong visual signal but not critical to the task decision. From 95 ms post-stimulus onset onwards, this visual information could be decoded from the MEG signal for all combinations of the factors: Attended and Unattended dots, both Target Frequency conditions (Active, Monitoring), and both our Time on Task durations (Early – first 5 blocks; Late – last 5 blocks; all BF > 3, different from chance).
All conditions were decodable above chance until at least 385 ms post-stimulus onset (BF > 3; Figure 3—figure supplement 1A), which was when the dots came closer to the centre, losing their visual difference. There was a rapid increase in information about the direction of approach between 50 ms and 150 ms post-stimulus onset, consistent with an initial forward sweep of visual information processing (VanRullen, 2007; Karimi-Rouzbahani et al., 2017; Karimi-Rouzbahani et al., 2019). For attended dots only (but regardless of the Target Frequency or Time on Task), the information then increased again before the deflection time, and remained different from chance until 1915 ms post-stimulus onset, which is just before the dot faded (Figure 3—figure supplement 1A). The second rise of decoding, which was more pronounced for the attended dots, could reflect the increasing relevance to the task as the dot approached the crucial deflection point, but it could also be due to higher visual acuity in foveal compared to peripheral areas of the visual field. The decoding peak observed after the deflection point for the attended dots was most probably caused by the large visual difference between the deflection trajectories for the dots approaching from the left vs. right side of the screen (see the deflection trajectories in Figure 1A).
The most task-relevant feature of the motion is the distance between the moving dot and the central object, with the deflection point of the trajectories being the key decision point. We therefore tested for decoding of distance information (distance to object, see Materials and methods). There was a brief increase in decoding of distance to object for attended dots across the other factors (Target Frequency and Time on Task) between the 15th and 10th distances and for the unattended dots across the other factors between 15th and the 12th distances. This corresponds to the first 400 ms for the attended dots and the first 240 ms for the unattended dots after the onset (Figure 3—figure supplement 1B). Distance decoding then dropped somewhat before ascending again as the dot approached the deflection point. The second rise of decoding, which was more pronounced for the attended dots, could reflect the increasing relevance to the task as the dot approached the crucial deflection point, but it could also be due to higher visual acuity in foveal compared to peripheral areas of the visual field. There was moderate or strong evidence that decoding of distance information for all attended conditions was greater than chance (50%, BF > 3) across all 15 distance levels with the exception of distance 8 in the late monitoring condition (Figure 3—figure supplement 1B, left panels). There were also timepoints with greater than chance decoding for the unattended conditions but these were far less consistent (Figure 3—figure supplement 1B, right panels).
Data availability
We have shared the Magnetoencephalography data (i.e. time series) as well as behavioral data in Matlab '.mat' format on the Open Science Framework website at https://osf.io/5aw8v/ with the DOI: 10.17605/OSF.IO/5AW8V. We have also uploaded a video of the "Multiple-Object-Monitoring" paradigm, developed for this study, for easier understanding of the task at the same address. The mentioned address is dedicated to this project and we will regularly update the contents to make them easier to follow for other researchers.
-
Open Science FrameworkID 10.17605/OSF.IO/5AW8V. Neural signatures of vigilance decrements predict behavioural errors before they occur.
References
-
Beyond functional connectivity: investigating networks of multivariate representationsTrends in Cognitive Sciences 22:258–269.https://doi.org/10.1016/j.tics.2017.12.002
-
Magnetoencephalography for brain electrophysiology and imagingNature Neuroscience 20:327–339.https://doi.org/10.1038/nn.4504
-
A continuous performance test of brain damageJournal of Consulting Psychology 20:343–350.https://doi.org/10.1037/h0043220
-
Covert auditory attention generates activation in the rostral/dorsal anterior cingulate cortexJournal of Cognitive Neuroscience 14:637–645.https://doi.org/10.1162/08989290260045765
-
BookStatistical Power Analysis for the Behavioral SciencesNew York: Academic Press.
-
Using Bayes to get the most out of non-significant resultsFrontiers in Psychology 5:781.https://doi.org/10.3389/fpsyg.2014.00781
-
The multiple-demand (MD) system of the primate brain: mental programs for intelligent behaviourTrends in Cognitive Sciences 14:172–179.https://doi.org/10.1016/j.tics.2010.01.004
-
Common regions of the human frontal lobe recruited by diverse cognitive demandsTrends in Neurosciences 23:475–483.https://doi.org/10.1016/S0166-2236(00)01633-7
-
Mal-adaptation of event-related EEG responses preceding performance errorsFrontiers in Human Neuroscience 4:65.https://doi.org/10.3389/fnhum.2010.00065
-
Performance-related activity in medial rostral prefrontal cortex (area 10) during low-demand tasksJournal of Experimental Psychology: Human Perception and Performance 32:45–58.https://doi.org/10.1037/0096-1523.32.1.45
-
Decoding dynamic brain patterns from evoked responses: a tutorial on multivariate pattern analysis applied to time series neuroimaging dataJournal of Cognitive Neuroscience 29:677–697.https://doi.org/10.1162/jocn_a_01068
-
Feature absence-presence and two theories of lapses of sustained attentionPsychological Research 75:384–392.https://doi.org/10.1007/s00426-010-0316-1
-
Signal salience and the mindlessness theory of vigilanceActa Psychologica 129:18–25.https://doi.org/10.1016/j.actpsy.2008.04.002
-
The effects of local prevalence and explicit expectations on search termination timesAttention, Perception, & Psychophysics 74:115–123.https://doi.org/10.3758/s13414-011-0225-4
-
Use of EEG workload indices for diagnostic monitoring of vigilance decrementHuman Factors: The Journal of the Human Factors and Ergonomics Society 56:1136–1149.https://doi.org/10.1177/0018720814526617
-
Representational similarity analysis - connecting the branches of systems neuroscienceFrontiers in Systems Neuroscience 2:4.https://doi.org/10.3389/neuro.06.004.2008
-
Bayesian statistical inference in psychology: comment on trafimow (2003)Psychological Review 112:662–668.https://doi.org/10.1037/0033-295X.112.3.662
-
A review of psychophysiological measures to assess cognitive states in Real-World drivingFrontiers in Human Neuroscience 13:57.https://doi.org/10.3389/fnhum.2019.00057
-
The breakdown of vigilance during prolonged visual searchQuarterly Journal of Experimental Psychology 1:6–21.https://doi.org/10.1080/17470214808416738
-
Prestimulus alpha and mu activity predicts failure to inhibit motor responsesHuman Brain Mapping 30:1791–1800.https://doi.org/10.1002/hbm.20763
-
Electroencephalographic and peripheral temperature dynamics during a prolonged psychomotor vigilance taskAccident Analysis & Prevention 126:198–208.https://doi.org/10.1016/j.aap.2017.10.014
-
The brain's default mode networkAnnual Review of Neuroscience 38:433–447.https://doi.org/10.1146/annurev-neuro-071013-014030
-
Sustaining visual attention in the face of distraction: a novel gradual-onset continuous performance taskAttention, Perception, & Psychophysics 75:426–439.https://doi.org/10.3758/s13414-012-0413-x
-
Default bayes factors for ANOVA designsJournal of Mathematical Psychology 56:356–374.https://doi.org/10.1016/j.jmp.2012.08.001
-
Divergent response-time patterns in vigilance decrement tasksJournal of Experimental Psychology: Human Perception and Performance 46:1058–1076.https://doi.org/10.1037/xhp0000813
-
SoftwareDeterioration of Performance on a Short-Term Perceptual-Motor TaskDeterioration of Performance on a Short-Term Perceptual-Motor Task.
-
Exploring cortical attentional system by using fMRI during a continuous performance testComputational Intelligence and Neuroscience 2010:1–6.https://doi.org/10.1155/2010/329213
-
The effects of signal salience and caffeine on performance, workload, and stress in an abbreviated vigilance taskHuman Factors: The Journal of the Human Factors and Ergonomics Society 42:183–194.https://doi.org/10.1518/001872000779656480
-
The power of the feed-forward sweepAdvances in Cognitive Psychology 3:167–176.https://doi.org/10.2478/v10053-008-0022-3
-
Vigilance requires hard mental work and is stressfulHuman Factors: The Journal of the Human Factors and Ergonomics Society 50:433–441.https://doi.org/10.1518/001872008X312152
-
The neural bases of momentary lapses in attentionNature Neuroscience 9:971–978.https://doi.org/10.1038/nn1727
-
Sustained attention and serotonin: a pharmaco-fMRI studyHuman Psychopharmacology: Clinical and Experimental 23:221–230.https://doi.org/10.1002/hup.923
-
Low target prevalence is a stubborn source of errors in visual search tasksJournal of Experimental Psychology: General 136:623–638.https://doi.org/10.1037/0096-3445.136.4.623
-
Adaptive coding of task-relevant information in human frontoparietal cortexJournal of Neuroscience 31:14592–14599.https://doi.org/10.1523/JNEUROSCI.2616-11.2011
-
Flexible coding of task rules in frontoparietal cortex: an adaptive system for flexible cognitive controlJournal of Cognitive Neuroscience 27:1895–1911.https://doi.org/10.1162/jocn_a_00827
-
Malleable attentional resources theory: a new explanation for the effects of mental underload on performanceHuman Factors: The Journal of the Human Factors and Ergonomics Society 44:365–375.https://doi.org/10.1518/0018720024497709
-
BookPosterior odds ratios for selected regression hypothesesIn: Bernardo J. M, DeGroot M. H, Lindley D. V, Smith A. F. M, editors. Bayesian Statistics: Proceedings of the First International Meeting Held in Valencia (Spain). University of Valencia. pp. 585–603.
Decision letter
-
Peter KokReviewing Editor; University College London, United Kingdom
-
Floris P de LangeSenior Editor; Radboud University, Netherlands
In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.
Acceptance summary:
In our modern work environment there are many situations where humans have to pay sustained attention in order to catch infrequent computer errors, such as while monitoring railway systems. Combining a novel multiple-object monitoring task with computationally sophisticated analyses of human magnetoencephalography (MEG) data, Karimi-Rouzbahani and colleagues find that increasing the rarity of targets leads to a worse neural representation of a crucial target feature (distance to a potential collision). They were also able to predict whether participants would catch or miss a target based on their neural data, which may prove a first step towards developing methods to pre-empt such potentially disastrous errors.
Decision letter after peer review:
Thank you for submitting your article "Neural signatures of vigilance decrements predict behavioural errors before they occur" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Floris de Lange as the Senior Editor. The reviewers have opted to remain anonymous.
The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.
As the editors have judged that your manuscript is of interest, but as described below that additional analyses are required before it is published, we would like to draw your attention to changes in our revision policy that we have made in response to COVID-19 (https://elifesciences.org/articles/57162). First, because many researchers have temporarily lost access to the labs, we will give authors as much time as they need to submit revised manuscripts. We are also offering, if you choose, to post the manuscript to bioRxiv (if it is not already there) along with this decision letter and a formal designation that the manuscript is "in revision at eLife". Please let us know if you would like to pursue this option. (If your work is more suitable for medRxiv, you will need to post the preprint yourself, as the mechanisms for us to do so are still in development.)
Summary:
Karimi-Rouzbahani and colleagues investigate vigilance and sustained monitoring, using a multiple-object monitoring task in combination with magnetoencephalography (MEG) recordings in humans to investigate the neural coding and decoding-based connectivity of vigilance decrements. Using computationally sophisticated multivariate analyses of the MEG data, they found that increasing the rarity of targets led to weaker decoding accuracy for the crucial feature (distance to an object), and weaker decoding was also found for misses compared to correct responses.
While the reviewers agreed the study was interesting, they also had concerns about the approach and the interpretation of the results.
Essential revisions:
1. The introduction makes it clear that the authors acknowledge that there may be multiple sources of interference contributing to declining vigilance over time: the encoding of sensory information, appropriate responses to the stimuli, or a combination of both. In the introduction, it would help if the authors review how infrequent targets affect response patterns. In addition, it would help if the theoretical approach and assumptions of the authors were explicitly stated. For instance, the a priori assumptions surrounding the connectivity analysis should be acknowledged and discussed in the interpretation of the pattern of results (e.g., p. 32, line 658). Specifically, the focus on connectivity between frontal and occipital areas seems to assume the effects are related to sensory processing alone, but this does not preclude other influences. For instance, effects could also occur on response patterns. These considerations should be added as caveats to the interpretation.
2. It is not clear what role eye fixations play here. Participants could freely scan the display, so the retinotopic representations would change depending on where the participants fixate, but at the same time the authors claim that eye position did not matter. Materials and methods, Page 11: The authors state that "We did not perform eye-blink artefact removal because it has been shown that blink artefacts are successfully ignored by multivariate classifiers as long as they are not systematically different between decoded conditions (Grootswagers et al., 2017)." This is not a sufficiently convincing argument. Firstly, the cited paper makes a theoretical argument rather than showing this empirically. Secondly, even if this were true, the frequency of eye-related artefacts seems to be of crucial importance for a paradigm that involves moving stimuli (and no fixation). There could indeed be systematic differences between conditions that are then picked up by the classifier (i.e. if more eye-blinks are related to tiredness and in turn decreased vigilance). The authors should show that their results replicate if standard artefact removal is performed on the data.
Relatedly, on page 16 the authors claim that "If the prediction from the MEG decoding was stronger than that of the eye tracking, it would mean that there was information in the neural signal over and above any artefact associated with eye movement." This statement is problematic: Firstly, such a result might only mean that prediction from MEG decoding is stronger than decoding from eye-movements, but not relate to "artefacts" in general, to which blinks would also count. Secondly, given that the signal underlying both analyses is entirely different (and the number of features), it is not valid to directly compare the results between these analyses. More detailed analyses of fixations and fixation duration on targets and distractors might indeed be strongly related to behaviour. What is decodable at a given time might just be driven by what participants are looking at.
3. One key finding was that while classifying the direction of the dots was modulated by attention, it was insensitive to many features that were captured by a classifier trained to decode the distance from the deflection. This is surprising since both are spatial features that seem hard to separate. In addition, the procedures to decode direction vs distance were very different. Do these differences still hold if the procedure used to train the two classifiers is more analogous or matched?
4. The distance classifier was trained using only correct trials. Then in the testing stage, it was generalized to either correct or miss trials. While there is a rationale for using correct trials only, could the decoding of error prediction be an artifact of the training sample, reflecting the fact that misses were not included in the training set?
5. By accumulating classifiers across time, it looks like classifier prediction improves closer to deflection. However, this could also be due to the fact that the total amount of information provided to the classifier increased. Is there a way to control for the total amount of information at different timepoints (e.g., by using a trailing window lag rather than accumulation), or contrast the classifier that derives from accumulating information with the classifier trained moment-by-moment?
6. Predicting miss trials: The implicit assumption here is that there is "less representation" for miss trials compared to correct trials (e.g., of distance to object). But even for miss trials, the representation is significantly above chance. However, maybe the lower accuracy for the miss trials resulted from on average more trials in which the target was not represented at all rather than a weaker representation across all trials. This would call into questions the interpretation of a decline in coding. In other words, on a single trial, a representation might only be present (but could result in a miss for other reasons) or not present (which would be the case for many miss trials), and the lower averages for misses would then be the result of more trials in which the information was completely absent.
It could be that the results of the subsequent analysis (predicting misses and correct responses before they occur) are in conflict with this more pessimistic interpretation. If we understand this correctly, here the classifier predicts Distance to Object for each individual trial, and Figure 6B shows that while there is a clear difference between the correct and miss trials, the latter can still be predicted above chance level but never exceed the threshold? If this is true for all single trials, this would indeed speak for a weak but "unused" representation on miss trials. But for this the authors need to show how many of the miss trials per participant had a chance-level accuracy (i.e. might be truly unrepresented), and how many were above chance but did not exceed the threshold (i.e. might have been "less represented").
7. The relationship between the vigilance decrement and error prediction. Is vigilance decrement driving the error prediction? That is, if errors increase later on, and the signal goes down, then maybe the classifier is worse. Alternatively, maybe the classifier predictions do not necessarily monotonically decrease throughout the experiment. Is the classifier equally successful at predicting errors early and late?
8. When decoding distance, active decoding declines from early to late, even though performance does not decline (or even slightly improves from early to late). This discrepancy seems hard to explain. Is this decline in classification driven by differences in the total signal from early to late?
9. Classifier performance was extremely high almost immediately after trial onset. Does the classifier perform at chance before the trial onset, or does this reflect sustained but not stimulus-specific information?
10. The connectivity analysis appears to be just a correlation of decoding results between two regions of interest. This means, if one "region" allows for decoding the distance to the object, the other one does too. However, this alone does not equal connectivity. It could simply mean that patterns across the entire brain allow for decoding the same information. For example, it would not be surprising to find that both ROIs correlate more strongly for correct trials (i.e. the brain has obviously represented the relevant information) than for errors (i.e. the brain has failed to represent the information), without this necessarily being related to connectivity at all. The more parsimonious interpretation here is that information might have been represented across all channels at this time. The authors show no evidence that only these two (arbitrarily selected) "regions" encode the information while other do not. To show evidence for meaningful connectivity, (a) the spread of information should be limited to small sub-regions, and (b) the decoding results in one "region" should predict the results in another region in time (as for DCM).
11. The display of the results is very dense, and it not always clear whether decoding for a specific variable was above chance or not. The authors often focused on relative differences, making it difficult to fully understand the meaning of the full pattern of results. The Bayes-factor plots in the decoding results figures are so cramped that it is very difficult to actually see the individual dots and to unpack all of this (e.g., Figure 3). Could this complexity be somehow reduced, maybe by dividing the panels into separate figures? The two top panels in Figure 3B should also include the chance level as in A. It looks like the accuracy is very low for unattended trials, which is only true in comparison to attended trials, but (as also shown in Supplementary Figure 1) it was clearly also encoded in unattended trials, which is very important for interpreting the results.
12. While this is methodologically interesting work, there is no convincing case made for what exactly the contribution of this study is for theories of vigilance. It seems that the findings can be reduced to that a lack of decodability of relevant target features from brain activity predicts that participants will miss the target. This alone, however, does not seem to be very novel. Even if the issues above are addressed, the study only demonstrates that with less attention to the target, there is less evidence of representations of the relevant features of targets in the brain. The authors also find the expected decrements for rare targets and when participants do not actively monitor the targets. How do these findings contribute to "theories of vigilance", as claimed by the authors?
https://doi.org/10.7554/eLife.60563.sa1Author response
Essential revisions:
1. The introduction makes it clear that the authors acknowledge that there may be multiple sources of interference contributing to declining vigilance over time: the encoding of sensory information, appropriate responses to the stimuli, or a combination of both. In the introduction, it would help if the authors review how infrequent targets affect response patterns.
We added the relevant information about response patterns to the Introduction as below:
“To date, most vigilance and rare target studies have used simple displays with static stimuli. […] Overall, vigilance decrements in terms of poorer performance can be seen in both accuracy and in reaction times, depending on the task.”
In addition, it would help if the theoretical approach and assumptions of the authors were explicitly stated. For instance, the a priori assumptions surrounding the connectivity analysis should be acknowledged and discussed in the interpretation of the pattern of results (e.g., p. 32, line 658). Specifically, the focus on connectivity between frontal and occipital areas seems to assume the effects are related to sensory processing alone, but this does not preclude other influences. For instance, effects could also occur on response patterns. These considerations should be added as caveats to the interpretation.
We have now carefully reviewed the manuscript to be sure our assumptions and approach for the connectivity analyses are explicit. We have added the suggested material to the interpretation of the pattern of results, and acknowledge the potential for other influences on the connectivity results as caveats to our interpretation.
We now limit our discussion of the connectivity results as relevant to evaluating sensory aspects of information encoding (in the Materials and methods section) as below:
“There are a few considerations in the implementation and interpretation of our connectivity analysis. First, it reflects the similarity of the way a pair of brain areas encode “distance” information during the whole trial. This means that we could not use the component of time in the evaluation of our connectivity as we have implemented elsewhere (Karimi-Rouzbahani et al., 2019; Karimi-Rouzbahani et al., 2020). Second, rather than a simple correlation of magnitudes of decoding accuracy between two regions of interest, our connectivity measure reflects a correlation of the patterns of decoding accuracies across conditions (i.e., distances here). Finally, our connectivity analysis evaluates sensory information encoding, rather than other aspects of cognitive or motor information encoding, which might have also been affected by our experimental manipulations.”
We now provide the rationale and our predictions about the impact of visual and auditory attention on our connectivity metric (in the Results section) based on the literature, as below.
“In line with attentional effects on sensory perception, we predicted that connectivity between the frontal attentional and sensory networks should be lower when not attending (vs. attending; Goddard et al., 2019). Behavioural errors were also previously predicted by reduced connection between sensory and ‘vigilance-related’ frontal brain areas (Ekman et al., 2012; Sadaghiani et al., 2015). Therefore, we predicted a decline in connectivity when targets were lower in frequency, and with increased time on task, as these led to increased errors in behaviour, specifically under vigilance conditions in our task (i.e., late blocks in Monitoring vs. late blocks in Active; Figure 2).”
We have toned down our conclusions (a) and added the possibility that other factors could also contribute to our vigilance decrement effects in the Discussion, as below (b).
a) “One explanation for the decrease in decoding accuracy for task-relevant information could be that when people monitor for rare targets, they process or encode the relevant sensory information less effectively as the time passes, relative to conditions in which they are actively engaged in completing the task.”
b) “Apart from sensory information coding and sensory-based informational connectivity, which were evaluated here and provide plausible neural correlates for the vigilance decrement, there may be other correlates we have not addressed. Effects on response-level selection, for example, independently or in conjunction with sensory information coding, could also affect performance under vigilance conditions, and need further research.”
2. It is not clear what role eye fixations play here. Participants could freely scan the display, so the retinotopic representations would change depending on where the participants fixate, but at the same time the authors claim that eye position did not matter.
We did not mean to claim that eye position doesn’t matter at all, but rather that our design ensures minimal effect of eye-related artefacts on the classifiers. We have carefully revised the manuscript to ensure this is clear (detailed response and additional analyses below).
Materials and methods, Page 11: The authors state that "We did not perform eye-blink artefact removal because it has been shown that blink artefacts are successfully ignored by multivariate classifiers as long as they are not systematically different between decoded conditions (Grootswagers et al., 2017)." This is not a sufficiently convincing argument. Firstly, the cited paper makes a theoretical argument rather than showing this empirically. Secondly, even if this were true, the frequency of eye-related artefacts seems to be of crucial importance for a paradigm that involves moving stimuli (and no fixation). There could indeed be systematic differences between conditions that are then picked up by the classifier (i.e. if more eye-blinks are related to tiredness and in turn decreased vigilance). The authors should show that their results replicate if standard artefact removal is performed on the data.
We appreciate the point here. There are theoretical and practical arguments that eye-related artefacts should not drive our effects, but to be sure we also now present our results with standard artefact removal as well.
Overall increases in eye-related artefacts (such as blinks) over time-on-task would not be an issue, as our design relies on comparisons between Active and Monitoring, and so any general effects should have negligible impact. But for these comparisons, there may indeed be differences in the number of eye blinks – and in fact, these conditions involve different levels of attentional recruitment, which has previously shown to correlate with the frequency of eye blinks (Nakano et al., 2013). Thus, we certainly do not want to claim that eye-related artefacts do not matter at all, but, importantly, there are two practical reasons that the effects of eye blinks should not be dominantly picked up by our classification procedure. First, the decoding analysis is time-resolved and computed in small time windows (5 ms and 80 ms, for direction and distance information decoding, respectively). For eye blink patterns to be picked up by the classifier, they would need to occur at consistent time points across trials of the same condition, and not in the other condition, which seems implausible. Second, our MEG helmet does not have the very frontal sensors where eye-related artefacts most strongly affect neural activations (Mognon et al., 2011), but we appreciate that this does not rule out their presence altogether.
To check empirically that eye-related artefacts were not driving our effects, we re-ran our analyses with standard artefact removal as requested. We see the same pattern of results as before, for both the key task-relevant feature of distance-to-object and the less relevant feature of direction of approach. We present the full comparative analysis in Figure 3—figure supplement 2. In the paper we now state that the results replicate with artefact removal and present the additional eye-movement-corrected results in the supplementary materials.
“…,we also did a post-hoc analysis in which we removed these using “runica” Independent Components Analysis (ICA) algorithm as implemented by EEGLAB. We used the ADJUST plugin (Mognon et al., 2011) of EEGLAB to decide which ICA components were related to eye artefacts for removal. This toolbox extracts spatiotemporal features from components to quantitatively measure if a component is related to eye movements or blinks. For all subjects except two, we identified only 1 component which were attributed to eye artefacts (i.e., located frontally and surpassing the ADJUST’s threshold) which we removed. For the two other participants, we identified and removed two components with these characteristics.”
Figure 3—figure supplement 2B shows the decoding results for the key task-relevant feature of distance-to-object without and with eye-related artefact removal, in the left and right panels, respectively. The main effects of attention and time on the task and the key interaction between target frequency and time on the task remain after eye artefact removal, replicating our initial pattern of results.
Figure 3—figure supplement 2A shows the decoding results for the direction of approach information without and with eye artefact removal. The results again replicate those of the original analysis: as before there is a main effect of Attention but no main effect of Time on Task or Target Frequency, and no interaction.
We also checked to see if our trial outcome prediction (Figure 6D) could be driven by eye artefacts by repeating our prediction procedure using the eye-movement corrected MEG data. The results (Author response image 1) show that although removal of eye artefacts seems to reduce the prediction accuracy slightly overall, it only has minimal effect on the statistics, replicating our original findings. We can still predict the outcome of the trial with >80% accuracy at closer distances.

The accuracy of predicting behavioral outcome of trials without and with eye artefact removal.
The results are for the left-out participant (averaged over all participants) using the threshold obtained from all the other participants as function of distance/time from the deflection point. Figure 6D shows the result without eye artefact removal and Author response image 1 with eye artefact removal. Thick lines and shading refer to average and one standard deviation around the mean across participants, respectively. Bayes Factors are shown in the bottom section of each graph: Filled circles show moderate/strong evidence for either hypothesis and empty circles indicate insufficient evidence.
In the Materials and methods section, we removed the sentence “We did not perform eye-blink artefact removal because it has been shown that blink artefacts are successfully ignored by multivariate classifiers as long as they are not systematically different between decoded conditions (Grootswagers et al., 2017).”
We also added the following explanations and the figures to the manuscript in the Results section to cover this point.
“Although eye-movements should not drive the classifiers due to our design (see Materials and methods), it is still important to verify that the results replicate when standard artefact removal is applied. We can also use eye-movement data as an additional measure, examining blinks, saccades and fixations for effects of our attention and vigilance manipulations.
First, to make sure that our neural decoding results replicate after eye-related artefact removal, we repeated our analyses on the data after eye-artefact removal (see Materials and methods), which provided analogous results to the original analysis (see the decoding results without and with artefact removal in Figure 3—figure supplement 2). Specifically, for our crucial distance to object data, the main effects of Attention and Time on Task and the key interaction between Target Frequency and Time on Task remain after eye-artefact removal, replicating our initial pattern of results.
Second, we conducted a post-hoc analysis to explore whether eye movement data showed the same patterns of vigilance decrements and therefore could explain our decoding results. We extracted the proportion of eye blinks, saccades and fixations per trial as well as the duration of those fixations from the eye-tracking data for correct trials (-100 to 1400 ms aligned to the stimulus onset time), and statistically compared them across our critical conditions (Figure 3—figure supplement 3). We saw strong evidence (BF = 4.8e8) for a difference in the number of eye blinks between attention conditions: There were more eye blinks for the Unattended (distractor) than Attended (potentially targets) colour dots. We also observed moderate evidence (BF = 3.4) for difference between the number of fixations, with more fixations in Unattended vs. Attended conditions. These suggest that there are systematic differences in the number of eye blinks and fixations due to our attentional manipulation, consistent with previous observations showing that the frequency of eye blinks can be affected by the level of attentional recruitment (Nakano et al. 2013). However, there was either insufficient evidence (0.3 < BF < 3) or moderate or strong evidence for no differences (0.1 < BF < 0.3 and BF < 0.3, respectively) between the number of eye blinks and saccades across our Active, Monitoring, Early and Late blocks, where we observed our ‘vigilance decrement’ effects in decoding. Therefore, this suggests that the main vigilance decrement effects in decoding, which were evident as an interaction between Target frequency (Active vs. Monitoring) and Time on the task (Early vs. Late) (Figure 3), were not driven by eye movements.”
Relatedly, on page 16 the authors claim that "If the prediction from the MEG decoding was stronger than that of the eye tracking, it would mean that there was information in the neural signal over and above any artefact associated with eye movement." This statement is problematic: Firstly, such a result might only mean that prediction from MEG decoding is stronger than decoding from eye-movements, but not relate to "artefacts" in general, to which blinks would also count. Secondly, given that the signal underlying both analyses is entirely different (and the number of features), it is not valid to directly compare the results between these analyses. More detailed analyses of fixations and fixation duration on targets and distractors might indeed be strongly related to behaviour. What is decodable at a given time might just be driven by what participants are looking at.
We take the point on the issues with this comparison, and so have removed the analysis from the manuscript, replacing it instead with more detailed analyses of the eye movement data:
We extracted the proportion of eye blinks, saccades and fixations per trial as well as the duration of those fixations from the eye-tracking data for correct trials (-100 to 1400 ms aligned to the stimulus onset time), and statistically compared them across our critical conditions as Figure 3—figure supplement 3. We saw strong evidence (BF=4.8e8) for a difference in the number of eye blinks between attention conditions: There were more eye blinks for Unattended (distractor) than Attended (potentially targets) color dots. We also observed moderate evidence (BF=3.4) for difference between the number of fixations, with more fixations in Unattended vs Attended conditions. These suggest that there are systematic differences in the number of eye blinks and fixations due to our attentional manipulation, consistent with Nakano et al., (2013). However, we observed either insufficient evidence (0.3<BF<3) or moderate to strong evidence for no difference (0.1<BF<0.3 and BF<0.3, respectively) between the number of eye blinks and saccades across our Active, Monitoring, Early and Late blocks, where we observed our ‘vigilance decrement’ effects in decoding. Consistent with the replication of the results with artefact removal presented above, this suggests that the main vigilance decrement effects in decoding, which were evident as an interaction between Target frequency (Active vs. Monitoring) and Time on the task (Early vs. Late) (Figure 3), were not driven by eye movements.
This information has also been added to the supplementary materials (Figure 3—figure supplement 3) and referred to in the manuscript (text quoted under previous bullet point).
3. One key finding was that while classifying the direction of the dots was modulated by attention, it was insensitive to many features that were captured by a classifier trained to decode the distance from the deflection. This is surprising since both are spatial features that seem hard to separate.
Yes, we see vigilance decrement effects for the distance information but not the direction of approach. Although they both rely on similar features of the visual display, the direction information classifier is likely to be driven primarily by the large visual difference between the categories (approach from the left vs approach from the right). In the key distance measure, we collapse across left and right approaching dots, which means the classifier has to use much more subtle differences (and is therefore more likely to be sensitive to other modulations). Moreover, the two types of information also differ in their importance to the task: Only the distance information is relevant to deciding whether an item is a target.
We have added to the Discussion noting this point.
“The less relevant information about direction of approach was modulated by attention, but its representation was not detectably affected by target frequency and time on task, and was noisier, but not noticeably attenuated, on error trials. The relative stability of these representations might reflect the large visual difference between stimuli approaching from the top left vs bottom right of the screen. In contrast, the task-relevant information of distance to object was affected by attention, target frequency and time on task and was dramatically attenuated on errors. The difference might reflect the fact that only the distance information is relevant to deciding whether an item is a target, and/or the classifier having to rely on much more subtle differences to distinguish the distance categories, which collapsed over stimuli appearing on the left and right sides of the display, removing the major visual signal.”
In addition, the procedures to decode direction vs distance were very different. Do these differences still hold if the procedure used to train the two classifiers is more analogous or matched?
In terms of technical differences in the decoding procedure between distance and direction information, we cannot directly compare the two types of information on an analogous platform because they have to be defined differently. There are a different number of classes in decoding for the two types of information: only two classes for the direction information (left vs. right), compared to the 15 classes for the distance information (15 distances). Therefore, if anything, the decoding of distance should result in less information compared to the direction of approach as the higher number of classes in decoding could potentially result in more noise in the data by decreasing signal to noise ratio per class.
We have added the following paragraph to the Materials and methods section to clarify the point:
“Note that the ‘direction of approach’ and ‘distance to object’ information cannot be directly compared on an analogous platform as the two types of information are defined differently. There are also different number of classes in decoding for the two types of information: only two classes for the direction information (left vs. right), compared to the 15 classes for the distance information (15 distances).”
4. The distance classifier was trained using only correct trials. Then in the testing stage, it was generalized to either correct or miss trials. While there is a rationale for using correct trials only, could the decoding of error prediction be an artifact of the training sample, reflecting the fact that misses were not included in the training set?
No, we do not think there is any way it could be an artefact. Our hypothesis is that correct trials contain information which is missing from miss trials. In other words, miss trials are in some way different from correct trials. Thus, it is crucial to use only correct trials in the training set. Please note that our approach is different from most conventional studies in which people directly discriminate correct and miss trials by feeding both types of trials to classifiers in the training phase and test the classifiers on the left-out correct and miss trials (i.e., without any feature extraction; as in Bode and Stahl, 2014). While this standard approach might lead to a higher classification performance, we developed our new approach for two main reasons. First, in the real world and many vigilance studies, there is usually not enough miss data to train classifiers. Second, we wanted to directly test whether the neural representations of correct trials contain some information which is (on average) less observable in miss trials. The result of conventional methods can reflect general differences between correct and miss trials (i.e., general level of attention, not time-locked to stimulus presentation), but cannot inform us about whether the difference reflects changes in information coding in the correct vs. miss trials; our approach allows this more specific inference.
In our approach, we trained our classifiers on correct trials and tested them on both correct and miss trials. Crucially, we tested the trained classifiers only on unseen data for both correct and miss trials. Specifically, when testing the classifiers, we used only the correct trials which were not used in the training phase. Therefore, there is no artefactual reason that the testing trials should be more similar to the training-phase trials for the correct compared to miss trials; the decoding prediction works because the correct testing trials have more similar neural representations to the correct training trials than the miss testing trials do.
We have added an explanation of the difference between approaches to the manuscript to ensure this point is clearer to the reader.
“Our method is different from the conventional method of error prediction, in which people directly discriminate correct and miss trials by feeding both types of trials to classifiers in the training phase and testing the classifiers on the left-out correct and miss trials (e.g., Bode and Stahl, 2014). Our method only uses correct trials for training, which makes its implementation plausible for real-world situations since we usually have plenty of correct trials and only few miss trials (i.e., cases when the railway controller diverts the trains correctly vs. misses and a collision happens). Moreover, it allows us to directly test whether the neural representations of correct trials contain information which is (on average) less observable in miss trials. We statistically compared the two types of trials and showed a large advantage in the level of information contained at individual-trial-level in correct vs. miss trials.”
5. By accumulating classifiers across time, it looks like classifier prediction improves closer to deflection. However, this could also be due to the fact that the total amount of information provided to the classifier increased. Is there a way to control for the total amount of information at different timepoints (e.g., by using a trailing window lag rather than accumulation), or contrast the classifier that derives from accumulating information with the classifier trained moment-by-moment?
Although it is likely that some of the increase in information reflects increased attention as the dot approaches the object, we think primarily that yes, the improved prediction power closer to the central object is likely to be due to accumulation of information (Figure 6D) and it will decline if we use a subsample of the accumulated information. We took this approach as the main purpose of our prediction analysis was to predict the outcome of the trial with maximal accuracy. We added the following sentence to the manuscript to clarify the point.
“In this analysis, the goal was to maximise the accuracy of predicting behaviour. For that purpose, we accumulated classification accuracies along the distances. Moreover, as each classifier performs a binary classification for each testing dot at each distance, the accumulation of classification accuracies also avoided the spurious classification accuracies to drive the decision, providing smooth “accumulated” accuracies for predicting the behaviour.”
6. Predicting miss trials: The implicit assumption here is that there is "less representation" for miss trials compared to correct trials (e.g., of distance to object). But even for miss trials, the representation is significantly above chance. However, maybe the lower accuracy for the miss trials resulted from on average more trials in which the target was not represented at all rather than a weaker representation across all trials. This would call into questions the interpretation of a decline in coding. In other words, on a single trial, a representation might only be present (but could result in a miss for other reasons) or not present (which would be the case for many miss trials), and the lower averages for misses would then be the result of more trials in which the information was completely absent.
It could be that the results of the subsequent analysis (predicting misses and correct responses before they occur) are in conflict with this more pessimistic interpretation. If we understand this correctly, here the classifier predicts Distance to Object for each individual trial, and Figure 6B shows that while there is a clear difference between the correct and miss trials, the latter can still be predicted above chance level but never exceed the threshold? If this is true for all single trials, this would indeed speak for a weak but "unused" representation on miss trials. But for this the authors need to show how many of the miss trials per participant had a chance-level accuracy (i.e. might be truly unrepresented), and how many were above chance but did not exceed the threshold (i.e. might have been "less represented").
This is a really good point. Yes, in principle, the average decoding levels could be composed of ‘all or none’ misses or graded drops in information, and it is possible that on some miss trials there is a good representation but the target is missed for other reasons (e.g., a response-level error). As neural data are noisy and multivariate decoding needs cross-validation across sub samples of the data, and because each trial, at each distance, can only be classified correctly or incorrectly by a two-way classifier, we tend not to compare the decoding accuracies in a trial-by-trial manner, but rather on average (Grootswagers et al., 2017). However, if we look at an individual dataset and examine all the miss trials (averaged over the 15 distances and cross-validation runs) in our distance-to-object decoding, we can get some insights into the underlying distributions.
We show the distribution of individual trial decoding accuracies for all participants on correct (Figure 5—figure supplement 1A) and miss (Figure 5—figure supplement 1B) trials. The vertical axis shows the number of trials in each accuracy bin of the histogram and the horizontal axis shows the decoding accuracy for each trial obtained by averaging its decoding accuracies over cross-validation folds (i.e., done by subsampling the correct trials into train and test sets and repeating the procedure until all correct trials are used once as training data and once as testing data) and distances. We calculated the percentage of miss trials for which there was strong evidence (BF>10) for above-chance decoding accuracies. To do this, we generated a null distribution with 100*N trials, where we produced 1000 decoding accuracies for each trial by randomizing the labels of distances for that trial. We used the same procedure for Bayes analyses as detailed in the manuscript.
The histograms of individual miss trials suggest a single distribution centred around chance decoding or slightly above (Figure 5—figure supplement 1B). This means that on an individual miss trial, there may be higher or lower decoding, but it is nowhere near the consistent high decoding levels we see for correct trials (Figure 5—figure supplement 1A). This seems consistent with an interpretation that on (most) miss trials, information is less present than on correct trials. Presumably it is this difference that allows our second level classifier to successfully predict the behavioural outcome on >80% of trials.
In contrast, for the correct trials, all trials (100%) for all subjects showed above-chance (>50%) decoding accuracy, with average accuracies around 80%. This suggest that as opposed to missed trials, in which some trials showed some distance information and some did not, on correct trials, all trials reflect the task-related information.
In order to quantify the overlap between correct and miss trials in individual trial level (as opposed to group-level Bayes factor analysis in the manuscript (Figure 5)), we calculated the Cohen’s d (Cohen, 1969) between the two distributions. As the results show (Figure 5—figure supplement 2C), there is a large difference (d >2) between the two distributions for every participant and condition. D values were mostly higher than 3 which corresponds to less than 7% overlap between decoding accuracies obtained for the correct and miss trials.
Overall, this additional analysis demonstrates that although the miss trials vary somewhat in levels of information (as measured by decoding), with some trials representing the distance information while others do not represent the distance information at all, very few miss trials are as informative as the least informative correct trials (the distributions overlap by less than ~7%). The miss trials with high decoding are presumably those on which our second level classifier makes the wrong prediction. We have revised the description in the manuscript to make this clearer and added the following paragraph and analyses to the manuscript.
“In principle, the average decoding levels could be composed of ‘all or none’ misses or graded drops in information, and it is possible that on some miss trials there is a good representation but the target is missed for other reasons (e.g., a response-level error). As neural data are noisy and multivariate decoding needs cross-validation across sub samples of the data, and because each trial, at each distance, can only be classified correctly or incorrectly by a two-way classifier, we tend not to compare the decoding accuracies in a trial-by-trial manner, but rather on average (Grootswagers et al., 2017). However, if we look at an individual dataset and examine all the miss trials (averaged over the 15 distances and cross-validation runs) in our distance-to-object decoding, we can get some insights into the underlying distributions (Figure 5—figure supplement 1). Results showed that, for all participants, the distribution of classifier accuracies for both correct and miss trials followed approximate normal distributions. However while the distribution of decoding accuracies for correct trials was centred around 80%, the decoding accuracies for individual miss trials were centred around chance-level. We evaluated the difference in the distribution of classification accuracies between the two types of trials using Cohen’s d. Cohen’s d was approximately 3 or higher for all participants and conditions, indicating a large (d > 2; Cohen, 1969) difference between the distribution of correct and miss trials. Therefore, although the miss trials vary somewhat in levels of information, very few (< 7%) miss trials are as informative as the least informative correct trials. These results are consistent with the interpretation that there was less effective representation of the crucial information about the distance from the object preceding a behavioural miss.”
7. The relationship between the vigilance decrement and error prediction. Is vigilance decrement driving the error prediction? That is, if errors increase later on, and the signal goes down, then maybe the classifier is worse. Alternatively, maybe the classifier predictions do not necessarily monotonically decrease throughout the experiment. Is the classifier equally successful at predicting errors early and late?
Thanks for the nice question. Our error prediction results initially were obtained from the whole dataset, including all blocks of trials. To answer the reviewer’s question, we now split the blocks into the first 5 (early) and the last 5 (late) blocks and repeated the error prediction procedure on the five early and late blocks separately. To remove the potential confound of the number of trials, we equalised the number of trials across the early and late time windows. As decoding of distances decreased along the time course of the experiment on correct trials (Figure 3), we would predict that there should be less difference in decoding of correct and miss trials in the later vs earlier blocks. The new analysis bears this out: Prediction accuracy for the trial outcome (correct vs miss) declined in later stages of the experiment (moderate to strong evidence (BF>3) for higher predictability for the trial outcome in early vs. late blocks of the experiment). Importantly, even with the decline in predicting accuracy, it is still possible to predict the behavioural outcome in the late blocks with well above-chance accuracy.
We have added these results to the supplementary material of the paper (Figure 6—figure supplement 1).
“The prediction of behavioural outcome (Figure 6) was performed using the data from the whole dataset. However, it is possible that the prediction would not be as accurate in later stages of the experiment (compared to the earlier stages) as the decoding performance of the distance information declined in general in later stages (Figure 3B). To test this, we performed the behavioural prediction procedure on datasets obtained from the first 5 (early) and the last 5 (late) stages of the experiment (Figure 6—figure supplement 1). There was strong evidence for a decline in the prediction power in the late vs. early blocks of trials. However, even with the decline in prediction accuracy, it is still possible to predict the behavioural outcome in the late blocks with well above-chance accuracy (up to 75%).”
8. When decoding distance, active decoding declines from early to late, even though performance does not decline (or even slightly improves from early to late). This discrepancy seems hard to explain. Is this decline in classification driven by differences in the total signal from early to late?
Thanks for the question. We explicitly define the vigilance effects as the difference between Active and Monitoring conditions to ensure that we are not interpreting general task effects like this one as vigilance decrements. This is important because otherwise effects that are not specific to maintaining vigilance (i.e., sustaining attention in the situation where only infrequent responses are necessary) could be misinterpreted. In this case, it could be driven by a number of general factors that are not specific to vigilance such as fatigue, but also equipment effects like the MEG recording system fluctuations in baseline (e.g., due to warming up). Our crucial comparisons for both behaviour and neural correlates are the increase in ‘miss rate’ and ‘reaction time’ for Monitoring vs. Active from early to late blocks and more decline in distance decoding information (from early to late blocks) for Monitoring than for Active (Figure 3B. Interaction between Target Frequency and Time on the task). We have now added the following sentence to Discussion and amended the manuscript to ensure this is clear.
“Note that our vigilance decrement effects are defined as the difference between Active and Monitoring conditions, which allows us to be sure that we are not interpreting general task (e.g., participant fatigue) or hardware-related effects as vigilance decrements. For example, the drop in decoding over time for both Active and Monitoring that is seen in Figure 3 might reflect some of the general changes in the characteristics of the recording hardware over the course of the experiment (e.g., the MEG system warming up), but our design allows us to dissociate these from the key vigilance effects we are interested in.”
9. Classifier performance was extremely high almost immediately after trial onset. Does the classifier perform at chance before the trial onset, or does this reflect sustained but not stimulus-specific information?
Thanks for pointing out that we were missing this information – yes, the classifier performs at chance in the pre-stimulus onset time. We have now added this to the modified figures in the revised manuscript.
10. The connectivity analysis appears to be just a correlation of decoding results between two regions of interest. This means, if one "region" allows for decoding the distance to the object, the other one does too. However, this alone does not equal connectivity. It could simply mean that patterns across the entire brain allow for decoding the same information. For example, it would not be surprising to find that both ROIs correlate more strongly for correct trials (i.e. the brain has obviously represented the relevant information) than for errors (i.e. the brain has failed to represent the information), without this necessarily being related to connectivity at all. The more parsimonious interpretation here is that information might have been represented across all channels at this time. The authors show no evidence that only these two (arbitrarily selected) "regions" encode the information while other do not. To show evidence for meaningful connectivity, (a) the spread of information should be limited to small sub-regions, and (b) the decoding results in one "region" should predict the results in another region in time (as for DCM).
Thanks for the important point. Actually, our connectivity analysis is not simply a correlation of magnitudes of decoding accuracy between two regions of interest, but rather a correlation of the patterns of decoding accuracies across conditions (i.e., across distances). Our approach follows the concept of informational connectivity (explained in more detail below) which measures how much similarity in information coding there is between two brain areas across conditions, which is interpreted as reflecting their potential connectivity. Therefore, rather than the average magnitude of decoding accuracy (high vs. low), the connectivity is driven by the correlation between the patterns of decoding accuracies either across time (Coutanche and Thompson-Schill, 2013) or across conditions (Kietzmann et al., 2018). We used the latter (i.e., RDMs) here to study connectivity. This is a critical difference because high classification values in two regions will not necessarily correspond to high connectivity in our analysis.
Accordingly, the difference in classification levels between ‘correct’ and ‘miss’ trials should not determine the connectivity – it’s more the consistency of the pattern (see below example). Our connectivity relies on (Spearman’s) correlation (which normalizes absolute amplitude), and as such it is unaffected by absolute decoding values in the pairs of input vectors: connectivity will be high only if the two areas encode the information across conditions similarly rather than if they code the information very efficiently across all conditions (i.e., maximum decoding values). For example, assume that we have four brain areas A, B, C and D with (simplified and vectorized) distance RDMs (as in our work) with decoding values of [95 91 97 92], [96 98 99 94], [57 51 55 54], [58 52 59 55], respectively. The inter-area correlation/connectivity matrix would be as in Author response table 1. As you can see, a pair of brain areas with absolutely higher decoding values (A and B), but less similarity of patterns in their RDMs can led to small correlations/connectivity (0.4) while pairs of brain areas which have small decoding values but more similar patterns of decoding in their RDMs (C and D) resulted in much higher correlation/connectivity (0.8). Therefore, rather than the absolute decoding values (i.e., whether the pair of areas encode the information or not), their patterns in the RDMs determine how/if they are coding information similarly and are potentially connected.
Connectivity (correlation) matrix obtained from four sample areas.
AREA | A | B | C | D |
---|---|---|---|---|
A | 1 | 0.4 | 0.8 | 1 |
B | 0.4 | 1 | 0 | 0.4 |
C | 0.8 | 0 | 1 | 0.8 |
D | 1 | 0.4 | 0.8 | 1 |
Although mathematically our connectivity should be unaffected by absolute decoding values, we acknowledge that potentially noisier patterns of distance information in the brain on miss vs. correct trials could result in apparently lower connectivity for misses. We therefore added the following paragraph to the manuscript acknowledging this possibility:
“While our connectivity is unaffected by the absolute levels of information encoding in the brain on miss vs. correct trials, potentially noisier patterns of information encoding in miss (vs. correct) trials could result in the lower level of connectivity observed on miss (vs. correct) trials. Therefore, the lower level of connectivity for miss vs. correct trials observed here could result from the pair of regions representing two distinct sets of information (i.e,. becoming in some sense less connected) or representing similar information but distorted by higher level of noise.”
The more parsimonious interpretation here is that information might have been represented across all channels at this time. The authors show no evidence that only these two (arbitrarily selected) "regions" encode the information while other do not. To show evidence for meaningful connectivity, (a) the spread of information should be limited to small sub-regions, and (b) the decoding results in one "region" should predict the results in another region in time (as for DCM).
a. Yes, it is possible that the whole brain may process the information with the same pattern of decoding but changing the ROIs to smaller ones would not rule out this potential scenario (which applies to all connectivity analyses, even the conventional ones). We avoid making claims about the spatial specificity of our connectivity effect, as we are using MEG (as reflected in the names we chose for the regions: peri-occipital and peri-frontal). Please note though that these sub-regions were not arbitrary, but rather based on areas known to be involved in vision and attention, and based on previous attention work which showed a flow of information across the two areas (Goddard et al., 2016; Goddard et al., 2019).
It is very important in the interpretation of our result that, rather than making any claims about the absolute existence or magnitude of potential connectivity in the brain, we compared our connectivity indices across conditions. In other words, we do not seek to test whether connectivity exists or not between our ROIs, but rather whether any such connectivity varies with our manipulations of vigilance. Therefore, even in if the entire brain was responding similarly, the modulation of the connectivity metric is only explainable by the manipulations across our conditions.
b. We could not check the time course of our connectivity as in our previous work (Goddard et al., 2016; Karimi-Rouzbahani et al., 2020), because our distance information involves the whole trial and the direction information does not have enough number of conditions to make RDMs (please see the informational connectivity text below). Therefore, we clarified in the manuscript that:
“Informational connectivity, on the other hand, is measured either through calculating the correlation between temporally resolved patterns of decoding accuracies across a pair of areas (Coutanche and Thompson-Schill, 2013) or the correlation between representational dissimilarity matrices (RDMs) obtained from a pair of areas (Kietzman et al., 2018; Goddard et al., 2016; Goddard et al., 2019; Karimi-Rouzbahani et al., 2019; Karimi-Rouzbahani et al., 2020). Either one measures how much similarity in information coding there is between two brain areas across conditions, which is interpreted as reflecting their potential informational connectivity, and is less affected by absolute activity values compared to conventional univariate connectivity measures (Anzellotti & Coutanche, 2018).”
And added the following considerations to the methods:
“First, it reflects the similarity of the way a pair of brain areas encode “distance” information during the whole trial. This means that we could not use the component of time in the evaluation of our connectivity as we have implemented elsewhere (Karimi-Rouzbahani et al., 2019; Karimi-Rouzbahani et al., 2020). Second, rather than a simple correlation of magnitudes of decoding accuracy between two regions of interest, our connectivity measure reflects a correlation of the patterns of decoding accuracies across conditions (i.e., distances here). Finally, our connectivity analysis evaluates sensory information encoding, rather than other aspects of cognitive or motor information encoding, which might have also been affected by our experimental manipulations.”
11. The display of the results is very dense, and it not always clear whether decoding for a specific variable was above chance or not. The authors often focused on relative differences, making it difficult to fully understand the meaning of the full pattern of results. The Bayes-factor plots in the decoding results figures are so cramped that it is very difficult to actually see the individual dots and to unpack all of this (e.g., Figure 3). Could this complexity be somehow reduced, maybe by dividing the panels into separate figures? The two top panels in Figure 3B should also include the chance level as in A. It looks like the accuracy is very low for unattended trials, which is only true in comparison to attended trials, but (as also shown in Supplementary Figure 1) it was clearly also encoded in unattended trials, which is very important for interpreting the results.
We have extensively revised our figures, and expanded the Bayes plots; we hope they are now clear. We have split the panels in figures into Active and Monitoring panels, added the chance level line, and the pre-stimulus decoding values. We also reduced the density of Bayes Factor dots by down-sampling, and improved their appearance using a log scale and colour coding.
Regarding the relative differences, our design focuses on these because this allows us to be more specific about the effects that reflect actual vigilance decrements. This differs from many vigilance studies, and provides the opportunity for more specific inference. We have ensured this is clearer in the revision.
We hope the revised text and figures enhance the interpretability of the relative differences.
12. While this is methodologically interesting work, there is no convincing case made for what exactly the contribution of this study is for theories of vigilance. It seems that the findings can be reduced to that a lack of decodability of relevant target features from brain activity predicts that participants will miss the target. This alone, however, does not seem to be very novel. Even if the issues above are addressed, the study only demonstrates that with less attention to the target, there is less evidence of representations of the relevant features of targets in the brain. The authors also find the expected decrements for rare targets and when participants do not actively monitor the targets. How do these findings contribute to "theories of vigilance", as claimed by the authors?
This work makes three clear contributions to vigilance research. First, we present a novel multiple-object-monitoring paradigm that clearly evokes specific vigilance decrements in a context that mimics real-world monitoring scenarios. Our design controls for general experiment-level effects that are not specific to vigilance conditions, which as mentioned above, is surprisingly rare in the vigilance literature (which we now make clearer in the revision). This is an important contribution to the field as it provides a tool for further studies and allows us to address our hypotheses in a new and more realistic context.
Second, we showed that behavioural vigilance decrements are reflected in the neural representation of information. Previous studies have only provided coarse-grained correlates for vigilance decrements such as α-band increase in power spectrum (Kamzanova et al., 2014; Mazaheri et al., 2009; O’Connell et al., 2009). Here, we show that the neural representation of task-related information (i.e., distance) is affected by target frequency. While we agree that this is clearly a plausible prediction, it is a major step forward for a field that has had limited success in exploring specific neural correlates.
Third, we showed that change in neural representation of information between miss trials and correct trials can be used to predict the behavioural outcome on a given trial. This involves new methods that will be widely applicable, contributes to the global endeavour to link brain and behaviour, and provides a foundation for further research into potential applications for industries where detecting lapses of attention (as measured by a drop in specific task-relevant information) could prevent tragic accidents, such as rail and air traffic control.
Although we mentioned the major theories of vigilance in the paper, the theories themselves are underspecified, making it difficult to directly test them. We therefore deliberately avoided making strong claims about how our results falsified (or otherwise) the theories: they just do not contain enough specificity to do this. Nonetheless to avoid the implication that we provide a direct test of these theories, we removed the relevant paragraph in the discussion and carefully revised the paper to be explicit that the goal is not to adjudicate between the descriptive cognitive theories but rather to (a) provide a specific tool for studying vigilance in situations that mimic real-world challenges; (b) to understand what changes in the information encoded in the brain when vigilant attention lapses; and (c) to develop a method that can use neural data to predict behavioural outcomes.
https://doi.org/10.7554/eLife.60563.sa2Article and author information
Author details
Funding
Australian Research Council (DP170101780)
- Anina N Rich
Australian Research Council (FT170100105)
- Alexandra Woolgar
The Royal Society (NIF\R1\192608)
- Hamid Karimi-Rouzbahani
Medical Research Council (SUAG/052/G101400)
- Alexandra Woolgar
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
This work was funded by an Australian Research Council (ARC) Discovery Project grant to ANR and AW (DP170101780). AW was supported by an ARC Future Fellowship (FT170100105) and MRC intramural funding SUAG/052/G101400. H K-R was supported by Newton International Fellowship from Royal Society (NIF\R1\192608). We thank Denise Moerel, Mark Wiggins, Jeremy Wolfe, and William Helton for contributions to an earlier design of the MOM task.
Ethics
Human subjects: The Human Research Ethics Committee of Macquarie University approved the experimental protocols and the participants gave informed consent before participating in the experiment. The approval identifier is 52020297914411.
Senior Editor
- Floris P de Lange, Radboud University, Netherlands
Reviewing Editor
- Peter Kok, University College London, United Kingdom
Version history
- Received: July 3, 2020
- Accepted: April 2, 2021
- Accepted Manuscript published: April 8, 2021 (version 1)
- Version of Record published: April 21, 2021 (version 2)
- Version of Record updated: August 8, 2023 (version 3)
Copyright
© 2021, Karimi-Rouzbahani et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 1,205
- Page views
-
- 148
- Downloads
-
- 4
- Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Neuroscience
How does the human brain combine information across the eyes? It has been known for many years that cortical normalization mechanisms implement ‘ocularity invariance’: equalizing neural responses to spatial patterns presented either monocularly or binocularly. Here, we used a novel combination of electrophysiology, psychophysics, pupillometry, and computational modeling to ask whether this invariance also holds for flickering luminance stimuli with no spatial contrast. We find dramatic violations of ocularity invariance for these stimuli, both in the cortex and also in the subcortical pathways that govern pupil diameter. Specifically, we find substantial binocular facilitation in both pathways with the effect being strongest in the cortex. Near-linear binocular additivity (instead of ocularity invariance) was also found using a perceptual luminance matching task. Ocularity invariance is, therefore, not a ubiquitous feature of visual processing, and the brain appears to repurpose a generic normalization algorithm for different visual functions by adjusting the amount of interocular suppression.