1. Neuroscience
Download icon

Voluntary and involuntary contributions to perceptually guided saccadic choices resolved with millisecond precision

Research Article
  • Cited 0
  • Views 746
  • Annotations
Cite this article as: eLife 2019;8:e46359 doi: 10.7554/eLife.46359

Abstract

In the antisaccade task, which is considered a sensitive assay of cognitive function, a salient visual cue appears and the participant must look away from it. This requires sensory, motor-planning, and cognitive neural mechanisms, but what are their unique contributions to performance, and when exactly are they engaged? Here, by manipulating task urgency, we generate a psychophysical curve that tracks the evolution of the saccadic choice process with millisecond precision, and resolve the distinct contributions of reflexive (exogenous) and voluntary (endogenous) perceptual mechanisms to antisaccade performance over time. Both progress extremely rapidly, the former driving the eyes toward the cue early on (∼100 ms after cue onset) and the latter directing them away from the cue ∼40 ms later. The behavioral and modeling results provide a detailed, dynamical characterization of attentional and oculomotor capture that is not only qualitatively consistent across participants, but also indicative of their individual perceptual capacities.

https://doi.org/10.7554/eLife.46359.001

eLife digest

How do you decide what to do next? Your behavior at any given moment is usually the result of a competition between internal and external factors. Internal factors include your existing plans, goals and knowledge. External factors include events happening in the world around you. When out driving, for example, you check zebra crossings because you know that pedestrians could be present. But you look at stoplights because your eyes are drawn automatically to their changing colors.

Scientists can study this competition between internal and external factors using a simple laboratory task. A single spot of light appears in the dark, and your job is to look away from it. The instruction is simple and yet carrying it out requires willful effort. This is because your automatic response is to look at any stimulus that suddenly appears. Overcoming this automatic response requires similar thought processes to those that help someone resist eating that second piece of chocolate.

However, the competition between automatic and voluntary visual processes is over in a fraction of a second, which makes it difficult to analyze. Salinas et al. therefore modified the “look-away” task by asking participants to respond under time pressure. This tweak makes it possible to track – with millisecond precision – voluntary and automatic influences on performance. The results revealed that the eyes are automatically drawn to the cue about 100 milliseconds after it appears. The separate voluntary process that directs the eyes away from the cue arises about 40 milliseconds later.

Salinas et al. observed these voluntary and involuntary components in every healthy volunteer tested. But there were also differences between individuals in how effectively they could look away from the cue. This is important because the automatic draw of salient stimuli determines what you pay attention to, as well as what you look at. Future studies could use the modified version of the look-away task to examine whether this automatic pull of attention, and the ability to resist it, differs in individuals with disorders like ADHD.

https://doi.org/10.7554/eLife.46359.002

Introduction

Neuroscience aims to explain macroscopic behavior based on the microscopic operation of distinct neural circuits, and this requires carefully designed tasks that expose the relationship between the two. In the case of the antisaccade task (Coe and Munoz, 2017; Munoz and Everling, 2004), in which participants are instructed to withhold responding to a visual cue in favor of programming a saccade to a diametrically opposed location, performance relies heavily on frontal cortical mechanisms associated with cognitive control (Guitton et al., 1985; Everling and Fischer, 1998; Munoz and Everling, 2004; Condy et al., 2007; Luna et al., 2008; Hakvoort Schwerdtfeger et al., 2012), and the paradigm is considered to be a sensitive assay of impulsivity and executive function in general. Indeed, the mean reaction time (RT) and overall error rate in the antisaccade task are frequently used as biomarkers for cognitive development (Klein and Foerster, 2001; Luna et al., 2008; Coe and Munoz, 2017) and, in clinical settings, for mental dysfunction (Everling and Fischer, 1998; Munoz et al., 2003; Hutton and Ettinger, 2006; Antoniades et al., 2015; Wiecki et al., 2016).

The antisaccade task pits against each other two fundamental processes, one involuntary and the other voluntary. On one hand, the sudden appearance of a salient visual stimulus automatically attracts spatial attention (Theeuwes, 1991; Theeuwes et al., 1998; Ruz and Lupiáñez, 2002; Busse et al., 2008; Theeuwes, 2010; Carrasco, 2011; Aagten-Murphy and Bays, 2017) to produce either a covert shift (‘attentional capture’) or an overt saccade (‘oculomotor capture’). In either case, the effect is described as bottom-up or exogenous, and is thought to be fast and transient. On the other hand, programming a saccade away from a cue is a top-down or endogenous process that likely summons several mechanisms, such as working memory (Roberts et al., 1994; Lavie and De Fockert, 2005) and endogenous attention (Godijn and Theeuwes, 2002; Theeuwes, 2010; Carrasco, 2011), which are thought to be slower and to require a sustained cognitive effort. Thus, the rationale for the task is sound — the timing and intensity of the conflict between bottom-up and top-down mechanisms should correlate with behavior, and with the dynamics of the underlying attentional and oculomotor neural circuits.

There is a problem, however: such conflict must unfold very quickly. First, exogenous attention is thought to be mediated by visually-driven responses in oculomotor areas such as the frontal eye field (FEF) and superior colliculus (SC), which have latencies of at least 50 ms (Gottlieb and Goldberg, 1999; Bisley et al., 2004; Thompson et al., 2005; Ipata et al., 2006; Joiner et al., 2017; White et al., 2017; Chen et al., 2018). And second, spatial attention can be endogenously shifted roughly 150 ms after a relevant cue is provided (Kim and Cave, 1999; Ogawa and Komatsu, 2004; Busse et al., 2008; Theeuwes, 2010; Markowitz et al., 2011). This suggests that the competition between exogenous and endogenous responses evolves in less than 100 ms. The usual behavioral metrics of mean RT and overall accuracy are thus unlikely to yield a clean characterization of this competition, because they can be traded against each other and reflect the end results of numerous operations (perceptual, motor, cognitive) that contribute to a much longer choice process (indeed, below we show that such metrics are severely confounded). How can this problem be overcome?

The solution is to make the task urgent. The compelled antisaccade task requires subjects, humans in this case, to begin programming a saccade before knowing the direction of the correct response, and to use later arriving information about cue location to appropriately modify the ongoing motor plans. Urgency allows us to generate a special psychometric function, the ‘tachometric curve,’ which tracks success rate as a function of the perceptually relevant time interval, the raw processing time (rPT, measured between cue presentation and saccade onset). We find that, for the compelled antisaccade task, the tachometric curve takes on a unique shape: within a narrow rPT range, the curve yields a pronounced dip to below-chance performance in which the exogenous capture by the cue is so strong that the success rate approaches 0%; thereafter, however, endogenous control takes over, and the fraction of correct saccades to the ‘anti’ location increases rapidly. The experimental data were comprehensively replicated by a neurophysiologically based model of the saccadic choice process, with the combined results providing a remarkably detailed account of how reflexive and voluntary mechanisms compete over time to determine task performance.

Results

Urgent antisaccade behavior is characterized by strong but temporally constrained oculomotor capture

Akin to an athlete anticipating the trajectory of a ball that must be caught or struck, the participant in the compelled antisaccade task must begin programming a movement in advance of the relevant sensory information, and must quickly interpret the later arriving visual cue to modify the developing motor plan(s) on the fly. In the sequence of task events (Figure 1a), the key step is the early offset of the fixation spot, which means ‘respond now!’ This go signal is given first, before the cue, which is revealed after an unpredictable gap period. The cue appears randomly to the left or right of fixation, and the participant is instructed to make an eye movement away from it, to the diametrically opposite location — but for this response to be correct, the saccade must be initiated within 450 ms of the go signal. Thus, due to urgency, that is, time pressure, the participant must begin planning (and may even execute) a motor response before knowing what the correct choice is. This design, in which motor and perceptual processes are meant to run concurrently, stands in contrast to the easy, non-urgent version of the task (Figure 1b), in which delivery of the cue before the go signal allows more time for the perceptual process to be completed before saccade onset.

Urgent and non-urgent variants of the antisaccade task.

(a) The compelled antisaccade task. After a fixation period (150, 250, or 350 ms), the central fixation point disappears (Go), instructing the participant to make an eye movement to the left or to the right (±10°) within 450 ms. The cue is revealed (Cue) after a time gap that varies unpredictably across trials (Gap, 0–350 ms). The correct response is an eye movement (Saccade, white arrow) away from the cue, to the diametrically opposite, or anti, location. (b) The delayed antisaccade task. In this case, the cue is shown before the go signal, during fixation. The interval between cue onset and fixation offset varies across trials (Delay, 100 or 200 ms). In all trials, the reaction time (RT) is measured between the onset of the go signal and the onset of the saccade, whereas the raw processing time (rPT) is measured between cue onset and saccade onset.

https://doi.org/10.7554/eLife.46359.003

As in other perceptually based urgent tasks (Becker and Jürgens, 1979; Stanford et al., 2010; Salinas and Stanford, 2013; Scerra et al., 2019), the cue viewing time or rPT (computed as RT - gap, or RT + delay; Figure 1) is the crucial variable here, because it specifies how much time is available for detecting and analyzing the cue in each trial. With little or no time to see the cue, the success rate cannot rise above chance, but as the viewing time increases, performance is expected to improve. Using multiple gap values (0–350 ms) ensures full coverage of the relevant rPT range. When the probability of making a correct choice is plotted as a function of rPT — a behavioral metric that we refer to as the tachometric curve — the result is a millisecond-by-millisecond readout of the evolving perceptual decision (Becker and Jürgens, 1979; Stanford et al., 2010; Shankar et al., 2011; Salinas and Stanford, 2013; Seideman et al., 2018).

For the compelled antisaccade task, the tachometric curve exhibits a unique, non-monotonic shape that reflects the interaction between early involuntary and later voluntary processes (Figure 2). For rPTs shorter than 90 ms, participants perform at chance, as expected. Shortly thereafter, the initial influence of the cue manifests as a pronounced drop in performance, as participants erroneously direct a large proportion of their saccades toward the cue. This dip, which we refer to as the ‘vortex,’ is short-lived (visible for rPTs of 100–140 ms approximately; Figure 2, gray shades), but it is so abrupt and occurs so reliably over a consistent range of rPTs, that it reaches nearly 0% correct even in the data pooled from all six participants (Figure 2, main panel). In trials in which the rPT falls inside this narrow interval, it is almost impossible to avoid looking at the cue.

Figure 2 with 1 supplement see all
Perceptual performance in the compelled antisaccade task demonstrates a vortex.

Each panel shows a tachometric curve, that is, a plot of the probability of making a correct response as a function of rPT, or cue-viewing time. Colored points are experimental results in overlapping time bins (bin width = 15 ms); light shades indicate ± 1 SE from binomial statistics; black lines are continuous, analytical functions fitted to the data. The vortex is the part of the curve for which saccades are highly likely to be captured and performance drops below chance. It is demarcated by gray shades for individual participants (small panels, P1 – P6) and for the aggregate data set (large panel, All participants). Results are from trials (between 1366 and 1534 per participant) in which the high-luminance cue (icon) was shown.

https://doi.org/10.7554/eLife.46359.004

As rPT increases beyond 140 ms, the success rate rises and gradually approaches an asymptote, as participants direct a progressively larger proportion of their saccades to the correct, anti location. This rise in performance is remarkable in that it is extremely fast: for the pooled data (Figure 2, main panel) the tachometric curve goes from 0.25 to 0.75 in 18 ms, and from 0.10 to 0.90 in only 37 ms. For some of the participants, the process is faster (Figure 2, P1, P2, P4). The asymptotic fraction of correct responses is close to 1 (the lowest across participants was 0.978), which indicates that the participants understood the instructions and could perform the task almost perfectly — given enough time. Consistent with this, in easy, non-urgent antisaccade trials (Figure 1b) the fraction correct was also close to 1 (median = 0.992). However, because the processing times it generates are so long, the easy version of the task only provides a glimpse of the capture phenomenon, if anything (Figure 2—figure supplement 1).

Antisaccade performance varies with cue luminance

The characteristic shape of the tachometric curve likely results from the interplay between a reflexive and a voluntary mechanism, both of which depend on the cue. If the vortex indeed reflects the strength of low-level sensory representations that are driven by the cue’s salience, then, consistent with previous demonstrations of attentional/oculomotor capture (Theeuwes, 1991; Theeuwes, 1992; Theeuwes, 1994), it should become weaker when the salience of the cue is reduced. To investigate this, our participants performed the compelled antisaccade task with cues of three luminance levels, high (data shown in Figure 2), medium, and low (Materials and methods). The three cues were the same for all participants and were randomly interleaved during the experiment. Because the faintest cue was chosen to be slightly above the detection threshold, we expected it to yield a much shallower vortex.

The expectation for the later rise in the tachometric curve was less clear. If the rise is a direct reflexion of the cognitive process that remaps the spatial location of the cue and programs an antisaccade, then its steepness must depend, at least in part, on the speed and the variability of this process. So, if weaker sensory signals are generally processed more slowly or with higher variance, then the tachometric curve should rise more gradually as luminance decreases.

The experimental results showed that both the timing and depth of the vortex depend strongly on cue luminance. For the data pooled from all participants (Figure 3a), as luminance decreases from high (bright green points) to low (dark green points), the vortex shifts to the right by about 50 ms (the minimum point shifts from rPT = 111 ± 1.3 ms [SE from bootstrap] to rPT = 162 ± 6.0 ms), suggesting that the time needed to detect the cue increases accordingly. The vortex also becomes much less deep (the minimum fraction correct goes from 0.03 ± 0.006 to 0.32 ± 0.026). These findings are consistent with the expected weakening of (involuntary) attentional capture.

Perceptual performance varies as a function of cue luminance.

(a) Tachometric curves for trials in which the cue had high, medium, and low luminance (indicated by bright, grayish, and dark green points, respectively). Results are for the pooled data from all participants. The vortex shifts to the right and becomes less deep as luminance decreases. (b) Tachometric curves from three individual participants at each cue luminance level, high, medium and low, as indicated by the icons. Gray shades demarcate the vortex of each curve. In each panel, colored points are experimental results and black lines are continuous fits to the data.

https://doi.org/10.7554/eLife.46359.006

We also found that, as luminance decreases, the rise of the tachometric curve becomes significantly less steep (p < 0.0001 for all differences in maximum slope between luminance conditions, from bootstrap; see Figure 4—figure supplement 1), consistent with the notion that the voluntary remapping of the cue location proceeds more slowly or less reliably as the cue becomes more dim. Thus, in addition to strongly determining the initial, bottom-up response to the cue, luminance probably also impacts the top-down process at work in the task.

Qualitatively similar dependencies on cue luminance were observed in each participant’s data set (Figure 3b), but reliable differences across individuals became evident when the effects were evaluated quantitatively. For any given tachometric curve, quantification was achieved by fitting the empirical data (Figure 3, colored data points) with a continuous analytical function (Figure 3, black traces; Equation 2) and measuring several features from the fitted curve (Materials and methods). We present results for three such features that were particularly reliable given the size of our samples (for additional features, see Figure 4—figure supplement 1). The first one is the average value of the tachometric curve for rPTs between 0 and 250 ms, which we refer to as the mean perceptual accuracy (Figure 4a). The second feature is the rPT at which the tachometric curve reaches its minimum, which we designate as the vortex time (Figure 4b). And the third feature is the rPT at which the rising part of the tachometric curve is halfway between its minimum and maximum values, which we call the endogenous response centerpoint, or just the centerpoint of the curve, for brevity (Figures 2 and 3b, right border of gray shades; Figure 4c). These quantities are partially related; the centerpoint, which measures how soon the participant can escape the vortex, is independent of the vortex time (partial Spearman correlation ρ = 0.34, p = 0.2; Materials and methods), but is strongly anti-correlated with perceptual accuracy (ρ = 0.85, p = 10-5). Notably, the separation between the ‘best’ and the ‘worst’ participant within a given luminance condition is statistically large, particularly for the mean perceptual accuracy and the centerpoint of the curve (Figure 4a,c; note little overlap between 95% confidence intervals for bars of same color). The observed effects of cue luminance are highly consistent across participants (Figure 3b), but the quantitative details reveal idiosyncratic variations that distinguish one individual from another (Figure 4a,c; see below). This is significant because cognitive tasks generally produce robust differences either between experimental conditions/treatments or between individuals, but not both (Borsboom et al., 2009; Hedge et al., 2018).

Figure 4 with 2 supplements see all
Perceptual performance quantified across participants and luminance conditions.

Each panel shows one particular quantity derived from the fitted tachometric curves, with results sorted by participant (x axes) and luminance level, high (bright green), medium (grayish green), and low (dark green). Error bars indicate 95% confidence intervals obtained by bootstrapping. (a) Mean perceptual accuracy, calculated as the average value of the fitted tachometric curve for rPTs between 0 and 250 ms. (b) Vortex time, calculated as the rPT at which the minimum of the fitted tachometric curve is found. (c) Endogenous response centerpoint, equal to the rPT at which the rise of the fitted tachometric curve is halfway between its minimum and maximum values.

https://doi.org/10.7554/eLife.46359.007

Individual differences in perceptual and overall performance

Antisaccade performance is often quantified using the mean accuracy. This assumes that the overall success rate in the task directly reflects the degree to which voluntary control can override the involuntary urge to look at a salient stimulus. But is this assumption correct? Answering this critical question is generally difficult because doing so requires access to an independent assessment of perceptual performance — but that is precisely what the tachometric curve affords. To investigate the relationship between traditional antisaccade performance measures and perceptual capacity, we examined their natural variations accross individual participants.

First, we computed the correlation between two variables (Materials and methods), the average perceptual accuracy (mean value of the tachometric curve), and the average observed accuracy (mean fraction of correct choices). We found that, even though both quantites tend to increase with higher luminance, suggesting a positive correlation, they are, in fact, uncorrelated (Figure 5a). The rank of a given participant based on one measure is not predictive of his or her rank based on the other.

Figure 5 with 4 supplements see all
Dissociation between perceptual capacity and overall task performance.

In each panel, the data from each participant (joined by lines) are shown for trials of high, medium, and low luminance cues (bright, grayish, and dark green points, respectively). Crosses indicate the typical (median) uncertainty (2 SEs) associated with the measurement in each direction. Partial Spearman correlations between values on the x and y axes are indicated, along with significance (Materials and methods). The partial correlation eliminates the association due exclusively to luminance. (a) Mean observed accuracy versus mean perceptual accuracy. (b) Mean observed accuracy versus mean RT. Average RT data include both correct and incorrect trials. (c) Mean perceptual accuracy versus mean RT.

https://doi.org/10.7554/eLife.46359.010

This may seem surprising. Logic dictates that better perception should translate into better performance — but critically, this is contingent on everything else being equal. The paradox arises because the mean RT also varies across participants, and the two accuracy measures relate to it in opposite ways. The average observed accuracy demonstrates a strong speed-accuracy tradeoff, that is, slower participants are correct more often (Figure 5b). In contrast, the mean perceptual accuracy demonstrates a weaker opposite trend, that is, those participants that exhibit high perceptual ability also tend to respond more quickly (Figure 5c). The results are nearly identical when perceptual performance is quantified with the endogenous response centerpoint (Figure 5—figure supplement 1).

This stark divergence did not arise because our urgent task produced more or different errors, but rather because, unlike the mean accuracy, the tachometric curve is a true metric of perceptual processing. This curve is highly sensitive to the properties of the visual stimuli that must be judged (such as luminance, in this case), and at the same time, when such properties are fixed, it is largely impervious to manipulations that substantially alter the RT (Stanford et al., 2010; Shankar et al., 2011; Salinas et al., 2014; Scerra et al., 2019; for evidence that is specific to the compelled antisaccade task, see Figure 5—figure supplement 2). In contrast, the mean observed accuracy depends not only on the shape of the tachometric curve, but also on two factors, unrelated to perception, that determine which parts of the curve are sampled during an experiment, the gap values used and the subject’s urgency. For instance, when only a zero gap is used, most rPTs are beyond the vortex range (Figure 2—figure supplement 1g,h), where differences between participants reflect mainly their asymptotic performance levels, that is, their lapse rates. But even when a wide range of gaps is used (as in Figure 5), participants that tend to respond quickly (short RTs) will generally produce short rPTs and sample more densely the left side of their curves, whereas participants that tend to respond slowly (long RTs) will generally produce long rPTs and sample more densely the right side of their curves. This is the source of the speed-accuracy tradeoff found here (Figure 5b). As a result, the mean accuracy provides scarcely any information about the ability of an individual to prevent a captured saccade relative to that of others.

A comprehensive account of antisaccade behavior based on motor competition

We developed a physiologically feasible model (Materials and methods) to explore two mechanistic hypotheses about the neural origin of the vortex. This model is a variant of one that replicates both behavioral performance and choice-related neuronal activity (in the FEF) in an urgent, two-alternative, color discrimination task (Stanford et al., 2010; Shankar et al., 2011; Costello et al., 2013; Seideman et al., 2018). As in that case, the current model considers two variables, rL and rR, that represent oculomotor responses favoring saccades toward left and right locations (Figure 6, black and red traces). These motor plans compete with each other such that the first one to reach a fixed threshold level (Figure 6, dashed lines) determines the choice: a left saccade if rL reaches threshold first, or a right saccade if rR reaches threshold first. In each trial, after the go signal, rL and rR start increasing with randomly drawn build-up rates. The build-up process is likely to end in a random choice (i.e. a guess; Figure 6c) when one of the initial rates is high and/or the gap is long, but otherwise, time permitting, the cue signal modifies the ongoing motor plans (Figure 6a,b). Specifically, once the target has been identified, the plan toward it (correct) is accelerated and the other one, toward the opposite, incorrect location, is decelerated (Figure 6a, note acceleration of black trace and deceleration of red trace after shaded interval). This corresponds to the cue content, interpreted according to task rules, informing the correct choice.

Three representative single trials of the race-to-threshold model.

Traces show motor plans rL toward the left (red) and rR toward the right (black) as functions of time. Because in these examples the cue is assumed to be on the left, these variables also correspond to motor plans toward the cue (red) and the anti location (black), respectively. After the exogenous response interval (ERI, pink shade), the former (incorrect) plan decelerates and the latter (correct) plan accelerates. A saccade is triggered a short efferent delay after activity reaches threshold (dashed lines). (a) A correct, long-rPT trial; that is, an informed choice (RT = 369 ms, rPT = 219 ms). (b) An incorrect trial with rPT within the vortex; that is, a captured saccade (RT = 283 ms, rPT = 133 ms). (c) A correct, short-rPT trial; that is, a correct guess (RT = 206 ms, rPT = 56 ms). In all examples, the gap is 150 ms. The influence of the cue depends on its timing relative to the ongoing motor activity.

https://doi.org/10.7554/eLife.46359.015

To adapt this ‘accelerated race-to-threshold’ model to the compelled antisaccade task, we introduced one crucial, task-specific assumption: that the competition is biased in favor of the cue location during a period of time that we refer to as the exogenous response interval, or ERI (Figure 6, pink shades). During the ERI, the cue has already been detected by the circuit but not yet interpreted as ‘opposite to the target’ (so it cannot yet drive the endogenous acceleration and deceleration described above). We consider two possible mechanisms by which, during the ERI, the detection of the cue may lead to exogenous attentional/oculomotor capture: (1) it could halt or suppress the ongoing plan toward the anti location (Figure 6a,b, black traces during pink interval) or (2) it could transiently accelerate the ongoing plan toward the cue location (Figure 6a,b, red traces during pink interval). These alternatives are not mutually exclusive. The former is consistent with evidence that salient, abrupt-onset stimuli reflexively interrupt ongoing saccade plans (Dorris et al., 2007; Bompas and Sumner, 2011; Hafed and Ignashchenkova, 2013; Buonocore et al., 2017; Salinas and Stanford, 2018), whereas the latter is consistent with the short-latency, stimulus-driven activation of visually responsive neurons in oculomotor areas (Gottlieb and Goldberg, 1999; Bisley et al., 2004; Thompson et al., 2005; Ipata et al., 2006; Marino et al., 2015; Joiner et al., 2017; White et al., 2017; Chen et al., 2018).

We found that, to reproduce the psychophysical data accurately, both mechanisms were necessary. To see why, first note that the tachometric curve, which refers to the proportion of correct choices in each rPT bin, can be expressed as a ratio,

(1) C(rPT)=fC(rPT)fC(rPT)+fI(rPT)

where fC(rPT) and fI(rPT) describe the frequencies of correct and incorrect choices at each rPT, that is, they are the rPT distributions for correct and incorrect trials (normalized by the same factor; Materials and methods). Each of these distributions demonstrates a distinct feature: fC has a dip (green shades in Figure 7, third row), whereas fI has a peak (red shades in Figure 7, bottom row). Both features contribute to the vortex, as dictated by the above expression. The critical mechanistic observation is that acceleration of the motor plan toward the cue during the ERI accounts for the peak in fI (Figure 7a), whereas interruption of the competing motor plan away from the cue produces the dip in fC (Figure 7b). Thus, when the model was implemented with either one of the mechanisms alone, it failed to replicate the experimental feature associated with the other (Figure 7, bottom two rows, compare black traces in a vs. b). However, with the two mechanisms acting simultaneously, in coordination, the model reproduced the full data set in quantitative detail (Figures 7c and 8).

Contributions of two distinct neural mechanisms to attentional/oculomotor capture.

Top row: representative single, long-rPT trials from the model. Second row: tachometric curves, simulated (black traces) and experimental (green dots). Third row: rPT distributions for correct trials (fC), simulated (black traces) and experimental (green shades). Bottom row: rPT distributions for incorrect trials (fI), simulated (black traces) and experimental (red shades). (a) Results from a restricted version of the model in which, during the ERI, the motor plan toward the cue accelerates but the plan toward the anti location keeps advancing, unperturbed. (b) Results from another restricted version of the model in which, during the ERI, the motor plan toward the anti location halts but that toward the cue keeps advancing, unperturbed. (c) Results from the full-blown model, in which, during the ERI, the cue plan accelerates and the anti plan halts. For each model variant, results were obtained with the parameter values that minimized the error between the model and the pooled experimental data in the high luminance condition (Materials and methods).

https://doi.org/10.7554/eLife.46359.016
Figure 8 with 3 supplements see all
The race-to-threshold model accounts for antisaccade performance.

(a) Tachometric curves for high (left), medium (middle), and low (right) luminance cues. Continuous lines are model results. (b) Processing time distributions for correct (shades) and incorrect trials (red traces) at each luminance level. Overlaid traces (black and dark red) are corresponding model results. (c) RT distributions for correct (shades) and incorrect trials (red traces) at individual gap values (125 and 175 ms) for each luminance level. Overlaid traces (black and dark red) are model results. Dotted vertical lines mark the rPT at which the vortex reaches its minimum point (vortex time). All empirical data are pooled across participants.

https://doi.org/10.7554/eLife.46359.017

First, for the data pooled across participants, the model fitted the tachometric curve (Figure 8a) and the rPT distributions for correct and incorrect responses (Figure 8b). Second, for individual gap conditions, the model matched the variations in mean success rate and mean RT (Figure 8—figure supplement 1), but more importantly, it reproduced the shapes of the RT distributions for correct and incorrect choices, which were typically bimodal (Figure 8c). Third, the model accurately captured all the dependencies on luminance (Figure 8, compare results across columns). Importantly, in doing so, the values of the parameters that correspond to pure motor performance (the distribution of initial build-up rates for rL and rR, and the distribution of afferent delays associated with the go signal) were the same across luminance conditions (Table 1), in correspondence with the fact that all trials proceeded identically up to cue presentation, and that trials with different gap and cue luminance were interleaved during the experiment. And fourth, the model also fitted the (noisier) data from individual participants, even though they showed large, idiosyncratic variations in motor performance, as well as in the dips and peaks of their rPT distributions (Figure 8—figure supplements 2 and 3). For all comparisons across participants, the empirical (Figures 4 and 5) and simulated results (Figure 4—figure supplement 2; Figure 5—figure supplement 3) were nearly indistinguishable.

Table 1
Parameters of the race-to-threshold model for the pooled data.

Build-up rates are in AU ms-1, times are in ms, and acceleration and deceleration are in AU ms-2.

https://doi.org/10.7554/eLife.46359.021
LumμbσbρbμGOaffσGOaffμCUEaffσCUEaffμERIσERIgERIΔERIaEXdENDaENDλσbρbμGOaffσGOaffμCUEaffσCUEaffμERIσERIgERIΔERIaEXdENDaENDλ
High1.43.74−0.9551367652440100.96−0.70.170.02
Medium1.43.74−0.955136104132430141.15−0.540.170.02
Low1.43.74−0.9551361261924100140.58−0.290.140.1

Mechanistically, the best-fitting parameter values of the model (Table 1) provide further insight about the crucial element that gives rise to the vortex — the exogenous bias during the ERI (Figure 6, pink shades). Consider the following values based on the pooled data. According to the model, the onset of the ERI, which corresponds to the time at which the cue is detected, is highly sensitive to luminance. For the high, medium, and low conditions, the oculomotor circuitry detects the cue 76 ± 5 ms (mean ± SD for simulated trials), 104 ± 13 ms, and 126 ± 19 ms after its presentation. This variation from high to low luminance (50 ms) corresponds closely with the rightward shift of the vortex observed experimentally (51 ms; Figure 3a) and is consistent with the ubiquitous dependence of visual response latency on luminance and contrast (Purushothaman et al., 1998; Bisley et al., 2004; van Rossum et al., 2008; White et al., 2008; Oram, 2010; Marino et al., 2015). Remarkably, the ERI lasts only 24 ms (on average) in all three conditions, and the exogenous acceleration of the plan toward the cue occurs only during the last 14 ms (high luminance), or only during the last 10 ms (medium and low luminance); before that, the plan toward the cue halts just like its counterpart toward the anti location (Figure 6a,b, note that red trace is initially flat during shaded interval). The model suggests that the exogenous acceleration favoring the cue location is very brief but very powerful, which explains why the left edge of the vortex can be so steep.

Finally, the parameter values (Supplementary file 1) also point to specific neural mechanisms that likely underlie the individual differences in perceptual capacity. In general, identifying those mechanisms is complicated because their variations across (random) participants and across (controllable) experimental conditions are not necessarily correlated (Borsboom et al., 2009; Hedge et al., 2018). The cue latency discussed above is a perfect example: it demonstrates (via parameter μCUEaff) a strong, consistent dependence on luminance in each participant’s data set, and yet, for a given luminance level, it is not predictive of individual perceptual accuracy (Figure 5—figure supplement 4a). By contrast, we hypothesize that the magnitudes of the exogenous and endogenous acceleration (via parameters aEX and aEND) are major sources of individual variation, because although they have weaker dependencies on luminance, they are reliable predictors of perceptual accuracy (Figure 5—figure supplement 4b–d).

Discussion

By making the antisaccade task urgent, focusing on processing time (instead of RT), and developing a mechanistic model that is firmly grounded on the neurophysiology of saccadic choices, we were able to resolve the opposing influences of endogenous and exogenous mechanisms on the oculomotor response with unprecedented sharpness. Our findings suggest that, whether overt or covert, the capture of attention by a salient stimulus corresponds to specific changes in the firing of saccade-related neurons: activity that is spatially congruent with the stimulus is accelerated, whereas activity that is spatially incongruent is halted or suppressed. This fast, reflexive bias is the capture.

The results make four contributions. First, they identify concrete ways in which endogenous and exogenous mechanisms act on the oculomotor circuitry — namely, via acceleration, deceleration, and halting of rising firing rates — together with their unique behavioral signatures (dips and peaks in rPT and RT distributions). Second, they characterize the time scales of those mechanisms (a few tens of ms) as well as their dependencies on luminance. Third, they clearly parse the motor (RT) and perceptual (tachometric curve) contributions to antisaccade performance, thus removing the pervasive confounds caused by the speed-accuracy tradeoff. And fourth, they relate variations in motor, perceptual, and cognitive mechanisms to individual differences in performance. In particular, all participants exhibited a vortex, but different vortices resulted from different combinations of exogenous and endogenous mechanisms (compare rPT distributions for P1 vs. P4 in Figure 8—figure supplements 2 and 3) — a degeneracy in neural function that is to be expected based on that found in more reduced circuits (Marder et al., 2015).

Antisaccade performance in relation to cognitive conflict

By design, the antisaccade task creates a conflict between exogenous and endogenous mechanisms, the former driven by the saliency of the cue and the latter by task instructions followed willfully. Other tasks (e.g. Kim and Cave, 1999), most notably the singleton-distracter task employed by Theeuwes and colleagues (Theeuwes, 1991; Theeuwes, 1992; Theeuwes, 1994; Theeuwes et al., 1998; Theeuwes et al., 1999; Nissens et al., 2017), have also revealed such conflict in the form of attentional or oculomotor capture, but its manifestation in those cases consists primarily of small variations (∼ tens of ms) in RT around mean values that are much longer (>> 250 ms) than typical intersaccadic intervals, and the results can only provide a crude estimate of the underlying temporal dynamics (Mulckhuyse et al., 2008; see also Markowitz et al., 2011). Those tasks also require more complex visual displays with multiple items, and a secondary discrimination to serve as a probe of the effect. In principle, a minimalistic task typifying such an essential phenomenon would be extremely useful; it could serve to determine the neural correlates of volitional versus reflexive action, or pinpoint the consequences of disease on specific cognitive abilities, for example.

Numerous studies based on the traditional antisaccade task have, in fact, reported large differences in overall performance between distinct populations of participants (Guitton et al., 1985; Klein and Foerster, 2001; Munoz et al., 2003; Condy et al., 2007; Hakvoort Schwerdtfeger et al., 2012; Antoniades et al., 2015) — but do those differences relate to the conflict at the heart of the task? This is unclear. While a distinction between fast and slow errors has been drawn based on theoretical considerations (Lo and Wang, 2016), the tachometric curve reveals three types of error: fast, incorrect guesses (rPT  100 ms), saccades captured by the cue (vortex), and lapses (rPT  200 ms), which probably depend on distinct cognitive processes or states that vary over long time scales (Harris and Thiele, 2011; Lo and Wang, 2016; Nir et al., 2017). When antisaccade performance is evaluated via the mean accuracy, the most common metric, the three error types are combined in proportions that are unpredictable, because they depend on each participant’s urgency (how quickly they tend to respond; Figure 5b) and on the gap values used in the experiment. In particular, when the cue is presented before or simultaneously with the go signal, most rPTs are sampled in the asymptotic performance range, where most errors are lapses (Figure 2—figure supplement 1). The captured saccades are the essential manifestation of the conflict, but they cannot be reliably quantified unless performance is tracked with high temporal resolution.

Oculomotor capture as a perceptuo-motor phenomenon

Insight about the visuo-motor interactions that determine the shape of the tachometric curve can be gleaned by realizing that, for each trial, the rPT conveys information not only about how much time was available for perceptual deliberation, but also about the state of the motor build-up at the time when the cue was detected by the oculomotor circuitry. Consider what must happen for a saccade to be triggered at rPT = 111 ms, when the capture is nearly certain: at the moment that the cue is detected (left edge of pink interval in Figure 6b), the motor activity toward the cue location must be just below threshold, so that the exogenous drive can reliably propel it past threshold. This can happen in many ways, such as when the motor plans build-up slowly and the gap is long, or when the build-up is fast and the gap is short — but whatever the build-up history, an rPT of 111 ms corresponds to the requisite level of subthreshold activity. The same is true for other parts of the tachometric curve. Short rPTs (guesses) correspond to insufficient perceptual deliberation and to activity that exceeded threshold before the cue was detected (Figure 6c), whereas long rPTs (informed choices) correspond to successful deliberation guiding activity that was still far from threshold at the time of cue detection (Figure 6a). Captured saccades are reliably found within a narrow range of rPTs because their expression requires certain combinations of sensory (cue exposure) and motor (degree of build-up) conditions to be met.

This explanation simply recounts what the model does, so it is worth discussing how our model is different from previous ones, and why we think it is largely credible. Previous models of antisaccade performance (e.g. Wiecki and Frank, 2013; Lo and Wang, 2016; Aponte et al., 2017) applied to non-urgent conditions, so they provide limited insight about the shape of the tachometric curve. Furthermore, such models were concerned with the neural basis of inhibitory control more generally, so they involve, explicitly or implicitly, multiple brain areas; for example, one for producing motor responses and another for inhibiting the reflexive movement toward the cue. In contrast, our race-to-threshold model is agnostic as to where the relevant perceptual or control signals come from, or how they are computed; it simply deals with their dynamical impact (i.e. acceleration, deceleration, halting) on the developing motor activity that must ultimately communicate the urgent choice. As such, the fast variations of the model firing rates are meant to be directly comparable to those of saccade-related oculomotor neurons (in FEF, and perhaps SC).

Indeed, previous single-neuron studies in monkeys support key elements of our modeling framework. First, the initial level of motor activity contributes as proposed: during urgent saccadic choices, the build-up of activity in FEF starts after the go signal, regardless of when the cue information arrives (Stanford et al., 2010; Costello et al., 2013), and during antisaccade performance, the ongoing activity in FEF and SC is higher before erroneous saccades toward the cue than before correct antisaccades (Everling et al., 1998; Everling and Munoz, 2000). Second, in an urgent color discrimination task, perceptual information does produce acceleration and deceleration of motor activity in an rPT-dependent way (Stanford et al., 2010; Costello et al., 2013). Third, the timing of the vortex and its dependence on luminance parallel those of visual bursts in the oculomotor system (Gottlieb and Goldberg, 1999; Bisley et al., 2004; Thompson et al., 2005; Ipata et al., 2006; Joiner et al., 2017; White et al., 2017; Chen et al., 2018). In particular, the captured saccades in our experiment resemble so-called ‘express saccades’ in many ways: both result in movements triggered after 100 ms or less of stimulus viewing time; both are more likely with higher luminance; and both are facilitated by the early removal of the fixation requirement, such that a visually evoked response is superimposed on advancing motor activity (Paré and Munoz, 1996; Dorris et al., 1997; Marino et al., 2015). And fourth, the sudden presentation of a salient distracter stimulus has a robust impact on a developing saccade plan, with the effect depending strongly on their spatial congruence: when the saccade target is diametrically opposite to the stimulus, there is ample evidence (reviewed by Salinas and Stanford, 2018) indicating that the developing plan is transiently halted or suppressed, whereas when the saccade target is near the abrupt-onset stimulus, the developing plan is boosted (Dorris et al., 2007; Edelman and Xu, 2009; White et al., 2013; Marino et al., 2015). These observations are consistent with the exogenous and endogenous mechanisms implemented by the model.

Coupling between spatial attention and saccade planning

Our results are pertinent to a mechanistic question that is central to the ‘premotor theory’ of attention: to what degree is the neural substrate of the deployment of spatial attention the same as that of saccade planning? There is strong evidence that the rise in oculomotor activity associated with planning a saccade inevitably implies that attentional resources are at least partially allocated to the intended saccade endpoint (Kowler et al., 1995; Deubel and Schneider, 1996; Moore and Fallah, 2001; Godijn and Theeuwes, 2003; Moore and Armstrong, 2003; Cavanaugh and Wurtz, 2004; Steinmetz and Moore, 2014; Klapetek et al., 2016). The converse relationship — that is, whether the covert deployment of spatial attention must be accompanied by saccade planning — has been more contentious (Juan et al., 2004; Thompson et al., 2005). It appears, however, that the hypothesized motor plan associated with attentional allocation is just very difficult to observe when fixation must be actively maintained (Belopolsky and Theeuwes, 2012). During fixation, such a plan may manifest only as a subtle increase in baseline activity, rather than via the more typical steady rise in firing rate (Hauser et al., 2018), but it can be uncovered through experimental manipulations (Theeuwes et al., 1998; Theeuwes et al., 1999; Katnani and Gandhi, 2013; Nissens et al., 2017), and is evident in microsaccades (Chen et al., 2015; Lowet et al., 2018).

Our results are consistent with the idea that attentional and oculomotor capture are different behavioral manifestations of the same underlying neuronal dynamics (in, say, the FEF or SC). When a salient cue is detected, a bias favoring a motor plan toward its location is always generated (with the bias consisting of acceleration of the plan toward the cue and halting of any plans away from it). However, the impact of the exogenous biasing signal depends on the current state of the oculomotor circuitry. When the motor activity congruent with the cue location is far from threshold and is competing with other developing motor plans, the bias corresponds to the covert (and transient) deployment of attention to the cue. In contrast, when that activity has already developed to a substantial degree, the bias can quickly propel it past threshold. Then, the exogenous attraction of the cue becomes observable as an overt, captured saccade.

The detailed mechanistic framework presented here should provide ample opportunity to test these ideas in future experiments.

Materials and methods

Subjects and setup

Request a detailed protocol

Experimental subjects were six healthy human volunteers, two male and four female, ages 21–30. They were recruited from the Wake Forest School of Medicine and Wake Forest University communities. All had normal or corrected-to-normal vision. All participants provided informed written consent before the experiment. All experimental procedures were conducted with the approval of the Institutional Review Board (IRB) of Wake Forest School of Medicine.

The experiments took place in a semi-dark room. The participants sat on an adjustable chair, with their chin and forehead supported, facing a VIEWPixx LED monitor (VPixx Technologies Inc, Saint Bruno, Quebec, Canada; 1920 × 1200 screen resolution, 120 Hz refresh rate, 12 bit color) at a distance of 57 cm. Viewing was binocular. Eye position was recorded using an EyeLink 1000 infrared camera and tracking system (SR Research, Ottawa, Canada) with a sampling rate of 1000 Hz. Stimulus presentation and data collection were controlled using the system’s integrated software package (Experiment Builder).

Behavioral tasks

Request a detailed protocol

The sequence of events in the antisaccade task is described in Figure 1. The inter-trial interval was 1 s. The gap values used were 200, 100, 0, 75, 100, 125, 150, 175, 200, 250, and 350 ms, where negative numbers correspond to delays in the easy antisaccade task (Figure 1b). Thus, compelled and easy, non-urgent trials were interleaved. In each trial, the gap value, cue location (10° or 10°), and luminance level (see below) were randomly sampled. Auditory feedback was provided at the end of each trial: a beep to indicate that the saccadic response was made within the allowed RT window (450 ms), or no sound if the limit was exceeded. This was independent of the choice. Feedback about the choice itself was unnecessary, as participants easily understood the rules of the task. The task was run in blocks of 150 trials. After 50–150 trials of practice, each participant completed 30 blocks over six experimental sessions (days). Within each session, 2–3 min of rest were allowed between blocks.

The cue was a green circle (0.5° diameter) appearing on a black background. Each participant performed the task with cues of three luminance levels, high (17.6 cd m-2), medium (0.35 cd m-2), and low (0.22 cd m-2). Luminance was measured with a spectrophotometer (i1 Pro 2 from X-Rite, Inc, Grand Rapids, MI). The cues were generated in Adobe Illustrator using the 8-bit RGB vectors [15 168 40], [3 28 7], and [1 12 3]. The lowest luminance was chosen to be close to the detection threshold based on detection curves generated previously for two participants.

Data analysis

Request a detailed protocol

All data analyses were carried out using customized scripts written in Matlab (The MathWorks, Natick, MA). Except where explicitly noted, results are based on the analysis of urgent trials (gap 0) only; that is, easy trials (delay trials with gap < 0) were excluded.

In each trial, the rPT was computed by subtracting the gap value from the RT value recorded in that trial. We refer to this processing time as ‘raw’ because it includes any afferent or efferent delays in the circuitry (Stanford et al., 2010). To compute the tachometric curve and rPT distributions, trials were grouped into rPT bins of 15 ms, with bins shifting every 1 ms. Normalized rPT distributions, fC(rPT) and fI(rPT), were obtained by counting the numbers of correct and incorrect trials, respectively, in each rPT bin, and dividing both functions by the same factor. The tachometric curve, which gives the proportion of correct trials in each bin, was then computed using Equation 1. For display purposes, the normalization factor used was the maximum value of fC or fI, whichever was largest, but the factor has no effect on the tachometric curve.

In order to quantify perceptual performance, each tachometric curve was fitted with a continuous analytical function, v(x), which was defined as

(2) v(x)=max(sL(x),sR(x),0)

where the maximum function max(a,b,c) returns a, b, or c, whichever is largest, and sL and sR are two sigmoidal curves. These are given by

(3) sL(x)=B+ALB1+exp(xCLDL)
(4) sR(x)=B+ARB1+exp(xCRDR)

where sL tracks the left (decreasing) side of the tachometric curve and sR tracks the right (increasing) side. The asymptotic value on the left side was fixed at AL= 0.5, to enforce the constraint that, for very short processing times, performance must be at chance. For any given empirical tachometric curve, the six remaining parameters defining v, the coefficients B, AR, CL, CR, DL, and DR, were adjusted to minimize the mean absolute error between the experimental and fitting functions. The minimization was done using the Matlab function fminsearch.

Once the best-fitting v(rPT) function for a given tachometric curve was found, we numerically calculated eight quantities, or features, from it: the asymptotic value (equal to AR), the minimum value (vortex depth), the rPT at which the minimum was found (vortex time), the most negative slope, the most positive slope, the rPT for which v is exactly between 0.5 (chance) and the minimum (left edge of the curve), the rPT for which v is exactly between the minimum and the asymptote (the curve’s centerpoint), and the average of the curve for rPTs between 0 and 250 ms (mean perceptual accuracy). In Figures 2 and 3, the gray shades demarcating the vortex correspond to the interval between the left edge and the centerpoint of each tachometric curve. Confidence intervals for all of these quantities were obtained by bootstrapping (Davison and Hinkley, 2006; Hesterberg, 2014); that is, by resampling the data with replacement and recalculating all the quantities many times to generate distributions for them. This was done in five steps: (1) resample the original trials with replacement, keeping the original number of contributing trials, (2) recompute the empirical tachometric curve from the resampled trials, (3) fit the new tachometric curve with a continuous v function, (4) recompute the eight characteristic features from the new v function, and (5) repeat steps 1–4 10,000 times to generate distributions for all the features. Reported confidence intervals correspond to the 2.5 and 97.5 percentiles obtained from the bootstrapped distributions.

To quantify the association between average quantities computed for individual participants, such as the mean RT or mean observed accuracy (Figure 5), we considered three data points per participant, one for each luminance condition. The strength of association and its significance were calculated with three methods. First we computed the partial Pearson correlation coefficient, which is the standard linear correlation between two variables but controlling for the effect of a third one, luminance in this case. This was implemented via the Matlab function partialcorr. We also computed the partial Spearman correlation coefficient, which involves a similar calculation but based on the ranks of the data points. This was using partialcorr too. Finally, using the Matlab function fitlm, we fitted the data to a linear regression model that also included luminance as a variable. The three methods typically produced similar results. We report those obtained with the partial Spearman correlation, which is denoted as ρ, because they were generally the most conservative.

The accelerated race-to-threshold model

Request a detailed protocol

The model for the compelled antisaccade task is a straightforward extension of one developed previously for a two-alternative, urgent, color discrimination task (Stanford et al., 2010; Shankar et al., 2011; Costello et al., 2013; Seideman et al., 2018). In that earlier model, both motor plans halt briefly when the relevant cue is detected (i.e. during the ERI); the dynamics are otherwise identical. As explained in the main text, the idea is that two motor plans (in the FEF), represented by firing rate variables rL and rR, compete with each other to trigger an eye movement with a saccade vector pointing either to the left or to the right. Because the acceleration, deceleration, and halting of these plans depends on the cue location, it is useful to relabel the two variables as rC and rA, where the subscripts now refer to the cue and anti locations (keeping in mind that the C and A labels are randomly assigned to left and right directions in each trial). Note, however, that the following description applies identically if the A and C labels are replaced everywhere by L and R, respectively, and we assume that the cue appears on the left side.

Over time, the two motor plans advance toward a fixed threshold (equal to 1000 arbitrary units, or AU). If rC exceeds threshold first, the saccade is incorrect, toward the cue, whereas if rA exceeds threshold first, the saccade is correct, away from the cue. The saccade is considered to be triggered a short efferent delay (equal to 20 ms) after threshold crossing. The fixed threshold is a reasonable approximation of the triggering mechanism for saccades (Hauser et al., 2018).

The two rate variables evolve as follows

(5) rC(t+Δt)=rC(t)+bCΔtrA(t+Δt)=rA(t)+bAΔt

where bC and bA are their respective build-up rates and the time step Δt is equal to 1 ms. When the build-up rates are constant, the firing rates rC and rA increase linearly over time. Periods during which the activity accelerates or decelerates are those during which the build-up rates themselves change steadily, as described below. Any negative rC and rA values are reset to zero. Each simulated trial can be subdivided into three epochs with different model dynamics.

Epoch 1: before the ERI. Each trial starts with the two activity variables, rC and rA, equal to zero. The go signal occurs at t=0, but the two motor plans start building up later, after an afferent delay. This afferent delay is drawn from a Gaussian distribution with mean μGOaff and SD σGOaff, where values below 20 ms are excluded. The initial build-up rates, bC0 and bA0, are drawn from a two-dimensional Gaussian distribution with mean μb, SD σb, and correlation coefficient ρb. During this epoch, after the initial afferent delay has elapsed, rC and rA evolve according to Equations 5, with bC=bC0 and bA=bA0. If during this period one of the motor plans exceeds the threshold, a saccade is produced and the trial ends. Otherwise, the trial continues.

Epoch 2: during the ERI. The start of the ERI corresponds to the time point at which the cue is detected by the model circuit (we stress that this is a local event, and make no claims about the participant’s perceptual experience). Cue detection occurs after an afferent delay relative to the time of cue presentation, which is at t= gap. This delay is drawn from a Gaussian distribution with mean μCUEaff and SD σCUEaff, where values below 20 ms are excluded. The duration of the ERI also varies normally across trials. It is drawn from a Gaussian distribution with mean μERI and SD σERI, with negative values reset to zero. The two motor plans behave differently during the ERI. For the plan toward the anti location, rA, the build-up rate is bA=gERIbA0, where the constant gain factor gERI is either zero (i.e. the plan halts) or negative (i.e. the plan is suppressed). This factor was set to zero for the pooled data, but negative values were allowed when fitting the data from individual participants. Whether zero or negative, the build-up rate of the anti plan is the same throughout the whole ERI. In contrast, for the motor plan toward the cue, rC, the build-up rate is bC=gERIbC0 but only during the first ΔERI ms of the ERI; thereafter this build-up rate instantly recovers its initial value (so bC=bC0) and then increases steadily, such that

(6) bC(t+Δt)=bC(t)+aEXΔt

until the end of the ERI, where the term aEX is the exogenous acceleration of the cue plan. In this way, the plan toward the cue, rC, first halts for ΔERI ms and then accelerates. If rC exceeds threshold during the ERI, a saccade toward the cue is triggered. Otherwise, the trial continues.

Epoch 3: after the ERI. During this last period, the plan toward the anti location first recovers its initial value (instantly, so bA=bA0) and then accelerates, whereas the plan toward the cue decelerates. That is,

(7) bC(t+Δt)=bC(t)+dENDΔtbA(t+Δt)=bA(t)+aENDΔt

where the endogenous deceleration dEND is negative and the endogenous acceleration aEND is positive. The process continues until one of the plans reaches the threshold.

Finally, the model also considers lapses, trials in which errors are made for reasons other than insufficient cue viewing time. Lapses occur with a probability λ, and are implemented as trials in which the endogenous acceleration and deceleration are equal to zero. In other words, a lapse corresponds to a trial in which the information about the correct target never reaches the circuit. During lapses, after the ERI (epoch 3), the motor plan toward the anti location continues building up at its initial rate, bA0, whereas the plan toward the cue continues advancing at whatever build-up rate it achieved at the end of the ERI.

In all, the model has 15 parameters that were adjusted to fit the pooled data set or the data from individual participants. Best-fitting values are listed in Table 1 and Supplementary file 1. These were obtained by searching over a multidimensional parameter space, gradually reducing its volume, seeking to minimize the mean absolute error between the simulated and the experimental data. For each parameter vector tested, the error consisted of a sum of terms, each representing one target function to be fitted. These functions were the RT distributions for correct choices at individual gaps, the RT distributions for incorrect choices, also at individual gaps, and the tachometric curve. The search/minimization procedure was repeated multiple times with different initial conditions to ensure that solutions were found near the global optimum.

References

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
    The two disciplines of scientific psychology, or: the disunity of psychology as a working hypothesis
    1. D Borsboom
    2. RA Kievit
    3. D Cervone
    4. SB Hood
    (2009)
    In: J Valsiner, editors. Dynamic Process Methodology in the Social and Developmental Science. New York: Springer-Verlag. pp. 67–97.
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
    Mechanisms of saccade suppression revealed in the anti-saccade task
    1. BC Coe
    2. DP Munoz
    (2017)
    Philosophical Transactions of the Royal Society B: Biological Sciences 372:20160192.
    https://doi.org/10.1098/rstb.2016.0192
  16. 16
  17. 17
  18. 18
    Bootstrap Methods and Their Applications
    1. AC Davison
    2. D Hinkley
    (2006)
    Cambridge: Cambridge University Press.
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
  39. 39
  40. 40
  41. 41
  42. 42
  43. 43
  44. 44
  45. 45
  46. 46
  47. 47
  48. 48
  49. 49
  50. 50
  51. 51
  52. 52
  53. 53
  54. 54
  55. 55
  56. 56
  57. 57
  58. 58
  59. 59
  60. 60
  61. 61
  62. 62
  63. 63
  64. 64
    A review of attentional capture: on its automaticity and sensitivity to endogenous control
    1. M Ruz
    2. J Lupiáñez
    (2002)
    Psicológica 23:283–309.
  65. 65
  66. 66
  67. 67
  68. 68
  69. 69
  70. 70
  71. 71
  72. 72
  73. 73
  74. 74
  75. 75
  76. 76
  77. 77
  78. 78
  79. 79
  80. 80
  81. 81
  82. 82
  83. 83
  84. 84
  85. 85

Decision letter

  1. Daeyeol Lee
    Reviewing Editor; Yale School of Medicine, United States
  2. Timothy E Behrens
    Senior Editor; University of Oxford, United Kingdom
  3. Daeyeol Lee
    Reviewer; Yale School of Medicine, United States

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Timescales of exogenous and endogenous attention revealed during urgent antisaccade performance" for consideration by eLife. Your article has been reviewed by three peer reviewers, including Daeyeol Lee as the Reviewing Editor and Reviewer #1, and the evaluation has been overseen by Timothy Behrens as the Senior Editor.

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

The authors have developed an ingenious task that combined two well-studied previous paradigm, i.e., anti-saccade task and compelled saccade task. A tachometric curve shows that saccades generated prematurely tended to be directed incorrectly towards the cue, rapidly switching to the correct antisaccades. An interesting finding in this paper is that this switch occurs very rapidly, and the so-called attentional vortex lasts only about 40 ms. The authors have extended their previous race model, which accounts for the behavioral data very well, and also identify the individual variability lies mostly with how much the rate variables corresponding to the exogenous and endogenous plans accelerates immediately and some time after the cue onset, respectively.

Essential revisions:

1) The title implies that this is a study of attention and, indeed, the authors are using overt attention as their behavioral metric. However, some reviewers opined that the work primarily gives insight into possible underlying neural mechanisms driven by exogenous and endogenous processes. Whether these are synonymous with attention (whatever attention is) is debatable, as the section in the Discussion implies. Therefore it is suggested that the study should be framed in terms of processes and leave the word attention for tidbit in the Discussion.

The paper would be clearer (and more accurate) if the writing were more careful in its use of terminology, especially in the Results section. Although it is true that reaction time measurements have often been used as ways to measure "attention", this might not be appropriate anymore given the careful distinctions that have been drawn in the literature between effects on sensory processing versus motor preparation versus decision criteria, etc. These distinctions are especially relevant given the framing of this work, which builds on the idea that sensory processing and motor preparation proceed in parallel. To refer to effects on reaction time as due to aspects of "attention" or "perception" therefore seems inappropriate, since some of these effects might be attributable (and are, in fact, from the perspective taken in the authors' model) to motor preparation, saccade trigger thresholds, and possibly other non-sensory processes.

The phrase "attention vortex" is certainly memorable, but these effects may not all be due to changes in "attention" but also involve changes in saccade and fixation control. For example, it is relevant that the timing of the fixation offset is similar to the values that are also associated with the gap effect and express saccades – these are understood to be due to both the loss of visual drive for maintaining fixation as well as the addition of visual drive for the saccade. It seems likely that these types of motor preparatory factors are also involved here, and they deserve to be mentioned.

2) In various places in the manuscript, the authors describe behavior as resulting from a strong involuntary capture (e.g. subsection “Antisaccade performance varies with cue luminance”, first paragraph) or rates of activity buildup (e.g. subsection “Attentional capture as a perceptuo-motor phenomenon”, first paragraph). This might be questioned. The likely reason for the behavior seen is due to the urgency in the task. It is likely that rPTs that are less than 90 ms or that are in the vortex occur primarily because the gap is really long, not because one or another rate was high in the pre-ERI period. It would be important for the authors to not overly interpret their results. As they know, when the task is not compelled, these results tend not to be seen.

3) The modeling needs a stronger justification. The subdivision of time in the model into epochs with different properties seems ad hoc. The valuable thing about race models is that they provide an explanation for temporal aspects of behavior by defining the rates of hypothetical underlying processes. But this model includes explicit time-related variables (for example, defining the Exogenous Response Interval) to explain the temporal features of your data. It seems the interesting features of the data set are baked in. It's also disappointing that different sets of parameters are needed to explain the effects of the cue luminance manipulation.

Is it possible to define one rate process that would generate pro-saccades (perhaps shorter latency and lower rate), and a second rate process (perhaps longer latency and higher rate) that would generate the anti-saccades, and then allow them to interact so as to explain performance over time? The ERI would then emerge as a byproduct of the difference in latencies and rates between the two processes. Moreover, the rates could be defined as a function of the cue luminance, so that the scaling of the curves across cue luminance conditions might also emerge with one set of parameters. A model with this structure would be more parsimonious.

Along these same lines, it would be helpful to use the reaction time data from an independent set of pro-saccade trials to help constrain the rate parameters in the model. The parameters that capture the RT distribution for prosaccades might be expected to also explain the timing of the "dip" in the tachometric curve in the urgent anti-saccade task. It would be nice corroboration to see that this works on a per subject basis, but not if you switch pro-saccade rates between subject.

4) One of the three summary measurements for the tachometric curves is the "mean accuracy", and then several paragraphs are used to explain the seemingly counterintuitive relationship between mean accuracy and other aspects of performance. This aspect of the Results seemed poorly motivated, because I felt that "mean accuracy" was itself a bad measurement, because the underlying distribution tends to be bimodal (with modes at 0.5 and 1.0) so reporting the mean is a bad idea to begin with.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Voluntary and involuntary contributions to perceptually guided saccadic choices resolved with millisecond precision" for further consideration at eLife. Your revised article has been favorably evaluated by Timothy Behrens as the Senior Editor, a Reviewing Editor, and two reviewers.

The manuscript has been improved but one of the reviewers still has concerns that you might want to address before acceptance, as outlined below. Please note that there was some disagreement among the reviewers regarding how to best address these issues, but I hope that you would take them in account to finalize the manuscript.

1) Reviewers had additional discussion after they completed their individual reviews, regarding whether the use of the phrase "attentional vortex" was entirely appropriate. On the one hand, we all agree that the term "attention" might not be ideal, but we are also sympathetic to the reasons why you might prefer this term to an alternative, such as oculomotor capture, which might be technically more appropriate. The following is a comment from one of the reviewers on this issue:

The authors have been more careful in their use of loaded terms like attention and perception, although they still like to use the term "attention vortex". There are a few reasons why I still argue against using this term. First, as the authors say in their Reply: "the existing term 'oculomotor capture' describes exactly what happens in their task". What they have very nicely mapped out is the time course of oculomotor capture during the urgent antisaccade task. There is no need to introduce a new term. Second, the measurements here are oculomotor measurements, and the link to attention is only inferred. We do not know if there are also effects on visual perceptual judgments in addition to eye movements (for example, effects on visual discrimination). Third, the term attentional vortex implies that the capture is driven by attention, when really it is due to the interaction between saccade planning and changes in the state of attention (you could have called it "saccade vortex" with equal mixed justification). Fourth, isn't the attention field confusing already? You have the opportunity to put forward a clarifying description of how attention-related and saccade-related processes interact. Why ruin it? In short, I strongly recommend replacing "attention vortex" and "vortex" with "oculomotor capture", for the sake of properly and accurately acknowledging how these results fit into the existing literature.

2) Another issue about which the reviewers couldn't reach a completely consensus is whether and how the results related to the "mean accuracy" should be improved. While we all agree that you can probably perform additional analyses to clarify this part of the manuscript further, we did not agree on whether this would entirely necessary or not. The following is a comment from one of the reviewers:

The documentation of the tachometric curves (subsection “Antisaccade performance varies with cue luminance”, last paragraph) still seems to miss the mark for me, especially the section that goes into the problems with "mean accuracy". From the author's reply, I appreciate the point they are trying to make, but I think there may be other ways to achieve this goal.

There are actually two different points to be made. The first point is that the tachometric curves are not the same between subjects, and this presumably is due to differences in the temporal dynamics of the underlying sensory and saccade-related processes. This is an interesting point and backed up the second and third measurements drawn from the tachometric curves.

The second point is that mean accuracy can be misleading, or is at least incomplete, because it fails to register the separate effects that contribute to the dip in the tachometric curve. This is an important point, but I think you could make it much more directly. Here is my specific suggestion/request. When you refer to other studies using mean accuracy (subsection “Individual differences in perceptual and overall performance”, first paragraph), I presume you are referring to studies that used particular "gap" values, most often 0 milliseconds. With your own data, you could show how the distributions of correct and incorrect responses land on that subject's tachometric curve – for that one particular gap value – and what the "mean accuracy" would be for that subject if that was the only condition you had run. This would show more directly how the value of "mean accuracy" depends on the reaction time of the subject (as well as the gap value chosen), and how the tachometric curves captures these dynamics in ways that get obscured in the "mean accuracy" measurement. If this type of demonstration could explain the variation in previous studies using the anti-saccade task, that would be an especially useful addition.

https://doi.org/10.7554/eLife.46359.027

Author response

Essential revisions:

1) The title implies that this is a study of attention and, indeed, the authors are using overt attention as their behavioral metric. However, some reviewers opined that the work primarily gives insight into possible underlying neural mechanisms driven by exogenous and endogenous processes. Whether these are synonymous with attention (whatever attention is) is debatable, as the section in the Discussion implies. Therefore it is suggested that the study should be framed in terms of processes and leave the word attention for tidbit in the Discussion.

The paper would be clearer (and more accurate) if the writing were more careful in its use of terminology, especially in the Results section. Although it is true that reaction time measurements have often been used as ways to measure "attention", this might not be appropriate anymore given the careful distinctions that have been drawn in the literature between effects on sensory processing versus motor preparation versus decision criteria, etc. These distinctions are especially relevant given the framing of this work, which builds on the idea that sensory processing and motor preparation proceed in parallel. To refer to effects on reaction time as due to aspects of "attention" or "perception" therefore seems inappropriate, since some of these effects might be attributable (and are, in fact, from the perspective taken in the authors' model) to motor preparation, saccade trigger thresholds, and possibly other non-sensory processes.

The phrase "attention vortex" is certainly memorable, but these effects may not all be due to changes in "attention" but also involve changes in saccade and fixation control. For example, it is relevant that the timing of the fixation offset is similar to the values that are also associated with the gap effect and express saccades – these are understood to be due to both the loss of visual drive for maintaining fixation as well as the addition of visual drive for the saccade. It seems likely that these types of motor preparatory factors are also involved here, and they deserve to be mentioned.

We generally agree with this comment, and acknowledge that while our results are pertinent to various aspects of spatial attention, the word attention is heavily loaded and may refer to a variety of phenomena. Thus, we have changed the title and eliminated the reference to attention in it; as suggested, it now refers to voluntary and involuntary “processes”, which is more general. Similar, broader terminology (e.g., “endogenous control”, “endogenous responses”, etc.) was adopted at other points in the text as well. In addition, we revised the text to make the language more accurate, as suggested. In particular, the link between endogenous attention and the behavior we describe was certainly unclear. We have eliminated all references to endogenous attention, save to say that it may contribute in part to the endogenous process that directs the eyes away from the cue in our task.

That said, we think that there is generally much less ambiguity in the literature regarding the exogenous effects produced by salient, abrupt-onset stimuli. In fact, an earlier version of the manuscript was criticized precisely because we did not refer to the dip in the tachometric curve simply as attentional/oculomotor capture. Operationally, the existing term “oculomotor capture” describes exactly what happens in our task. By extension, the term “attentional capture” is also pertinent, for two reasons. First, because there is much evidence to indicate that oculomotor and attentional capture are indeed different expressions of the same underlying dynamics (not only psychophysical work by Theeuwes and others, but also neurophysiological studies; for instance, Dorris et al., 2007; Busse et al., 2008). And second, because our modeling results are entirely consistent with this idea: the exogenous effect is always the same (transient, biased acceleration/halting during the ERI), only sometimes it triggers a (captured) saccade immediately, and sometimes it does not — but in all likelihood, in the latter case the subthreshold increase in oculomotor activity corresponds to an increase in spatial attention. For these reasons, we believe that describing our findings in reference to attentional/oculomotor capture is not inappropriate. As for the closely related term “attentional vortex,” it is now unambiguously defined as the range of processing times in which saccades are highly likely to be captured. While captured saccades are not a new finding, their tightly constrained temporal window is.

Aside from the terminology, we entirely agree that captured saccades result from a combination of visually-evoked and motor preparatory activity that is facilitated by the early release of fixation. We now point this out along with the similarity between our captured saccades and express saccades, which demonstrate many mechanistic parallels (subsection “Attentional capture as a perceptuo-motor phenomenon”, first and last paragraphs).

2) In various places in the manuscript, the authors describe behavior as resulting from a strong involuntary capture (e.g. subsection “Antisaccade performance varies with cue luminance”, first paragraph) or rates of activity buildup (e.g. subsection “Attentional capture as a perceptuo-motor phenomenon”, first paragraph). This might be questioned. The likely reason for the behavior seen is due to the urgency in the task. It is likely that rPTs that are less than 90 ms or that are in the vortex occur primarily because the gap is really long, not because one or another rate was high in the pre-ERI period. It would be important for the authors to not overly interpret their results. As they know, when the task is not compelled, these results tend not to be seen.

What we meant was that both our findings and the results of previous experiments reporting oculomotor capture are likely caused by similar, low-level visual representations (acting on ongoing motor activity). We did not mean to imply that the two phenomena are identical (that, we don’t know). This is now stated more accurately (subsection “Antisaccade performance varies with cue luminance”, first paragraph).

As for the first paragraph of the subsection “Attentional capture as a perceptuo-motor phenomenon”, both the gap length and the intensity of the build-up process (i.e., urgency) determine the outcome of each trial. For instance, a captured saccade may result when the initial build-up rate is moderate and the gap is long, or when the initial build-up rate is high and the gap is short – but in either case the rPT will be around 111 ms. This is what the paragraph intends to point out, that the rPT marks the convergence of sensory (viewing duration) and motor (initial build-up) conditions that promote the capture. That said, the reviewer’s comment is definitely well taken, and appreciated. In that paragraph in particular the importance of the interplay between the two factors was not apparent. The paragraph has been rewritten to avoid oversimplification; it now reflects the more nuanced situation, in which both the gap and the build-up rate are important.

3) The modeling needs a stronger justification. The subdivision of time in the model into epochs with different properties seems ad hoc. The valuable thing about race models is that they provide an explanation for temporal aspects of behavior by defining the rates of hypothetical underlying processes. But this model includes explicit time-related variables (for example, defining the Exogenous Response Interval) to explain the temporal features of your data. It seems the interesting features of the data set are baked in. It's also disappointing that different sets of parameters are needed to explain the effects of the cue luminance manipulation.

Is it possible to define one rate process that would generate pro-saccades (perhaps shorter latency and lower rate), and a second rate process (perhaps longer latency and higher rate) that would generate the anti-saccades, and then allow them to interact so as to explain performance over time? The ERI would then emerge as a byproduct of the difference in latencies and rates between the two processes. Moreover, the rates could be defined as a function of the cue luminance, so that the scaling of the curves across cue luminance conditions might also emerge with one set of parameters. A model with this structure would be more parsimonious.

The development of our model has been data driven, and as far as we can tell, its neurophysiological implications are accurate.

First, we emphasize that this model is 95% the same as that developed previously for an urgent color discrimination task (the current model contains the earlier one as a special case). The broad structure has been thoroughly validated against neurophysiological data recorded in FEF (Stanford et al., 2010; Salinas et al., 2010; Costello et al., 2013; Scerra et al., 2019). That work indicates that, initially, after the go signal is detected, the two alternative motor plans indeed advance with fixed, randomly sampled build-up rates, and that this independent build-up ends when perceptual information arrives at the motor circuit to accelerate the motor plan associated with the target and decelerate the plan associated with the distracter (Stanford et al., 2010; Salinas et al., 2010; Costello et al., 2013). Notably, in that earlier model the ERI existed already, except that it consisted of a brief, symmetric halting of both motor plans (this is now mentioned in the first paragraph of the subsection “The accelerated race-to-threshold model”). At the time, this feature was introduced into the model to account for a small but reliable dip in the distributions of processing times. More recently, we realized that a biased halting of the motor activity (more prolonged for the side opposite to the cue) during this pre-existing period, and as observed in so-called “saccadic inhibition” experiments, could partially explain the attentional vortex. The point is that the ERI is not an ad hoc assumption of the antisaccade model, but a manifestation of a normally subtle yet extraordinarily reliable effect of visual onsets (discussed at length by Salinas and Stanford, 2018). The temporal features of our data may seem odd because they are absent in other, non-urgent decision-making tasks that unfold over several hundreds of milliseconds — but such features are not “baked in,” they are simply the most consistent with our current and previous results, and with a large literature on the interaction between visual onsets and saccade planning (reviewed by Salinas and Stanford, 2018).

Second, although we did not fully understand the alternative modeling scenario outlined by the reviewer, we stress that the luminance results not only did not require additional assumptions or special tweaks to the model, but also are entirely consistent with the neurophysiological effects of luminance. The model certainly cannot account for the effect of luminance without some variation in parameter values. In the scenario outlined by the reviewer, “the rates could be defined as a function of the cue luminance.” But the build-up rates cannot depend on luminance before the cue information arrives at the model circuit. The rates do change with cue luminance (gradually, via acceleration and deceleration), but in order to do so the afferent delay inevitably associated with the cue onset must elapse; that latency defines the beginning of the ERI. Again, this afferent delay is a real constraint, not an arbitrary assumption.

The model was developed to fit the high-luminance data, and doing so required exogenous halting and acceleration during the ERI, as described (Figure 7). Afterward, once those mechanisms were in place, we found that the model also produced excellent fits to the medium- and low-luminance data. Moreover, the best-fit parameter values it found make perfect sense: according to the model, the main consequence of lowering the luminance of the cue is to increase the afferent delay associated with the detection of the cue by ∼50 ms. This is highly consistent with psychophysical data (Purushothaman et al., 1998; White et al., 2008) and neurophysiological reports (Bisley et al., 2004; van Rossum et al., 2008; Oram, 2010; Marino et al., 2015) showing that visual latencies everywhere, from retina to high-order visual areas to oculomotor structures, increase as luminance and/or contrast diminishes, with differences as large as 100 ms or more. The results for medium and low luminance confirm that the two mechanisms proposed initially are consistent with the temporal properties of visually sensitive neurons in oculomotor areas, which are thought to mediate exogenous attention. This is now mentioned in the fifth paragraph of the subsection “A comprehensive account of antisaccade behavior based on motor competition”.

And third, experiments are underway in our laboratory to directly probe the reality of the ERI, i.e., the discrete transition from uninformed build-up, to exogenous bias, to endogenous guidance of the saccadic choice. We are in the very early stages of this effort, but so far the results are encouraging. For example, preliminary data show that the exogenous modulations of activity during the ERI associated with stimulus detection can be manipulated independently of the later endogenous signal that informs the choice — but such manipulation requires high temporal precision, as expected from the short duration of the ERI.

Along these same lines, it would be helpful to use the reaction time data from an independent set of pro-saccade trials to help constrain the rate parameters in the model. The parameters that capture the RT distribution for prosaccades might be expected to also explain the timing of the "dip" in the tachometric curve in the urgent anti-saccade task. It would be nice corroboration to see that this works on a per subject basis, but not if you switch pro-saccade rates between subject.

The contrast between pro- and antisaccades would indeed be very informative. In fact, for an experiment with interleaved pro- and anti- trials, once the model is fitted to the antisaccade trials, all the prosaccade results become fully predictable. That is, having set all the parameter values using the anti data, the pro data turn into a parameter-free test of the model. This test is, in fact, particularly rich in our urgent task, because the predictions include not only a full tachometric curve (Author response image 1A, red curve) but also specific shapes for the rPT distributions; for instance, the rPT histograms for correct pro (Author response image 1B, red curve) and incorrect anti trials (Author response image 1B, gray histogram) should overlap substantially up to the point where the pro curve reaches asymptotic performance. This, however, is a new experiment that goes beyond the scope of the present study.

Author response image 1
Predictions for an experiment in which urgent prosaccades and antisaccades are interleaved.

All results are from model simulations.

4) One of the three summary measurements for the tachometric curves is the "mean accuracy", and then several paragraphs are used to explain the seemingly counterintuitive relationship between mean accuracy and other aspects of performance. This aspect of the Results seemed poorly motivated, because I felt that "mean accuracy" was itself a bad measurement, because the underlying distribution tends to be bimodal (with modes at 0.5 and 1.0) so reporting the mean is a bad idea to begin with.

Thanks for noting the lack of clarity on this issue. In our data set, it may seem somewhat obvious that using the mean accuracy is not the best thing to do, but our point is precisely that this is not a good idea in general, regardless of whether one uses an urgent or a non-urgent version of the task. From our results, one can readily identify why this the case, and it boils down to two main reasons: first, the mean accuracy can be easily traded with speed (mean RT), and second, under non-urgent conditions the mean accuracy will largely reflect the rate of lapses, not the rate of captured saccades (which are the types of error that, implicitly or explicitly, are supposed to be characteristic of the antisaccade task). We have made various changes to the corresponding section (subsection “Individual differences in perceptual and overall performance”) to motivate the results better and state these points more clearly.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

The manuscript has been improved but one of the reviewers still has concerns that you might want to address before acceptance, as outlined below. Please note that there was some disagreement among the reviewers regarding how to best address these issues, but I hope that you would take them in account to finalize the manuscript.

1) Reviewers had additional discussion after they completed their individual reviews, regarding whether the use of the phrase "attentional vortex" was entirely appropriate. On the one hand, we all agree that the term "attention" might not be ideal, but we are also sympathetic to the reasons why you might prefer this term to an alternative, such as oculomotor capture, which might be technically more appropriate. The following is a comment from one of the reviewers on this issue:

The authors have been more careful in their use of loaded terms like attention and perception, although they still like to use the term "attention vortex". There are a few reasons why I still argue against using this term. First, as the authors say in their Reply: "the existing term 'oculomotor capture' describes exactly what happens in their task". What they have very nicely mapped out is the time course of oculomotor capture during the urgent antisaccade task. There is no need to introduce a new term. Second, the measurements here are oculomotor measurements, and the link to attention is only inferred. We do not know if there are also effects on visual perceptual judgments in addition to eye movements (for example, effects on visual discrimination). Third, the term attentional vortex implies that the capture is driven by attention, when really it is due to the interaction between saccade planning and changes in the state of attention (you could have called it "saccade vortex" with equal mixed justification). Fourth, isn't the attention field confusing already? You have the opportunity to put forward a clarifying description of how attention-related and saccade-related processes interact. Why ruin it? In short, I strongly recommend replacing "attention vortex" and "vortex" with "oculomotor capture", for the sake of properly and accurately acknowledging how these results fit into the existing literature.

We appreciate the reviewer’s point of view and the overall sentiment — that conservative interpretations are generally preferred. We think that our results do speak to the close relationship between attention-related and saccade-related processes, but we agree that using the term “attentional vortex” is perhaps not the best way to convey the specific implications of our results, particularly in the Results section. Since the most problematic part of the term is the qualifier, which brings in the baggage associated with attention, we have decided to omit the reference to attention and simply call it the vortex. At many points in the text we refer specifically to the part of the tachometric curve where captured saccades prevail, so giving it a name is a necessity. While “vortex” recalls the strong, involuntary nature of the capture, it is entirely neutral in regard to the attention versus oculomotor distinction.

In addition, we again reviewed all mentions to attention and attentional capture. A couple of them were either omitted or substituted with “oculomotor capture,” as suggested. Finally, we think that the exogenous increase in activity driven by the detection of the cue does correspond to the reflexive, covert allocation of attention – when that activity stays below threshold and does not trigger a saccade to the cue – but this interpretation is now more clearly articulated as such in the last part of the Discussion.

2) Another issue about which the reviewers couldn't reach a completely consensus is whether and how the results related to the "mean accuracy" should be improved. While we all agree that you can probably perform additional analyses to clarify this part of the manuscript further, we did not agree on whether this would entirely necessary or not. The following is a comment from one of the reviewers:

The documentation of the tachometric curves (subsection “Antisaccade performance varies with cue luminance”, last paragraph) still seems to miss the mark for me, especially the section that goes into the problems with "mean accuracy". From the author's reply, I appreciate the point they are trying to make, but I think there may be other ways to achieve this goal.

There are actually two different points to be made. The first point is that the tachometric curves are not the same between subjects, and this presumably is due to differences in the temporal dynamics of the underlying sensory and saccade-related processes. This is an interesting point and backed up the second and third measurements drawn from the tachometric curves.

The second point is that mean accuracy can be misleading, or is at least incomplete, because it fails to register the separate effects that contribute to the dip in the tachometric curve. This is an important point, but I think you could make it much more directly. Here is my specific suggestion/request. When you refer to other studies using mean accuracy (subsection “Individual differences in perceptual and overall performance”, first paragraph), I presume you are referring to studies that used particular "gap" values, most often 0 milliseconds. With your own data, you could show how the distributions of correct and incorrect responses land on that subject's tachometric curve – for that one particular gap value – and what the "mean accuracy" would be for that subject if that was the only condition you had run. This would show more directly how the value of "mean accuracy" depends on the reaction time of the subject (as well as the gap value chosen), and how the tachometric curves captures these dynamics in ways that get obscured in the "mean accuracy" measurement. If this type of demonstration could explain the variation in previous studies using the anti-saccade task, that would be an especially useful addition.

The reviewer raises an interesting point about why exactly the mean accuracy fails to reflect the conflict between endogenous and exogenous influences that is evidenced by the tachometric curve. What we are asking in the subsection “Individual differences in perceptual and overall performance”, is indeed, whether using the mean accuracy to measure that conflict is a good idea. The reviewer notes an important nuance about our analysis: the results could depend on which gap values are used in the calculation. This was certainly worth looking into. However, it turns out that, regardless of which gaps we include in the analysis, the mean accuracy never correlates with the measures of perceptual performance derived from the tachometric curve.

In Figure 5, the mean accuracy and mean RT were computed using all the positive gap values, i.e., the same trials used to compute the tachometric curve. Calculated in this way, the mean accuracy is unrelated to perceptual accuracy (Figure 5A), and clearly demonstrates a speed-accuracy trade-off across individual participants (Figure 5B).

The same analysis was repeated based just on the zero-gap data, as suggested (Author response image 2). Again, there is no obvious relationship between observed accuracy and perceptual accuracy (Author response image 2A), and because the former is generally very high at zero gap (mostly above 90% correct), the speed-accuracy trade-off is no longer visible (Author response image 2B). The relationship between perceptual accuracy and mean RT is preserved (Author response image 2C), but that just means that the RTs of zero-gap trials are highly consistent with those of nonzero-gap trials (i.e., the participants that respond quickly, do so for all gaps).

The zero-gap data are not particularly informative because, as can be seen in Figure 2—figure supplement 1G, H, they miss the vortex and predominantly reflect asymptotic performance (delay trials are even worse, because they exclusively cover the asymptotic range; Figure 2—figure supplement 1C, D). However, our initial explanation based on the speed-accuracy trade-off still stands: even when multiple gaps are used and the data cover the full rPT range (as in Figure 5), the mean accuracy is still not a reliable way to compare the perceptual abilities of two participants because each of them will generally sample their respective tachometric curves differently, according to their own urgency. Both factors are important, the gaps used and the unpredictable sampling bias due to urgency.

We thank the reviewer for raising this issue; it helped us articulate these results better. The last paragraph of the subsection “Individual differences in perceptual and overall performance”, was rewritten to explain the distinct roles of the two factors, gaps and urgency. The last paragraph of the subsection “Antisaccade performance in relation to cognitive conflict”, in the Discussion, elaborates on this point and was also revised with the reviewer’s comment in mind.

Author response image 2
Same format as in Figure 5 of the main text, except that the mean observed accuracy and the mean RT were computed using zero-gap trials only.

Perceptual accuracy values are the same as in Figure 5.

https://doi.org/10.7554/eLife.46359.028

Article and author information

Author details

  1. Emilio Salinas

    Department of Neurobiology and Anatomy, Wake Forest School of Medicine, Winston-Salem, United States
    Contribution
    Conceptualization, Resources, Data curation, Software, Formal analysis, Supervision, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing—original draft, Project administration, Writing—review and editing
    For correspondence
    esalinas@wakehealth.edu
    Competing interests
    Reviewing editor, eLife
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7411-5693
  2. Benjamin R Steinberg

    Department of Neurobiology and Anatomy, Wake Forest School of Medicine, Winston-Salem, United States
    Contribution
    Software, Formal analysis, Investigation, Visualization
    Competing interests
    No competing interests declared
  3. Lauren A Sussman

    Department of Neurobiology and Anatomy, Wake Forest School of Medicine, Winston-Salem, United States
    Contribution
    Formal analysis, Investigation
    Competing interests
    No competing interests declared
  4. Sophia M Fry

    Department of Neurobiology and Anatomy, Wake Forest School of Medicine, Winston-Salem, United States
    Contribution
    Formal analysis, Investigation
    Competing interests
    No competing interests declared
  5. Christopher K Hauser

    Department of Neurobiology and Anatomy, Wake Forest School of Medicine, Winston-Salem, United States
    Contribution
    Software, Formal analysis, Investigation, Methodology
    Competing interests
    No competing interests declared
  6. Denise D Anderson

    Department of Neurobiology and Anatomy, Wake Forest School of Medicine, Winston-Salem, United States
    Contribution
    Supervision, Investigation, Project administration
    Competing interests
    No competing interests declared
  7. Terrence R Stanford

    Department of Neurobiology and Anatomy, Wake Forest School of Medicine, Winston-Salem, United States
    Contribution
    Conceptualization, Resources, Data curation, Supervision, Funding acquisition, Validation, Methodology, Writing—original draft, Project administration, Writing—review and editing
    Competing interests
    No competing interests declared

Funding

National Eye Institute (R01EY025172)

  • Emilio Salinas
  • Terrence R Stanford

National Institute of Neurological Disorders and Stroke (T32NS073553-01)

  • Christopher K Hauser

National Science Foundation (Graduate research fellowship)

  • Christopher K Hauser

Tab Williams Family Endowment

  • Emilio Salinas
  • Terrence R Stanford

National Eye Institute (R01EY021228)

  • Emilio Salinas
  • Terrence R Stanford

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

 We thank Ziad Hafed for comments and suggestions.

Ethics

Human subjects: All participants provided informed written consent before the experiment. All experimental procedures were conducted with the approval of the Institutional Review Board (IRB) of Wake Forest School of Medicine.

Senior Editor

  1. Timothy E Behrens, University of Oxford, United Kingdom

Reviewing Editor

  1. Daeyeol Lee, Yale School of Medicine, United States

Reviewer

  1. Daeyeol Lee, Yale School of Medicine, United States

Publication history

  1. Received: February 24, 2019
  2. Accepted: June 20, 2019
  3. Accepted Manuscript published: June 21, 2019 (version 1)
  4. Version of Record published: July 22, 2019 (version 2)

Copyright

© 2019, Salinas et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 746
    Page views
  • 127
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)