Introduction

Physical properties of stimuli strongly influence perception such that low-intensity stimuli are detected infrequently. As intensity increases, detection probability remains low until some perceptual threshold is crossed, after which stimuli are perceived robustly. A psychometric function (13) mathematically describes this property of perception. Only within a narrow range around the perceptual threshold do stimuli lead to significant trial-to-trial perceptual variance. While many studies present stimuli at threshold (47), few have probed the laminar cortical microcircuit mechanisms that underlie successful or unsuccessful perception under these conditions (8, 9).

Prior studies have characterized how perceived stimuli trigger stronger information propagation from earlier visual areas to higher-order visual and frontal regions (4, 9). This information propagation and sensory processing are strongly influenced by brain states such as arousal and attention (8, 10). Arousal has long been recognized for its role in modulating cortical activity (1113) and affecting performance in various sensory tasks (1416). In visual area V4, a key intermediate region in the ventral visual processing stream (1719), attention strongly modulates neural activity (2023). Attention increases the firing rates of V4 neurons, enhances the reliability of individual neuron firing, and reduces correlated fluctuations among pairs of neurons (21, 2426). Brain state dynamics impact both cortical and subcortical structures, contributing to behavior (27, 28). Fluctuations in attention are reflected in the on- and off-state dynamics of V4 ensembles, which have been shown to correlate with behavioral performance (29, 30).

The visual cortex has a columnar architecture in which multiple cell classes (3135) across the cortical layers (18, 36) form distinct sub-populations. These sub-populations have unique and stereotyped patterns of connectivity, thus forming a canonical microcircuit that orchestrates the encoding and flow of information (37, 38). Moreover, these sub-populations contribute uniquely to sensory processing and are differentially modulated by brain states (25, 3941). While it has been shown that attentional modulation varies across cortical layers (39, 4147), the role of these sub-populations in attentive perception at threshold remains poorly understood. Moreover, the influence of physiological states, which may be responsible for different outcomes at threshold, on these sub-populations has not been studied in detail.

Here we examine the neural mechanisms that regulate perception at threshold. We specifically focus on the columnar microcircuit mechanisms within area V4. We hypothesized that minor fluctuations in behavioral state, such as arousal and visual sensitivity, and in the activity of neural sub-populations across the layers of the visual cortex, result in different perceptual outcomes at threshold. Specifically, we hypothesized that output layer (II-III, V-VI) sub-populations, ones projecting to higher cortical areas and subcortical structures, would show evidence of improved capacity for stimulus representation during successful perception. We also hypothesized that such successful events would be accompanied by improved information propagation throughout a cortical column. We find that differences in behavioral states and lamina-specific neural states characterize correct and incorrect trials at threshold and explain perceptual variability.

Results

To study the neural dynamics responsible for determining whether a stimulus presented at perceptual threshold is perceived, we analyzed behavioral and cortical layer-specific neural data from area V4, collected while monkeys performed a cued attention task (39).

Monkeys were trained to detect an orientation change in one of two Gabor stimuli that were presented concurrently at two spatial locations, and to report having seen the change by making an eye movement to the changed stimulus. Prior to a block of trials, monkeys were cued as to which of the two spatial locations was likely to undergo the orientation change (95% valid cue; presented at the start of each block). During a trial, “non-target” stimuli at a fixed reference orientation were repeatedly presented. Non-targets were turned on for 200ms at the two spatial locations, and then turned off for a variable interval (200-400ms). At a random time (1-5s, mean 3s) a “target” stimulus, differing in orientation from the non-targets, was presented at one of the locations. If the monkey reported having detected the orientation change by making an eye movement to the location of the target stimulus, it received a juice reward (Figure 1A, “hit” trial). If the monkey failed to detect the orientation change and instead continued to maintain fixation on the center of the monitor, it was not rewarded (Figure 1A, “miss” trial). In this study, we focused exclusively on trials in which the target stimulus was presented at the cued location (95% of trials). All figures relate exclusively to trials in which the change occurred at the cued location.

Orientation change detection task at perceptual threshold

(A) Schematic of task structure. The monkey initiated a trial by fixating on the center of the screen. Two Gabor stimuli (represented by oriented lines) were presented for 200 ms and then turned off for 200-400ms. This was repeated until, at an unpredictable time, one of the stimuli changed orientation. The monkey could report having seen the change by making an eye movement to the location of the target stimulus to receive a reward (hit trials). If the monkey failed to report the orientation change and maintained fixation on the center of the screen it was not rewarded (miss trials). Before a block of trials, the monkey was cued as to which stimulus was likely to undergo the change (95% valid cue). In 5% of trials the orientation change occurred at the other location (foil trials). Circles indicating the cued location and receptive field are drawn for figure reference only and were not presented during the task. (B) Example behavioral psychometric function from one recording session and attention condition. Behavioral performance (hit rate, circles) is presented as a function of orientation change. Data was fitted with a logistic function. The threshold condition, trials with performance halfway between the upper and lower asymptotes of the logistic function, is indicated by the orange box. Error bars represent standard deviation calculated with a jackknife procedure (20 jackknives). The square symbol indicates foil trial performance.

On each trial, the magnitude of the orientation change was drawn from a distribution that spanned multiple levels of difficulty. We fit the behavioral data with a logistic function (1) and defined the threshold condition as the orientation change that was closest to the 50% threshold of the fitted psychometric function for that session (Figure 1B, Methods; see Figure 1-figure supplement 2 for additional examples and Figure 1-figure supplement 3A-G for logistic fit parameters). We selected this subset of trials for further analysis, since the constant target stimuli in these trials were equally likely to be perceived or not perceived. Target presentation times were not different between hit and miss trials (Figure 1-figure supplement 3H; p=0.15, Wilcoxon rank sum test). There was a slight difference in threshold trial performance based on time in the session (Figure 1-figure supplement 3I, p<0.01, permutation test). Performance in the middle of the recording session (second and third quartiles) was higher than in the beginning and end of the session (first and fourth quartiles). Monkeys initiated a median of 905 trials per session (range: 651 to 1086).

While monkeys performed this task, single- and multi-unit activity and local field potentials (LFPs) were recorded in area V4 using 16-channel linear array electrodes (Plexon inc., Figure 1-figure supplement 1A-E). The array was inserted perpendicular to the cortical surface and spanned the cortical layers. We used current source density (CSD) analysis (48) to estimate the boundaries between the superficial (I-III), input (IV), and deep (V-VI) cortical layers (Figure 1-figure supplement 1E-F), and assign individual neurons their layer identity (39). Single units were classified as either broad-spiking (putative excitatory neurons) or narrow-spiking (putative inhibitory neurons) on the basis of their waveform width using previously published techniques (peak to trough duration; Figure 1-figure supplement 1D; see Methods; 31, 39, 40, 49, 50). Eye position and pupil diameter were also recorded (ISCAN ETL-200). When analyzing pupil diameter and eye position data, we considered all trials in the threshold condition in which the change occurred at the cued location, regardless of whether the cued location was within the receptive field of the recorded neurons. For all electrophysiological analyses, we only considered trials in which the cued stimulus was within the receptive field of the recorded neurons, and the stimulus change occurred at the cued location.

To assess the behavioral impact of variations in arousal and retinal image stability across trials at the threshold condition, we compared pupil diameter and microsaccade incidence across trial outcomes. Larger pupil diameter is thought to be a proxy for elevated alertness and arousal (8, 14, 15, 5154). We found that hit trials were associated with larger pupil diameters compared to miss trials, both before and during non-target and target stimulus presentations (Figure 2A). We quantified this difference in the estimation statistics framework (55, 56) by comparing effect sizes and using bootstrapping to estimate uncertainty in the differences. We found that the mean of the distribution of pupil diameters associated with hit trials is greater than that associated with misses (Figure 2B; complementary null hypothesis testing results in Table 1). Prior work has shown that the optimal state for sensory performance occurs at intermediate levels of arousal, with states of low and hyper arousal associated with decreased performance (1416, 5760). In both hit and miss trials, the mean pupil diameter was close to the optimal arousal state for perceptual performance (Figure 2C; 15). The average differences in pupil diameter across hit and miss trials reflect differences within the optimal state of intermediate arousal. All results were held for individual animals (Figure 2-figure supplement 2A-C). Our results thus suggest that hits are more likely to occur during periods of greater arousal.

Hit trials have larger pupil diameter whereas microsaccades more often precede misses

(A) Normalized pupil diameter for hit and miss trials in the threshold condition. 0 ms corresponds to non-target and target stimulus onset. Mean +/-s.e.m. (B) Distribution of pupil diameter values associated with hit and miss trials. Pupil diameter was averaged from 100ms before to 100ms after non-target and target stimulus onset. Violin plots were generated using kernel smoothing (See Methods). Error bars represent 95% confidence intervals for the mean of each distribution, and the mean difference (blue, right axes). Inset: zoomed in view of the mean difference between hit and miss trials. Black bar represents a 95% confidence interval of the mean difference. Shaded region reflects the distribution of the bootstrapped estimation of the mean difference. (C) Histogram of mean pupil diameter around the time of non-target and target stimulus onset (calculated as in B). Orange and gray lines represent the mean pupil diameter for hit and miss trials respectively. (D) Left: Hit rate for trials with (387 trials) and without (right, 1336 trials) a microsaccade detected in the time window 0-400ms before target onset. Right: Bootstrapped estimation of the mean difference in hit proportion in trials with vs without a pre-target microsaccade. Same conventions as 2B.

Corresponding Null-Hypothesis Testing Results

Microsaccades, small fixational eye movements of <1° in amplitude that occur during normal fixation, are associated with periods of decreased visual sensitivity due to unstable retinal images (61, 62). Microsaccades have been linked to suppressed neural responses in visual areas during perceptual tasks, impairing fine visual discrimination and behavioral performance (63, 64). We grouped trials in the threshold condition based on whether a microsaccade occurred in a 400ms window preceding the onset of the target stimulus. Most trials with a pre-target microsaccade were misses, whereas the majority of trials without a microsaccade in this window were hits (Figure 2D; see Figure 2-figure supplement 2D for individual animal plots). There is a strong link between microsaccade direction and attention deployment (6571). Consistent with previous reports we also find that microsaccades toward the attended stimulus were overrepresented in correct trials (Figure 2-figure supplement 1A, upper left). Conversely, microsaccades towards the attended stimulus were underrepresented in incorrect trials (Figure 2-figure supplement 1A, lower left). There was a very low but statistically significant negative correlation between pupil diameter and microsaccade rate (Figure 2-figure supplement 1B, r2 = 0.006, p < 0.001). Microsaccade rates and inter-microsaccade times are reported in Figure 2-figure supplement 1C-D. Overall, these results suggest that successful trials at threshold are significantly more likely to occur during a state of greater arousal and improved visual sensitivity.

Having established that hit trials are more likely to occur in states of elevated arousal and visual stability, we investigated whether hits are characterized by differential information processing in V4. We first examined the ability to discriminate target stimuli from non-target stimuli using the firing rates of single and multi-unit V4 neurons in each of the three identified cortical layers (superficial, input, and deep). A linear decoder could better discriminate targets from non-targets in hits compared to misses (Figure 3A; see Figure 3-Figure supplement 1 for individual animal plots), suggesting differences in firing rates across these trial types. This improved stimulus discriminability was consistent across all three layers (Figure 3A).

Target stimuli evoke higher firing rates in hit trials

Rows correspond to different layers (top=superficial, middle=input, bottom=deep). (A) Performance for decoding targets from non-targets from single-units and multi-units in each layer. Points in the left section of each plot show the decoding performance for each of the 20 different cross-validations. The right section for each layer shows the bootstrapped estimation of the difference between decoding performance between hits and misses. Half-violin plots show the bootstrapped distribution of the difference, and black dots and bars represent the mean and 95% confidence intervals of the difference in decoding performance. Chance levels, determined by shuffling target and non-target identity, were subtracted from the raw decoding performance values. (B) Non-target population (single and multi-unit) PSTH of visually responsive neurons for the hit (orange) and miss (dark-gray) trials in the threshold condition (mean +/-s.e.m). The horizontal black bar indicates the time and duration of stimulus presentation. (C) As in B but for target stimuli. The star indicates the time at which firing rates in the input layer first differ significantly between hit and miss trials. Vertical lines represent the mean time at which firing rates for each neuron rise above the 95% confidence interval of their baseline activity (see also Figure S3C). (D) Bootstrapped estimation of the paired mean difference in target stimulus-evoked firing rate between hit and miss trials in the time window 60-260ms (red dotted box in C) after target stimulus onset. Shaded regions represent the bootstrapped estimation of the paired mean difference in firing rate (hit - miss), and black lines are 95% confidence intervals. Plots include data from both single and multi-units, separated by layer (top= superficial, middle=input, bottom=deep). (E) As in D, bootstrapped estimation of the paired mean difference in firing rate for hit trials compared to miss trials in the target stimulus-evoked period, but only for single-units broken up by cell class (gold=broad, teal=narrow).

Elevated stimulus-evoked firing rates would indicate a stronger representation of the stimulus that could cause this improved discriminability in hits. We compared the firing rates of all neurons (single and multi-units) recorded in each cortical layer across hit and miss trials. For non-target stimuli, firing rates were equivalent for hits and misses in both the pre-stimulus (0-200ms before stimulus onset) and stimulus-evoked (60-260ms following stimulus onset) periods (Figure 3B; see Figure 3-figure supplement 2 for individual animal plots). For the target stimulus, firing rates were once again equivalent in the pre-stimulus period, but hit trials were characterized by elevated firing across cortical layers in the stimulus-evoked period (Figure 3C-D). Broad and narrow-spiking neurons in both the input and deep layers respond more to target stimuli in hit trials, and trend towards elevated firing rates in the superficial layers during hits (Figure 3E). The average firing rate in response to target stimuli for each neuron is shown in Figure 3-figure supplement 3A for both hit and miss trials. It is important to note that the stimuli presented to the animals were identical for both hits and misses. Moreover, the responses to the target stimuli occur early, and elevated firing in hits emerges at the time of expected V4 response latencies (70-100 ms; Figure 3-figure supplement 3C), and thus cannot be attributed to elevated levels of firing due to subsequent saccade planning in these trials (Figure 3-figure supplement 3B; expected >200ms; 72).

Variability in response reflects how reliably information is encoded by a neural population. Lower baseline variability can enhance the ability of neurons to encode stimulus differences. We calculated the Fano factor, a mean-normalized measure of trial-to-trial variability in firing, for single-units in our population (Figure 4A; see Figure 4-figure supplement 1 for individual animal plots). We find that broad-spiking units in the superficial layer exhibited lower Fano factor during the pre-stimulus period in hit trials (0-60ms before non-target stimulus onset, Figure 4B), indicating this population of neurons fired more reliably when the animal correctly detected the orientation change. This was not the case for broad-spiking neurons in other layers (Figure 4B) or narrow-spiking neurons (Figure 4-figure supplement 2).

Broad-spiking neurons in the superficial layer have decreased variability in hit trials

(A) Rows correspond to different layers (top=superficial, middle=input, bottom=deep). The Fano factor of broad-spiking putative excitatory neurons for the hit and miss trials in the threshold condition (mean +/-s.e.m). There is a significant decrease in variability for the hit trials prior to stimulus onset only in the superficial layer. 0 ms corresponds to non-target stimulus onset. The average Fano factor within a 60ms time-window (red dashed box) prior to non-target stimulus onset is plotted in B. (B) Top: Fano factor modulation index for each broad-spiking neurons recorded in each layer, averaged in the 60ms preceding non-target stimulus onset. Bottom: Bootstrapped estimation of the mean difference of the Fano factor modulation index from zero in each of the three layers. Colored curves represent the estimated bootstrapped distribution. Black dots and lines reflect the mean and 95% confidence intervals of the distributions.

We next wanted to test how the relationship between spiking activity and LFPs may differ across hits and misses. Spike-LFP synchrony can reflect cortical processing and both within-and inter-areal coordination (7375). We calculated the PPC (76), a frequency-resolved measure of spike-LFP phase-locking, for single and multi-units relative to their local LFP signal during the pre-stimulus period (0-200ms before non-target stimulus onset, Figure 5A; see Figure 5-figure supplement 2 for individual animal plots). We averaged PPC values at low (3-12Hz), medium (15-25Hz), and high (30-80Hz) frequency bands (superficial & Input: Figure 5-figure supplement 1A-B; deep: Figure 5B & Figure 5-figure supplement 1C).

Deep layer neurons are phase-locked to low-frequency rhythms in miss trials

(A) Pairwise phase consistency (PPC) of single and multi-units in each layer to the local field potential (LFP) signal recorded from the same channel in hit and miss trials at threshold. PPC was calculated in the pre-stimulus period (0-200 ms before stimulus onset). Dashed red line indicates a PPC of 0, below which there is no consistent relationship between spikes and LFP phase. (B) Bootstrapped estimation plot for the paired mean difference in PPC for deep layer neurons over three frequency bands: 3-12Hz, 15-25Hz, 30-80Hz. Curves represent the bootstrapped distribution for the paired difference, and black dots and vertical lines represent the mean and 95% confidence intervals for the paired mean difference.

Our results at the individual neuron or neural-sub-population levels suggest enhanced processing of perceived stimuli. However, it is the concerted activity among neural sub-populations that ultimately determine information flow through the laminar cortical circuit. We turned to canonical correlation analysis (CCA) to investigate the strength of feed-forward communication across layers (77). CCA has previously been used to describe interactions among multiple cortical areas (78). We performed CCA on each pair of layers: input to superficial, input to deep, and superficial to deep, where the two elements in each pair correspond to the upstream and downstream layers respectively (Figure 6A). We refer to the results of CCA as population correlations. Interlaminar feed-forward population correlations were higher in hits than in misses in both the pre-stimulus and stimulus-evoked periods (Figure 6B-C). This suggests that feed-forward information flow through the column is more effective in hits than in misses. To further investigate interlaminar communication, we analyzed interlaminar synchrony as signatures of differential information flow between hit and miss trials. Spike-spike coherence (SSC) is a frequency-resolved measure of the degree to which two spike trains fluctuate together (26, 79). We measured interlaminar SSC for spike trains from pairs of cortical layers, each spike train being comprised of all recorded action potentials in a given layer (See Methods). We computed interlaminar SSC separately for hit and mis trials in both the pre-stimulus (0-200ms before non-target stimulus onset, Figure 7A) and non-target stimulus-evoked (60-260ms after non-target stimulus onset, Figure 7C) periods, matching the firing rates across hit and miss trials separately for the pre-stimulus and non-target stimulus-evoked conditions (see Figure 7-figure supplement 1 for individual animal plots). We averaged SSC for each pair of layers across three frequency bands, 3-12Hz, 15-25Hz, and 30-80Hz (Figure 7B and D).

Hit trials are characterized by stronger feed-forward interlaminar population correlations

(A) CCA-based population correlation as a function of time and inter-laminar delay during the pre-stimulus and stimulus-evoked periods in hit and miss trials in an example session. (B) Mean feed-forward population correlation in each session. Color indicates the monkey (blue=monkey A, yellow=monkey C). (C) Bootstrapped estimation plot for the paired mean difference in population correlation for each pair of layers and time window (pre-stimulus or stimulus-evoked). Curves represent the bootstrapped distribution for the paired difference, and black dots and vertical lines represent the mean and 95% confidence intervals for the paired mean difference.

Greater interlaminar coherence in hit trials in the pre-stimulus and non-target stimulus-evoked periods

Rows correspond to different pairs of layers (top=superficial-input, middle=superficial-deep, bottom=input-deep). (A) Multi-unit interlaminar spike-spike coherence (SSC) calculated in the 200ms before non-target stimulus onset in hit and miss trials (solid lines, mean +/-s.e.m). Firing rates were matched across hit and miss trials. Dashed lines represent coherence calculated with shuffled trial identities (mean +/-s.e.m). (B) Bootstrapped estimation plot for the paired mean difference in SSC for each pair of layers averaged over three frequency bands : 3-12Hz, 15-25Hz, 30-80Hz. Curves represent the bootstrapped distribution for the paired difference, and black dots and vertical lines represent the mean and 95% confidence intervals for the paired mean difference. (C) Interlaminar spike-spike coherence in the non-target stimulus-evoked period (60-260ms after stimulus onset). Same conventions as in A. (D) Bootstrapped estimation plot for the paired mean difference in SSC for each pair of layers averaged over three frequency bands. Same conventions as B.

Overall, hit trials have greater interlaminar SSC compared to misses at almost all frequencies (Figure 7B and D). In the pre-stimulus period, the strongest SSC difference between hits and misses was observed between the superficial and deep layers across all frequencies (Figure 7B, middle panel). This implies greater synchrony of the output layers of the cortex during hit trials. In contrast, this pattern was directionally the same during the non-target stimulus-evoked period, but stronger in the other layer pairs, with greater SSC differences being found in pairs that involve the input layer (Figure 7D, top and bottom). This may reflect a higher degree of stimulus-driven feed-forward information propagation during hit trials. When comparing across time (pre-stimulus vs non-target stimulus-evoked), layers, and frequency band, there was a significant interaction effect of layer pair and time window (three-way ANOVA, p = 0.0075).

Finally, we sought to compare the predictive power of our results on the monkey’s perceptual performance. We created a generalized linear model (GLM) to regress behavioral outcome from the pupil diameter, number of microsaccades in the pre-target window, and average target-evoked multi-unit firing rate in each of the three layers (see Methods; 80). Other reported measures (Fano factor, PPC, interlaminar population correlations, and SSC) that we could not estimate reliably on a single trial basis were not considered in the GLM analysis. Pre-target microsaccades were by far the strongest predictor of performance (weight = -1.3116; p = 6.0757 e – 08). Input layer firing rate also significantly predicted perception (weight = 0.3276; p = 0.020068). Superficial firing rate, deep firing rate, and pupil diameter were not significant predictors (Table 2, all p > 0.5). This indicates that, among the variables that we could estimate reliably on a single trial basis, stable retinal images in the pre-target window are critical for behavioral performance, and elevated firing in the input layer is the most reliable physiological signature of a perceived stimulus. GLM fit parameters can be found in Table 3.

GLM Coefficient Values

GLM Summary

Discussion

We investigated the physiological processes responsible for variable behavioral outcomes at perceptual threshold. Controlling for both the attentive instruction (thus minimizing large-scale attentional effects) and the stimulus condition that elicited performance at a threshold level allowed us to examine the physiological and neural correlates that underlie correct versus incorrect behavioral outcomes. While this study cannot disentangle the independent roles of behavioral state fluctuations and neural fluctuations in determining behavioral outcomes, evidence suggests that differences in both are associated with hits. We found multiple lines of evidence which suggest that a state of higher arousal and eye-position stability and the accompanying enhanced processing of visual stimuli contributes to accurate perception in hit trials (Figure 8).

Conceptual model for stimulus processing at perceptual threshold

(A) Hit trials have a larger pupil diameter and fewer pre-target microsaccades, reflecting a state of increased arousal and greater eye position stability. Conversely, miss trials show decreased arousal and eye position stability. (B) In the spontaneous pre-stimulus period, hits are characterized by decreased variability in superficial layer broad-spiking neurons, which we hypothesize is reflective of lower membrane potential (Vm) variability (inset). Hit trials are also characterized by greater synchrony between the superficial and deep layers (indicated by thicker arrows), which could be reflecting a stronger top-down influence on the cortical column. (C) In the stimulus-evoked period there is greater interlaminar synchrony between pairs that include the input layer (represented by thicker arrows), which we propose reflects improved feed-forward propagation of information. We propose these state differences in hits contribute to elevated firing rates in response to target stimuli, particularly in the superficial layers (inset), resulting in a higher-fidelity output to downstream areas. E = excitatory; I = inhibitory; s = superficial; i = input; d = deep.

Pupil diameter is elevated in hit trials (Figure 2A-C; Figure 8A), and prior studies have shown that pupil diameter is strongly linked to arousal and alertness (51, 52, 54). This provides evidence that a state of higher arousal may contribute to improved sensory processing. The much lower hit rate in trials with a microsaccade preceding the target (Figure 2D; Figure 8A) and our GLM analysis show that stability of retinal images is critical for accurate discrimination at threshold. It is unlikely that these two measures are reflecting the same phenomenon, as there is a very weak correlation between them over the course of a trial (Figure 2-figure supplement 1B).

There is a strong link between oculomotor control and attentional deployment (8183). In this study, hits and misses differ in their behavioral responses, with hit trials being characterized by a saccade to the target stimulus. Almost all of our neural results reflect differences around the time of non-target stimulus presentations during which the monkeys maintained fixation at the center of the screen and, therefore, were hundreds of milliseconds prior to saccade planning and execution in the case of hit trials. Trials in which saccades were made to non-target stimuli were excluded from analysis, as were trials in which the monkey made a saccade to the target too soon after its presentation to have been a behavioral response to stimulus perception (see Methods). The analysis of microsaccade occurrence focused on the window just before target stimulus presentation and before monkeys could begin oculomotor planning. Only the analysis of neural responses to target stimuli appears in conjunction with divergent oculomotor behavior between the hit (saccade) and miss (no saccade) trials. However, here too firing rates diverge much earlier, particularly in the input layer, than would be consistent with the effects of saccade planning (Figure 3-figure supplement 1B; 72).

Non-target stimulus contrasts were slightly different between hits and misses (mean: 33.1% in hits, 34.0% in misses, permutation test, p = 0.02), but the contrast of the target was higher in hits compared to misses (mean: 38.7% in hits, 27.7% in misses, permutation test, p = 1.6 e − 31). Firing rates were normalized by contrast in Figure 3. In all other figures, we considered only non-target stimuli, which had very minor differences in contrast (<1%) across hits and misses. While we cannot completely rule out any other effects of stimulus contrast, the normalization in Figure 3 and minor differences for non-target stimuli should minimize them.

A body of evidence (See 84 for review), suggests that microsaccades directed toward a target stimulus reflect attention-related processing and performance (6571). In our dataset, during the pre-target period, microsaccades towards the attended stimulus were overrepresented in correct trials (Figure 2-figure supplement 1A, upper left). Conversely, microsaccades towards the attended stimulus were underrepresented in incorrect trials (Figure 2-figure supplement 1A, lower left). Microsaccades directed towards the location of the eventual target may reflect elevated attentional deployment that can compensate for reduced sensitivity due to a higher incidence of microsaccades.

Our electrophysiological findings and their laminar patterns associated with hit trials within a cued attention state mirror several previous findings that are associated with the deployment of covert spatial attention. Attention has long been known to increase firing rates in V4 (21, 25, 85), and there is evidence that this increase occurs in all cortical layers in V4 (39). We find improved target vs non-target discriminability in hits (Figure 3A) across all cortical layers. Additionally, elevated target-evoked firing rates in hits occur across all layers in conjunction with elevated arousal (Figure 3B-D; Figure 8C). Attention reduces the variability in the firing of V4 neurons, and this reduction is thought to contribute to the improved information coding capacity of a population of neurons (2426, 39, 86). The reduction in Fano factor among broad-spiking superficial-layer neurons in hit trials mirrors the effects of attention (Figure 4). Multiple lines of evidence suggest broad- and narrow-spiking correspond to putative excitatory and inhibitory neurons respectively. Narrow-spiking neurons exhibit higher firing rates, which corresponds well with inhibitory interneuron (31, 39, 40, 8789). Repolarization times in broad-spiking neurons are also longer, as they are in excitatory pyramidal neurons (40, 50, 90). Since these neurons are putative projection neurons to downstream cortical areas, this reduction in Fano factor may indicate increased reliability in stimulus encoding that could contribute to hits. Our finding is also in agreement with previous reports of higher variability in representations of unperceived stimuli in humans (91). Synchronous neural activity appears to modulate perceptual and cognitive ability in a variety of contexts (9295). We found that deep-layer neurons exhibit less low-frequency phase-locking in hit trials (Figure 5). This is consistent with prior studies that find an attention-mediated reduction in the power spectrum of the spike-triggered-averaged LFP (93).

In examining interlaminar population synchrony, we found that hit trials were characterized by stronger feed-forward interactions across the cortical column (Figure 6). This state of improved inter-laminar information flow could be a result of neuromodulatory or top-down processes that maintain the cortex in a state of sustained depolarization corresponding to a state of higher arousal during hits (8, 15). Our examination of inter-laminar synchrony revealed two interesting and complementary patterns: hits were associated with greater coherence between the superficial and deep layers during spontaneous activity in the pre-stimulus period (Figure 7A-B; Figure 8B); in contrast, we found enhanced coherence between the input layer and both the output layers (superficial and deep) in the stimulus-evoked period during hits (Figure 7C-D; Figure 8C). Increased superficial-deep coherence in the pre-stimulus period could be the result of the same neuromodulatory or top-down processes. Increased synchrony between the input layer and the output layers during the stimulus-evoked period provides further evidence of stronger information propagation through the cortical circuit, and hence with improved stimulus detection (96). In contrast to broad global synchrony or local correlated fluctuations, which may signal a default state of minimal processing or decreased information coding capacity (26, 9799), these patterns of interlaminar coherence that we found suggest that successful perception at threshold is mediated by pathway-specific modulation of information flow through the laminar cortical circuit.

Prior studies showing decreased correlations under attention typically do not contain laminar information (24, 26) or only consider decreased correlations within a layer (39). In contrast, the correlation and synchrony analyses presented here are interlaminar, which we expect could reflect improved information processing in a column, similar to principles of communication across areas (78).

Taken together, our results provide insight about how information about a threshold stimulus may successfully propagate through a cortical column and influence sensory perception. Lower baseline variability among broad-spiking superficial layer neurons and decreased low-frequency synchronous activity in the deep layers could be indicative of improved capacity to encode sensory information. Higher target-evoked firing rates and elevated interlaminar synchrony could enhance the propagation of this encoded signal. These results associate pre-stimulus baseline state differences with enhanced cortical processing in the stimulus-evoked period.

Several studies have examined how information flow differs for perceived and unperceived stimuli at a more macroscopic scale (4, 9). van Vugt et al. (9) recorded from three brain regions, V1, V4, and dorsolateral prefrontal cortex (dlPFC), while a monkey performed a stimulus detection task at threshold. Their work supports the model that feedforward propagation of sensory information from the visual cortex to the PFC causes a non-linear “ignition” of association areas resulting in conscious perception (100). Herman et al. (4) found that conscious human perception triggers a wave of activity propagation from occipital to frontal cortex while switching off default mode and other networks. Our study provides insight into the functions of the cortical microcircuit at the columnar level that could reflect these large-scale sweeping activity changes in perception.

Overall, we identified substantial layer-specific differences in cortical activity between hits and misses at perceptual threshold, leading to the following conceptual model (Figure 8). During spontaneous activity, the state of elevated arousal and eye position stability during hits (Figure 8A) is manifested by increased interlaminar synchrony between the superficial and deep layers (Figure 8B, thicker orange arrows), which we propose is due to top-down influences. We predict that decreased firing variability in broad-spiking neurons in the superficial layer is caused by a lower variability in membrane potential closer to the action potential threshold among these neurons (Figure 8B, inset). Elevated feed-forward propagation in the stimulus-evoked period (Figure 8C) and a membrane potential closer to action potential threshold could both contribute to higher firing rates in the output layers of the cortex (Figure 8C, inset), and are indicative of greater fidelity of stimulus processing in hits. These physiological differences in the laminar microcircuit likely contribute to successful perceptual discrimination at threshold.

Materials & methods

Surgical Procedures

Surgical procedures have been described in detail previously (39, 101, 102). In brief, an MRI compatible low-profile titanium chamber was placed over the pre-lunate gyrus, on the basis of preoperative MRI in two rhesus macaques (right hemisphere in Monkey A, left hemisphere in Monkey C). The native dura mater was then removed, and a silicone based optically clear artificial dura (AD) was inserted, resulting in an optical window over dorsal V4 (Figure 1-figure supplement 1A, B). Antibiotic (amikacin or gentamicin) soaked gauzed was placed in the chamber between recording sessions to prevent bacterial growth. All procedures were approved by the Institutional Animal Care and Use Committee and conformed to NIH guidelines.

Electrophysiology

At the beginning of each recording session a plastic insert, with an opening for targeting electrodes, was lowered into the chamber and secured. This served to stabilize the recording site against cardiac pulsations. Neurons were recorded from cortical columns in dorsal V4 using 16-channel linear array electrodes (‘laminar probes’, Plexon Inc., Plexon V-probe). The laminar probes were mounted on adjustable X-Y stages attached to the recording chamber and positioned over the center of the pre-lunate gyrus under visual guidance through a microscope (Zeiss Inc., Figure 1-figure supplement 1C). This ensured that the probes were maximally perpendicular to the surface of the cortex and thus had the best possible trajectory to make a perpendicular penetration down a cortical column. Across recording sessions, the probes were positioned over different sites along the center of the gyrus in the parafoveal region of V4 with receptive field (RF) eccentricities between 2 and 7 degrees of visual angle. Care was taken to target cortical sites with no surface micro-vasculature, with surface micro-vasculature used as reference so that the same cortical site was not targeted across recording sessions. The probes were advanced using a hydraulic microdrive (Narishige Inc.) to first penetrate the AD and then through the cortex under microscopic visual guidance. Probes were advanced until the point that the top-most electrode (toward the pial surface) registered local field potential (LFP) signals. At this point, the probe was retracted by about 100-200 μm to ease the dimpling of the cortex due to the penetration. This procedure greatly increased the stability of the recordings and increased the neuronal yield in the superficial electrodes.

The distance from the tip of the probes to the first electrode contact was either 300 μm or 700 μm. The inter-electrode distance was 150 μm, thus minimizing the possibility of recording the same neural spikes in adjacent recording channels. Electrical signals were recorded extracellularly from each channel. These were then amplified, digitized and filtered either between 0.5 Hz and 2.2 kHz (LFPs) or between 250 Hz and 8 kHz (spikes) and stored using the Multichannel Acquisition Processor system (MAP system, Plexon Inc.). Spikes and LFPs were sampled at 40 and 10 kHz respectively. LFP signals were further low-pass filtered with a 6th order Butterworth filter with 300Hz cut-off and down-sampled to 1 kHz for further analysis. Spikes were classified as either multi-unit clusters or isolated single units using the Plexon Offline Sorter software program. Single units were identified based on two criteria: (a) if they formed an identifiable cluster, separate from noise and other units, when projected into the principal components of waveforms recorded on that electrode and (b) if the inter-spike interval (ISI) distribution had a well-defined refractory period. Single-units were classified as either narrow-spiking (putative interneurons) or broad-spiking (putative pyramidal cells) based on methods described in detail previously (25, 39). Specifically, only units with waveforms having a clearly defined peak preceded by a trough were potential candidates. The distribution of trough-to-peak duration was clearly bimodal (Hartigan’s Dip Test, p = 0.012) (103). Units with trough-to-peak duration less than 225 μs were classified as narrow-spiking units; units with trough-to-peak duration greater than 225 μs were classified as broad-spiking units (Figure 1-figure supplement 1D; teal=narrow, gold=broad).

Data was collected over 32 sessions (23 sessions in Monkey A, 9 in Monkey C), yielding a total of 413 single units (146 narrow-spiking, 267 broad-spiking) and 296 multi-unit clusters. Per session unit yield was considerably higher in Monkey C compared to Monkey A, resulting in a roughly equal contribution of both monkeys toward the population data.

Task, Stimuli and Inclusion Criteria

Stimuli were presented on a computer monitor placed 57 cm from the eye. Eye position was continuously monitored with an infrared eye tracking system (ISCAN ETL-200). Trials were aborted if eye position deviated more than 1° (degree of visual angle, ‘dva’) from fixation. Experimental control was handled by NIMH Cortex software (http://www.cortex.salk.edu/). Eye-position (all sessions) and pupil diameter (18/32 sessions) data were concurrently recorded and stored using the MAP system.

Receptive Field Mapping

At the beginning of each recording session, neuronal RFs were mapped using subspace reverse correlation in which Gabor (eight orientations, 80% luminance contrast, spatial frequency 1.2 cycles/degree, Gaussian half-width 2°) or ring stimuli (80% luminance contrast) appeared at 60 Hz while the monkeys maintained fixation. Each stimulus appeared at a random location selected from an 11x11 grid with 1° spacing in the appropriate visual quadrant. Spatial receptive maps were obtained by applying reverse correlation to the evoked local field potential (LFP) signal at each recording site. For each spatial location in the 11x11 grid, we calculated the time-averaged power in the stimulus evoked LFP (0-200ms after each stimulus flash) at each recording site. The resulting spatial map of LFP power was taken as the spatial RF at the recording site. For the purpose of visualization, the spatial RF maps were smoothed using spline interpolation and displayed as stacked contours plots of the smoothed maps (Figure 1-figure supplement 1G). All RFs were in the lower visual quadrant (lower-left in Monkey A, lower-right in Monkey C) and with eccentricities between 2 and 7 dva.

Current Source Density Mapping

In order to estimate the laminar identity of each recording channel, we used a current source-density (CSD) mapping procedure (48). Monkeys maintained fixation while 100% luminance contrast ring stimuli were flashed (30ms) centered at the estimated RF overlap region across all channels. The size of the ring was scaled to about three-quarters of the estimated diameter of the RF. CSD was calculated as the second spatial derivative of the flash-triggered LFPs (Figure 1-figure supplement 1E). The resulting time-varying traces of current across the cortical layers can be visualized as CSD maps (Figure 1-figure supplement 1F; maps have been spatially smoothed with a Gaussian kernel for aid in visualization). Red regions depict current sinks in the corresponding region of the cortical laminae; blue regions depict current sources. The input layer (Layer 4) was identified as the first current sink followed by a reversal to current source. The superficial (Layers 1-3) and deep (Layers 5-6) layers had opposite sink-source patterns. LFPs and spikes from the corresponding recording channels were then assigned to one of three layers: superficial, input or deep.

Attention task

In the main experiment, monkeys had to perform an attention-demanding orientation change-detection task (Figure 1A). While the monkey maintained fixation, two achromatic Gabor stimuli (orientation optimized per recording session, spatial frequency 1.2 cycles/degree, Gaussian halfwidth 2 degrees, 6 contrasts randomly chosen from an uniform distribution of luminance contrasts, Monkey A: contrast = [10, 18, 26, 34, 42, 50%], Monkey C: contract(8 sessions) = [20, 28, 36, 44, 52, 60%] or contrast(1 session) = [30,40,50,60,70,80%]) were flashed on for 200 ms and off for a variable period chosen from a uniform distribution between 200-400 ms. One of the Gabors was flashed at the receptive field overlap region, the other at a location of equal eccentricity across the vertical meridian. The range of stimulus contrasts was the same at both locations. At the beginning of a block of trials, the monkey was spatially cued (‘instruction trials’) to covertly attend to one of these two spatial locations. During these instruction trials, the stimuli were only flashed at the spatially cued location. No further spatial cue was presented during the rest of the trials in a block. At an unpredictable time drawn from a truncated exponential distribution (minimum 1 s, maximum 5 s, mean 3 s), one of the two stimuli changed in orientation. The monkey was rewarded for making a saccade to the location of orientation change. The monkey was rewarded for only those saccades where the saccade onset time was within a window of 100-400 ms after the onset of the orientation change. The orientation change occurred at the cued location with 95% probability and at the uncued location with 5% probability (‘foil trials’). We controlled task difficulty by varying the degree of orientation change (Δori), which was randomly chosen from one of the following: 1, 2, 3, 4, 6, 8, 10 and 12°. The orientation change in the foil trials was fixed at 4°. These foil trials allowed us to assess the extent to which the monkey was using the spatial cue, with the expectation that there would be an impairment in performance and slower reaction times compared to the case in which the change occurred at the cued location. If no change occurred before 5s, the monkey was rewarded for maintaining fixation (‘catch trials’, 13% of trials). We refer to all stimuli at the baseline orientation as ‘non-targets’ and the stimulus flash with the orientation change as the ‘target’. Monkeys initiated a median of 905 trials (range of 651 to 1086).

Inclusion criteria

Of the 413 single units, we included only a subset of neurons that were visually responsive for further analysis. For each neuron we calculated its baseline firing-rate for each attention condition (attend into RF [‘attend-in’ or ‘IN’], attend away from RF [‘attend-away’ or ‘AWAY’]) from a 200ms window before a stimulus flash. We also calculated the neuron’s contrast response function for both attention conditions (Figure 1-figure supplement 1H). This was calculated as the firing rate over a window between 60-200 ms after stimulus onset and averaged across all stimulus flashes (restricted to non-targets) of a particular contrast separately for each attention condition. A neuron was considered visually responsive if any part of the contrast response curves exceeded the baseline rate by 4 standard deviations for both attention conditions. This left us with 274 single units (84 narrow-spiking, 190 broad-spiking) and 217 multi-unit clusters for further analysis.

Data analysis

Behavioral Analysis

For each orientation change condition Δori, we calculated the hit rate as the ratio of the number of trials in which the monkey correctly identified the target by making a saccadic eye-movement to the location of the target over the number of trials in which the target was presented. The hit rate as a function of Δori, yields a behavioral psychometric function (Figure 1B). We performed this analysis independently for each recording day for each monkey, yielding a similar but distinct psychometric function for every session. Psychometric functions were fitted with a smooth logistic function (1). Error bars were obtained by a jackknife procedure (20 jackknives, 5% of trials left out for each jackknife). Performance for the foil trials were calculated similarly as the hit rate for trials in which the orientation change occurred at the un-cued location (Figure 1B, square symbol). For each fitted psychometric function in both the attend-in and attend-away conditions, we calculated the threshold of the fitted logistic function (i.e. the Δori at which performance was mid-way between the lower and upper asymptotes). Because the threshold of the fitted function always lies somewhere on the axis of Δori, but not exactly at an orientation change presented to the subject, we then defined the threshold condition as the subset of trials in which the orientation change of the target stimulus was closest to the threshold of the fitted function (Figure 1B). We restricted further analysis to this threshold condition. For this threshold condition we identified the trials in which the monkey correctly identified the target as ‘hit’ trials and those in which the monkey failed to identify the target as ‘miss’ trials. Analysis of behavior, pupil diameter, and microsaccades was conducted on both the attend-in and attend-away conditions; all electrophysiological analysis was applied only to the attend-in condition.

To compare the effect of target timing across hits and misses, we determined the time between trial initiation and target presentation in all trials in the threshold condition. For comparison purposes, the contrast values of all target and non-target stimuli presented in the threshold condition were compared in hit and miss trials using a permutation test.

Pupil Diameter

The raw pupil diameter measurements from the infrared eye-tracking system could differ across days due to external factors such as display monitor illumination. To control for this, we normalized the raw data by a Z-score procedure separately for each session (using the mean and standard deviation of all measurements during the session). We analyzed normalized pupil diameter traces for hit and miss trials in the threshold condition, over a time window from 100 ms before to 100 ms after all stimulus presentations (non-target and target), excluding the first stimulus presentation in a trial. The first stimulus was excluded to avoid pupil diameter changes due to the pupillary near response caused by acquiring fixation (104). The pupil diameter was averaged over this time period and compared across conditions using bootstrap estimation and t-test. Distribution violin plots were generated using kernel density estimation (105) (bandwidth(hit) = 0.0801, bandwidth(miss) = 0.0648).

Microsaccade Analysis

Saccadic eye-movements were detected using ClusterFix (106). We identified microsaccades by filtering for eye movements with amplitudes between 0.1 and 1 degree of visual angle. We then split all trials in the threshold condition into two groups: those in which a microsaccade was detected in the 400ms preceding the target stimulus presentation, and those without a detected microsaccade. We calculated the hit rate for trials within those two groups. For all trials in which a target stimulus was presented at the attended location, we determined the direction of all microsaccades in the 400ms period preceding target presentation, relative to both the attended and unattended stimuli. The relative microsaccade direction was defined as the angle between two vectors: the one defined by the eye positions at the beginning and end of the microsaccade, and the vector from the initial eye position to the center of the stimulus (calculated separately for attended and unattended stimuli). Relative microsaccade directions were grouped into 12 bins from 0-360°. The distribution of relative microsaccade directions were calculated separately for correct and incorrect trials, relative to both the attended and unattended stimuli (Figure 2-figure supplement 1A).

We next created a null distribution of relative microsaccade direction. This was done by pooling together microsaccades from correct and incorrect trials and then sampling with replacement from this pooled data (bootstrap procedure (107); 1000 samples). The number of microsaccades chosen for each sample was the same as the number in correct or incorrect trials respectively. These bootstrapped samples were used to create 99.5% confidence intervals for the count of microsaccades expected in each of the 12 bins. A bin was considered significantly different from chance if it’s true count fell outside this confidence interval.

We calculated microsaccade rate for an entire trial by dividing the total number of detected microsaccades in the whole trial by the trial length (4592 total trials). The Pearson correlation between microsaccade rate and mean normalized pupil diameter (see above) for the trial was calculated for all trials with pupil diameter data, regardless of trial type or outcome (Figure 2-figure supplement 1B). Not pictured in Figure 2-figure supplement 1B but included in correlation analysis were trials with a mean normalized pupil diameter greater than 2 or less than -2 (∼4% of trials). Only 4 of these trials were longer than 1 s, out of which 2 trials contained detected microsaccades. The mean pupil diameter in these trials is shown in Figure 2-figure supplement 1B inset. Inter-microsaccade time was calculated as the time between microsaccade onset of microsaccades detected in the same trial. 4% of microsaccades separated by >538 ms are excluded from Figure 2-figure supplement 1D as they were more than 1.5x the interquartile range above the third quartile.

Decoding Analysis

For each single-or multi-unit neuron, we extracted spike counts from 60-260ms following all non-target or target stimulus onsets in the threshold condition. Using these spike counts, we fit a Poisson distribution to estimate the mean firing rate for each neuron in each of four stimulus conditions (non-targets and targets in hit and miss trials). We then created a pseudo-population of neurons in each layer. We generated spike counts drawn from the fitted Poisson distributions to create synthetic spike counts for target and non-target stimuli in hits and misses separately (1000 repeats). We used linear discriminant analysis (LDA) to decode target from non-target stimuli. Decoders were trained separately for hit and miss trials. The procedure was repeated for a 20-fold cross-validation. We calculated the chance performance by training the decoder on data generated from all trials in the threshold condition (both hits and misses) and with shuffled labels (target or non-target).

Firing Rate

Firing rates were normalized per neuron to that neuron’s maximum stimulus-evoked response to each contrast before being combined across contrasts and trial types. We averaged stimulus-evoked firing rates from 60-260 ms following non-target or target stimulus presentations. We used bootstrapped estimation to compare firing rates in hit and miss trials in a paired comparison. This was done for all single and multi-unit clusters, as well as broad-and narrow-spiking single-units in each layer. Firing rates were also compared across hit and miss trials by paired t-test for each group. PSTH of firing rates were calculated in 30ms bins shifted in 5ms increments. To calculate the time of firing rate divergence, we compared the difference in each single neuron or multi-unit firing rate in the two conditions over time. At each time point, we performed a Wilcoxin rank-sum test comparing the firing rates across hits and misses, and defined the divergence point as the first time the firing rates were significantly different. Divergence was calculated separately for each layer. To determine the time at which firing rates rose above baseline levels, we used bootstrapping to estimate 95% confidence intervals for each neuron’s pre-stimulus firing rate (0-100 ms before target or non-target stimulus onset). We then calculated the target PSTH for each neuron in 30ms bins shifted in 5ms increments. We defined the response latency as the first time bin in which the neuron’s firing rate in the PSTH exceeded the upper limit of the 95% confidence interval of baseline firing. We calculated the response latency independently for hit and miss trials for each neuron.

Fano factor

Trial-to-trial variability was estimated by the Fano factor, which is the ratio of the variance of the spike counts across trials over the mean of the spike counts for each broad and narrow-spiking single unit. The Fano factor was calculated over non-overlapping 20ms time bins in a window from 200ms prior to each non-target flash onset to 200ms after each non-target flash onset for hit and miss trials in the threshold condition. To compare across conditions, we calculated the Fano factor modulation index (MI), defined as

where FFhit and FFmiss represent the Fano factor for a given unit in hit and miss trials respectively at each point in time with respect to non-target stimulus onset. The Fano factor MI was averaged from 0-60ms prior to non-target stimulus onset and compared across trial types in the threshold condition for each sub-population.

Pairwise Phase Consistency (PPC)

We calculated PPC (76) for single and multi-units in the non-target pre-stimulus period (0-200ms preceding onset) in trials in the threshold condition. Although PPC is unbiased by spike count, we set a threshold of 50 spikes for analysis so that only units with enough spikes for a reliable estimate of PPC were included (superficial: n = 26, input: n = 41, deep: n = 64). LFP phase was calculated using Morlet wavelets. PPC for each unit was calculated for the phase of the LFP recorded on the same channel and averaged in three frequency bands (3-12 Hz, 15-25 Hz, and 30-80 Hz). PPC was calculated separately for hit and miss trials and compared across trial outcomes by t-test, corrected for multiple comparisons.

Canonical Correlation Analysis (CCA)

We used CCA (78, 108) to capture the correlation between layers at different time periods and with different amounts of temporal delays. We considered all possible combinations of pairs of layers. We took two windows of activity, one in each layer, in either the pre-stimulus period (0-200ms before non-target stimulus onset) or stimulus-evoked period (60-260ms after non-target stimulus onset). Window length was 50ms and the window was advanced in 10ms steps. The activity within each window was then binned using 10ms bins. We reported correlation associated with the first two canonical pairs, and calculated it separately for hit and miss trials. For each pair of layers, we limited CCA to sessions in which there were at least two neurons recorded in both layers. Feed-forward (FF) layer pairs were defined as follows: input to superficial, input to deep, and superficial to deep. The correlations along the FF signaling pathways (CFF) were calculated as the mean correlation at positive delays:

where t is the time following response onset, dt is the inter-laminar delay involved between windows from two layers, C(t, dt) is the corresponding correlation. Ndt>0 is the number of positive delays investigated. The values in Figure 6B correspond to CFF in each session.

Spike-spike coherence (SSC)

For each recording session, all spikes recorded from visually responsive single and multi-units in each layer were combined into a single spike train for that layer (layer multi-unit). Separately for both the pre-stimulus and non-target stimulus-evoked periods, we randomly deleted spikes from the layer multi-unit with a higher firing rate so that the firing rates were matched across hit and miss trials. SSC was calculated for each of the three possible pairs of layer multi-units in each session for both the pre-stimulus (0-200 ms preceding stimulus onset) and non-target stimulus evoked period (60-260 ms following non-target stimulus onset) separately for hit and miss trials using Chronux (NW = 1; K = 1; http://chronux.org) (77, 79). To control for differences in firing rates across hit and miss trials we used a rate matching procedure (26). For estimation statistics, interlaminar SSC values was calculated for each frequency and subsequently averaged across three frequency bands: 3-12 Hz, 5-15 Hz, and 30-80 Hz and compared across hit and miss trials for each pair of layers in each recording session. For null-hypothesis testing, we calculated the SSC modulation index, defined as

The SSC MI was calculated for each frequency and subsequently averaged across three frequency bands: 3-12 Hz, 5-15 Hz, and 30-80 Hz. MI values for each frequency band were compared to zero by t-test, Bonferroni corrected for multiple comparisons. We tested for interaction effects with a three-factor ANOVA, with frequency, pair of layers, and time window (pre or post stimulus) as factors. We calculated a shuffled distribution of SSC by shuffling the trial identities of the spikes in one of the layers in the pair. We then calculated SSC with the shuffled trial identities. This procedure was repeated 10 times to create the shuffled distribution.

GLM quantification

To compare how well our results can predict behavioral performance we fit a GLM to the response of the monkeys in trials in the threshold condition (80). We included five regressors in our analysis: (1) average pupil diameter during the trial, (2) number of microsaccades in the pre-target window (0-400ms before target stimulus onset), and average target-evoked multi-unit firing rate in the (3) superficial, (4) input, and (5) deep layers. We calculated the average target-evoked firing rate by averaging the firing rate of all single- and multi-units in a given layer 60-260ms after target stimulus onset in each trial. In order to be able to compare weights across regressors, each regressor was transformed into a z-score before being included in the model. We fit the GLM using a logit link function, using the predictors to regress the categorical binary trial outcome (hit or miss). A total of 309 trials were included in the GLM.

Figure legends

Laminar recordings in V4 (Modified from Nandy et a. (2017)).

(A) An artificial dura (AD) chamber is shown over dorsal V4 in the right hemisphere of Monkey A. The native dura mater was resected and replaced with a silicone based artificial dura, thereby providing an optically clear window into the cortex. Scale bar = 5mm. (B) An enlarged view of the boxed region in A clearly shows the sulci and the microvasculature. sts = superior temporal sulcus, lu = lunate sulcus, io = inferior occipital sulcus. Area V4 lies on the pre-lunate gyrus between the superior temporal and lunate sulci. Scale bar = 2mm. (C) Electrophysiology setup: a plastic stabilizer with a circular aperture is secured in place inside the chamber such that the aperture is centered over the pre-lunate gryus. A 16-channel linear array electrode (electrode spacing 150 μm) is positioned over the center of the gyrus and lowered into the cortex under microscopic guidance. The microvasculature pattern was used as a reference to target different cortical sites across recording sessions. (D) Example recording session in monkey C depicting 12 single unit waveforms (mean +/-s.e.m.) isolated along the cortical column. Teal waveforms correspond to narrow-spiking putative interneurons and gold waveforms correspond to broad-spiking putative excitatory units. (E) Stimulus triggered local field potentials (LFPs) obtained by flashing 30ms high contrast ring stimuli in the receptive field of a V4 cortical column. LFP traces averaged across all stimulus repeats are shown color-coded as being part of either the superficial (green), input (gray) or deep (pink) layers. Layer assignment was done after current source-density analysis. (F) Current source-density (CSD) calculated as the second spatial derivative of the stimulus triggered LFPs and displayed as a colored map. The x-axis represents time from stimulus onset; the y-axis represents cortical depth oriented such that the pial surface is at the top and the white matter is at the bottom. Red hues represent current sink, blue hues represent current source. The input layer is identified as the first current sink followed by a reversal to current source. The superficial and deep layers have the opposite sink-source pattern. The CSD map has been spatially smoothed for visualization. (G) Stacked contour plots show spatial receptive fields (RFs) mapped along each contact point in the laminar probe. The spatial receptive fields were obtained by applying reverse correlation to the LFP power evoked by sparse pseudo-random sequences of Gabor stimuli. The RFs are well aligned, indicating perpendicular penetration down a cortical column. Zero depth represents the center of the input layer as estimated from the CSD. (H) Contrast response functions – spikes rate as a function of stimulus contrast – are shown for 2 example units identified in a single recording session in Monkey A. Red and blue traces correspond to the attend-in to RF and attend-away from RF conditions respectively. The dotted lines represent the corresponding background firing-rates. The dashed lines are 4 standard deviations above baseline. A unit was considered as visually responsive, if the contrast response functions exceeded this threshold in both the attention conditions. Mean +/-s.e.m. Panels are reproduced from Nandy et al. (2017).

Additional psychometric function examples.

Example behavioral psychometric functions from 9 recording sessions in the attend in (red) and attend away (blue) conditions. Behavioral performance (circles) is presented as a function of orientation change. Data was fitted with a logistic function. The threshold of each fitted logistic function is indicated by the vertical dashed line. The square symbol indicates foil trial performance, and the star symbol indicates catch trial performance.

Psychometric function parameters.

(A) Distribution of the orientation threshold of the fitted psychometric functions in the attend-in (red solid) and attend-away (blue hatch) conditions for each recording session. The threshold of the sigmoid may not correspond exactly to an orientation change tested in the experiment. (B) Distribution of threshold orientation changes used as the threshold condition in the rest of this study (red=attend-in, blue=attend-away, circle=monkey A, plus=monkey C). (C) Distribution of slopes for the fitted psychometric functions (same conventions as A). (D) Distribution of guess rate for the fitted psychometric functions (same conventions as A). (E) Distribution of lapse rate for the fitted psychometric functions (same conventions as A). (F) Distribution of catch trial performance in each session (same conventions as A). (G) Performance on orientation-change matched regular trials (target presented at cued location) and foil trials (target presented at uncued location) for each session (same conventions as B). (H) Distribution of target presentation time for hit (orange) and miss (gray) trials. (I) Performance in threshold trials in each recording session split by quartiles (see Methods). Error bars represent standard deviation.

Microsaccades are preferentially directed towards the target in correct trials and have a slight correlation with pupil diameter

Data is presented for all trials, regardless of orientation change (not just the threshold condition). (A) The histograms represent the direction of microsaccades relative to the attended stimulus (left column) or unattended stimulus (right column) in correct (top row) and incorrect (bottom row) trials. Black lines represent the mean (solid) and 99.5% confidence interval (dashed) of the bootstrapped null distribution estimated by pooling correct and incorrect microsaccades. ∗ p < 0.005. Inset: Schematic for calculation of relative microsaccade direction. Microsaccade is represented by the gray arrow (B) Scatterplot of microsaccade rate versus mean normalized pupil diameter, shows a small but statistically significant relationship between the two quantities (r2 = 0.006, p < 0.001). Each dot is color-coded by trial length. Inset: Histogram of mean pupil diameter for all trials, including those less than -2 or greater than 2 (red lines). Only four of the trials with pupil diameters less than -2 or greater than 2 had detected microsaccades. (C) Boxplot of microsaccade rates for all trials. (D) Boxplot of the time between microsaccades recorded in the same trial. 4% of microsaccades separated by more than 538ms were excluded as outliers (See Methods).

Single monkey pupil diameter and microsaccade data.

(A) Normalized pupil diameter for hit and miss trails in the threshold condition separated by monkey. 0 ms corresponds to non-target and target stimulus onset. Mean +/-s.e.m. (B) Distribution of pupil diameter values associated with hit and miss trials separated by monkey. Pupil diameter was averaged from 100ms before to 100ms after non-target and target stimulus onset. Violin plots were generated using kernel smoothing (See Methods). Error bars represent 95% confidence intervals for the mean of each distribution, and the mean difference (blue, right axes). (C) Histogram of mean pupil diameter around the time of non-target and target stimulus onset for each monkey (calculated as in B). Orange and gray lines represent the mean pupil diameter for hit and miss trials respectively. (D) Hit rate for trials with and without a microsaccade detected in the time window 0-400ms before target onset for each monkey.

Single monkey decoding performance

(A) Performance for decoding targets from non-targets from single-units and multi-units in the superficial layer for individual monkeys. Points in the left section of each plot show the decoding performance for each of the 20 different cross-validations. The right section for each layer shows the bootstrapped estimation of the difference between decoding performance in hits and misses. Half-violin plots show the bootstrapped distribution of the difference, and black dots and bars represent the mean and 95% confidence intervals of the difference in decoding performance. Chance levels, determined by shuffling target and non-target identity, were subtracted from the raw decoding performance values. (B) Same as A but for the input layer. (C) Same as A but for the deep layers.

Single monkey firing rate data

(A) Single monkey non-target population (single and multi-unit) PSTH of visually responsive neurons for the hit (orange) and miss (dark-gray) trials in the threshold condition (mean +/-s.e.m). (B) As in A but for target stimuli. Red box indicates the time window used for analysis in 3C.

Firing rates for individual neurons and reaction time in threshold condition

(A) Target stimulus-evoked normalized firing rates in hit and miss trials for each recorded single and multi-unit cluster in hit and miss trials. Clusters are divided by layer: left=superficial, middle=input, right=deep. Related to Figure 3B. Each line represents the mean firing rate in response to target stimuli in hit and miss trials for a given unit. Data is color coded by unit type (gold=broad, teal=narrow, gray=multi-unit). See Methods for normalization method. (B) Mean reaction time (time from target stimulus onset to saccade onset) for hit trials in the threshold condition in both attend-in (red) and attend-away (blue) trials for each recording session. (C) Mean time at which single-and multi-unit firing rates rise significantly above pre-stimulus baseline in response to target stimuli in hit and miss trials. Mean +/-s.e.m.

Single monkey Fano factor data

Rows correspond to different layers (top=superficial, middle=input, bottom=deep) and columns correspond to monkeys. The Fano factor of broad-spiking putative excitatory neurons for the hit and miss trials in the threshold condition in each monkey (mean +/-s.e.m).

Narrow-spiking neurons do not have decreased variability in hit trials

Top: Fano factor modulation index for each narrow-spiking neurons recorded in each layer, averaged in the 60ms preceding non-target stimulus onset. Bottom: Bootstrapped estimation of the mean difference of the Fano factor modulation index from zero in each of the three layers. Colored curves represent the estimated bootstrapped distribution. Black dots and lines reflect the mean and 95% confidence intervals of the distributions.

Additional PPC data

(A-B) Top: Raw PPC values calculated for clusters recorded in the superficial (A) and (B) input layers in hit and miss trials, averaged into three frequency bands, 3-12 Hz, 15-25 Hz, and 30-80 Hz. PPC was calculated using the LFP recorded on the same channel as the spikes. Bottom: Bootstrapped estimation of the paired mean difference in PPC across hit and miss trials for each frequency band. Note that although there appears to be a difference in high-frequency PPC in the superficial layer, this population does not have significantly positive PPC in either condition, indicating that there is no phase-locking in either hits or misses. (C) Raw PPC values for neurons recorded in the deep layer in hit and miss trials, averaged into the same 3 frequency bands. Related to Figure 5B.

Single monkey PPC data

Rows correspond to different layers (top=superficial, middle=input, bottom=deep) and columns correspond to monkeys. Pairwise phase consistency (PPC) of single and multi-units in each layer to the local field potential (LFP) signal recorded from the same channel in hit (orange) and miss (gray) trials at threshold. PPC was calculated in the pre-stimulus period (0-200 ms before stimulus onset). Dashed red line indicates a PPC of 0, below which there is no consistent relationship between spikes and LFP phase.

Single animal interlaminar coherence

Rows correspond to different pairs of layers (top=superficial-input, middle=superficial-deep, bottom=input-deep). (A) Single monkey multi-unit interlaminar spike-spike coherence (SSC) calculated in the 200ms before non-target stimulus onset in hit and miss trials (solid lines, mean +/-s.e.m). Firing rates were matched across hit and miss trials. Dashed lines represent coherence calculated with shuffled trial identities (mean +/-s.e.m). (B) Interlaminar spike-spike coherence in the non-target stimulus evoked period in each monkey (60-260ms after stimulus onset). Same conventions as in A.

Author contributions

MPJ and ASN conceptualized the project. ASN collected the data and supervised the project. MPM analyzed the data, with assistance from SD and IB. MPM, SD, ASN, MPJ, and JHR wrote the manuscript.

Acknowledgements

This research was supported by NIH R01 EY021827 to JHR and ASN, NIH R01 EY032555, NARSAD Young Investigator Grant, Ziegler Foundation Grant, Yale Orthwein Scholar Funds & Lawrence Family Young Investigator Funds to ASN, NIH R01 EY034605 & NIH R00 EY025026 to MPJ, and by NEI core grants for vision research P30 EY019005 to the Salk Institute and P30 EY026878 to Yale University. MM was supported by training grants T32-NS007224 and T32-NS041228 to Yale University. We would like to thank Catherine Williams and Mat LeBlanc for their excellent animal care.