Abstract
Continuous flash suppression (CFS), where a dynamic masker presented to one eye suppresses the conscious perception of a stimulus shown to the other eye, has been extensively used to study visual consciousness. Various studies reported high-level visual and cognitive functions under CFS, which, however, has more recently been questioned and at least partially attributed to low-level stimulus. A key but unsettled issue is the extent to which the responses of V1 neurons, where inputs from two eyes first merge, are affected, as severely suppressed V1 responses would not sustain high-level processing. Here, we used two-photon calcium imaging to record the responses of V1 neurons to a grating stimulus under CFS in awake, fixating macaques. The results revealed that CFS substantially suppressed V1 orientation responses. Ocularity-wise, it nearly completely eliminated the orientation responses of V1 neurons preferring the masker eye or both eyes, while also significantly suppressing the responses of those preferring the grating eye. Modeling analyses suggest that, under CFS, the brain retains the ability of classifying coarse orientations, but may become less capable of reconstructing the grating stimulus. Consequently, while CFS-suppressed orientation information still supports low-level orientation discrimination, it may not suffice for high-level visual and cognitive processing.
Introduction
When a target stimulus is presented to one eye and a flickering Mondrian-like masker to the other eye, the target can be rendered invisible for an extended period (Tsuchiya & Koch, 2005). This paradigm, known as continuous flash suppression (CFS), has been widely used to investigate subconscious visual processing (Moors, Hesselmann, Wagemans, & van Ee, 2017; Pournaghdali & Schwartz, 2020; Yang, Brascamp, Kang, & Blake, 2014). Among the most intriguing findings are the subconscious high-level visual and cognitive functions under the influence of CFS (e.g., Adams, Gray, Garner, & Graf, 2010; Almeida, Mahon, Nakayama, & Caramazza, 2008; Fang & He, 2005; Mudrik, Breska, Lamy, & Deouell, 2011; Sklar et al., 2012; Tettamanti, Conca, Falini, & Perani, 2017; Zabelina et al., 2013). For example, priming effects are reportedly evident when the target and the invisible primer are categorically (Almeida et al., 2008) or semantically (Zabelina et al., 2013) consistent. However, many of these observations have been questioned by more recent studies, with at least some of the high-level effects being attributed to low-level feature processing (Gray, Adams, Hedger, Newton, & Garner, 2013; Hesselmann & Malach, 2011; Moors, Boelens, van Overwalle, & Wagemans, 2016; Moors & Hesselmann, 2018; Moors et al., 2017; Pournaghdali & Schwartz, 2020; Sakuraba, Sakai, Yamanaka, Yokosawa, & Hirayama, 2012; Stuit, Paffen, & Van der Stigchel, 2023).
A critical but unresolved issue in this debate is the impact of CFS on V1 neuronal activity. CFS has been hypothesized to arise from mechanisms similar to those in binocular rivalry (Moors et al., 2017; Tsuchiya & Koch, 2005; Yang et al., 2014), which likely suppress V1 responses through interocular inhibition. Only the surviving stimulus information would then be relayed to downstream areas for potential subconscious higher-level visual and cognitive processing (Adams et al., 2010; Almeida, Mahon, & Caramazza, 2010; Jiang, Costello, & He, 2007). Importantly, if V1 activity is suppressed to a sufficient degree, the low-level stimulus information carried by the remaining V1 responses may not suffice to sustain high-level processing of more complex stimuli defined by those low-level features.
Nevertheless, two fMRI studies reported that V1 activity is either unaffected or only weakly affected (Watanabe et al., 2011; Yuval-Greenberg & Heeger, 2013), implying that at least most stimulus inputs remain intact in V1 under CFS, thereby allowing for high-level unconscious processing. In these studies, the strength of neural responses under CFS, in which the stimulus and the flashing masker were presented dichoptically, was compared to that in a monocular masking condition, where the flashing masker was presented to the same eye as the target. No stronger or only slightly stronger CFS masking was found compared to monocular masking. However, monocular masking also suppresses pre-cortical neural responses (Macknik & Martinez-Conde, 2004). As a result, the dichoptic CFS masking, which is cortical, could be substantially stronger than monocular masking when accounting for the pre-cortical effects of monocular masking.
Neurons in V1 exhibit various degrees of ocular dominance (Hubel & Wiesel, 1962), which influences each neuron’s binocular combination of monocular visual inputs from the two eyes (Kato, Bishop, & Orban, 1981; Mitchell, Carlson, Westerberg, Cox, & Maier, 2023; Zhang, Zhao, Jiang, Tang, & Yu, 2024). In the present study, we examined the extent to which V1 neuronal responses are affected by CFS and how neurons preferring the target eye, masker eye, or both eyes are differently impacted. Using a customized two-photon imaging setup for awake macaques (Li, Liu, Jiang, Lee, & Tang, 2017), we sampled large neuronal populations at cellular resolution and measured ocular dominance for each individual neuron. This approach enabled us to investigate the potentially differential impacts of CFS on the responses of V1 neurons with varying ocular preferences, as well as apply machine learning tools to understand the sensory and perceptual consequences of these CFS impacts.
Results
We employed two-photon calcium imaging to record responses of V1 superficial neurons from two awake, fixating macaques, each with two response fields of view (FOVs, 850 x 850 μm2) (Fig. 1A). During the initial recording, the stimulus was a binocular 0.45-contrast square-wave grating varying at twelve orientations and two spatial frequencies (3 & 6 cpd) (Fig. 1B). A total of 3,564 neurons were identified through image processing, including 3,004 (84.29%) orientation-tuned neurons that were included in ensuing data analyses.

Two-photon imaging and ocular dominance mapping.
A. Optical windows for imaging of two macaques. Green crosses indicate the regions for viral vector injections, and yellow boxes indicate the FOVs chosen for imaging. B. Stimuli used for OD mapping. A circular-windowed square-wave grating was presented monocularly to each eye, respectively, to probe each neuron’s ODI. C. Ocular dominance functional maps of each FOV at single-neuron resolution showing OD clusters. D. Frequency distributions of individual neurons’ ODIs in each FOV.
The same grating stimulus was then presented monocularly (Fig. 1B) to each eye to characterize individual neurons’ eye preferences. Each neuron’s ocular dominance index (ODI) was calculated as ODI = (Ri – Rc)/(Ri + Rc), where Ri and Rc were the neuron’s peak responses to ipsilateral and contralateral stimulations, respectively. Neurons with an ODI at –1 or +1 would exclusively prefer the contralateral or ipsilateral eye, while neurons with an ODI at 0 would prefer both eyes equally. Consistent with previous findings (Horton & Hocking, 1996; Hubel & Wiesel, 1962; Livingstone, 1996; Zhang et al., 2024), neurons with similar eye preferences clustered together (Fig. 1C), indicating ocular dominance columns. The ODI followed unimodal distributions (Fig. 1D), in which the majority of neurons were binocular, showing comparable preferences for either eye. Only a small portion of neurons were monocular, being more responsive to the ipsilateral or contralateral eye.
In a third and last step, the grating stimulus and the flashing masker were presented dichoptically to evaluate the impact of CFS on neurons’ orientation responses (Fig. 2A). The results are summarized as population orientation tuning functions under the baseline no-CFS condition and the CFS condition following the procedure in Busse, Wade, and Carandini (2009). Specifically, neurons with similar orientation preferences were binned (bin width = 15°) relative to the target orientation for a total of 12 bins, and the resultant population orientation tuning functions based on the mean responses of these bins (Fig. 2C) were fitted with a Gaussian function. Compared to the baseline population orientation tuning functions, those under the influence of CFS displayed profound reductions in orientation responses. The amplitude decreased by 84.18% in Monkey A and 60.78% in Monkey B on the basis of Gaussian fitting, while the slope decreased by 91.31% in Monkey A and 71.50% in Monkey B (Fig. 2B).

The impacts of CFS on population orientation tuning in two macaques.
A. Stimuli used in the CFS experiment for one macaque. The grating target was presented to one eye, which was dichoptically masked by a circular flashing masker presented to the other eye. The white dot was the fixation point. B. Exemplar baseline and CFS orientation tuning functions for neurons with different eye preferences. C. Population orientation tuning functions of all neurons without CFS as the baseline and with CFS. Data from two FOVs of each monkey were pooled due to highly consistent results. Solid curves are Gaussian fittings. D. Population orientation tuning functions of sub-groups of neurons with different eye preferences without and with CFS. Solid curves are Gaussian fittings. Error bars represent ±1 SE. E. The impacts of CFS on Fisher information. Fisher information is plotted as a function of relative orientation (to the neuron’s preferred orientation) without and with CFS. Shaded areas denote ±1 SE. F. The ratio of baseline/CFS Fisher information within 15° of neurons’ preferred orientations. Data from two FOVs of each monkey were pooled due to highly consistent results.
Furthermore, neurons were divided into three groups according to their ODIs, and the impacts of CFS on their respective orientation responses were examined: neurons preferring the grating eye (ODI > 0.2 or < -0.2, depending on whether the grating stimulation was ipsilateral or contralateral), binocular neurons (-0.2 <= ODI <= 0.2), and neurons preferring the masker eye (ODI < -0.2 or > 0.2 relative to the grating eye). Compared to the no-CFS baseline condition, the orientation tuning of neurons preferring the masker eye was completely wiped out by CFS (Fig. 2B, D left), leading to flattened tuning curves with unmeasurable amplitudes or bandwidths. The orientation tuning of binocular neurons was either nearly completely wiped out (Monkey A) or substantially abolished (Monkey B) (Fig. 2B, D middle). There were 85.68% and 68.32% decreases in amplitude, and 92.64% and 77.07% decreases in slope, for Monkeys A and B, respectively. The orientation tuning of neurons preferring the grating eye was the least but still substantially affected (Fig. 2B, D right), with respective 77.78% and 41.75% decreases in amplitude and 85.23% and 57.56% decreases in slope for two monkeys.
To quantify the information loss in V1 population orientation coding due to continuous flash suppression (CFS), we compared the Fisher information (Averbeck & Lee, 2006) under both baseline and CFS conditions. Here, Fisher information serves as a statistical measure that reflects how much information the responses of neurons can provide about the grating orientation. Specifically, it indicates the sensitivity of neural responses to small changes in orientation, in that higher values signify greater precision in encoding that information. As illustrated in Fig. 2E, Fisher information was reduced by CFS primarily for orientations deviated by less than 15° from the neurons’ preferred orientations. The average Fisher information for stimuli within this 15° range decreased to 29.1% and 43.4% of the baseline values in two macaques, respectively (Fig. 2F), indicating the detrimental impact of CFS on the ability of V1 populations to accurately encode and represent orientation information, especially for orientations closely aligned with neuronal preferences.
What are the sensory and perceptual consequences of CFS-induced suppression of V1 orientation responses? To answer this question, which is crucial for understanding subconscious processing under CFS, we trained linear decoders to classify neighboring stimulus orientations (15°) in our experiments, as well as transformer models to reconstruct the grating images. Here, orientation classification reflected coarse orientation discrimination, and image reconstruction reflected orientation perception, both suggesting the upper bounds of performance assuming an ideal observer.
For orientation classification, we trained a support vector machine (SVM) to classify neighboring orientations based on population neural activity in each trial. Decoders for different FOVs, ipsilateral/contralateral target presentations, different pairs of neighboring orientations, and baseline vs. CFS conditions were trained separately. Under the baseline condition, the decoders achieved mean classification accuracies of 95.8 ± 2.3% and 97.8 ± 1.8% across ipsilateral and contralateral eye conditions and 12 neighboring orientation pairs in Monkeys A and B, respectively. Under CFS, the respective accuracies were unchanged (93.4 ± 3.3% and 98.1 ± 1.5%, Fig. 3A). These results suggest that under CFS, there is likely still sufficient information for orientation discrimination, even for Monkey A whose V1 neuronal responses were more substantially suppressed. Furthermore, with CFS suppression as severe as that in Monkey A, real and non-ideal observers with low efficiency in reading out stimulus information may still have a good chance of completing a similar coarse orientation discrimination task.

Sensory/perceptual consequences of CFS revealed by machine learning.
A. Orientation classification accuracies under CFS vs. baseline conditions obtained using SVM decoders. Each datum represents results from a contralateral or ipsilateral grating condition with a specific FOV averaged across 5-fold cross-validations. Error bars denote 95% confidence intervals. B. A diagram of the transformer model for stimulus image reconstruction. C. Exemplar learning curves of transformer models under baseline and CFS conditions from two FOVs. The vertical dashed line indicates the epoch at which the baseline model reaches 75% of its total loss decrease between the two learning plateaus estimated using a sigmoid fit. D. Illustrations of corresponding reconstructed stimulus images on the basis of learning curves in C. E. Box plots of SSIM scores between the original and reconstructed images with baseline and CFS transformers. Within a FOV, results from contralateral eye and ipsilateral eye conditions are combined. F. Distributions of absolute orientation errors between the true orientation and the orientation extracted from the reconstructed image using a gradient-based procedure.
Next, we trained transformer models to reconstruct the grating stimulus images on the basis of corresponding neuronal responses under baseline and CFS conditions. The motivation for this part of the modeling work was the assumption that high-level tasks would be difficult to carry out if the basic stimulus features forming more complex patterns were not intact. Our transformer model contained an architecture that integrated embedding, self-attention, and unembedding modules, as well as a fully connected feedforward layer (Fig. 3B). The model inputs were the responses of all neurons within a FOV to the grating stimulus (ipsilateral and contralateral presentations of the same stimulus were modeled seperately), and the model output was the reconstructed grating image. During the training process, the model typically reached two successive learning plateaus, where the validation loss temporarily stagnated (Fig. 3C). Moreover, the validation loss decreased more rapidly when training on the baseline neural response data compared to the CFS data. To compare the differences, we identified the epoch at which the validation loss of the baseline model reached 75% of its total decrease between the two plateaus using a sigmoid fit, and then we retrained both the baseline and CFS models up to this epoch.
The retained baseline models reconstructed the grating stimuli significantly better than the CFS models in Monkey A, but this discrepancy was less pronounced in Monkey B (Fig. 3D), consistent with the neuronal data that Monkey A exhibited substantially more CFS suppression than Monkey B in terms of population orientation tuning and Fisher information (Fig. 2). We used a structural similarity index (SSIM) (Brunet, Vrscay, & Wang, 2012) and a gradient-based orientation extraction procedure to quantify the reconstruction performances. Across the grating-presenting ipsilateral and contralateral eyes, the baseline models reconstructed the grating with median SSIMs of 0.52 and 0.61 for the two FOVs of Monkey A, and 0.57 and 0.63 for the two FOVs of Monkey B, respectively, while the corresponding SSIMs for the CFS models were 0.16 and 0.19 for Monkey A, and 0.55 and 0.53 for Monkey B (Fig. 3E).
Furthermore, the grating orientations extracted from the reconstructed images deviated 4.46° and 3.10° (median values) from the actual stimulus orientation in baseline models for the two FOVs of Monkey A, and 2.86° and 2.20° for the two FOVs of Monkey B, respectively. However, in the CFS models, this orientation error increased to 48.45° and 24.03° in the two FOVs of Monkey A, implying that the stimulus orientation could not be reconstructued or unconsciously “perceived” (Fig. 3F). In contrast, the orientation error increased slightly to 3.06° and 3.42° in Monkey B, implying only moderately impaired reconstruction and unconscious “perception” of the stimulus orientation.
To estimate the impact of CFS-induced V1 suppression on downstream processing, we also recorded neuronal responses from two V2 FOVs in Monkey A (FOVs 3 & 4). As anticipated, V2 neurons were binocular, with over 90% of them showing ODIs within the range of -0.2 to 0.2 (Fig. 4A). Similar to the V1 results for the same monkey, CFS on average reduced the amplitudes of the population orientation tuning functions by 80.05% and the slopes by 89.44% (Fig. 4B). It also reduced the Fisher information to 33.1% of the baseline value (Fig. 4C). Furthermore, we applied the same orientation classification and image reconstruction procedures to the V2 data. For orientation classification, the SVM decoders achieved near-perfect performance in distinguishing neighboring orientations under both baseline and CFS conditions, with classification accuracies exceeding 98% across all cases (Fig. 4D). In the image reconstruction task, the baseline model outperformed the CFS model. Specifically, the baseline transformer models reconstructed the stimulus images with the median SSIM values of 0.61 and 0.53 for the two V2 FOVs, respectively, which dropped to 0.42 and 0.18 in the CFS models (Fig. 4E left). The resulting errors of extracted orientations increased from 2.53° and 3.05° with the baseline models to 7.00° and 42.45° with the CFS models (Fig. 4E right), implying poorer or failed reconstruction and unconscious “perception” of stimulus orientations.

Effects of CFS on V2 orientation responses.
A. OD maps of the two V2 FOVs of Monkey A (MA3 & MA4). B. Population orientation tuning functions for all orientation-tuned neurons with baseline and CFS conditions. Solid lines represent the results of Gaussian fittings. Error bars represent ±1 SE. C. Fisher information as a function of the relative orientation (to the neuron’s preferred orientation) with baseline and CFS conditions. Shaded areas denote ±1 SE. Fisher information was lower in MA4 due to higher variations in the data. D. Orientation classification accuracies under CFS vs. baseline conditions using SVM decoders. Each datum represents results from a contralateral or ipsilateral grating condition with one FOV, averaged across 5-fold cross-validations. Error bars denote 95% confidence intervals. E. Box plots of SSIM scores between the original and reconstructed images with baseline and CFS transformers. Within a FOV, results from contralateral eye and ipsilateral eye conditions are combined. F. Distributions of orientation deviation errors between the original orientation and the extracted orientation.
Discussion
Our study demonstrates that CFS severely compromises orientation information in V1 neurons in an ocular dominance-dependent manner. Orientation information carried by neurons preferring the masker eye or both eyes is completely or nearly completely wiped out, while information carried by those preferring the grating eye is partially retained.
Downstream, orientation information in V2 neurons is also substantially weakened. Linear decoding and transformer models suggest that CFS-compromised orientation information may still allow coarse orientation discrimination, but will most likely impair orientation perception when the suppression is sufficiently strong as in Monkey A. Similarly strong suppression is also possible in Monkey B if the current grating contrast (0.45) is lower to be 0.1-0.3 as in some CFS experiments (Alais, Coorey, Blake, & Davidson, 2024; Lunghi & Pooresmaeili, 2023; Watanabe et al., 2011; Yuval-Greenberg & Heeger, 2013).
CFS-compromised V1 orientation information transmits for downstream visual processing, which may explain the unconscious orientation processing observed in human CFS studies. The “invisible” orientation information can be processed, as demonstrated by adaptation (Bahrami, Carmel, Walsh, Rees, & Lavie, 2008; Kanai, Tsuchiya, & Verstraten, 2006) and priming (Koivisto & Grassini, 2018) studies. The adaptation aftereffect is reduced compared to the visible condition but not entirely abolished (Bahrami et al., 2008; Kanai et al., 2006), likely a result of the degraded orientation information surviving CFS. For the same reason, the priming effect also decreases during trials in which the stimulus is rendered invisible by CFS, compared to those in which the stimulus is visible or partially visible (Koivisto & Grassini, 2018), as the degraded stimulus information provides insufficient evidence for decision-making, resulting in a diminished priming effect (Dehaene, 2011; Gomez, Perea, & Ratcliff, 2013).
Furthermore, our linear decoding and transformer results can help elucidate the debate on whether visual processing still functions at the categorization level under the influence of CFS. Previous studies have provided evidence for the preserved category information of the target, as demonstrated by tool-specific priming effects (Almeida et al., 2010; Almeida et al., 2008) and differential BOLD response patterns between tools and other object categories under CFS (Hesselmann, Hebart, & Malach, 2011; Tettamanti et al., 2017). However, an intriguing question is: Do these results rather reflect low-level feature differences between tools and other object categories? It has been reported that elongated objects, irrespective of their categorical affiliation, elicit similar priming effects (Sakuraba et al., 2012). Consistent with this, when tools are categorized by their shape (elongated vs. non-elongated), only the neural response patterns elicited by elongated tools can be discriminated from other object categories under CFS (Fogelson, Kohler, Miller, Granger, & Tse, 2014; Ludwig, Kathmann, Sterzer, & Hesselmann, 2015). Moreover, a recent study measuring the contrast thresholds required to both break from and suppress CFS found that stimuli exhibited similar suppression strengths across various categories (Alais et al., 2024). According to our results, when suppression is too strong to allow for stimulus reconstruction, as in the case of Monkey A (Fig. 3C), the orientation information under CFS may not accumulate to a level sufficient for resolving semantic category boundaries. The latter might require somewhat intact stimulus orientation, even if subconsciously. However, it could potentially assist in category discrimination when categorical differences lie in certain low-level shape dimensions like orientation, as coarse orientation discrimination appears unaffected by CFS suppression (Fig. 3A).
In the framework of global neuronal workspace theory (Mashour, Roelfsema, Changeux, & Dehaene, 2020; Seth & Bayne, 2022), a stimulus reaches consciousness when it triggers an ‘ignition’, defined as the recurrent processing and amplification of the sensory signal. This ignition enables the stimulus to broadcast and be available to a widespread global neuronal workspace, thereby becoming part of the conscious experience. To achieve this, the feedforward signal needs to be sufficiently strong to reach the ‘consciousness hub’, where high-density connectivity facilitates efficient broadcasting, presumably in the prefrontal cortex (Mashour et al., 2020). In the present study, the target information under CFS is severely compromised in V1, therefore unable to reach the ’consciousness hub’ and trigger an ignition, thus remaining subliminal.
Materials and methods
Monkey preparation
Monkey preparation was identical to procedures reported in previous studies (Guan, Ju, Tao, Tang, & Yu, 2021; Ju, Guan, Tao, Tang, & Yu, 2020; Zhang et al., 2024). Two male rhesus monkeys (Macaca mulatta, aged 5 and 6, respectively) underwent two sequential surgeries under general anesthesia and strictly sterile conditions. During the first surgery, a 20-mm diameter craniotomy was performed on the skull over V1. The dura was opened and multiple tracks of 100-150 nil AAV1.hSynap.GCaMP5G.WPRE.SV40 (AV-1-PV2478, titer 2.37e13 (GC/ml), Penn Vector Core) were pressure-injected at a depth of ∼350 µm at multiple locations. The dura was then sutured, the skull cap was re-attached with three titanium lugs and six screws, and the scalp was sutured. After the surgery, the animal was returned to the cage and treated with injectable antibiotics (Ceftriaxone sodium, Youcare Pharmaceutical Group, China) for one week. Postoperative analgesia was also administered. The second surgery was performed 45 days later. A T-shaped steel frame was installed for head stabilization, and an optical window was inserted onto the cortical surface. Data collection could start as early as one week later. More details about the preparation and surgical procedures can be found in Li et al. (2017). The procedures were approved by the Institutional Animal Care and Use Committee, Peking University.
Behavioral task
After a ten-day recovery period following the second surgery, monkeys were placed in a primate chair with head restraint. They were trained to hold fixation on a small white spot (0.2°) with eye positions monitored by an Eyelink-1000 eye tracker (SR Research) at a 1000-Hz sampling rate. During the experiment, trials with the eye position deviated 1.5° or more from the fixation before stimulus offset were discarded as ones with saccades and repeated.
Visual stimuli and experimental procedures
Visual stimuli were generated with a Matlab-based Psychtoolbox-3 software (Pelli & Zhang, 1991) and presented on a ROG Swift PG278QR monitor (refresh rate = 120 Hz, resolution = 2560 × 1440 pixel, pixel size = 0.23 mm × 0.23 mm). The screen luminance was linearized by an 8-bit look-up table, and the mean luminance was 47 cd/m2. The viewing distance was 60 cm.
A drifting square-wave grating (spatial frequency = 4 cpd, contrast = full, speed = 3 cycles/sec, starting phase = 0°, size = 0.4° in diameter) was first used to determine the population receptive field (pRF) location, shape, and approximate size associated with a specific FOV. The same stimulus was also monocularly presented to confirm the V1 location as ocular dominance columns would appear. This fast process used a 4 × objective lens mounted on the two-photon microscope and did not provide cell-specific information. The recorded V1 pRFs were centered at ∼0.90° eccentricity in Monkey A and ∼1.93° in Monkey B. V2 pRFs were centered at ∼0.67° in Monkey A. All pRFs were approximately circular with a diameter of 0.9°.
The target stimulus used in the experiments was a 0.45-contrast circular-windowed square-wave grating. It drifted at 4 cycles per second in opposite directions perpendicular to the orientation with a starting phase of 0°, and varied at 12 orientations (0° to 165° in 15° increments) and two spatial frequencies (3 & 6 cpd) trial by trial. The circular envelope had a diameter of 1°, which approximated the size of pRFs for recorded FOVs, with the edge blurred by a linear ramp starting at a radius of 0.38°. The flashing masker was a circular white noise pattern with a diameter of 1.89°, a contrast of 0.5, and a flickering rate of 10 Hz. The white noise consisted of randomly generated black and white blocks (0.07° × 0.07° each). The target grating and the flashing masker were presented through a pair of NVIDIA 3D Vision 2 active shutter glasses. To mitigate the ghost image, a low contrast (RMS contrast = 0.08) white noise was added to the grating. The width of the noise element was half of the bar width of the square grating, and the white noise was regenerated every frame.
Each block of trials consisted of four groups of stimuli: binocular, monocular, CFS, and flashing masker-only. In the binocular group, the grating was presented to both eyes simultaneously. The relevant data were only used to help identify ROIs and orientation-tuned neurons along with data from other stimulus conditions. In the monocular group, the grating was monocularly presented to the contralateral or ipsilateral eye, which served as the baseline conditions without the influences of CFS. In the CFS group, the grating and flashing masker were presented dichoptically. In the flashing masker-only group, the flashing masker was presented monocularly to either eye. Each stimulus condition was repeated for 10-12 trials. For conditions involving the grating, the trials were split for two opposite drifting directions. A block of trial contained 242 trials, two trials for each stimulus condition, with the order of stimulus conditions arranged in a pseudorandom manner. There were 5 to 6 blocks of trials with each FOV.
Each stimulus was presented for 1000 ms, followed by an inter-stimulus interval of 1500 ms, allowing sufficient time for the calcium signals to return to the baseline level (Guan, Zhang, Zhang, Tang, & Yu, 2020). For each FOV, the recording was completed in a single session with 5-6 experiment blocks and lasted for 2-3 hours.
Two-photon imaging
Two-photon imaging was performed using a FENTOSmart two-photon microscope (Femtonics), along with a Ti:sapphire laser (Mai Tai eHP, Spectra Physics). GCaMP5 was chosen as the indicator of calcium signals because the fluorescence activities it expresses are linearly proportional to neuronal spike activities within a wide range of firing rates from 10-150 Hz (Li et al., 2017). During imaging, a 16× objective lens (0.8 N.A., Nikon) with a resolution of 1.6 µm/pixel was used, along with a 1000 nm femtosecond laser. A fast resonant scanning mode (32 fps) was chosen to obtain continuous images of neuronal activity (8 frames per second after averaging every 4 frames). The strength of fluorescent signals (mean luminance of a small area) was monitored and adjusted if necessary for the drift of fluorescent signals. Two response fields of view (FOVs) measuring 850 × 850 µm2 in V1 were selected in both macaques, and two FOVs of the same size in V2 were selected in Macaque A.
Imaging data analysis: Initial screening of ROIs
Data were analyzed with customized MATLAB codes. A normalized cross-correlation based translation algorithm was used to reduce motion artifacts (Li et al., 2017). Then the fluorescence changes were associated with corresponding visual stimuli through the time sequence information recorded by Neural Signal Processor (Cerebus system, Blackrock Microsystem). By subtracting the mean of the 4 frames before stimuli onset (F0) from the average of the 6th-9th frames after stimuli onset (F) across 5 or 6 repeated trials for the same stimulus condition (same orientation, spatial frequency, size, and drifting direction), the differential image (ΔF = F - F0) was obtained.
For a specific FOV, the regions of interest (ROIs) or possible cell bodies were decided through sequential analysis of 242 differential images in the order of CFS, monocular, binocular, and flashing masker-only conditions. CFS conditions consisted of 96 (2×2×12×2 = 96) differential images, with the grating presented to either eye (2), at two spatial frequencies (2), twelve orientations (12), and two motion directions (2). Monocular conditions were identical to the CFS conditions except that the flashing masker was absent. In the binocular conditions, gratings at two spatial frequencies (2), twelve orientations (12), and two motion directions (2) were binocularly presented, resulting in 48 differential images. The flashing masker-only conditions consisted of the flashing masker presented to either eye, resulting in 2 differential images.
The first differential image was filtered with a band-pass Gaussian filter (size = 2–10 pixels), and connected subsets of pixels (>25 pixels, which would exclude smaller vertical neuropils) with average pixel value >3 standard deviations of the mean brightness were selected as ROIs. Then the areas of these ROIs were set to mean brightness in the next differential image before the bandpass filtering and thresholding were performed. This measure gradually reduced the standard deviations of differential images and facilitated the detection of neurons with relatively low fluorescence responses. If a new ROI and an existing ROI from the previous differential image overlapped, the new ROI would be on its own if the overlapping area OA < 1/4 ROInew, discarded if 1/4 ROInew < OA < 3/4 ROInew, and merged with the existing ROI if OA > 3/4 ROInew. The merges would help smoothen the contours of the final ROIs. This process went on through all differential images twice to select ROIs. Finally, the roundness for each ROI was calculated as:

where A was the ROI’s area, and P was the perimeter. Only ROIs with roundness larger than 0.9, which would exclude horizontal neuropils, were selected for further analysis.
Imaging data analysis: Orientation tuning and ocular dominance
The ratio of fluorescence change (ΔF/F0) was calculated as a neuron’s response to a specific stimulus condition. For a specific neuron’s response to a specific stimulus condition, the F0n of the n-th trial was the average of 4 frames before stimulus onset (-500 - 0 ms), and Fn was the average of the 5th-8th, 6th-9th, or 7th-10th frames after stimulus onset, whichever was the greatest. F0n was then averaged across 10 or 12 repeated trials to obtain the baseline F0 for all trials (to reduce noise in the calculation of responses), and ΔFn/F0 = (Fn-F0)/F0 was taken as the neuron’s response to this stimulus at the n-th trial.
Several steps were taken to determine whether a neuron was orientation-selective. For each monocular or binocular condition, the orientation and SF eliciting the maximal responsewere designated as the neuron’s preferred SF and orientation. We then compared responses across all 12 orientations at the preferred SF by performing a non-parametric Friedman test to determine whether the neuron’s responses at various orientations were significantly different from each other. To reduce Type I errors, the significance level was set at α = 0.01. Neurons that passed the Friedman test at least under one viewing condition were selected as orientation-tuned neurons.
The ocular dominance index (ODI) was calculated to characterize each neuron’s eye preference: ODI = (Ri – Rc)/(Ri +Rc), where Ri and Rc were the neuron’s peak responses at the best orientation and SF to ipsilateral and contralateral monocular grating conditions, respectively. Neurons with an ODI of -1 or 1 would be completely contralateral or ipsilateral eye dominant, and of 0 would be equally dominant by both eyes.
Population orientation tuning
For each neuron, neural responses at the preferred SF were selected for tuning analysis. To derive population orientation tuning curves under a specific condition, we categorized neurons into twelve orientation preference bins according to their preferred orientations (bin width = 15°). For each orientation presented, the responses of all orientation preference bins were reorganized according to the relative orientation preference. Subsequently, neuronal responses of the same relative orientation preference were averaged to generate the final population orientation tuning function. For CFS conditions, the selected SF and binning procedures were the same as their corresponding monocular conditions.
The population orientation tuning function was fitted with a Gaussian model with MATLAB’s nonlinear least-squares function ‘lsqnonlin’:

where R(θ) was the response at orientation θ, free parameters a, θ0, σ, and b were the amplitude, peak orientation, standard deviation of the Gaussian function (equal to half width at half height), and minimal response, respectively.
The population orientation tuning curves for different eye preference groups were derived using the same procedure, with additional binning of neurons according to their ODI. To obtain the tuning curve of the neurons preferring the eye seeing the grating, responses of neurons with an ODI < -0.2 (preferentially responding to the contralateral eye) under contralateral eye grating presentation and those with an ODI > 0.2 (preferentially responding to the ipsilateral eye) under ipsilateral eye grating presentation were combined. Similarly, for neurons preferring the eye seeing the masker, responses of neurons with an ODI < -0.2 under ipsilateral eye grating presentation and those with an ODI > 0.2 under contralateral eye grating presentation were combined. For binocular neurons (-0.2 < ODI < 0.2), responses under both grating presentation conditions were combined.
Fisher information
The Fisher information assesses the amount of information contained in a neuron population using an optimal decoder (Pouget, Deneve, Ducom, & Latham, 1999). Assume independent Gaussian noise distributions, the Fisher information for a population of N neurons was given as

where fi(θ) was the mean activity of neuron i in response to the presentation angle, θ, and f′(θ) was its derivative with respect to θ. We fitted each neuron’s response tuning fi(θ) and variance tuning σi(θ) with Gaussian functions and calculated the averaged Fisher information across neurons at each orientation.
SVM-based orientation classification
In the orientation classification task (Fig. 3A), we trained a support vector machine (SVM) to classify neighboring orientations from standardized population neural activity. The SVM decoder was implemented using MATLAB’s ‘fitcsvm’ function with a linear kernel. To prevent overfitting and evaluate the generalization ability of the model, we employed a 5-fold cross-validation procedure, and the model performance on the validation dataset was reported.
Decoders were trained independently for each experiment condition and adjacent orientation pairs, resulting in 48 models per FOV (contralateral/ipsilateral × baseline/CFS × 12 neighboring orientation pairs). Neural response data from two spatial frequencies and two orientations in a neighboring orientation pair were used as input, with each neuron treated as a feature. In this way, each model were trained on 48 or 40 samples (2 SFs × 2 orientations × 12/10 repeats).
The transformer model
Model input, output, and training procedure
We implemented a transformer-based model to reconstruct grating stimuli from population neuronal responses recorded under different experiment conditions. The model input was a vector of neuronal responses, each corresponding to an individual neuron, and the output was the reconstructed grating image of size 70 × 70 pixels.
The transfromer was trained independently for each experimental condition, resulting in four models per FOV (contralateral/ipsilateral × baseline/CFS). Pilot experiments revealed that our original dataset was insufficient for the model to converge. To address this, we augmented the dataset to four times its original size before training. Augmentation was performed by sampling from a normal distribution centered at each neuron’s response mean, with a standard deviation equal to its original standard deviation. Within the augmented dataset, 6% was reserved for validation. Responses were normalized to [0, 1] before fed into the model.
We implemented a two-phase training procedure to assess the reconstruction ability of models trained on different neural data. During the training process, the model typically reached two learning plateaus, where the validation loss temporarily stagnated (Fig. 3C). In the first training session, we analyzed the learning curve to determine the epoch at which the baseline model’s validation loss had completed 75% of its total decrease between the two plateaus. This was estimated using a modified sigmoid fit:

where A and C defined the function range, b was the symmetry point, k was the steepness parameter, and t represented the epoch number, counted from epoch 500 (initial epochs were discarded due to a drastic drop in validation loss across all training runs, see Fig. 3C). The 75% decrease point was computed as:

In the second training session, we retrained both models up to the identified epoch and evaluated their performances.
The model was trained to minimize the mean squared error (MSE) between the reconstructed and actual stimuli. Optimization was performed using RMSprop with a learning rate of 0.00005 and a smoothing factor ρ = 0.85. Model structure
Each neuron’s response was embedded into a higher-dimensional space using a learned weight vector as follows:

where the R0 (n×1) represented the original response vector from n neurons, and Wemb(n× dmodel) was the embedding weight matrix, with each row corresponding to a neuron-specific weight vector. Here we used dmodel = 2. The symbol ⊙ denoteed row-wise multiplication, such that the ith response 

The enriched embedding matrix was then passed through a self-attention module. In this module, R1was first projected into queries (Q), keys (K), and values (V) through independent learnable weight matrices, respectively. Then the attention map was computed as:

where dk represented the dimensionality of the key vectors, which scaled the dot-product to control the variance of the attention scores.
The output of self-attention was calculated as:

The output from self-attention was unembedded by projecting each neuron’s high-dimensional representation back to one-dimensional. A feedforward layer transformed the unembedded vector into a stimulus vector, which was then reshaped into the final 70 × 70 image.
Model evaluation
The original, non-augmented data was used for analysis, which had been seen during training in both the training and validation sets. We used a structural similarity index (SSIM) (Brunet et al., 2012) and a gradient-based orientation extraction procedure to quantify the reconstruction performances.
The SSIM (Brunet et al., 2011) between two images x and y (both 70×70) is defined as:

where μx and μy are the mean intensities, 

To extract the stimulus orientation from each Gabor image, we first computed the horizontal and vertical gradients and assembled them into a matrix of pixel-wise gradient vectors. Principal component analysis (PCA) was then applied to this matrix to identify the primary direction of variation in the gradient field, which reflected the dominant orientation of image features. The final orientation was converted to degrees and constrained to the range [0, 180).
Code and data avalibility
The code could be found at github: https://github.com/caviaryusi/CFS/blob/main/README.md
The data could be found at HuggingFace (Hugging Face, RRID:SCR_020958): https://huggingface.co/datasets/chencaixia/CFS_2p
Acknowledgements
This study was supported by a STI2030-Major Projects grant (2022ZD0204600), Natural Science Foundation of China grants 31230030 and 31730109, and funds from Peking-Tsinghua Center for Life Sciences, Peking University.
Additional information
Funding
Ministry of Science and Technology of the People's Republic of China (2022ZD0204600)
National Natural Science Foundation of China
Ministry of Science and Technology of the People's Republic of China (31230030)
Ministry of Science and Technology of the People's Republic of China (31730109)
References
- High-Level Face Adaptation Without AwarenessPsychological Science 21:205–210https://doi.org/10.1177/0956797609359508Google Scholar
- A new ‘CFS tracking’ paradigm reveals uniform suppression depth regardless of target complexity or salienceeLife 12:RP91019https://doi.org/10.7554/eLife.91019Google Scholar
- The Role of the Dorsal Visual Processing Stream in Tool IdentificationPsychological Science 21:772–778https://doi.org/10.1177/0956797610371343Google Scholar
- Unconscious processing dissociates along categorical linesProceedings of the National Academy of Sciences of the United States of America 105:15214–15218https://doi.org/10.1073/pnas.0805867105Google Scholar
- Effects of Noise Correlations on Information Encoding and DecodingJournal of Neurophysiology 95:3633–3644https://doi.org/10.1152/jn.00919.2005Google Scholar
- Unconscious orientation processing depends on perceptual loadJournal of Vision 8:12–12https://doi.org/10.1167/8.3.12Google Scholar
- On the Mathematical Properties of the Structural Similarity IndexIEEE Transactions on Image Processing 21:1488–1499https://doi.org/10.1109/TIP.2011.2173206Google Scholar
- Representation of concurrent stimuli by population activity in visual cortexNeuron 64:931–942https://doi.org/10.1016/j.neuron.2009.11.004Google Scholar
- Conscious and Nonconscious Processes:Distinct Forms of Evidence Accumulation?In:
- Rivasseau V.
- Cortical responses to invisible objects in the human dorsal and ventral pathwaysNature Neuroscience 8:1380–1385https://doi.org/10.1038/nn1537Google Scholar
- Unconscious neural processing differs with method used to render stimuli invisibleFrontiers in Psychology 5https://doi.org/10.3389/fpsyg.2014.00601Google Scholar
- A diffusion model account of masked versus unmasked priming: Are they qualitatively different?Journal of Experimental Psychology: Human Perception and Performance 39:1731–1740https://doi.org/10.1037/a0032333Google Scholar
- Faces and awareness: low-level, not emotional factors determine perceptual dominanceEmotion 13:537–544https://doi.org/10.1037/a0031403Google Scholar
- Functional organization of spatial frequency tuning in macaque V1 revealed with two-photon calcium imagingProgress in Neurobiology 205:102–120https://doi.org/10.1016/j.pneurobio.2021.102120Google Scholar
- Plaid detectors in macaque V1 revealed by two-photon calcium imagingCurrent Biology 30:934–940https://doi.org/10.1016/j.cub.2020.01.005Google Scholar
- Differential BOLD activity associated with subjective and objective reports during “blindsight” in normal observersJournal of Neuroscience 31:12936–12944https://doi.org/10.1523/JNEUROSCI.1556-11.2011Google Scholar
- The link between fMRI-BOLD activation and perceptual awareness is stream-invariant in the human visual systemCerebral Cortex 21:2829–2837https://doi.org/10.1093/cercor/bhr085Google Scholar
- Intrinsic Variability of Ocular Dominance Column Periodicity in Normal Macaque MonkeysThe Journal of Neuroscience 16:7228https://doi.org/10.1523/JNEUROSCI.16-22-07228.1996Google Scholar
- Receptive fields, binocular interaction and functional architecture in the cat’s visual cortexThe Journal of Physiology 160:106–154https://doi.org/10.1113/jphysiol.1962.sp006837Google Scholar
- Processing of Invisible Stimuli: Advantage of Upright Faces and Recognizable Words in Overcoming Interocular SuppressionPsychological Science 18:349–355https://doi.org/10.1111/j.1467-9280.2007.01902.xGoogle Scholar
- Orientation Tuning and End-stopping in Macaque V1 Studied with Two-photon Calcium ImagingCerebral Cortex 31:2085–2097https://doi.org/10.1093/cercor/bhaa346Google Scholar
- The Scope and Limits of Top-Down Attention in Unconscious Visual ProcessingCurrent Biology 16:2332–2336https://doi.org/10.1016/j.cub.2006.10.001Google Scholar
- Binocular interaction on monocularly discharged lateral geniculate and striate neurons in the catJ Neurophysiol 46:932–951https://doi.org/10.1152/jn.1981.46.5.932Google Scholar
- Unconscious response priming during continuous flash suppressionPLOS One 13:e0192201https://doi.org/10.1371/journal.pone.0192201Google Scholar
- Long-term two-photon imaging in awake macaque monkeyNeuron 93:1049–1057https://doi.org/10.1016/j.neuron.2017.01.027Google Scholar
- Ocular dominance columns in New World monkeysThe Journal of Neuroscience 16:2086https://doi.org/10.1523/JNEUROSCI.16-06-02086.1996Google Scholar
- Investigating category- and shape-selective neural processing in ventral and dorsal visual stream under interocular suppressionHuman Brain Mapping 36:137–149https://doi.org/10.1002/hbm.22618Google Scholar
- Learned value modulates the access to visual awareness during continuous flash suppressionScientific Reports 13:756https://doi.org/10.1038/s41598-023-28004-5Google Scholar
- Dichoptic Visual Masking Reveals that Early Binocular Neurons Exhibit Weak Interocular Suppression: Implications for Binocular Vision and Visual AwarenessJournal of Cognitive Neuroscience 16:1049–1059https://doi.org/10.1162/0898929041502788Google Scholar
- Conscious Processing and the Global Neuronal Workspace HypothesisNeuron 105:776–798https://doi.org/10.1016/j.neuron.2020.01.026Google Scholar
- A role for ocular dominance in binocular integrationCurrent Biology 33:3884–3895https://doi.org/10.1016/j.cub.2023.08.019Google Scholar
- Scene Integration Without Awareness: No Conclusive Evidence for Processing Scene Congruency During Continuous Flash SuppressionPsychol Sci 27:945–956https://doi.org/10.1177/0956797616642525Google Scholar
- A critical reexamination of doing arithmetic nonconsciouslyPsychon Bull Rev 25:472–481https://doi.org/10.3758/s13423-017-1292-xGoogle Scholar
- Continuous Flash Suppression: Stimulus Fractionation rather than IntegrationTrends Cogn Sci 21:719–721https://doi.org/10.1016/j.tics.2017.06.005Google Scholar
- Integration without awareness: expanding the limits of unconscious processingPsychol Sci 22:764–770https://doi.org/10.1177/0956797611408736Google Scholar
- Accurate control of contrast on microcomputer displaysVision Research 31:1337–1350https://doi.org/10.1016/0042-6989(91)90055-AGoogle Scholar
- Narrow Versus Wide Tuning Curves: What’s Best for a Population Code?Neural Computation 11:85–90https://doi.org/10.1162/089976699300016818Google Scholar
- Continuous flash suppression: Known and unknownsPsychonomic Bulletin & Review 27:1071–1103https://doi.org/10.3758/s13423-020-01771-2Google Scholar
- Does the human dorsal stream really process a category for tools?Journal of Neuroscience 32:3949–3953https://doi.org/10.1523/JNEUROSCI.3973-11.2012Google Scholar
- Theories of consciousnessNature Reviews Neuroscience 23:439–452https://doi.org/10.1038/s41583-022-00587-4Google Scholar
- Reading and doing arithmetic nonconsciouslyProc Natl Acad Sci U S A 109:19614–19619https://doi.org/10.1073/pnas.1211645109Google Scholar
- Prioritization of emotional faces is not driven by emotional contentSci Rep 13:549https://doi.org/10.1038/s41598-022-25575-7Google Scholar
- Unaware processing of tools in the neural system for object-directed action representationJournal of Neuroscience 37:10712–10724https://doi.org/10.1523/JNEUROSCI.1061-17.2017Google Scholar
- Continuous flash suppression reduces negative afterimagesNature Neuroscience 8:1096–1101https://doi.org/10.1038/nn1500Google Scholar
- Attention But Not Awareness Modulates the BOLD Signal in the Human V1 During Binocular SuppressionScience 334:829–831https://doi.org/10.1126/science.1203161Google Scholar
- On the use of continuous flash suppression for the study of visual processing outside of awarenessFrontiers in Psychology 5https://doi.org/10.3389/fpsyg.2014.00724Google Scholar
- Continuous Flash Suppression Modulates Cortical Activity in Early Visual CortexThe Journal of Neuroscience 33:9635https://doi.org/10.1523/JNEUROSCI.4612-12.2013Google Scholar
- Suppressed semantic information accelerates analytic problem solvingPsychon Bull Rev 20:581–585https://doi.org/10.3758/s13423-012-0364-1Google Scholar
- Ocular dominance-dependent binocular combination of monocular neuronal responses in macaque V1eLife 13:RP92839https://doi.org/10.7554/eLife.92839Google Scholar
Article and author information
Author information
Version history
- Sent for peer review:
- Preprint posted:
- Reviewed Preprint version 1:
Cite all versions
You can cite all versions using the DOI https://doi.org/10.7554/eLife.107518. This DOI represents all versions, and will always resolve to the latest one.
Copyright
© 2025, Chen et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
- views
- 35
- downloads
- 3
- citations
- 0
Views, downloads and citations are aggregated across all versions of this paper published by eLife.