Abstract
Continuous flash suppression (CFS), a dynamic masker presented to one eye suppressing the conscious perception of a stimulus shown to the other eye, has been extensively used to study visual consciousness. Various studies reported high-level visual and cognitive functions under CFS, which, however, has more recently been questioned and at least partially attributed to low-level stimulus properties. A key but unsettled issue is the extent to which the responses of V1 neurons, where inputs from two eyes first merge, are affected, as severely suppressed V1 responses would not sustain high-level processing. Here, we used two-photon calcium imaging to record the responses of large samples of V1 neurons to a grating stimulus under CFS in awake, fixating macaques. The results revealed that CFS substantially suppressed V1 orientation responses. Ocularity-wise, it nearly completely eliminated the orientation responses of V1 neurons preferring the masker eye or both eyes, while also significantly suppressing the responses of those preferring the grating eye. Modeling analyses suggest that, under CFS, the brain retains the ability of classifying coarse orientations, but may become less capable of reconstructing the grating stimulus. Consequently, while CFS-suppressed orientation information still supports low-level orientation discrimination, it may not suffice for high-level visual and cognitive processing.
Introduction
When a target stimulus is presented to one eye and a flickering Mondrian-like masker to the other eye, the target can be rendered invisible for an extended period (Tsuchiya & Koch, 2005). This paradigm, known as continuous flash suppression (CFS), has been widely used to investigate subconscious visual processing (Yang, Brascamp, Kang, & Blake, 2014; Moors, Hesselmann, Wagemans, & van Ee, 2017; Pournaghdali & Schwartz, 2020). Among the most intriguing findings are the subconscious high-level visual and cognitive functions under the influence of CFS (e.g., Fang & He, 2005; Almeida, Mahon, Nakayama, & Caramazza, 2008; Adams, Gray, Garner, & Graf, 2010; Mudrik, Breska, Lamy, & Deouell, 2011; Sklar et al., 2012; Zabelina et al., 2013; Tettamanti, Conca, Falini, & Perani, 2017). For example, as reported, priming effects are evident when the target and the invisible primer are categorically (Almeida et al., 2008) or semantically (Zabelina et al., 2013) consistent. However, many of these observations have been questioned by more recent studies, with at least some of the high-level effects being attributed to low-level feature processing (Hesselmann & Malach, 2011; Sakuraba, Sakai, Yamanaka, Yokosawa, & Hirayama, 2012; Gray, Adams, Hedger, Newton, & Garner, 2013; Moors, Boelens, van Overwalle, & Wagemans, 2016; Moors et al., 2017; Moors & Hesselmann, 2018; Pournaghdali & Schwartz, 2020; Stuit, Paffen, & Van der Stigchel, 2023).
A critical issue in this debate is the impact of CFS on V1 neuronal activity. CFS has been hypothesized to arise from mechanisms similar to those in binocular rivalry (Tsuchiya & Koch, 2005; Yang et al., 2014; Moors et al., 2017), which likely suppress V1 responses through interocular inhibition. Only the surviving stimulus information would then be relayed to downstream areas for potential subconscious higher-level visual and cognitive processing (Jiang, Costello, & He, 2007; Adams et al., 2010; Almeida, Mahon, & Caramazza, 2010). Importantly, if V1 activity is suppressed to a sufficient degree, the low-level stimulus information carried by the remaining V1 responses may not suffice to sustain high-level processing of more complex stimuli defined by those low-level features.
Two prominent fMRI studies have examined the impact of CFS on V1 activity (Watanabe et al., 2011; Yuval-Greenberg & Heeger, 2013). Watanabe et al. (2011) compared monocular CFS masking (stimulus visible) and dichoptic CFS masking (stimulus invisible), and reported that V1 BOLD responses were largely insensitive to stimulus visibility when attention was carefully controlled. However, using similar experimental design, Yuval-Greenberg and Heeger (2013) observed reduced BOLD responses in V1 under dichoptic masking, suggesting that V1 activity changed with stimulus visibility. They attributed the difference of results between two studies mainly to differences in number of trials and thus the statistical power (∼250 trials per condition vs. ∼90 trials per condition). Nevertheless, these studies were not designed to quantify the pure effect of CFS on stimulus-evoked V1 responses, as they contrasted monocular and dichoptic masking conditions to equate stimulus input while manipulating perceptual visibility. In contrast, original psychophysical studies (Tsuchiya & Koch, 2005; Tsuchiya, Koch, Gilroy, & Blake, 2006) demonstrated CFS masking by contrasting the visibility of the target stimulus with and without the presence of dichoptic mask. It is apparent that the pure CFS impact in above fMRI studies should measure the difference of BOLD signals between binocular masking and stimulus alone conditions. In other words, the impact of CFS on V1 activity should be larger than what has been reported by Yuval-Greenberg and Heeger (2013).
Neurons in V1 exhibit various degrees of ocular dominance (Hubel & Wiesel, 1962), which influences each neuron’s binocular combination of monocular visual inputs from two eyes (Kato, Bishop, & Orban, 1981; Mitchell, Carlson, Westerberg, Cox, & Maier, 2023; Zhang, Zhao, Jiang, Tang, & Yu, 2024). In the present study, we used a with-or-without-dichoptic-masker design similar to those used in original psychophysical studies, and examined the extent to which V1 neuronal responses were affected by CFS and how neurons preferring the target eye, masker eye, or both eyes were differently impacted. Using a customized two-photon imaging setup for awake macaques (Li, Liu, Jiang, Lee, & Tang, 2017), we sampled large neural populations at cellular resolution and measured ocular dominance for each individual neuron. This approach enabled us to investigate the potentially differential impacts of CFS on the responses of V1 neurons with varying ocular preferences, as well as apply machine learning tools to understand the impacts of CFS on V1 orientation coding at the population level.
Results
We used two-photon calcium imaging to record responses of V1 superficial neurons from two awake, fixating macaques, each with two response fields of view (FOVs, 850 x 850 µm2) (Fig. 1A). During the initial recording, the stimulus was a binocular 0.45-contrast square-wave grating varying at twelve orientations and two spatial frequencies (3 & 6 cpd) (Fig. 1B). A total of 3,564 neurons were identified through image processing, including 3,004 (84.29%) orientation-tuned neurons that were included in following data analyses.

Two-photon imaging and ocular dominance mapping.
A. Optical windows for imaging of two macaques. Green crosses indicate the regions for viral vector injections, and yellow boxes indicate the FOVs chosen for imaging. B. Stimuli used for OD mapping. A circular-windowed square-wave grating was presented monocularly to each eye, respectively, to probe each neuron’s ODI. C. Ocular dominance functional maps of each FOV at single-neuron resolution showing OD clusters. D. Frequency distributions of individual neurons’ ODIs in each FOV.
The same grating stimulus was then presented monocularly (Fig. 1B) to each eye to characterize individual neurons’ eye preferences. Each neuron’s ocular dominance index (ODI) was calculated as ODI = (Ri – Rc)/(Ri + Rc), where Ri and Rc were the neuron’s peak responses to ipsilateral and contralateral stimulations, respectively. Neurons with an ODI at –1 or +1 would exclusively prefer the contralateral or ipsilateral eye, while neurons with an ODI at 0 would prefer both eyes equally. Consistent with previous findings (Hubel & Wiesel, 1962; Horton & Hocking, 1996; Livingstone, 1996; Zhang et al., 2024), neurons with similar eye preferences clustered together (Fig. 1C), indicating ocular dominance columns. The ODI followed unimodal distributions (Fig. 1D), in which the majority of neurons were binocular, showing comparable preferences for either eye. Only a small portion of neurons were monocular, being more responsive to the ipsilateral or contralateral eye.
In a third and last step, the grating stimulus and the flashing noise masker were presented dichoptically to evaluate the impact of CFS on neurons’ orientation responses (Fig. 2A). The results are summarized as population orientation tuning functions under the baseline no-CFS condition and the CFS condition following the procedure in Busse, Wade, and Carandini (2009). Specifically, neurons with similar orientation preferences were binned (bin width = 15°) relative to the target orientation for a total of 12 bins, and the resultant population orientation tuning functions based on the mean responses of these bins (Fig. 2C) were fitted with a Gaussian function. Compared to the baseline population orientation tuning functions, those under the influence of CFS displayed profound reductions in orientation response. The amplitude decreased by 84.18% in Monkey A and 60.78% in Monkey B on the basis of Gaussian fitting, while the slope decreased by 91.31% in Monkey A and 71.50% in Monkey B (Fig. 2B).

The impacts of CFS on population orientation tuning in two macaques.
A. Stimuli used in the CFS experiment for one macaque. The grating target was presented to one eye, which was dichoptically masked by a circular flashing masker presented to the other eye. The white dot was the fixation point. B. Exemplar baseline and CFS orientation tuning functions for neurons with different eye preferences. C. Population orientation tuning functions of all neurons without CFS as the baseline and with CFS. Data from two FOVs of each monkey were pooled due to highly consistent results. Solid curves are Gaussian fittings. D. Population orientation tuning functions of sub-groups of neurons with different eye preferences without and with CFS. Solid curves are fitting results using an ocular dominance-dependent gain control model elaborated in the supplementary material (Fig. S1). Error bars represent ±1 SE. E. The impacts of CFS on Fisher information. Fisher information is plotted as a function of relative orientation (to the neuron’s preferred orientation) without and with CFS. Shaded areas denote ±1 SE. F. The ratio of baseline/CFS Fisher information within 15° of neurons’ preferred orientations. Data from two FOVs of each monkey were pooled due to highly consistent results.
Furthermore, neurons were divided into three groups according to their ODIs, and the impacts of CFS on their respective orientation responses were examined: neurons preferring the grating eye (ODI > 0.2 or < -0.2, depending on whether the grating stimulation was ipsilateral or contralateral), binocular neurons (-0.2 <= ODI <= 0.2), and neurons preferring the masker eye (ODI < -0.2 or > 0.2 relative to the grating eye). Compared to the baseline condition, the orientation tuning of neurons preferring the masker eye was completely wiped out by CFS (Fig. 2B, D left), leading to flattened tuning curves with unmeasurable amplitudes or bandwidths. The orientation tuning of binocular neurons was either nearly completely wiped out (Monkey A) or substantially abolished (Monkey B) (Fig. 2B, D middle). There were 85.68% and 68.32% decreases in amplitude, and 92.64% and 77.07% decreases in slope, for Monkeys A and B, respectively. The orientation tuning of neurons preferring the grating eye was the least but still substantially affected (Fig. 2B, D right), with respective 77.78% and 41.75% decreases in amplitude and 85.23% and 57.56% decreases in slope for two monkeys.
To quantify the loss of V1 population orientation encoding due to continuous flash suppression (CFS), we compared the Fisher information (Averbeck & Lee, 2006) under both baseline and CFS conditions. Here, Fisher information serves as a statistical measure that reflects how much information the responses of neurons can provide about the grating orientation. Specifically, it indicates the sensitivity of neural responses to small changes in orientation, in that higher values signify greater precision in encoding orientation information. As illustrated in Fig. 2E, Fisher information was reduced by CFS primarily for orientations deviated by less than 15° from the neurons’ preferred orientations. The average Fisher information for stimuli within this 15° range decreased to 29.1% and 43.4% of the baseline values in two macaques, respectively (Fig. 2F), indicating the detrimental impact of CFS on the ability of V1 populations to accurately encode and represent orientation information, especially for orientations closely aligned with neuronal preferences.
What are the impact of CFS-induced suppression on V1 orientation decoding? To answer this question, which is crucial for understanding subconscious processing under CFS, we trained linear decoders to classify neighboring stimulus orientations (15°) in our experiments, as well as transformer models to reconstruct the stimulus images. Here, orientation classification was parallel to coarse orientation discrimination, and image reconstruction was parallel to orientation recognition, both suggesting the upper bounds of performance assuming an ideal observer.
For orientation classification, we trained an all-pair multiclass support vector machine (SVM) classifier to discriminate 12 orientations based on trial-by-trial population neural responses from all trials (Allwein, Schapire, & Singer, 2000). Decoders for different FOVs, ipsilateral/contralateral target presentations, and baseline vs. CFS conditions were trained separately. Under the baseline condition, the decoders achieved mean classification accuracies of 89.5 ± 2.0% and 91.5 ± 2.1% across ipsilateral and contralateral eye conditions in Monkeys A and B, respectively, in contrast to a chance level of 8.3% (1 out of 12). Under CFS, decoding accuracy slightly decreased in Monkey A (81.7 ± 1.9%) but remained stable in Monkey B (90.4 ± 2.1%, Fig. 3A). These results suggest that under CFS, there is still sufficient information for coarse orientation discrimination, even for Monkey A whose V1 neuronal responses were substantially suppressed.

Decoding consequences of CFS revealed by machine learning.
A. Multiway orientation classification accuracies under CFS vs. baseline conditions obtained using SVM decoders. Each datum represents results from a contralateral or ipsilateral grating condition with a specific FOV averaged across 10-fold cross-validations. Error bars denote 95% confidence intervals. B. A diagram of the transformer model for stimulus image reconstruction. C. Exemplar learning curves of transformer models under baseline and CFS conditions from two FOVs. The vertical dashed line indicates the epoch at which the baseline model reaches 75% of its total loss decrease between the two learning plateaus estimated using a sigmoid fit. D. Illustrations of corresponding reconstructed stimulus images on the basis of learning curves in C. E. Box plots of SSIM scores between the original and reconstructed images with baseline and CFS transformers. Within a FOV, results from contralateral eye and ipsilateral eye conditions are combined.
Next, we trained transformer models to reconstruct the grating images on the basis of corresponding neuronal responses under baseline and CFS conditions. The motivation for this part of the modeling work was the assumption that high-level tasks would be difficult to carry out if the basic stimulus features forming more complex patterns were not intact. Our transformer model contained an architecture that integrated embedding, self-attention, and unembedding modules, as well as a fully connected feedforward layer (Fig. 3B). The model inputs were the responses of all neurons within a FOV to the grating stimulus (ipsilateral and contralateral presentations of the same stimulus were modeled separately), and the model output was the reconstructed grating image. During the training process, the model typically reached two successive learning plateaus, where the validation loss temporarily stagnated (Fig. 3C). Moreover, the validation loss decreased more rapidly when training on the baseline neural response data compared to the CFS data. To compare the differences, we identified the epoch at which the validation loss of the baseline model reached 75% of its total decrease between the two plateaus using a sigmoid fit, and then we retrained both the baseline and CFS models up to this epoch.
The retained baseline models reconstructed the grating stimuli significantly better than the CFS models in Monkey A, but this discrepancy was less pronounced in Monkey B (Fig. 3D), consistent with the neuronal data that Monkey A exhibited substantially more CFS suppression than Monkey B in terms of population orientation tuning and Fisher information (Fig. 2). We used a structural similarity index (SSIM) (Brunet, Vrscay, & Wang, 2012) to quantify the reconstruction performances. Across the grating-presenting ipsilateral and contralateral eyes, the baseline models reconstructed the grating with median SSIMs of 0.52 and 0.61 for the two FOVs of Monkey A, and 0.57 and 0.63 for the two FOVs of Monkey B, respectively, while the corresponding SSIMs for the CFS models were 0.16 and 0.19 for Monkey A, and 0.55 and 0.53 for Monkey B (Fig. 3E).
To estimate the impact of CFS-induced V1 suppression on downstream processing, we also recorded neuronal responses from two V2 FOVs in Monkey A (FOVs 3 & 4). As anticipated, V2 neurons were binocular, with over 90% of them showing ODIs within the range of -0.2 to 0.2 (Fig. 4A). Similar to V1 results from the same monkey, CFS on average reduced the amplitudes of the population orientation tuning functions by 80.05% and the slopes by 89.44% (Fig. 4B). It also reduced the Fisher information to 33.1% of the baseline value (Fig. 4C). Furthermore, we applied the same orientation classification and image reconstruction procedures to the V2 data. For orientation classification, the SVM decoders achieved near-perfect performance in classifying 12 orientations under both baseline and CFS conditions, with classification accuracies exceeding 94% across all cases (Fig. 4D). In the image reconstruction task, the baseline model outperformed the CFS model. Specifically, the baseline transformer models reconstructed the stimulus images with the median SSIM values of 0.61 and 0.53 for the two V2 FOVs, respectively, which dropped to 0.42 and 0.18 in the CFS models (Fig. 4E), implying poorer or failed reconstruction of stimulus images.

Effects of CFS on V2 orientation responses.
A. OD maps of the two V2 FOVs of Monkey A (MA3 & MA4). B. Population orientation tuning functions for all orientation-tuned neurons with baseline and CFS conditions. Solid lines represent the results of Gaussian fittings. Error bars represent ±1 SE. C. Fisher information as a function of the relative orientation (to the neuron’s preferred orientation) with baseline and CFS conditions. Shaded areas denote ±1 SE. Fisher information was lower in MA4 due to higher variations in the data. D. Multiway orientation classification accuracies under CFS vs. baseline conditions using SVM decoders. Each datum represents results from a contralateral or ipsilateral grating condition with one FOV, averaged across 5-fold cross-validations. Error bars denote 95% confidence intervals. E. Box plots of SSIM scores between the original and reconstructed images with baseline and CFS transformers. Within a FOV, results from contralateral eye and ipsilateral eye conditions are combined.
Discussion
Our study demonstrates that CFS severely compromises orientation information in V1 neurons in an ocular dominance-dependent manner. Orientation information carried by neurons preferring the masker eye or both eyes is completely or nearly completely wiped out, while information carried by those preferring the grating eye is partially retained. Downstream, orientation information in V2 neurons is also substantially weakened. Linear decoding and transformer models suggest that CFS-compromised orientation information may still allow coarse orientation discrimination, but will most likely impair orientation recognition when the suppression is sufficiently strong as in Monkey A. Similarly strong suppression is also possible in Monkey B if the current grating contrast (0.45) is lower to be 0.1-0.3 as in many CFS experiments (Watanabe et al., 2011; Yuval-Greenberg & Heeger, 2013; Lunghi & Pooresmaeili, 2023; Alais, Coorey, Blake, & Davidson, 2024).
CFS-compromised V1 orientation information transmits for downstream visual processing, which may explain the unconscious orientation processing observed in human CFS studies. The “invisible” orientation information can be processed, as demonstrated by adaptation (Kanai, Tsuchiya, & Verstraten, 2006; Bahrami, Carmel, Walsh, Rees, & Lavie, 2008) and priming (Koivisto & Grassini, 2018) studies. The adaptation aftereffect is reduced compared to the visible condition but not entirely abolished (Kanai et al., 2006; Bahrami et al., 2008), likely a result of the degraded orientation information surviving CFS. For the same reason, the priming effect also decreases during trials in which the stimulus is rendered invisible by CFS, compared to those in which the stimulus is visible or partially visible (Koivisto & Grassini, 2018), as the degraded stimulus information provides insufficient evidence for decision-making, resulting in a diminished priming effect (Dehaene, 2011; Gomez, Perea, & Ratcliff, 2013).
Furthermore, our linear decoding and transformer results can help elucidate the debate on whether visual processing still functions at the categorization level under the influence of CFS. Previous studies have provided evidence for the preserved category information of the target, as demonstrated by tool-specific priming effects (Almeida et al., 2008; Almeida et al., 2010) and differential BOLD response patterns between tools and other object categories under CFS (Hesselmann, Hebart, & Malach, 2011; Tettamanti et al., 2017). However, an intriguing question is: Do these results rather reflect low-level feature differences between tools and other object categories? It has been reported that elongated objects, irrespective of their categorical affiliation, elicit similar priming effects (Sakuraba et al., 2012). Consistent with this, when tools are categorized by their shape (elongated vs. non-elongated), only the neural response patterns elicited by elongated tools can be discriminated from other object categories under CFS (Fogelson, Kohler, Miller, Granger, & Tse, 2014; Ludwig, Kathmann, Sterzer, & Hesselmann, 2015). In line with this interpretation, Hesselmann, Darcy, Rothkirch, and Sterzer (2018) reported that tool-specific priming under CFS does not reliably emerge under conditions designed to produce strong interocular suppression, suggesting that previously observed category effects may reflect access to low-level shape features rather than preserved category representations. Moreover, a recent study measuring the contrast thresholds required to both break from and suppress CFS found that stimuli exhibited similar suppression strengths across various categories (Alais et al., 2024). According to our results, when suppression is too strong to allow for stimulus reconstruction, as in the case of Monkey A (Fig. 3C), the orientation information under CFS may not accumulate to a level sufficient for resolving semantic category boundaries. The latter might require somewhat intact stimulus information, even if subconsciously. However, it could potentially assist in category discrimination when categorical differences lie in certain low-level shape dimensions like orientation, as coarse orientation discrimination appears unaffected by CFS suppression (Fig. 3A).
A related issue is the dorsal-ventral CFS hypothesis, which proposes that CFS suppression may disproportionately affect ventral visual processing while relatively preserving dorsal pathways involved in visuomotor functions, potentially allowing category- or action-related information to remain accessible under suppression (Fang & He, 2005). However, subsequent fMRI studies have failed to provide consistent support for this dissociation, reporting either stream-invariant awareness effects (Hesselmann & Malach, 2011; Ludwig et al., 2015; Tettamanti et al., 2017), residual signal in ventral rather than dorsal regions (Hesselmann et al., 2011; Fogelson et al., 2014), or residual low-level feature information/partial visibility rather than preserved dorsal processing (Ludwig et al., 2015). Although our study does not directly test dorsal-ventral dissociations, our V1 results provide a constraint on what information downstream visual pathways could access under suppression. When CFS- induced interocular suppression was strong enough and stimuli reconstruction was markedly reduced, as in the case of Monkey A, the information required for category-level or action-related processing may not be sufficient for high-level cortical representation.
Interocular suppression under CFS is known to vary substantially across individuals (Yamashiro et al., 2013; Gayet & Stein, 2017; Blake, Goodman, Tomarken, & Kim, 2019). This inter-individual variability may contribute to the heterogeneity observed in the CFS literature. We also found that the strength of V1 response suppression during CFS differed between two monkeys, as reflected by population orientation tuning functions (Fig. 2C), Fisher information (Fig. 2F), and reconstruction performance by the transformer (Fig. 3E). Several experimental factors may have contributed to the relatively weaker suppression observed in Monkey B. Because monkeys viewed the stimuli passively, we could not determine the dominant eye for each monkey (instead we switched the eyes and averaged the results), and the target was presented at relatively high contrast. Both factors are known to reduce the effectiveness of CFS suppression (Yang, Blake, & McDonald, 2010; Yuval-Greenberg & Heeger, 2013). In addition, the random-noise masker we used might not be as effective as Mondrian patterns (Hesselmann, Darcy, Ludwig, & Sterzer, 2016). If reduced stimulus contrast and a Mondrian masker were used, we predict that CFS suppression in Monkey B would strengthen, potentially approaching the level observed in Monkey A. Nevertheless, it is worth emphasizing that our main conclusions are primarily based on data from Monkey A, who exhibited much stronger CFS suppression.
Materials and Methods
Monkey preparation
Monkey preparation was identical to procedures reported in previous studies (Ju, Guan, Tao, Tang, & Yu, 2020; Guan, Ju, Tao, Tang, & Yu, 2021; Zhang et al., 2024). Two rhesus monkeys (Macaca mulatta, aged 5 and 6, respectively) underwent two sequential surgeries under general anesthesia and strictly sterile conditions. During the first surgery, a 20-mm diameter craniotomy was performed on the skull over V1. The dura was opened and multiple tracks of 100-150 nil AAV1.hSynap.GCaMP5G.WPRE.SV40 (AV-1-PV2478, titer 2.37e13 (GC/ml), Penn Vector Core) were pressure-injected at a depth of ∼350 µm at multiple locations. The dura was then sutured, the skull cap was re-attached with three titanium lugs and six screws, and the scalp was sutured. After the surgery, the animal was returned to the cage and treated with injectable antibiotics (Ceftriaxone sodium, Youcare Pharmaceutical Group, China) for one week. Postoperative analgesia was also administered. The second surgery was performed 45 days later. A T-shaped steel frame was installed for head stabilization, and an optical window was inserted onto the cortical surface. Data collection could start as early as one week later. More details about the preparation and surgical procedures can be found in Li et al. (2017). The procedures were approved by the Institutional Animal Care and Use Committee, Peking University.
Behavioral task
After a ten-day recovery period following the second surgery, monkeys were placed in a primate chair with head restraint. They were trained to hold fixation on a small white spot (0.2°) with eye positions monitored by an Eyelink-1000 eye tracker (SR Research) at a 1000-Hz sampling rate. During the experiment, trials with the eye position deviated 1.5° or more from the fixation before stimulus offset were discarded as ones with saccades and repeated.
Visual stimuli and experimental procedures
Visual stimuli were generated with a Matlab-based Psychtoolbox-3 software (Pelli & Zhang, 1991) and presented on a ROG Swift PG278QR monitor (refresh rate = 120 Hz, resolution = 2560 × 1440 pixel, pixel size = 0.23 mm × 0.23 mm). The screen luminance was linearized by an 8-bit look-up table, and the mean luminance was 47 cd/m2. The viewing distance was 60 cm.
A drifting square-wave grating (spatial frequency = 4 cpd, contrast = full, speed = 3 cycles/sec, starting phase = 0°, size = 0.4° in diameter) was first used to determine the population receptive field (pRF) location, shape, and approximate size associated with a specific FOV. The same stimulus was also monocularly presented to confirm the V1 location as ocular dominance columns would appear. This fast process used a 4 × objective lens mounted on the two-photon microscope and did not provide cell-specific information. The recorded V1 pRFs were centered at ∼0.90° eccentricity in Monkey A and ∼1.93° in Monkey B. V2 pRFs were centered at ∼0.67° in Monkey A. All pRFs were approximately circular with a diameter of 0.9°.
The target stimulus used in the experiments was a 0.45-contrast circular-windowed square-wave grating. It drifted at 4 cycles per second in opposite directions perpendicular to the orientation with a starting phase of 0°, and varied at 12 orientations (0° to 165° in 15° increments) and two spatial frequencies (3 & 6 cpd) trial by trial. The circular envelope had a diameter of 1°, which approximated the size of pRFs for recorded FOVs, with the edge blurred by a linear ramp starting at a radius of 0.38°. The flashing masker was a circular white noise pattern with a diameter of 1.89°, a contrast of 0.5, and a flickering rate of 10 Hz. The white noise consisted of randomly generated black and white blocks (0.07° × 0.07° each). The target grating and the flashing masker were presented through a pair of NVIDIA 3D Vision 2 active shutter glasses. To mitigate the ghost image, a low contrast (RMS contrast = 0.08) white noise was added to the grating. The width of the noise element was half of the bar width of the square grating, and the white noise was regenerated every frame.
Each block of trials consisted of four groups of stimuli: binocular, monocular, CFS, and flashing masker-only. In the binocular group, the grating was presented to both eyes simultaneously. The relevant data were only used to help identify ROIs and orientation-tuned neurons along with data from other stimulus conditions. In the monocular group, the grating was monocularly presented to the contralateral or ipsilateral eye, which served as the baseline conditions without the influences of CFS. In the CFS group, the grating and flashing masker were presented dichoptically. In the flashing masker-only group, the flashing masker was presented monocularly to either eye. Each stimulus condition was repeated for 10-12 trials. For conditions involving the grating, the trials were split for two opposite drifting directions. A block of trial contained 242 trials, two trials for each stimulus condition, with the order of stimulus conditions arranged in a pseudorandom manner. There were 5 to 6 blocks of trials with each FOV.
Each stimulus was presented for 1000 ms, followed by an inter-stimulus interval of 1500 ms, allowing sufficient time for the calcium signals to return to the baseline level (Guan, Zhang, Zhang, Tang, & Yu, 2020). For each FOV, the recording was completed in a single session with 5-6 experiment blocks and lasted for 2-3 hours.
Two-photon imaging
Two-photon imaging was performed using a FENTOSmart two-photon microscope (Femtonics), along with a Ti:sapphire laser (Mai Tai eHP, Spectra Physics). GCaMP5 was chosen as the indicator of calcium signals because the fluorescence activities it expresses are linearly proportional to neuronal spike activities within a wide range of firing rates from 10-150 Hz (Li et al., 2017). During imaging, a 16× objective lens (0.8 N.A., Nikon) with a resolution of 1.6 µm/pixel was used, along with a 1000 nm femtosecond laser. A fast resonant scanning mode (32 fps) was chosen to obtain continuous images of neuronal activity (8 frames per second after averaging every 4 frames). The strength of fluorescent signals (mean luminance of a small area) was monitored and adjusted if necessary for the drift of fluorescent signals. Two response fields of view (FOVs) measuring 850 × 850 µm2 in V1 were selected in both macaques, and two FOVs of the same size in V2 were selected in Macaque A.
Imaging data analysis: Initial screening of ROIs
Data were analyzed with customized MATLAB codes. A normalized cross-correlation based translation algorithm was used to reduce motion artifacts (Li et al., 2017). Then the fluorescence changes were associated with corresponding visual stimuli through the time sequence information recorded by Neural Signal Processor (Cerebus system, Blackrock Microsystem). By subtracting the mean of the 4 frames before stimuli onset (F0) from the average of the 6th-9th frames after stimuli onset (F) across 5 or 6 repeated trials for the same stimulus condition (same orientation, spatial frequency, size, and drifting direction), the differential image (ΔF = F -F0) was obtained.
For a specific FOV, the regions of interest (ROIs) or possible cell bodies were decided through sequential analysis of 242 differential images in the order of CFS, monocular, binocular, and flashing masker-only conditions. CFS conditions consisted of 96 (2×2×12×2 = 96) differential images, with the grating presented to either eye (2), at two spatial frequencies (2), twelve orientations (12), and two motion directions (2). Monocular conditions were identical to the CFS conditions except that the flashing masker was absent. In the binocular conditions, gratings at two spatial frequencies (2), twelve orientations (12), and two motion directions (2) were binocularly presented, resulting in 48 differential images. The flashing masker-only conditions consisted of the flashing masker presented to either eye, resulting in 2 differential images.
The first differential image was filtered with a band-pass Gaussian filter (size = 2–10 pixels), and connected subsets of pixels (>25 pixels, which would exclude smaller vertical neuropils) with average pixel value >3 standard deviations of the mean brightness were selected as ROIs. Then the areas of these ROIs were set to mean brightness in the next differential image before the bandpass filtering and thresholding were performed. This measure gradually reduced the standard deviations of differential images and facilitated the detection of neurons with relatively low fluorescence responses. If a new ROI and an existing ROI from the previous differential image overlapped, the new ROI would be on its own if the overlapping area OA < 1/4 ROInew, discarded if 1/4 ROInew < OA < 3/4 ROInew, and merged with the existing ROI if OA > 3/4 ROInew. The merges would help smoothen the contours of the final ROIs. This process went on through all differential images twice to select ROIs. Finally, the roundness for each ROI was calculated as:

where A was the ROI’s area, and P was the perimeter. Only ROIs with roundness larger than 0.9, which would exclude horizontal neuropils, were selected for further analysis.
Imaging data analysis: Orientation tuning and ocular dominance
The ratio of fluorescence change (ΔF/F0) was calculated as a neuron’s response to a specific stimulus condition. For a specific neuron’s response to a specific stimulus condition, the F0n of the n-th trial was the average of 4 frames before stimulus onset (-500 -0 ms), and Fn was the average of the 5th-8th, 6th-9th, or 7th-10th frames after stimulus onset, whichever was the greatest. F0n was then averaged across 10 or 12 repeated trials to obtain the baseline F0 for all trials (to reduce noise in the calculation of responses), and ΔFn/F0 = (Fn- F0)/F0 was taken as the neuron’s response to this stimulus at the n-th trial.
Several steps were taken to determine whether a neuron was orientation-selective. For each monocular or binocular condition, the orientation and SF eliciting the maximal response were designated as the neuron’s preferred SF and orientation. We then compared responses across all 12 orientations at the preferred SF by performing a non-parametric Friedman test to determine whether the neuron’s responses at various orientations were significantly different from each other. To reduce Type I errors, the significance level was set at α = 0.01. Neurons that passed the Friedman test at least under one viewing condition were selected as orientation-tuned neurons.
The ocular dominance index (ODI) was calculated to characterize each neuron’s eye preference: ODI = (Ri – Rc)/(Ri +Rc), where Ri and Rc were the neuron’s peak responses at the best orientation and SF to ipsilateral and contralateral monocular grating conditions, respectively. Neurons with an ODI of -1 or 1 would be completely contralateral or ipsilateral eye dominant, and of 0 would be equally dominant by both eyes.
Population orientation tuning
For each neuron, neural responses at the preferred SF were selected for tuning analysis. To derive population orientation tuning curves under a specific condition, we categorized neurons into twelve orientation preference bins according to their preferred orientations (bin width = 15°). For each orientation presented, the responses of all orientation preference bins were reorganized according to the relative orientation preference. Subsequently, neuronal responses of the same relative orientation preference were averaged to generate the final population orientation tuning function. For CFS conditions, the selected SF and binning procedures were the same as their corresponding monocular conditions.
The population orientation tuning function was fitted with a Gaussian model with MATLAB’s nonlinear least-squares function ‘lsqnonlin’:

where R(θ) was the response at orientation θ, free parameters a, θ0, σ, and b were the amplitude, peak orientation, standard deviation of the Gaussian function (equal to half width at half height), and minimal response, respectively.
The population orientation tuning curves for different eye preference groups were derived using the same procedure, with additional binning of neurons according to their ODI. To obtain the tuning curve of the neurons preferring the eye seeing the grating, responses of neurons with an ODI < -0.2 (preferentially responding to the contralateral eye) under contralateral eye grating presentation and those with an ODI > 0.2 (preferentially responding to the ipsilateral eye) under ipsilateral eye grating presentation were combined. Similarly, for neurons preferring the eye seeing the masker, responses of neurons with an ODI < -0.2 under ipsilateral eye grating presentation and those with an ODI > 0.2 under contralateral eye grating presentation were combined. For binocular neurons (-0.2 < ODI < 0.2), responses under both grating presentation conditions were combined.
Fisher information
The Fisher information assesses the amount of information contained in a neuron population using an optimal decoder (Pouget, Deneve, Ducom, & Latham, 1999). Assume independent Gaussian noise distributions, the Fisher information for a population of N neurons was given as

where 𝑓i(𝜃) was the mean activity of neuron i in response to the presentation angle, θ, and 𝑓i′(𝜃) was its derivative with respect to θ. We fitted each neuron’s response tuning 𝑓i(𝜃) and variance tuning 𝜎i(𝜃) with Gaussian functions and calculated the averaged Fisher information across neurons at each orientation.
SVM-based orientation classification
In the orientation classification task (Fig. 3A), we trained a support vector machine (SVM) with a one-vs.-one coding scheme to classify orientations from standardized population neural activity. The SVM decoder was implemented using MATLAB’s ‘fitcsvm’ function with a linear kernel. To prevent overfitting and evaluate the generalization ability of the model, we employed a 5-fold cross-validation procedure, and the model performance on the validation dataset was reported.
Decoders were trained independently for each experiment condition, resulting in 4 models per FOV (contralateral/ipsilateral × baseline/CFS). Neural response data from two spatial frequencies were used as input, with each neuron treated as a feature. In this way, each model were trained and tested on 288 or 240 samples (2 SFs × 12 orientations × 12/10 repeats).
The transformer model
Model input, output, and training procedure
We implemented a transformer-based model to reconstruct grating stimuli from population neuronal responses recorded under different experiment conditions. The model input was a vector of neuronal responses, each corresponding to an individual neuron, and the output was the reconstructed grating image of size 70 × 70 pixels.
The transfromer was trained independently for each experimental condition, resulting in four models per FOV (contralateral/ipsilateral × baseline/CFS). Pilot experiments revealed that our original dataset was insufficient for the model to converge. To address this, we augmented the dataset to four times its original size before training. Augmentation was performed by sampling from a normal distribution centered at each neuron’s response mean, with a standard deviation equal to its original standard deviation. Within the augmented dataset, 6% was reserved for validation. Responses were normalized to [0, 1] before fed into the model.
We implemented a two-phase training procedure to assess the reconstruction ability of models trained on different neural data. During the training process, the model typically reached two learning plateaus, where the validation loss temporarily stagnated (Fig. 3C). In the first training session, we analyzed the learning curve to determine the epoch at which the baseline model’s validation loss had completed 75% of its total decrease between the two plateaus. This was estimated using a modified sigmoid fit:

where A and C defined the function range, b was the symmetry point, k was the steepness parameter, and t represented the epoch number, counted from epoch 500 (initial epochs were discarded due to a drastic drop in validation loss across all training runs, see Fig. 3C). The 75% decrease point was computed as:

In the second training session, we retrained both models up to the identified epoch and evaluated their performances.
The model was trained to minimize the mean squared error (MSE) between the reconstructed and actual stimuli. Optimization was performed using RMSprop with a learning rate of 0.00005 and a smoothing factor ρ = 0.85.
Model structure
Each neuron’s response was embedded into a higher-dimensional space using a learned weight vector as follows:

where the 𝑅3(n×1) represented the original response vector from n neurons, and 𝑊12 (n× dmodel) was the embedding weight matrix, with each row corresponding to a neuron-specific weight vector. Here we used dmodel = 2. The symbol ⊙ denoteed row-wise multiplication, such that the ith response 𝑟3was multiplied by both elements in its embedding weight vector 𝑤12.. The resulting embedding matrix 𝑅+ (n×dmodel) contained the high-dimensional representations of the neuronal responses.
The enriched embedding matrix was then passed through a self-attention module. In this module, 𝑅1was first projected into queries (Q), keys (K), and values (V) through independent learnable weight matrices, respectively. Then the attention map was computed as:

where 𝑑k represented the dimensionality of the key vectors, which scaled the dot-product to control the variance of the attention scores.
The output of self-attention was calculated as:

The output from self-attention was unembedded by projecting each neuron’s high-dimensional representation back to one-dimensional. A feedforward layer transformed the unembedded vector into a stimulus vector, which was then reshaped into the final 70 × 70 image.
Model evaluation
The original, non-augmented data was used for analysis, which had been seen during training in both the training and validation sets. We used a structural similarity index (SSIM) (Brunet et al., 2012) to quantify the reconstruction performances.
The SSIM (Brunet et al., 2011) between two images x and y (both 70×70) is defined as:

where 𝜇x and 𝜇y are the mean intensities, 𝜎x2 and 𝜎y2 are the variances, 𝜎xy is the covariance, and 𝑐1 and 𝑐2 are constants for numerical stability.
Data availability
The code could be found at github: https://github.com/caviaryusi/CFS/blob/main/README.md. The data could be found at HuggingFace (Hugging Face, RRID:SCR_020958): https://huggingface.co/datasets/chencaixia/CFS_2p.
Acknowledgements
This study was supported by a Natural Science Foundation of China STI2030-Major Projects grant (2022ZD0204600) to SMT and CY.
Additional information
Funding
MOST | National Natural Science Foundation of China (NSFC) (2022ZD0204600)
Cong Yu
References
- 1.High-Level Face Adaptation Without AwarenessPsychological Science 21:205–210Google Scholar
- 2.A new ‘CFS tracking’ paradigm reveals uniform suppression depth regardless of target complexity or salienceeLife 12:RP91019https://doi.org/10.7554/eLife.91019Google Scholar
- 3.Reducing multiclass to binary: A unifying approach for margin classifiersJournal of machine learning research 1:113–141Google Scholar
- 4.The Role of the Dorsal Visual Processing Stream in Tool IdentificationPsychological Science 21:772–778Google Scholar
- 5.Unconscious processing dissociates along categorical linesProceedings of the National Academy of Sciences of the United States of America 105:15214–15218Google Scholar
- 6.Effects of Noise Correlations on Information Encoding and DecodingJournal of Neurophysiology 95:3633–3644Google Scholar
- 7.Unconscious orientation processing depends on perceptual loadJournal of Vision 8:12–12Google Scholar
- 8.Individual differences in continuous flash suppression: Potency and linkages to binocular rivalry dynamicsVision Research 160:10–23Google Scholar
- 9.On the Mathematical Properties of the Structural Similarity IndexIEEE Transactions on Image Processing 21:1488–1499Google Scholar
- 10.Representation of concurrent stimuli by population activity in visual cortexNeuron 64:931–942Google Scholar
- 11.Conscious and Nonconscious Processes:Distinct Forms of Evidence Accumulation?In:
- Rivasseau V.
- 12.Cortical responses to invisible objects in the human dorsal and ventral pathwaysNature Neuroscience 8:1380–1385Google Scholar
- 13.Unconscious neural processing differs with method used to render stimuli invisibleFront Psychol 5Google Scholar
- 14.Between-Subject Variability in the Breaking Continuous Flash Suppression Paradigm: Potential Causes, Consequences, and SolutionsFrontiers in Psychology 8:2017Google Scholar
- 15.A diffusion model account of masked versus unmasked priming: Are they qualitatively different?Journal of Experimental Psychology: Human Perception and Performance 39:1731–1740Google Scholar
- 16.Faces and awareness: low-level, not emotional factors determine perceptual dominanceEmotion 13:537–544Google Scholar
- 17.Functional organization of spatial frequency tuning in macaque V1 revealed with two-photon calcium imagingProgress in Neurobiology 205:102–120Google Scholar
- 18.Plaid detectors in macaque V1 revealed by two-photon calcium imagingCurrent Biology 30:934–940Google Scholar
- 19.Priming in a shape task but not in a category task under continuous flash suppressionJ Vis 16:17Google Scholar
- 20.Investigating masked priming along the “vision-for-perception” and “vision-for-action” dimensions of unconscious processingJournal of Experimental Psychology: General 147:1641Google Scholar
- 21.Differential BOLD activity associated with subjective and objective reports during “blindsight” in normal observersJournal of Neuroscience 31:12936–12944Google Scholar
- 22.The link between fMRI-BOLD activation and perceptual awareness is stream-invariant in the human visual systemCerebral Cortex 21:2829–2837Google Scholar
- 23.Intrinsic Variability of Ocular Dominance Column Periodicity in Normal Macaque MonkeysThe Journal of Neuroscience 16:7228Google Scholar
- 24.Receptive fields, binocular interaction and functional architecture in the cat’s visual cortexThe Journal of Physiology 160:106–154Google Scholar
- 25.Processing of Invisible Stimuli: Advantage of Upright Faces and Recognizable Words in Overcoming Interocular SuppressionPsychological Science 18:349–355Google Scholar
- 26.Orientation Tuning and End-stopping in Macaque V1 Studied with Two-photon Calcium ImagingCerebral Cortex 31:2085–2097Google Scholar
- 27.The Scope and Limits of Top-Down Attention in Unconscious Visual ProcessingCurrent Biology 16:2332–2336Google Scholar
- 28.Binocular interaction on monocularly discharged lateral geniculate and striate neurons in the catJ Neurophysiol 46:932–951Google Scholar
- 29.Unconscious response priming during continuous flash suppressionPLOS One 13:e0192201Google Scholar
- 30.Long-term two-photon imaging in awake macaque monkeyNeuron 93:1049–1057Google Scholar
- 31.Ocular dominance columns in New World monkeysThe Journal of Neuroscience 16:2086Google Scholar
- 32.Investigating category- and shape-selective neural processing in ventral and dorsal visual stream under interocular suppressionHuman Brain Mapping 36:137–149Google Scholar
- 33.Learned value modulates the access to visual awareness during continuous flash suppressionScientific Reports 13:756Google Scholar
- 34.A role for ocular dominance in binocular integrationCurrent Biology 33:3884–3895Google Scholar
- 35.Scene Integration Without Awareness: No Conclusive Evidence for Processing Scene Congruency During Continuous Flash SuppressionPsychol Sci 27:945–956Google Scholar
- 36.A critical reexamination of doing arithmetic nonconsciouslyPsychon Bull Rev 25:472–481Google Scholar
- 37.Continuous Flash Suppression: Stimulus Fractionation rather than IntegrationTrends Cogn Sci 21:719–721Google Scholar
- 38.Integration without awareness: expanding the limits of unconscious processingPsychol Sci 22:764–770Google Scholar
- 39.Accurate control of contrast on microcomputer displaysVision Research 31:1337–1350Google Scholar
- 40.Narrow Versus Wide Tuning Curves: What’s Best for a Population Code?Neural Computation 11:85–90Google Scholar
- 41.Continuous flash suppression: Known and unknownsPsychonomic Bulletin & Review 27:1071–1103Google Scholar
- 42.Does the human dorsal stream really process a category for tools?Journal of Neuroscience 32:3949–3953Google Scholar
- 43.Reading and doing arithmetic nonconsciouslyProc Natl Acad Sci U S A 109:19614–19619Google Scholar
- 44.Prioritization of emotional faces is not driven by emotional contentSci Rep 13:549Google Scholar
- 45.Unaware processing of tools in the neural system for object-directed action representationJournal of Neuroscience 37:10712–10724Google Scholar
- 46.Continuous flash suppression reduces negative afterimagesNature Neuroscience 8:1096–1101Google Scholar
- 47.Depth of interocular suppression associated with continuous flash suppression, flash suppression, and binocular rivalryJournal of Vision 6:6–6Google Scholar
- 48.Attention But Not Awareness Modulates the BOLD Signal in the Human V1 During Binocular SuppressionScience 334:829–831Google Scholar
- 49.Activity in early visual areas predicts interindividual differences in binocular rivalry dynamicsJournal of Neurophysiology 111:1190–1202Google Scholar
- 50.A New Interocular Suppression Technique for Measuring Sensory Eye DominanceInvestigative Ophthalmology & Visual Science 51:588–593Google Scholar
- 51.On the use of continuous flash suppression for the study of visual processing outside of awarenessFrontiers in Psychology 5Google Scholar
- 52.Continuous Flash Suppression Modulates Cortical Activity in Early Visual CortexThe Journal of Neuroscience 33:9635Google Scholar
- 53.Suppressed semantic information accelerates analytic problem solvingPsychon Bull Rev 20:581–585Google Scholar
- 54.Ocular dominance-dependent binocular combination of monocular neuronal responses in macaque V1eLife 13:RP92839https://doi.org/10.7554/eLife.92839Google Scholar
Article and author information
Author information
Version history
- Sent for peer review:
- Preprint posted:
- Reviewed Preprint version 1:
- Reviewed Preprint version 2:
Cite all versions
You can cite all versions using the DOI https://doi.org/10.7554/eLife.107518. This DOI represents all versions, and will always resolve to the latest one.
Copyright
© 2025, Chen et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
- views
- 248
- downloads
- 19
- citations
- 0
Views, downloads and citations are aggregated across all versions of this paper published by eLife.