Different rules for binocular combination of luminance flicker in cortical and subcortical pathways

  1. Federico G Segala  Is a corresponding author
  2. Aurelio Bruno
  3. Joel T Martin
  4. Myat T Aung
  5. Alex R Wade
  6. Daniel H Baker
  1. Department of Psychology, University of York, United Kingdom
  2. School of Psychology and Vision Sciences, University of Leicester, United Kingdom
  3. York Biomedical Research Institute, University of York, United Kingdom

Abstract

How does the human brain combine information across the eyes? It has been known for many years that cortical normalization mechanisms implement ‘ocularity invariance’: equalizing neural responses to spatial patterns presented either monocularly or binocularly. Here, we used a novel combination of electrophysiology, psychophysics, pupillometry, and computational modeling to ask whether this invariance also holds for flickering luminance stimuli with no spatial contrast. We find dramatic violations of ocularity invariance for these stimuli, both in the cortex and also in the subcortical pathways that govern pupil diameter. Specifically, we find substantial binocular facilitation in both pathways with the effect being strongest in the cortex. Near-linear binocular additivity (instead of ocularity invariance) was also found using a perceptual luminance matching task. Ocularity invariance is, therefore, not a ubiquitous feature of visual processing, and the brain appears to repurpose a generic normalization algorithm for different visual functions by adjusting the amount of interocular suppression.

eLife assessment

This study provides potentially important, new insights about the combination of information from the two eyes in humans. The data includes frequency tagging of each eye's inputs and measures reflecting both cortical (EEG) and sub-cortical processes (pupillometry). The strength of supporting evidence is solid, suggesting that temporal modulations are combined differently than spatial modulations, with additional differences between subcortical and cortical pathways. However, questions remain as to exactly how information is combined, how the findings relate to the extant literature and more broadly, to the interests of vision scientists at large.

https://doi.org/10.7554/eLife.87048.3.sa0

Introduction

The brain must combine information across multiple sensory inputs to derive a coherent percept of the external world. This involves a process of signal combination both within (Baker and Wade, 2017) and between (Ernst and Banks, 2002) the senses. Binocular vision is a useful test case for signal combination, as the inputs to the two eyes overlap substantially (in species with forward-facing eyes), and the neural locus is well-established (Hubel and Wiesel, 1962). Much of our knowledge about binocular combination derives from studies on the contrast response of the ‘canonical’ visual pathway, in which signals pass from the eyes to the primary visual cortex (V1), via the lateral geniculate nucleus (LGN) (Purves et al., 2008). However, signals are also combined across the eyes in the network of subcortical nuclei that govern pupil diameter in response to absolute light levels (McDougal and Gamlin, 2008), and much less is known about the computations that operate in these subcortical pathways. Our primary purpose here is to investigate the computations governing signal combinations in these two anatomically distinct pathways in response to luminance changes.

For pattern vision, binocular presentation confers greater sensitivity to low-contrast targets than monocular presentation. This is known as binocular summation, with summation ratios (the relative improvement under binocular presentation) at detection threshold lying between √2 and 2 (Baker et al., 2018; Campbell and Green, 1965). This advantage is lost at high stimulus contrasts, where both psychophysical performance (contrast discrimination thresholds) (Legge, 1984; Meese et al., 2006) and neural activity (Baker and Wade, 2017; Moradi and Heeger, 2009) are approximately equal for monocular and binocular presentation. Contemporary models of binocular vision (Ding and Sperling, 2006; Meese et al., 2006) advocate a process of interocular suppression that normalizes the two eyes’ inputs at high contrasts and negates the binocular advantage. This is consistent with our everyday experience of ‘ocularity invariance’ (Baker et al., 2007): perceived contrast does not change when one eye is opened and closed.

The pupillary light reflex is an automatic constriction of the iris sphincter muscles in response to increases in light levels, which causes the pupil to shrink (McDougal and Gamlin, 2008). There is a clear binocular component to this reflex, as stimulation of one eye still causes constriction of the other eye’s pupil (termed the consensual response; Wyatt and Musselman, 1981). Importantly, the neuroanatomical pathway involved completely bypasses the canonical cortical pathway (retina to V1), instead involving a network of subcortical nuclei, including the Pretectal Olivary nucleus, Superior Cervical ganglion, and Edinger-Westphal nucleus (Angée et al., 2021; Mathôt, 2018; McDougal and Gamlin, 2008; Wang and Munoz, 2015). To account for the consensual response, these brain regions must combine information from the left and right eyes (ten Doesschate and Alpern, 1967), yet the computation that achieves this is unclear. The pupil response can be modulated by periodic changes in luminance, and is temporally low-pass (Barrionuevo et al., 2014; Spitschan et al., 2014), most likely due to the mechanical limitations of the iris sphincter and dilator muscles (Privitera and Stark, 2006).

To investigate the binocular combination of light, we designed an experiment that allowed us to simultaneously record electrophysiological and pupillometric responses to monocular and binocular stimuli. We chose a primary flicker frequency of 2 Hz as a compromise between the low-pass pupil response (see Barrionuevo et al., 2014; Spitschan et al., 2014), and the relatively higher-pass EEG response (Regan, 1966). This novel paradigm allowed us to probe both cortical (using EEG) and subcortical (using a binocular eye tracker) pathways simultaneously in response to flickering light, and make quantitative comparisons between them. Periodic flicker entrains both neural (Norcia et al., 2015) and pupil (Spitschan et al., 2014) responses at the flicker frequency, enabling precise estimation of response amplitudes in the Fourier domain. Relative to the response to a monocular signal, adding a signal in the other eye can either increase the response (facilitation) or reduce it (suppression). We followed up our main experiment with an additional exploration of the effect of stimulus frequency, and a psychophysical matching experiment measuring perceived flicker intensity (i.e. temporal contrast). The results are interpreted using a hierarchical Bayesian computational model of binocular vision, and reveal that subcortical pathways implement stronger interocular suppression than the canonical cortical pathway.

Results

Experiment 1

The pupillometry results are summarized in Figure 1. The group average waveform for binocular presentation is shown in Figure 1a. There is a substantial pupil constriction at stimulus onset, followed by visible oscillations at the flicker frequency (2 Hz, see waveform at foot). The average Fourier spectrum is displayed in Figure 1b, and shows a clear spike at 2 Hz, but no evidence of a second harmonic response at 4 Hz (though see Appendix 1). These results demonstrate that our paradigm can evoke measurable steady-state pupil responses at 2 Hz.

Summary of pupillometry results for N=30 participants.

Panel (a) shows a group average waveform for binocular presentation (low pass filtered at 5 Hz), with the driving signal plotted at the foot. Negative values indicate constriction relative to baseline, and positive values indicate dilation. Panel (b) shows the average Fourier spectrum (absolute amplitude values). Panels (c, d) show contrast response functions for pupil diameter at 2 Hz for different conditions (illustrated in Figure 8). Panel (e) shows contrast response functions at 1.6 Hz for three conditions. Shaded regions and error bars indicate bootstrapped standard errors.

Figure 1c shows contrast response functions driven by stimuli flickering only at 2 Hz. Response amplitudes increased monotonically with target contrast, confirming that our paradigm is suitable for measuring contrast-dependent differences in the pupil response (to our knowledge this is the first time this has been demonstrated). The amplitude of the binocular condition (blue squares) is consistently greater than that of the monocular condition (red circles) across all target contrasts. A 2×5 repeated measures ANOVAcirc2 (Baker, 2021) comparing these conditions revealed a significant main effect of target contrast (F(8, 580) = 16.79, p<0.001), a significant effect of condition (F(2, 580) = 11.04, p<0.001), and a significant interaction (F(8, 580) = 56.25, p<0.001). The dichoptic condition begins at a much higher amplitude, owing to the binocular combination of the target and high (48%) contrast mask, and then increases slightly with increasing target contrast (main effect of target contrast: F(8, 232) = 3.03, p<0.003).

In Figure 1d, we plot responses to monocular target stimuli flickering at 2 Hz, when the other eye viewed stimuli flickering at 1.6 Hz (the red monocular-only data are replotted from Figure 1c for comparison). When the 1.6 Hz component had the same contrast as the target (the binocular cross condition, shown in purple) responses were facilitated slightly at low contrasts, and suppressed at the highest target contrasts (interaction between contrast and condition: F(8, 580) = 52.94, p<0.001). When the 1.6 Hz component had a fixed contrast of 48% (the dichoptic cross condition, shown in yellow), responses were suppressed slightly across the contrast range (interaction between contrast and condition: F(8, 580) = 62.05, p<0.001).

Figure 1e shows responses at 1.6 Hz, for the same conditions, as well as for a condition in which a monocular stimulus flickered at 1.6 Hz (gray circles). Here, we find strong suppression in both the binocular cross (purple triangles) and dichoptic cross (yellow triangles) conditions. In the binocular cross condition, the amplitudes are reduced relative to the monocular condition (gray circles) (interaction effect: F(8, 580) = 41.23, p<0.001). In the dichoptic cross condition, increasing the 2 Hz target contrast suppresses the response to the 1.6 Hz mask, and the function decreases (see e.g. Busse et al., 2009) (main effect of target contrast F(8, 232) = 17, p<0.001).

Figure 2 shows equivalent results, measured contemporaneously using EEG. Figure 2a shows the group average waveform for binocular presentation, and Figure 2b shows the Fourier spectrum for binocular presentation, both averaged across four posterior electrodes (Oz, POz, O1, and O2, marked on the inset scalp plots). Unlike for the pupillometry data, there are clear responses at both the first harmonic frequency (2 Hz), and also the second harmonic frequency (4 Hz). We therefore calculated contrast response functions at both first and second harmonic frequencies.

Summary of EEG results for N=30 participants.

Panel (a) shows a group average waveform for binocular presentation (low pass filtered at 5 Hz), with the driving signal plotted at the foot. Panel (b) shows the average Fourier spectrum, and inset scalp distributions. Black dots on the scalp plots indicate electrodes Oz, POz, O1, and O2. Panels (c, d) show contrast response functions at 2 Hz for different conditions. Panel (e) shows contrast response functions at 1.6 Hz for three conditions. Panels (f–h) are in the same format but for the second harmonic response. Shaded regions and error bars indicate bootstrapped standard errors.

When stimuli in both eyes flicker at 2 Hz, the binocular responses at the first (Figure 2c) and second (Figure 2f) harmonics are substantially greater than the monocular responses, particularly at high contrasts. Analysis of variance on the complex values (ANOVAcirc2) revealed a main effect of contrast (F(8, 580) = 4.38, p<0.001) and an interaction effect (F(8, 580) = 61.58, p<0.001), but no effect of condition (p=0.13) at the first harmonic, with a similar pattern of results obtained at the second harmonic. For the cross-frequency conditions (Figure 2d and g), there was no appreciable effect of adding a 1.6 Hz component on the response at 2 Hz or 4 Hz (no effect of condition, and no interaction). Similarly, there were no clear interocular interactions between frequencies in the responses at 1.6 Hz (Figure 2e) and 3.2 Hz (Figure 2h). This pattern of results suggests that the processing of temporal luminance modulations happens in a more linear way in the visual cortex (indexed by EEG), compared with subcortical pathways (indexed by pupillometry), and shows no evidence of interocular suppression.

Finally, we calculated the ratio of binocular to monocular responses across the three data types from Experiment 1. Figure 3 shows that these ratios are approximately √2 across the low-to-intermediate contrast range for all three data types. At higher contrasts, we see ratios of 2 or higher for the EEG data, but much weaker ratios near 1 for the pupillometry data. Note that the ratios here are calculated on a per-participant basis and then averaged, rather than being the ratios of the average values shown in Figures 1 and 2. A 3×5 repeated measures ANOVA on the logarithmic (dB) ratios found a main effect of contrast (F(3.08, 89.28)=4.53, p<0.002), no effect of data modality (F(2, 58) = 0.75, p=0.48), but a highly significant interaction (F(5.54, 160.67)=3.84, p<0.001). All of the key results from Experiment 1 were subsequently replicated for peripheral stimulation (see Appendix 1).

Ratio of binocular to monocular response for three data types.

These were calculated by dividing the binocular response by the monocular response at each contrast level, using the data underlying Figure 1c and Figure 2c, f. Each value is the average ratio across N=30 participants, and error bars indicate bootstrapped standard errors.

Experiment 2

The strong binocular facilitation and weak interocular suppression in the EEG data from Experiment 1 were very different from previous findings on binocular combination using steady-state EEG with grating stimuli (Baker and Wade, 2017). One possible explanation is that the lower temporal frequency used here (2 Hz, vs 5 or 7 Hz in previous work) might be responsible for this difference. We, therefore, ran a second experiment to compare monocular and binocular responses at a range of temporal frequencies. Only EEG data were collected for this experiment, as the pupil response is substantially weaker above around 2 Hz (Barrionuevo et al., 2014; Spitschan et al., 2014); note that we originally chose 2 Hz because it produces measurable signals for both EEG and pupillometry, yet is unfortunately optimal for neither.

Results from the temporal frequency experiment are shown in Figure 4. Figure 4a shows the Fourier spectra for responses to binocular flicker at 5 different frequencies (2, 4, 8, 16, and 30 Hz). From 2–16 Hz, clear signals are observed at each fundamental frequency, and typically also their higher harmonics (integer multiples of the fundamental). However, at 30 Hz (upper row), the responses recorded were not demonstrably above the noise baseline. Figure 4b compares the monocular and binocular responses at each stimulation frequency. Here, we replicate the substantial summation effect across frequencies up to and including 16 Hz (Figure 4c), demonstrating that strong binocular facilitation in the EEG data of Experiment 1 cannot be attributed to our use of 2 Hz flicker.

Binocular facilitation at different temporal frequencies, measured using EEG.

Panel (a) shows Fourier spectra for responses to binocular flicker at five different frequencies (offset vertically for clarity). Panel (b) shows the response at each stimulation frequency for monocular (red circles) and binocular (blue squares) presentation. Panel (c) shows the ratio of binocular to monocular responses. Error bars and shaded regions indicate bootstrapped standard errors across N=12 participants.

Experiment 3

In Experiment 1, we found evidence of stronger binocular facilitation for cortical responses to luminance flicker (measured using EEG), compared with subcortical responses (measured using pupillometry; see Figure 3). Since perception is dependent on cortical responses, these results provide a clear prediction for perceived contrast judgments indexed by psychophysical contrast matching paradigms (e.g. Anstis and Ho, 1998; Legge and Rubin, 1981; Levelt, 1965; Quaia et al., 2018). We therefore conducted such an experiment, in which participants judged which of two stimuli had the greater perceived amplitude of flicker. On each trial, one stimulus was a matching stimulus, that had a fixed binocular flicker amplitude of either 24% or 48% (temporal) contrast. The other stimulus was a target stimulus, the contrast of which was controlled by a staircase algorithm. We tested 9 ratios of target contrast between the left and right eyes.

The results from the matching experiment are shown in Figure 5. Each data point indicates the contrast levels required in each eye that were perceptually equivalent to the binocular 24% (red circles) and 48% (blue circles) matching contrasts. At both matching contrasts, we see a very substantial increase in the physical contrast required for a monocular target (data points along the x- and y-axes), compared to a binocular target (points along the diagonal of x=y). For example with a 48% match, the monocular targets required contrasts close to 100%, whereas binocular targets required a contrast of around 50%. The data points between these extremes also fall close to the predictions of a linear summation model (diagonal dotted lines), and are inconsistent with a winner-takes-all (or MAX) model (dashed lines). Overall, these matching results are consistent with the approximately linear summation effects observed in the EEG data of Experiment 1 (Figure 2c and f).

Contrast matching functions.

Dotted and dashed lines are predictions of canonical summation models involving linear combination (dotted) or a winner-take-all rule (dashed). Error bars indicate the standard error across participants (N=10), and are constrained along radial lines converging at the origin. Note that, for the 48% match, the data point on the x-axis falls higher than 100% contrast. This is because the psychometric function fits for some individuals were interpolated such that the PSE fell above 100%, shifting the mean slightly above that value.

Computational modeling

We fitted a computational model to the data from Experiments 1 & 3 using a hierarchical Bayesian approach. The model behavior is displayed in Figure 6a–d, with empirical data superimposed for comparison. In general, the model captures the key characteristics of the empirical data, with group-level parameter estimates provided in Table 1. We were particularly interested in comparing the weight of interocular suppression across datasets. We therefore plot the posterior distributions for this parameter for all four datasets (see Figure 6e). The key finding is that the pupillometry results (green distribution) display a much greater weight of interocular suppression compared with the other dataets (gray, purple, and yellow distributions). There is no overlap between the pupillometry distribution and any of the other three. All four distributions are also meaningfully below a weight of 1 – the value that previous work using grating stimuli would predict (Baker and Wade, 2017; Meese et al., 2006), and the peak location of our prior distribution (black curve). These results offer an explanation of the empirical data: the strong interocular suppression for the pupillometry data is consistent with the weak binocular facilitation, and measurable dichoptic masking observed using that method. The weaker suppression for the other experiments is consistent with the near-linear binocular facilitation effects, and absent dichoptic masking.

Summary of computational modeling.

Panels (a–d) show empirical data from key conditions, replotted from earlier figures for the pupillometry (a), first harmonic EEG responses (b), second harmonic EEG responses (c) and contrast matching (d) experiments, with curves showing model behavior generated using the median group-level parameter values. Panel (e) shows the posterior probability distributions of the interocular suppression parameter for each of the four model fits. The pupillometry distribution (green) is centered about a substantially higher suppressive weight than for the other data types (note the logarithmic x-axis). The black curve shows the (scaled) prior distribution for the weight parameter.

Table 1
Summary of median parameter values.
DatasetZnwRmax
Pupillometry3.440.010.610.00023
EEG 1 F2.620.150.020.00336
EEG 2 F3.710.070.020.0031
Matching0.305.100.09-

Discussion

Using a novel paradigm that combines EEG and pupillometry, we found surprising results for the binocular integration of flickering light. In the visual cortex response (indexed by EEG), the binocular combination of spatially uniform temporal luminance modulations seems to happen approximately linearly, with no evidence of interocular suppression. Evidence for this comes from the substantial binocular facilitation effect when comparing monocular and binocular responses, and the lack of a dichoptic suppression effect when the two eyes were stimulated at different frequencies. In the subcortical pathway (indexed by pupillometry), the binocular combination is more non-linear, with evidence of interocular suppression. This was evidenced by a weaker binocular facilitation, and stronger dichoptic suppression, relative to the EEG data. This pattern of results was confirmed by computational modeling, which showed a much greater suppressive weight for the pupillometry data compared to the EEG data. Additionally, we found that the perception of flickering light is consistent with a near-linear binocular summation process, consistent with the cortical (EEG) responses.

The results of our main experiment were unexpected for both the pupillometry and the EEG measures. Previous studies investigating binocular combination of spatial patterns (i.e. sine wave grating stimuli) are generally consistent with strong interocular suppression and weak binocular facilitation at high contrasts (Baker and Wade, 2017; Meese et al., 2006; Moradi and Heeger, 2009) (however, we note that facilitation as substantial as ours has been reported in previous EEG work by Apkarian et al., 1981). Our second experiment ruled out the possibility that these differences were due to the lower temporal frequency (2 Hz) used here. However, there is evidence of more extensive binocular facilitation for a range of other stimuli. Using scleral search coils, Quaia et al., 2018 observed a strong binocular facilitation (or ‘supersummation’) in the reflexive eye movement response to rapidly moving stimuli (also known as the ocular following response). Spitschan and Cajochen, 2019 report a similar result in archival data on melatonin suppression due to light exposure (melatonin is a hormone released by the pineal gland that regulates sleep; its production is suppressed by light exposure and can be measured from saliva assays). Work on the accommodative response indicates that binocular combination is approximately linear (Flitcroft et al., 1992), and can even cancel when signals are in antiphase (we did not try this configuration here). In the auditory system, interaural suppression of amplitude modulation also appears to be weak when measured using a similar steady-state paradigm (Baker et al., 2020). Finally, psychophysical matching experiments using static stimuli also show near-linear behavior for luminance increments (Anstis and Ho, 1998; Baker et al., 2012; Levelt, 1965), though not for luminance decrements (Anstis and Ho, 1998). Overall, this suggests that strong interocular normalization may be specific to spatial pattern vision, and not a general feature of binocular signal combination (or combination across multiple inputs in other senses).

Given the above, where does this leave our understanding of the overarching purpose of signal combination? Baker and Wade, 2017 point out that strong suppression between channels that are subsequently summed is equivalent to a Kalman filter, which is the optimal method for combining two noisy inputs (see also Ernst and Banks, 2002). Functionally, interocular suppression may, therefore, act to dynamically suppress noise, rendering binocular vision more stable. This account has intuitive appeal and is consistent with other models that propose binocular combination as a means of redundancy reduction (Li and Atick, 1994; May and Zhaoping, 2022). One possibility is that optimal combination is useful for visual perception — a critical system for interacting with the local environment — and is, therefore, worth devoting the additional resource of inhibitory wiring between ocular channels. However, the other examples of binocular combination discussed above are primarily physiological responses (pupil size, eye movements, hormone release) that may benefit more from an increased signal-to-noise ratio, or otherwise be phylogenetically older than binocular pattern vision. Conceptualized another way, the brain can repurpose a generic architecture for different situational demands by adjusting parameter values (here the weight of interocular suppression) to achieve different outcomes. Our future work in this area intends to compare binocular combinations for specific photoreceptor pathways, including different cone classes, and intrinsically photoreceptive retinal ganglion cells.

Pupil size affects the total amount of light falling on the retina. It is, therefore, the case that fluctuations in pupil diameter will have a downstream effect on the signals reaching cortex. We did not incorporate such interactions into our computational model, though in principle this might be worthwhile. However, we anticipate that any such effects would be small since pupil modulations at 2 Hz are in the order of 2% of overall diameter (e.g. Spitschan et al., 2014). It is also the case that cortical activity can modulate pupil diameter, usually through arousal and attention mechanisms (e.g. Bradley et al., 2008). We think it unlikely that these temporally coarse processes would have a differential effect on e.g., monocular and binocular stimulation conditions in our experiment, and any fluctuations during an experimental session (perhaps owing to fatigue) will be equivalent for our comparisons of interest. Therefore, we make the simplifying assumption that the pupil and perceptual pathways are effectively distinct, but hope to investigate this more directly in future neuroimaging work. Using fMRI to simultaneously image cortical and subcortical brain regions will also allow us to check that the differences we report here are not a consequence of the different measurement techniques we used (pupillometry and EEG).

Classic studies investigating the neurophysiological architecture of V1 reported that cells in cytochrome-oxidase ‘blobs’ (Horton and Hubel, 1981; Livingstone and Hubel, 1984) are biased towards low spatial frequencies (Edwards et al., 1995; Tootell et al., 1988), and relatively insensitive to stimulus orientation (Horton and Hubel, 1981; Livingstone and Hubel, 1984; though see Economides et al., 2011). As the blob regions are embedded within ocular dominance columns (Horton and Hubel, 1981), they are also largely monocular (Livingstone and Hubel, 1984; Tychsen et al., 2004). More recent work has reported psychophysical evidence for unoriented chromatic (Gheiratmand et al., 2013) and achromatic (Meese and Baker, 2011) mechanisms, that also appear to be monocular. Our use of luminance flicker might preferentially stimulate these mechanisms, perhaps explaining why our EEG data show little evidence of binocular interactions. Indeed, our EEG results could potentially be explained by a model involving entirely non-interacting monocular channels, with the binocular facilitation effects we find (e.g. Figures 3 and 4) owing to additivity of the electrophysiological response across independent monocular cells, rather than within binocular neurons. We, therefore, performed an additional analysis to investigate this possibility.

In the steady-state literature, one hallmark of a nonlinear system that pools inputs is the presence of intermodulation responses at the sums and differences of the fundamental flicker frequencies (Baitch and Levi, 1988; Tsai et al., 2012). In Figure 7 we plot the amplitude spectra of conditions from Experiment 1 in which the two eyes were stimulated at different frequencies (2 Hz and 1.6 Hz) but at the same contrast (48%; these correspond to the binocular cross and dichoptic cross conditions in Figures 1d, e ,, 2d and e). Figure 7a reveals a strong intermodulation difference response at 0.4 Hz (red dashed line), and Figure 7b reveals an intermodulation sum response at 3.6 Hz (red dashed line). It seems likely that the absence of a sum response for pupillometry data, and of a difference responses for the EEG data, is a consequence of the temporal constraints of these methods. The presence of intermodulation terms is predicted by nonlinear gain control models of the type considered here (Baker and Wade, 2017; Tsai et al., 2012), and indicates that the processing of monocular flicker signals is not fully linear prior to the point at which they are combined across the eyes. Indeed, our model architecture (Meese et al., 2006) makes specific predictions about the location of interocular suppression - it impacts before binocular combination, consistent with results from primate physiology (Dougherty et al., 2019).

Summary of intermodulation responses in pupillometry (a) and EEG (b) data.

The data are pooled across the binocular cross and dichoptic cross conditions of Experiment 1, with a target contrast of 48%. Vertical dashed lines indicate the fundamental flicker frequencies of 2 Hz (F1; black) and 1.6 Hz (F2; green), and the intermodulation difference (F1-F2=0.4 Hz) and sum (F1+F2=3.6 Hz) frequencies (red). Data are averaged across N=30 participants, and shaded regions indicate ±1 standard error.

Conclusions

We have demonstrated that the binocular combination of flickering light differs between cortical and subcortical pathways. Flicker was also associated with substantially weaker interocular suppression, and stronger binocular facilitation, compared to the combination of spatial luminance modulations in the visual cortex. Our computational framework for understanding signal combination permits direct comparisons between disparate experimental paradigms and data types. We anticipate that this will help elucidate the constraints the brain faces when combining different types of signals to govern perception, action, and biological function.

Methods

Participants

Thirty (20 females), twelve (seven females), and ten (three females) adult participants, whose ages ranged from 18 to 45, were recruited for Experiments 1, 2, and 3, respectively. All participants had normal or corrected to normal binocular vision, and gave written informed consent. Our procedures were approved by the Ethics Committee of the Department of Psychology at the University of York (identification number 792).

Apparatus & stimuli

The stimuli were two discs of achromatic flickering light with a diameter of 3.74 degrees, presented on a black background. The same stimuli were used for all three experiments. Four dark red lines were added around both discs to help with their perceptual fusion, giving the appearance of a single binocular disc (see upper left insert in Figure 8 for an example of the fused stimulus). The discs were viewed through a four-mirror stereoscope, which used front silvered mirrors to avoid internal reflections, and meant that participants saw a single fused disc. The use of a stereoscope allowed us to modulate the stimuli in three different ocular configurations: monocular, binocular, and dichoptic. Note that during the monocular presentation of flicker, the unstimulated eye still saw the static (non-flickering) disc of mean luminance.

Schematic diagram illustrating the ocular arrangements, and temporal waveforms of the luminance modulations used in Experiment 1.

Shaded waveforms indicate a target stimulus, that was presented at one of five contrasts on each trial (denoted by the shading levels). Unshaded waveforms indicate mask stimuli, that were presented at a fixed contrast level of 48% regardless of the target contrast. Each waveform corresponds to a 1 s period of a 12 s trial, and coloured symbols are for consistency with Figures 1 and 2. The icon in the upper left corner illustrates the stimulus appearance (a luminous disc against a black background). The left and right eye assignments were counterbalanced across trials in the experiment (i.e. the monocular stimulus could be shown to either eye with equal probability).

All stimuli had a mean luminance of 42 cd/m2 and were displayed on an Iiyama VisionMaster Pro 510 display (800 × 600 pixels, 60 Hz refresh rate), which was gamma-corrected using a Minolta LS-110 photometer (Minolta Camera Co. Ltd., Japan). For experiments 1 and 2, the stimuli were presented using Psychopy (v3.0.7). For experiment 3, the stimuli were presented using Psychopy (v2022.1.1).

EEG data were collected for Experiments 1 and 2 using a 64-electrode ANT WaveGuard cap and the signals were recorded at 1 kHz using the ASA software (ANT Neuro, Netherlands). Pupillometry data were collected for Experiment 1 using a binocular Pupil Core eye-tracker (Pupil Labs GmbH, Berlin, Germany; Kassner et al., 2014) running at 120 Hz, and the signals were recorded with the Pupil Capture software.

Procedure

Before each experiment, participants adjusted the angle of the stereoscope mirrors to achieve binocular fusion. This was done so that they would perceive the two discs as one fused disc when looking at the screen through the stereoscope.

Experiment 1: simultaneous EEG and pupillometry

The experiment was conducted in a windowless room, in which the only light source was the monitor. The participants sat at 99 cm from the monitor and the total optical viewing distance (through the stereoscope) was 107 cm. The experiment was carried out in a single session lasting 45 min in total, divided into three blocks of 15 min each. In each block, there were 60 trials lasting 15 s each (12 s of stimulus presentation, with an interstimulus interval of 3 s). The participants were given no task other than to look at the fixation cross in the middle of the disc while trying to minimize their blinking during the presentation period.

We included six distinct ocular conditions, each at five temporal contrast levels (combined factorially) relative to the mean luminance: 6, 12, 24, 48, and 96%. Contrast was defined as temporal Michelson contrast; the difference between maximum and minimum luminances, scaled by the mean and expressed as a percentage. In the first three conditions, the discs flickered at 2 Hz, in either a monocular, binocular, or dichoptic arrangement (see upper rows of Figure 8). In the dichoptic condition, the non-target eye saw a fixed temporal contrast of 48%. The rationale for including the monocular and binocular conditions is that they permit us to measure empirically any binocular facilitation, by comparing the response amplitudes across these two conditions. The rationale for including the dichoptic condition is that it provides additional constraints to computational models, and further explores the binocular contrast-response space (see Baker et al., 2007).

In the remaining three conditions (termed the cross-frequency conditions) an additional flicker frequency of 1.6 Hz was introduced. We chose this frequency because it is sufficiently well-isolated from the target frequency (2 Hz) in the Fourier spectrum for 10 s trials. We repeated the monocular condition with this stimulus (one eye sees 1.6 Hz flicker, the other sees mean luminance), as well as testing in a binocular cross configuration (one eye sees each frequency at the target contrast). The rationale for the binocular cross condition is that it allows us to see the effects of suppression between the eyes without the additional complication of signal summation (which occurs when both eyes receive the same frequency), because the response of each eye can be resolved independently by frequency. Finally, in the dichoptic cross condition, one eye saw the target stimulus flickering at 2 Hz, and the other eye saw flicker at 1.6 Hz with a contrast of 48% - again this reveals the presence of suppression (by comparison with the 2 Hz monocular condition). A schematic overview of the cross-frequency conditions is shown in the lower rows of Figure 8. In all conditions, we counterbalanced the presentation of the target stimulus across the left and right eyes.

Experiment 2: EEG responses across temporal frequency

This experiment used the same equipment setup as Experiment 1, except that the eye tracker was not used. Unlike the first experiment, only one contrast level was used (96%) and the discs were set to flicker at five different frequencies (2, 4, 8, 16, and 30 Hz). Only two ocular configurations, monocular and binocular, were included, with the latter having both discs flickering at the same frequency. The experiment was carried out in one session lasting 25 min in total, divided into five blocks of 5 min each. In each block, there were 20 trials in total with the same timing as for Experiment 1.

Experiment 3: temporal contrast matching

The experiment was conducted in a darkened room with a blacked-out window. The display equipment (monitor and stereoscope) was the same as for the two previous experiments, but no EEG or pupillometry data were collected. A two-interval contrast matching procedure was used to collect data. In one interval, participants were presented with a standard fused disc that flickered at a set contrast level (either 24 or 48%), which was selected by the experimenter at the beginning of each block. In the other interval, a target disc was displayed, flickering at different contrast levels on each trial, but with a fixed interocular contrast ratio across the block. The contrast level of the target was controlled by a 1-up, 1-down staircase moving in logarithmic (dB) steps of contrast. The ratio of flicker amplitudes in the left and right eyes was varied across blocks and was set to be 0, 0.25, 0.5, 0.75, or 1 (nine distinct conditions). The standard and target discs were displayed for 1 s each, with an interstimulus interval of 0.5 s. After both discs had appeared on the screen, the participants had to indicate which interval they perceived as having the more intense flicker. The intervals were randomly ordered, and all discs flickered at a frequency of 2 Hz (two cycles in sine phase).

Due to its long duration (approximately 3 hr in total), the participants completed the experiment across multiple sessions initiated at their own convenience. The experiment was divided into 54 blocks (3 repetitions ×2 standard contrasts ×9 target ratios), which lasted on average 3 min each, depending on the response speed of the participant. In each block, there was a total of 50 trials. No auditory feedback was given for this subjective task.

Data analysis

EEG data were converted from the ANT-EEProbe format to a compressed csv text file using a custom Matlab script (available at: https://github.com/bakerdh/PupillometryEEG/ copy archived at Baker, 2023; Segala et al., 2023) and components of the EEGlab toolbox (Delorme and Makeig, 2004). The data for each participant were then loaded into R for analysis, where a 10 s waveform for each trial at each electrode was extracted (omitting the first 2 s). The Fourier transform of each waveform was calculated, and the complex spectrum was stored in a matrix. All repetitions of each condition were then averaged for each electrode. They were then averaged across four occipital electrodes (POz, Oz, O1, O2), to obtain individual results. Finally, these were averaged across participants to obtain the group results. All averaging was performed in the complex domain and, therefore, retained the phase information (i.e. coherent averaging), and at each stage, we excluded data points with a Mahalanobis distance exceeding D = 3 from the complex-valued mean (see Baker, 2021). For statistical comparisons of complex-valued data, we use the ANOVAcirc2 statistic described by Baker, 2021. This is a multivariate extension of ANOVA that assumes equal variance of the real and imaginary Fourier components, or equivalently, an extension of the Tcirc2 statistic of Victor and Mast, 1991 that can compare more than two conditions.

A similar analysis pipeline was adopted for the pupillometry data. The data were converted from mp4 videos to a csv text file using the Pupil Player software (Kassner et al., 2014), which estimated pupil diameter for each eye on each frame using a 3D model of the eyeball. The individual data were then loaded into R for analysis, where again a 10 s waveform for each trial in each eye was extracted (excluding the first 2 s after stimulus onset). We interpolated across any dropped or missing frames to ensure regular and continuous sampling over time. The Fourier transform was calculated for each waveform, and all repetitions of each condition were pooled across the eye and then averaged. We confirmed in additional analyses that the monocular consensual pupil response was complete, justifying our pooling of data across the eyes. Finally, data were averaged across all participants to obtain the group results. Again, we used coherent averaging, and excluded outlying data points in the same way as for the EEG data. Note that previous pupillometry studies using luminance flicker have tended to fit a single sine-wave at the fundamental frequency, rather than using Fourier analysis (e.g. Spitschan et al., 2014). The Fourier approach is more robust to noise at other frequencies (which can make the phase and amplitude of a fitted sine wave unstable) and has been used in some previous studies (see Barrionuevo et al., 2014; Barrionuevo and Cao, 2016). Additionally, it makes the pupillometry analysis is consistent with standard practice in steady-state EEG analysis (e.g. Figueira et al., 2022).

To analyze the matching data, we pooled the trial responses across all repetitions of a given condition for each participant. We then fitted a cumulative normal psychometric function to estimate the point of subjective equality at the 50% level. Thresholds were averaged across participants in logarithmic (dB) units.

For all experiments, we used a bootstrapping procedure with 1000 iterations to estimate standard errors across participants. All analysis and figure construction was conducted using a single R-script, available online, making this study fully computationally reproducible.

Computational model and parameter estimation

To describe our data, we chose a model of binocular contrast gain control with the same general form as the first stage of the model proposed by Meese et al., 2006. The second gain control stage was omitted (consistent with Baker and Wade, 2017) to simplify the model and reduce the number of free parameters. The response of the left eye’s channel is given by:

(1) RespL=L2Z+L+wR,

with an equivalent expression for the right eye:

(2) RespR=R2Z+R+wL.

In both equations, L and R are the contrast signals from the left and right eyes, Z is a saturation constant that shifts the contrast-response function laterally, and w is the weight of suppression from the other eye.

The responses from the two eyes are then summed binocularly:

(3) RespB=Rmax(RespL+RespR)+n,

where n is a noise parameter, and Rmax scales the overall response amplitude. The Rmax parameter was omitted when modeling the contrast-matching data, as it has no effect in this paradigm.

Despite being derived from the model proposed by Meese et al., 2006, the simplifications applied to this architecture make it very similar to other models (e.g. Ding and Sperling, 2006; ten Doesschate and Alpern, 1967; Legge, 1984; Schrödinger, 1926). In particular, we fixed the numerator exponent at 2 in our model, because otherwise, this value tends to trade off with the weight of interocular suppression (see Baker et al., 2012; Kingdom and Libenson, 2015). Our key parameter of interest is the weight of interocular suppression. Large values around w = 1 result in a very small or nonexistent binocular advantage at suprathreshold contrasts, consistent with previous work using grating stimuli (Baker and Wade, 2017). Low values around w = 0 produce substantial, near-linear binocular facilitation (Baker et al., 2020). Models from this family can handle both scalar contrast values and continuous waveforms (Tsai et al., 2012) or images (Meese and Summers, 2007) as inputs. For time-varying inputs, the calculations are performed at each time point, and the output waveform can then be analyzed using Fourier analysis in the same way as for empirical data. This means that the model can make predictions for the entire Fourier spectrum, including harmonic and intermodulation responses that arise as a consequence of nonlinearities in the model (Baker and Wade, 2017). However, for computational tractability, we performed fitting here using scalar contrast values.

We implemented the model within a Bayesian framework using the Stan software (Carpenter et al., 2017). This allowed us to estimate group-level posterior parameter distributions for the weight of interocular suppression, w, and the other free model parameters Rmax, Z, and n. The prior distributions for all parameters were Gaussian, with means and standard deviations of 1 and 0.5 for w and Rmax, and 5 and 2 for Z and n, with these values chosen based on previous literature (Baker et al., 2012; Meese et al., 2006). We sampled from a Student’s t-distribution for the amplitudes in the pupillometry and EEG experiments, and from a Bernoulli distribution for the single trial matching data. The models were fit using the individual data across all participants, independently for each dataset. We used coherent averaging to combine the data across participants, but this was not implemented in the model, so to compensate we corrected the group-level model by scaling the estimated noise parameter (n) by the square root of the number of participants (ngroup=n30). We took posterior samples at over a million steps for each dataset, using a computer cluster, and retained 10% of samples for plotting.

Preregistration, data, and code availability

We initially preregistered our main hypotheses and analysis intentions for the first experiment. We then conducted a pilot study with N=12 participants, before making some minor changes to the stimulus (we added dim red lines to aid binocular fusion). We then ran the main experiment, followed by two additional experiments that were not preregistered. The preregistration document, raw data files, and experimental and analysis code are available on the project repository: https://doi.org/10.17605/OSF.IO/TBEMA.

Appendix 1

We conducted a conceptual replication of Experiment 1 using an alternative system of hardware and software (Martin et al., 2023; Martin et al., 2022). A pair of Spectra Tune Lab multiprimary devices (LEDmotive Technologies LLC, Barcelona, Spain) were coupled to a binocular headset using liquid light guides. The light was imaged onto a circular diffuser for each eye (field size 30 deg) with the central 8 degrees masked off using a black occluder. Therefore, the replication experiment involved peripheral stimulation, unlike the main experiment which stimulated the central ~4 degrees of the visual field. All conditions were otherwise as described for the main experiment, and we tested 12 participants in total.

The pupillometry results are shown in Appendix 1—figure 1 and correspond closely to those from the main experiment (Figure 1). The ratio of binocular to monocular responses in Appendix 1—figure 1c is similar, and suppression is evident in Appendix 1—figure 1d, e. Of particular interest is the existence of a slight response at the second harmonic (4 Hz, Appendix 1—figure 1b), which was not present in our original data. This may be because the driving signal is stronger when stimulating the periphery (note the clear waveform in Appendix 1—figure 1a), or more robust to eye movements, or it might indicate additional nonlinearities not present at the fovea.

Appendix 1—figure 1
Summary of pupillometry results for N=12 participants, for peripheral stimulation.

See Figure 1 for a description of each panel.

The EEG results are shown in Appendix 1—figure 2. These still show a strong binocular facilitation effect at the highest contrast levels (Appendix 1—figure 2c, f), but the contrast response function is less clear at both the first and second harmonics. We suspect that this is because the cortical representation of the peripheral visual field is primarily along the calcarine sulcus, which results in some cancellation of the steady-state signal. This results in weaker signals than we obtained for foveal stimulation (represented at the occipital pole) in the main experiment (see Figure 2).

Appendix 1—figure 2
Summary of steady-state EEG results for N=12 participants, for peripheral stimulation.

See Figure 2 for a description of each panel.

Data availability

Raw data files, and experimental and analysis code, are available on the project repository on the Open Science Framework: https://doi.org/10.17605/OSF.IO/TBEMA.

The following data sets were generated

References

  1. Conference
    1. Kassner M
    2. Patera W
    3. Bulling A
    (2014)
    Pupil: An open source platform for pervasive eye tracking and mobile gaze-based interaction
    Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication. pp. 1151–1160.
  2. Book
    1. McDougal DH
    2. Gamlin PD
    (2008)
    Pupillary Control Pathways.The Senses: A Comprehensive Reference
    Elsevier.
  3. Book
    1. Purves D
    2. Brannon EM
    3. Cabeza R
    4. LaBar KS
    5. Huettel SA
    6. Platt ML
    7. Woldorff MG.
    (2008)
    Principles of Cognitive Neuroscience
    Oxford University Press.
    1. Schrödinger E
    (1926)
    Mueller-Pouillets Lehrbuch Der Physik
    456–560, Die Gesichtsempfindungen, Mueller-Pouillets Lehrbuch Der Physik, Braunschweig, Vieweg.

Peer review

Reviewer #1 (Public Review):

In this paper, the interocular/binocular combination of temporal luminance modulations is studied. Binocular combination is of broad interest because it provides a remarkable case study of how the brain combines information from different sources. In addition, the mechanisms of binocular combination are of interest to vision scientists because they provide insight into when/where/how information from two eyes is combined.

This study focuses on how luminance flicker is combined across two eyes, extending previous work that focused mainly on spatial modulations. The results appear to show that temporal modulations are combined in different ways, with additional differences between subcortical and cortical pathways.

The manuscript has been revised to address prior reviewers' comments. It now provides more justification for the empirical choices made by the authors, and a better illustration of the methods. That said, the paper would still benefit from an expanded rationale for significance beyond this specific area. There were no substantive changes made to the abstract or introduction, and only little to the discussion.

https://doi.org/10.7554/eLife.87048.3.sa1

Reviewer #2 (Public Review):

Previous studies have extensively explored the rules by which patterned inputs from the two eyes are combined in visual cortex. Here the authors explore these rules for un-patterned inputs (luminance flicker) at both the level of cortex, using Steady-State Visual Evoked Potentials (SSVEPs) and at the sub-cortical level using pupillary responses. They find that the pattern of binocular combination differs between cortical and sub-cortical levels with the cortex showing less dichoptic masking and somewhat more binocular facilitation.

Importantly, the present results with flicker differ markedly from those with gratings Hou et al., 2020, J Neurosci, Baker and Wade 2017 cerebral cortex, Norcia et al, 2000 Neuroreport, Brown et al., 1999, IOVS. When SSVEP responses are measured under dichoptic conditions where each eye is driven with a unique temporal frequency, in the case of grating stimuli, the magnitude of the response in the fixed contrast eye decreases as a function of contrast in the variable contrast eye. Here the response increases by varying (small) magnitudes. The authors favor a view that cortex and perception pool binocular flicker inputs approximately linearly using cells that are largely monocular. The lack of a decrease below the monocular level when modulation strength increase is taken to indicate that previously observed normalization mechanism in pattern vision does not play a substantial role in the processing of flicker. The authors present of computational model of binocular combination that captures features of the data when fit separately to each data set. Because the model has no frequency dependence and is based on scalar quantities, it cannot make joint predictions for the multiple experimental conditions which one of its limitations.

A strength of the current work is the use of frequency-tagging of both pupil and EEG responses to measure responses for flicker stimuli at two anatomical levels of processing. Flicker responses are interesting but have been relatively neglected. The tagging approach allows one to access responses driven by each eye, even when the other eye is stimulated which is a great strength. The tagging approach can be applied at both levels of processing at the same time when stimulus frequencies are low, which is an advantage as they can be directly compared. The authors demonstrate the versatility of frequency tagging in a novel experimental design which may inspire other uses, both within the present context and others. A disadvantage of the tagging approach for studying sub-cortical dynamics via pupil responses is that it is restricted to low temporal frequencies given the temporal bandwidth of the pupil. The inclusion of a behavioral measure and a model is also a strength, but there are some limitations in the modeling (see below).

The authors suggest in the discussion that luminance flicker may preferentially drive cortical mechanisms that are largely monocular and in the results that they are approximately linear in the dichoptic cross condition (no effect of the fixed contrast stimulus in the other eye). By contrast, prior research using dichoptic dual frequency flickering stimuli has found robust intermodulation (IM) components in the VEP response spectrum (Baitch and Levi, 1988, Vision Res; Stevens et al., 1994 J Ped Ophthal Strab; France and Ver Hoeve, 1994, J Ped Ophthal Strab; Suter et al., 1996 Vis Neurosci). The presence of IM is a direct signature of binocular interaction and suggests that at least under some measurement conditions, binocular luminance combination is "essentially" non-linear, where essential implies a point-like non-linearity such as squaring of excitatory inputs. The two views are in striking contrast.

In this revised manuscript, the addition of Figure 8, which shows more complete response spectra, partially addresses this issue. However, it also raises new questions. Critically, intermodulation (IM) has to be generated at or after a point of binocular combination, as it is a mixture of the two monocular frequencies and the monocular frequencies can only mix after a point of binocular combination.

In equations 1 and 2 and in the late summation and two-stage models of Meese et al (2006), there are divisive binocular cross-links prior to a summation block. This division is a form of binocular interaction. Do equations 1 and 2 generate IM on their own with parameters used for the overall modeling? Multiplication of two inputs clearly does, as the authors indicate in their toy model. If not, then a different binocular summation rule than the one expressed in equation 3 needs to be considered to produce IM.

The discussion considers flicker processing as manifest in the EEG to be largely monocular, given the relative lack of binocular facilitation and suppression effects. And yet there is robust IM. These are difficult to reconcile as it stands. The authors suggest that their generic modeling framework can predict IM, but can it predict IM with the parameters used to fit the data, e.g. with very low values of the weight of interocular suppression and no other binocular non-linearity?

Determining whether IM can be generated by the existing non-linear elements in the model is important because previous work on dichoptic flicker IM has considered a variety of simple models of dichoptic flicker summation and has favored models involving either a non-linear combination of linear monocular inputs (Baitch and Levi, Vis Research, 1988) or a non-linear combination of rectified (non-linear) monocular inputs (Regan and Regan, Canadian J Neurol Sci, 1989). In either case, the last stage of binocular combination is non-linear, rather than linear. The authors' model is different - it has a stage of divisive binocular interaction and this "quasi-monocular" stage feeds a linear binocular combination stage.

There is a second opportunity to test the proposed model that the authors could take advantage of. In the initial review, two of the reviewers were curious about what is predicted for counter-phase inputs to the two eyes. The authors indicate that the class of models they are using could be extended to cover this case. As it turns out, this experiment has been done for dichoptic full-field flicker (Sherrington, BrJPsychiatr, 1904); van der Tweel and Estevez, Ophthalmologica, 1974; Odom and Chao, IntJNeurosci, 1995; Cavonius, QJExpPsych, 1979; Levi et al., BJO, 1982. More importantly, the predictions of several binocular combination models for anti-phase inter-ocular flicker stimulation have been tested for both the VEP and psychophysics (Odom and Chao, Int J Neurosci). Varying the relative phase of the two eyes inputs from in phase to antiphase, Odom and Chao observed that the 2nd harmonic response went to a minimum at 90 deg of interocular phase. This will happen because a 2nd order nonlinearity in the monocular path will double the phase shift of the second harmonic, putting the two eyes' 2nd harmonic response out of phase when the interocular phase is 90 deg. Summing these inputs thus leads to cancellation at 90 deg, rather than 180 deg of interocular phase. Does the authors' model predict this behavior with typical parameters used in the modeling? In the end, to account for details of both VEP and psychophysical data, Odom and Chao favored a two-path model with one path comprising non-linear monocular inputs being combined linearly and a second path combining linear monocular inputs at a non-linear binocular stage. A similar set of results and models has been developed for inter-ocular presentation of gratings (Zemon et al., PNAS, 1995).

The Odom/Chao/Zemon VEP and psychophysical data are directly relevant to the authors' work and need to be taken into account in sufficient detail so that we can judge the consistency of the proposed framework with their data and the similarities and differences in the model predictions for dichoptic flicker combination. These models are also relevant to the generation of IM, a concern raised above.

https://doi.org/10.7554/eLife.87048.3.sa2

Author response

The following is the authors’ response to the original reviews.

eLife assessment

This study provides potentially important, new information about the combination of information from the two eyes in humans. The data included frequency tagging of each eye's inputs and measures reflecting both cortical (EEG) and sub-cortical processes (pupillometry). Binocular combination is of potentially general interest because it provides -in essence- a case study of how the brain combines information from different sources and through different circuits. The strength of supporting evidence appears to be solid, showing that temporal modulations are combined differently than spatial modulations, with additional differences between subcortical and cortical pathways. However, the manuscript's clarity could be improved, including by adding more convincing motivations for the approaches used.

We thank the editor and reviewers for their detailed comments and suggestions regarding our paper. We have implemented most of the suggested changes. In doing so we noticed a minor error in our analysis code that affected the functions shown in Figure 2e (previously Figure 1e), and have fixed this and rerun the modelling. Our main results and conclusions are unaffected by this change. We have also added a replication data set to the Appendix, as this bears on one of the points raised by a reviewer, and included a co-author who helped run this experiment.

Reviewer #1 (Public Review):

In this paper, the interocular/binocular combination of temporal luminance modulations is studied. Binocular combination is of broad interest because it provides a remarkable case study of how the brain combines information from different sources. In addition, the mechanisms of binocular combination are of interest to vision scientists because they provide insight into when/where/how information from two eyes is combined.

This study focuses on how luminance flicker is combined across two eyes, extending previous work that focused mainly on spatial modulations. The results appear to show that temporal modulations are combined in different ways, with additional differences between subcortical and cortical pathways.

1. Main concern: subcortical and cortical pathways are assessed in quite different ways. On the one hand, this is a strength of the study (as it relies on unique ways of interrogating each pathway). However, this is also a problem when the results from two approaches are combined - leading to a sort of attribution problem: Are the differences due to actual differences between the cortical and subcortical binocular combinations, or are they perhaps differences due to different methods. For example, the results suggest that the subcortical binocular combination is nonlinear, but it is not clear where this nonlinearity occurs. If this occurs in the final phase that controls pupillary responses, it has quite different implications.

At the very least, this work should clearly discuss the limitations of using different methods to assess subcortical and cortical pathways.

The modelling asserts that the nonlinearity is primarily interocular suppression, and that this is stronger in the subcortical pathway. Moreover the suppression impacts before binocular combination. So this is quite a specific location. We now say more about this in the Discussion, and also suggest that fMRI might avoid the limits on the conclusions we can draw from different methods.

1. Adding to the previous point, the paper needs to be a better job of justifying not only the specific methods but also other details of the study (e.g., why certain parameters were chosen). To illustrate, a semi-positive example: Only page 7 explains why 2Hz modulation was used, while the methods for 2Hz modulation are described in detail on page 3. No justifications are provided for most of the other experimental choices. The paper should be expanded to better explain this area of research to non-experts. A notable strength of this paper is that it should be of interest to those not working in this particular field, but this goal is not achieved if the paper is written for a specialist audience. In particular, the introduction should be expanded to better explain this area of research, the methods should include justifications for important empirical decisions, and the discussion should make the work more accessible again (in addition to addressing the issues raised in point 1 above). The results also need more context. For example, why EEG data have overtones but pupillometry does not?

We now explain the choice of frequency in the final paragraph of the introduction as follows:

‘We chose a primary flicker frequency of 2Hz as a compromise between the low-pass pupil response (see Barrionuevo et al., 2014; Spitschan et al., 2014), and the relatively higher-pass EEG response (Regan, 1966).’

We also mention why the pupil response is low-pass:

‘The pupil response can be modulated by periodic changes in luminance, and is temporally low-pass (Barrionuevo et al., 2014; Spitschan et al. 2014), most likely due to the mechanical limitations of the iris sphincter and dilator muscles’.

Reviewer #2 (Public Review):

Previous studies have extensively explored the rules by which patterned inputs from the two eyes are combined in the visual cortex. Here the authors explore these rules for un-patterned inputs (luminance flicker) at both the level of the cortex, using Steady-State Visual Evoked Potentials (SSVEPs) and at the sub-cortical level using pupillary responses. They find that the pattern of binocular combination differs between cortical and sub-cortical levels with the cortex showing less dichoptic masking and somewhat more binocular facilitation.

Importantly, the present results with flicker differ markedly from those with gratings (Hou et al., 2020, J Neurosci, Baker and Wade 2017 cerebral cortex, Norcia et al, 2000 Nuroreport, Brown et al., 1999, IOVS). When SSVEP responses are measured under dichoptic conditions where each eye is driven with a unique temporal frequency, in the case of grating stimuli, the magnitude of the response in the fixed contrast eye decreases as a function of contrast in the variable contrast eye. Here the response increases by varying (small) magnitudes. The authors favor a view that cortex and perception pool binocular flicker inputs approximately linearly using cells that are largely monocular. The lack of a decrease below the monocular level when modulation strength increase is taken to indicate that previously observed normalization mechanism in pattern vision does not play a substantial role in the processing of flicker. The authors present a computational model of binocular combination that captures features of the data when fit separately to each data set. Because the model has no frequency dependence and is based on scalar quantities, it cannot make joint predictions for the multiple experimental conditions which is one of its limitations.

A strength of the current work is the use of frequency-tagging of both pupil and EEG responses to measure responses for flicker stimuli at two anatomical levels of processing. Flicker responses are interesting but have been relatively neglected. The tagging approach allows one to access responses driven by each eye, even when the other eye is stimulated which is a great strength. The tagging approach can be applied at both levels of processing at the same time when stimulus frequencies are low, which is an advantage as they can be directly compared. The authors demonstrate the versatility of frequency tagging in a novel experimental design which may inspire other uses, both within the present context and others. A disadvantage of the tagging approach for studying sub-cortical dynamics via pupil responses is that it is restricted to low temporal frequencies given the temporal bandwidth of the pupil. The inclusion of a behavioral measure and a model is also a strength, but there are some limitations in the modeling (see below).

The authors suggest in the discussion that luminance flicker may preferentially drive cortical mechanisms that are largely monocular and in the results that they are approximately linear in the dichoptic cross condition (no effect of the fixed contrast stimulus in the other eye). By contrast, prior research using dichoptic dual frequency flickering stimuli has found robust intermodulation (IM) components in the VEP response spectrum (Baitch and Levi, 1988, Vision Res; Stevens et al., 1994 J Ped Ophthal Strab; France and Ver Hoeve, 1994, J Ped Ophthal Strab; Suter et al., 1996 Vis Neurosci). The presence of IM is a direct signature of binocular interaction and suggests that at least under some measurement conditions, binocular luminance combination is "essentially" non-linear, where essential implies a point-like non-linearity such as squaring of excitatory inputs. The two views are in striking contrast. It would thus be useful for the authors could show spectra for the dichoptic, two-frequency conditions to see if non-linear binocular IM components are present.

This is an excellent point, and one that we had not previously appreciated the importance of. We have generated a figure (Fig 8) showing the IM response in the cross frequency conditions. There is a clear response at 0.4Hz in the pupillometry data (2-1.6Hz), and at 3.6Hz in the EEG data (2+1.6Hz). We therefore agree that this shows the system is essentially nonlinear, despite the binocular combination appearing approximately linear. We now say in the Discussion:

‘In the steady-state literature, one hallmark of a nonlinear system is the presence of intermodulation responses at the sums and differences of fundamental flicker frequencies (Baitch & Levi, 1988; Tsai et al., 2012). In Figure 8 we plot the amplitude spectra of conditions from Experiment 1 in which the two eyes were stimulated at different frequencies (2Hz and 1.6Hz) but at the same contrast (48%; these correspond to the binocular cross and dichoptic cross conditions in Figures 2d,e and 3d,e). Consistent with the temporal properties of pupil responses and EEG, Figure 8a reveals a strong intermodulation difference response at 0.4Hz (red dashed line), and Figure 8b reveals an intermodulation sum response at 3.6Hz (red dashed line). The presence of these intermodulation terms is predicted by nonlinear gain control models of the type considered here (Baker and Wade, 2017; Tsai et al., 2012), and indicates that the processing of monocular flicker signals is not fully linear prior to the point at which they are combined across the eyes.’

If the IM components are indeed absent, then there is a question of the generality of the conclusions, given that several previous studies have found them with dichoptic flicker. The previous studies differ from the authors' in terms of larger stimuli and in their use of higher temporal frequencies (e.g. 18/20 Hz, 17/21 Hz, 6/8 Hz). Either retinal area stimulated (periphery vs central field) or stimulus frequency (high vs low) could affect the results and thus the conclusions about the nature of dichoptic flicker processing in cortex. It would be interesting to sort this out as it may point the research in new directions.

This is a great suggestion about retinal area. As chance would have it, we had already collected a replication data set where we stimulated the periphery, and we now include a summary of this data set as an Appendix. In general the results are similar, though we obtain a measurable (though still small) second harmonic response in the pupillometry data with this configuration, which is a further indication of nonlinear processing.

Whether these components are present or absent is of interest in terms of the authors' computational model of binocular combination. It appears that the present model is based on scalar magnitudes, rather than vectors as in Baker and Wade (2017), so it would be silent on this point. The final summation of the separate eye inputs is linear in the model. In the first stage of the model, each eye's input is divided by a weighted input from the other eye. If we take this input as inhibitory, then IM would not emerge from this stage either.

We have performed the modelling using scalar values here for simplicity and transparency, and to make the fitting process computationally feasible (it took several days even done this way). This type of model is quite capable of processing sine waves as inputs, and producing a complex output waveform which is Fourier transformed and then analysed in the same way as the experimental data (see e.g. Tsai, Wade & Norcia, 2012, J Neurosci; Baker & Wade, 2017, Cereb Cortex). However our primary aim here was to fit the model, and make inferences about the parameter values, rather than to use a specific set of parameter values to make predictions.We now say more about this family of models and how they can be applied in the methods section:

“Models from this family can handle both scalar contrast values and continuous waveforms (Tsai et al., 2012) or images (Meese and Summers, 2007) as inputs. For time-varying inputs, the calculations are performed at each time point, and the output waveform can then be analysed using Fourier analysis in the same way as for empirical data.This means that the model can make predictions for the entire Fourier spectrum, including harmonic and intermodulation responses that arise as a consequence of nonlinearities in the model (Baker and Wade, 2017). However for computational tractability, we performed fitting here using scalar contrast values.”

As a side point, there are quite a lot of ways to produce intermodulation terms, meaning they are not as diagnostic as one might suppose. We demonstrate this in Author response image 1, which shows the Fourier spectra produced by a toy model that multiplies its two inputs together (for an interactive python notebook that allows various nonlinearities to be explored, see here). Intermodulation terms also arise when two inputs of different frequencies are summed, followed by exponentiation. So it would be possible to have an entirely linear binocular summation process, followed by squaring, and have this generate IM terms (not that we think this is necessarily what is happening in our experiments).

Author response image 1

Related to the model: One of the more striking results is the substantial difference between the dichoptic and dichoptic-cross conditions. They differ in that the latter has two different frequencies in the two eyes while the former has the same frequency in each eye. As it stands, if fit jointly on the two conditions, the model would make the same prediction for the dichoptic and dichoptic-cross conditions. It would also make the same prediction whether the two eyes were in-phase temporally or in anti-phase temporally. There is no frequency/phase-dependence in the model to explain differences in these cases or to potentially explain different patterns at the different VEP response harmonics. The model also fits independently to each data set which weakens its generality. An interpretation outside of the model framework would thus be helpful for the specific case of differences between the dichoptic and dichoptic-cross conditions.

As mentioned above, the limitations the reviewer highlights are features of the specific implementation, rather than the model architecture in general. Furthermore, although this particular implementation of the model does not have separate channels for different phases, these can be added (see e.g. Georgeson et al., 2016, Vis Res, for an example in the spatial domain). In future work we intend to explore the phase relationship of flicker, but do not have space to do this here.

Prior work has defined several regimes of binocular summation in the VEP (Apkarian et al.,1981 EEG Journal). It would be useful for the authors to relate the use of their terms "facilitation" and "suppression" to these regimes and to justify/clarify differences in usage, when present. Experiment 1, Fig. 3 shows cases where the binocular response is more than twice the monocular response. Here the interpretation is clear: the responses are super-additive and would be classed as involving facilitation in the Apkarian et al framework. In the Apkarian et al framework, a ratio of 2 indicates independence/linearity. Ratios between 1 and 2 indicate sub-additivity and are diagnostic of the presence of binocular interaction but are noted by them to be difficult to interpret mechanistically. This should be discussed. A ratio of <1 indicates frank suppression which is not observed here with flicker.

Operationally, we use facilitation to mean an increase in response relative to a monocular baseline, and suppression to mean a decrease in response. We now state this explicitly in the Introduction. Facilitation greater than a factor of 2 indicates some form of super-additive summation. In the context of the model, we also use the term suppression to indicate divisive suppression between channels, however this feature does not always result in empirical suppression (it depends on the condition, and the inhibitory weight). We think that interpretation of results such as these is greatly aided by the use of a computational modelling framework, which is why we take this approach here. The broad applicability of the model we use in the domain of spatial contrast lends it credibility for our stimuli here.

Can the model explore the full range of binocular/monocular ratios in the Apkarian et al framework? I believe much of the data lies in the "partial summation" regime of Apkarian et al and that the model is mainly exploring this regime and is a way of quantifying varying degrees of partial summation.

Yes, in principle the model can produce the full range of behaviours. When the weight of suppression is 1, binocular and monocular responses are equal. When the weight is zero, the model produces linear summation. When the weight is greater than 1, suppression occurs. It is also possible to produce super-additive summation effects, most straightforwardly by changing the model exponents. However this was not required for our data here, and so we kept these parameters fixed. We agree that the model is a good way to unify the results across disparate experimental paradigms, and that is our main intention with Figure 7i.

Reviewer #3 (Public Review):

This manuscript describes interesting experiments on how information from the two eyes is combined in cortical areas, sub-cortical areas, and perception. The experimental techniques are strong and the results are potentially quite interesting. But the manuscript is poorly written and tries to do too much in too little space. I had a lot of difficulty understanding the various experimental conditions, the complicated results, and the interpretations of those results. I think this is an interesting and useful project so I hope the authors will put in the time to revise the manuscript so that regular readers like myself can better understand what it all means.

Now for my concerns and suggestions:

The experimental conditions are novel and complicated, so readers will not readily grasp what the various conditions are and why they were chosen. For example, in one condition different flicker frequencies were presented to the two eyes (2Hz to one and 1.6Hz to the other) with the flicker amplitude fixed in the eye presented to the lower frequency and the flicker amplitude varied in the eye presented to the higher frequency. This is just one of several conditions that the reader has to understand in order to follow the experimental design. I have a few suggestions to make it easier to follow. First, create a figure showing graphically the various conditions. Second, come up with better names for the various conditions and use those names in clear labels in the data figures and in the appropriate captions. Third, combine the specific methods and results sections for each experiment so that one will have just gone through the relevant methods before moving forward into the results. The authors can keep a general methods section separate, but only for the methods that are general to the whole set of experiments.

We have created a new figure (now Fig 1) that illustrates the conditions from Experiment 1, and is referenced throughout the paper. We have kept the names constant, as they are rooted in a substantial existing literature, and it will be confusing to readers familiar with that work if we diverge from these conventions. We did consider separating out the methods section, but feel it helps the flow of the results section to keep it as a single section.

I wondered why the authors chose the temporal frequencies they did. Barrionuevo et al (2014) showed that the human pupil response is greatest at 1Hz and is nearly a log unit lower at 2Hz (i.e., the change in diameter is nearly a log unit lower; the change in area is nearly 2 log units lower). So why did the authors choose 2Hz for their primary frequency? And why did the authors choose 1.6Hz which is quite close to 2Hz for their off frequency? The rationale behind these important decisions should be made explicit.

We now explain this in the Introduction as follows:

‘We chose a primary flicker frequency of 2Hz as a compromise between the low-pass pupil response (see Barrionuevo et al., 2014; Spitschan et al., 2014), and the relatively higher-pass EEG response (Regan, 1966).’

It is a compromise frequency that is not optimal for either modality, but generates a measurable signal for both. The choice of 1.6 Hz was for similar reasons - for a 10-second trial it is four frequency bins away from the primary frequency, so can be unambiguously isolated in the spectrum.

By the way, I wondered if we know what happens when you present the same flicker frequencies to the two eyes but in counter-phase. The average luminance seen binocularly would always be the same, so if the pupil system is linear, there should be no pupil response to this stimulus. An experiment like this has been done by Flitcroft et al (1992) on accommodation where the two eyes are presented stimuli moving oppositely in optical distance and indeed there was no accommodative response, which strongly suggests linearity.

We have not tried this yet, but it’s on our to-do list for future work. The accommodation work is very interesting, and we now cite it in the manuscript as follows:

‘Work on the accommodative response indicates that binocular combination there is approximately linear (Flitcroft et al. 1992), and can even cancel when signals are in antiphase (we did not try this configuration here).’

Figures 1 and 2 are important figures because they show the pupil and EEG results, respectively. But it's really hard to get your head around what's being shown in the lower row of each figure. The labeling for the conditions is one problem. You have to remember how "binocular" in panel c differs from "binocular cross" in panel d. And how "monocular" in panel d is different than "monocular 1.6Hz" in panel e. Additionally, the colors of the data symbols are not very distinct so it makes it hard to determine which one is which condition. These results are interesting. But they are difficult to digest.

We hope that the new Figure 1 outlining the conditions has helped with interpretation here.

The authors make a strong claim that they have found substantial differences in binocular interaction between cortical and sub-cortical circuits. But when I look at Figures 1 and 2, which are meant to convey this conclusion, I'm struck by how similar the results are. If the authors want to continue to make their claim, they need to spend more time making the case.

Indeed, it is hard to make direct comparisons across figures - this is why Figure 4 plots the ratio of binocular to monocular conditions, and shows a clear divergence between the EEG and pupillometry results at high contrasts.

Figure 5 is thankfully easy to understand and shows a very clear result. These perceptual results deviate dramatically from the essentially winner-take-all results for spatial sinewaves shown by Legge & Rubin (1981); whom they should cite by the way. Thus, very interestingly the binocular combination of temporal variation is quite different than the binocular combination of spatial variation. Can the pupil and EEG results also be plotted in the fashion of Figure 5? You'd pick a criterion pupil (or EEG) change and use it to make such plots.

We now cite Legge & Rubin. We see what you mean about plotting the EEG and pupillometry results in the same coordinates as the matching data, but we don’t think this is especially informative as we would end up only with data points along the axes and diagonal of the plot, without the points at other angles. This is a consequence of how the experiments were conducted.

My main suggestion is that the authors need to devote more space to explaining what they've done, what they've found, and how they interpret the data. I suggest therefore that they drop the computational model altogether so that they can concentrate on the experiments. The model could be presented in a future paper.

We feel that the model is central to the understanding and interpretation of our results, and have retained it in the revised version of the paper.

Reviewer #2 (Recommendations For The Authors):

I found the terms for the stimulus conditions confusing. I think a simple schematic diagram of the conditions would help the reader.

Now added (the new Fig 1).

In reporting the binocular to monocular ratio, please clarify whether the monocular data was from one eye alone (and how that eye was chosen) or from both eyes and then averaged, or something else. It would be useful to plot the results from the dichoptic condition in this form, as well.

These were averaged across both eyes. We now say in the Methods section:

‘We confirmed in additional analyses that the monocular consensual pupil response was complete, justifying our pooling of data across the eyes.’

Also, clarify whether the term facilitation is used as above throughout (facilitation being > 2 times monocular response under binocular condition) or if a different criterion is being used. If we take facilitation to mean a ratio > 2, then facilitation depends on temporal frequency in Figure 4.

We now explain our use of these terms in the final paragraph of the Introduction:

‘Relative to the response to a monocular signal, adding a signal in the other eye can either increase the response (facilitation) or reduce it (suppression).’

The magnitude of explicit facilitation attained is interesting, but not without precedent. Ratios of binocular to mean monocular > 2, have been reported previously and values of summation depend strongly on the stimulus used (see for example Apkarian et al., EEG Journal, 1981, Nicol et al., Doc Ophthal, 2011).

We now mention this in the Discussion as follows:

‘(however we note that facilitation as substantial as ours has been reported in previous EEG work by Apkarian et al. (1981))’

In Experiment 3, the authors say that the psychophysical matching results are consistent with the approximately linear summation effects observed in the EEG data of Experiment 1. In describing Fig. 3, the claim is that the EEG is non-linear, e.g. super-additive - at least at high contrasts. Please reconcile these statements.

We think that the ‘superadditive’ effects are close enough to linear that we don’t want to make too much of a big deal about them - this could be measurement error, for example. So we use terms such as near-linear, or approximately linear, when referring to them throughout.

Reviewer #3 (Recommendations For The Authors):

Let me make some more specific comments using a page/paragraph/line format to indicate where in the text they're relevant.

1/2 (middle)/3 from end. "In addition" seems out of place here.

Removed.

1/3/4. By "intensities" do you mean "contrasts"?

Fixed.

1/3/last. "... eyes'...".

Fixed.

2/5/3. By "one binocular disc", you mean into "one perceptually fused disc".

Rewritten as: ‘to help with their perceptual fusion, giving the appearance of a single binocular disc’

3/1/1. "calibrated" seems like the wrong word here. I think you're just changing the vergence angle to enable fusion, right?

Now rewritten as: ‘Before each experiment, participants adjusted the angle of the stereoscope mirrors to achieve binocular fusion’

3/1/1. "adjusting the angles...". And didn't changing the mirror angles affect the shapes of the discs in the retinal images?

Perhaps very slightly, but this is well within the tolerance of the visual system to compensate for in the fused image, especially for such high contrast edges.

3/3/5. "fixed contrast" is confusing here because it's still a flickering stimulus if I follow the text here. Reword.

Now ‘fixed temporal contrast’

3/4/1. It would be clearer to say "pupil tracker" rather than "eye tracker" because you're not really doing eye tracking.

True, but the device is a commercial eye tracker, so this is the appropriate term regardless of what we are using it for.

3/5/6. I'm getting lost here. "varying contrast levels" applies to the dichoptic stimulus, right?

Yes, now reworded as ‘In the other interval, a target disc was displayed, flickering at different contrast levels on each trial, but with a fixed interocular contrast ratio across the block.’

3/5/7. Understanding the "ratio of flicker amplitudes" is key to understanding what's going on here. More explanation would be helpful.

Addressed in the above point.

4/3/near end. Provide some explanation about why the Fourier approach is more robust to noise.

Added ‘(which can make the phase and amplitude of a fitted sine wave unstable)’

Figure 1. In panel a, explain what the numbers on the ordinate mean. What's zero, for example? Which direction is dilation? Same question for panel b. It's interesting in panel c that the response in one eye to 2Hz increases when the other eye sees 1.6Hz. Would be good to point that out in the text.

Good idea about panel (a) - we have changed the y-axis to ‘Relative amplitude’ for clarity, and now note in the figure caption that ‘Negative values indicate constriction relative to baseline, and positive values indicate dilation.’ Panel (b) is absolute amplitude, so is unsigned. Panel (c) only contains 2Hz conditions, but there is some dichoptic suppression across the two frequencies in panels (d,e) - we now cover this in the text and include statistics.

6/2/1. Make clear in the text that Figure 1c shows contrast response functions for the pupil.

Now noted in the caption.

Figure 3. I'm lost here. I feel like I should be able to construct this figure from Figures 1 and 2, but don't know how. More explanation is needed at least in the caption.

Done. The caption now reads:

‘Ratio of binocular to monocular response for three data types. These were calculated by dividing the binocular response by the monocular response at each contrast level, using the data underlying Figures 2c, 3c and 3f. Each value is the average ratio across N=30 participants, and error bars indicate bootstrapped standard errors.’

9/1/1-2. I didn't find the evidence supporting this statement compelling.

We now point the reader to Figure 4 as a reminder of the evidence for this difference.

9/1/6-9. You said this. But this kind of problem can be fixed by moving the methods sections as I suggested above.

As mentioned, we feel that the results section flows better with the current structure.

Figure 4. Make clear that this is EEG data.

Now added to caption.

Figure 5 caption. Infinite exponent in what equation?

Now clarified as: ‘models involving linear combination (dotted) or a winner-take-all rule (dashed)’

Figure 6. I hope this gets dropped. No one will understand how the model predictions were derived. And those who look at the data and model predictions will surely note (as the authors do) that they are rather different from one another.

As noted above, we feel that the model is central to the paper and have retained this figure. We have also worked out how to correct the noise parameter in the model for the number of participants included in the coherent averaging, which fixes the discrepancy at low contrasts. The correspondence between the data and model in is now very good, and we have plotted the data points and curves in the same panels, which makes the figure less busy.

12/1. Make clear in this paragraph that "visual cortex" is referring to EEG and perception results and that "subcortical" is referring to pupil. Explain clearly what "linear" would be and what the evidence for "non-linear" is.

Good suggestion, we have added qualifiers linking to both methods. Also tidied up the language to make it clearer that we are talking about binocular combination specifically in terms of linearity, and spelled out the evidence for each point.

12/2/6-9. Explain the Quaia et al results enough for the reader to know what reflexive eye movements were studied and how.

We now specify that these eye movements are also known as the ‘ocular following response’ and were measured using scleral search coils.

12/2/9-10. Same for Spitchan and Cajochen: more explanation.

Added:

“(melatonin is a hormone released by the pineal gland that regulates sleep; its production is suppressed by light exposure and can be measured from saliva assays)”

12/3/2-3. Intriguing statements about optimally combining noisy signals, but explain this more. It won't be obvious to most readers.

We have added some more explanation to this section.

13/1. This is an interesting paragraph where the authors have a chance to discuss what would be most advantageous to the organism. They make the standard argument for perception, but basically punt on having an argument for the pupil.

Indeed, we agree that this point is necessarily speculative, however we think it is interesting for the reader to consider.

13/2/1. "Pupil size affects the ..." is more accurate.

Fixed.

13/2/2 from end. Which "two pathways"? Be clear.

Changed to ‘the pupil and perceptual pathways’

https://doi.org/10.7554/eLife.87048.3.sa3

Article and author information

Author details

  1. Federico G Segala

    Department of Psychology, University of York, York, United Kingdom
    Contribution
    Data curation, Software, Formal analysis, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing
    For correspondence
    fgs502@york.ac.uk
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-4982-8023
  2. Aurelio Bruno

    School of Psychology and Vision Sciences, University of Leicester, Leicester, United Kingdom
    Contribution
    Conceptualization, Supervision, Funding acquisition, Project administration, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-4899-1769
  3. Joel T Martin

    Department of Psychology, University of York, York, United Kingdom
    Contribution
    Resources, Software, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-4475-3835
  4. Myat T Aung

    Department of Psychology, University of York, York, United Kingdom
    Contribution
    Resources, Software, Writing – review and editing
    Competing interests
    No competing interests declared
  5. Alex R Wade

    1. Department of Psychology, University of York, York, United Kingdom
    2. York Biomedical Research Institute, University of York, York, United Kingdom
    Contribution
    Conceptualization, Resources, Supervision, Funding acquisition, Methodology, Project administration, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-4871-2747
  6. Daniel H Baker

    1. Department of Psychology, University of York, York, United Kingdom
    2. York Biomedical Research Institute, University of York, York, United Kingdom
    Contribution
    Conceptualization, Resources, Data curation, Software, Formal analysis, Supervision, Funding acquisition, Investigation, Visualization, Methodology, Writing – original draft, Project administration, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0161-443X

Funding

Biotechnology and Biological Sciences Research Council (BB/V007580/1)

  • Daniel H Baker
  • Alex R Wade

Wellcome Trust (10.35802/213616)

  • Aurelio Bruno

For the purpose of Open Access, the authors have applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission. The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

Supported by Biotechnology and Biological Sciences Research Council grant BB/V007580/1 awarded to DHB and ARW, and Wellcome Trust grant 213616/Z/18/Z to AB.

Ethics

All participants gave written informed consent. Our procedures were approved by the Ethics Committee of the Department of Psychology at the University of York (identification number 792).

Senior Editor

  1. Timothy E Behrens, University of Oxford, United Kingdom

Reviewing Editor

  1. Krystel R Huxlin, University of Rochester, United States

Version history

  1. Preprint posted: February 15, 2023 (view preprint)
  2. Sent for peer review: March 7, 2023
  3. Preprint posted: May 9, 2023 (view preprint)
  4. Preprint posted: September 4, 2023 (view preprint)
  5. Version of Record published: September 26, 2023 (version 1)

Cite all versions

You can cite all versions using the DOI https://doi.org/10.7554/eLife.87048. This DOI represents all versions, and will always resolve to the latest one.

Copyright

© 2023, Segala et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 218
    Page views
  • 40
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Federico G Segala
  2. Aurelio Bruno
  3. Joel T Martin
  4. Myat T Aung
  5. Alex R Wade
  6. Daniel H Baker
(2023)
Different rules for binocular combination of luminance flicker in cortical and subcortical pathways
eLife 12:RP87048.
https://doi.org/10.7554/eLife.87048.3

Further reading

    1. Neuroscience
    Connon I Thomas, Melissa A Ryan ... Benjamin Scholl
    Research Article

    Postsynaptic mitochondria are critical for the development, plasticity, and maintenance of synaptic inputs. However, their relationship to synaptic structure and functional activity is unknown. We examined a correlative dataset from ferret visual cortex with in vivo two-photon calcium imaging of dendritic spines during visual stimulation and electron microscopy reconstructions of spine ultrastructure, investigating mitochondrial abundance near functionally and structurally characterized spines. Surprisingly, we found no correlation to structural measures of synaptic strength. Instead, we found that mitochondria are positioned near spines with orientation preferences that are dissimilar to the somatic preference. Additionally, we found that mitochondria are positioned near groups of spines with heterogeneous orientation preferences. For a subset of spines with a mitochondrion in the head or neck, synapses were larger and exhibited greater selectivity to visual stimuli than those without a mitochondrion. Our data suggest mitochondria are not necessarily positioned to support the energy needs of strong spines, but rather support the structurally and functionally diverse inputs innervating the basal dendrites of cortical neurons.

    1. Neuroscience
    Weiwei Qui, Chelsea R Hutch ... Darleen Sandoval
    Research Article

    Several discrete groups of feeding-regulated neurons in the nucleus of the solitary tract (nucleus tractus solitarius; NTS) suppress food intake, including avoidance-promoting neurons that express Cck (NTSCck cells) and distinct Lepr- and Calcr-expressing neurons (NTSLepr and NTSCalcr cells, respectively) that suppress food intake without promoting avoidance. To test potential synergies among these cell groups we manipulated multiple NTS cell populations simultaneously. We found that activating multiple sets of NTS neurons (e.g., NTSLepr plus NTSCalcr (NTSLC), or NTSLC plus NTSCck (NTSLCK)) suppressed feeding more robustly than activating single populations. While activating groups of cells that include NTSCck neurons promoted conditioned taste avoidance (CTA), NTSLC activation produced no CTA despite abrogating feeding. Thus, the ability to promote CTA formation represents a dominant effect but activating multiple non-aversive populations augments the suppression of food intake without provoking avoidance. Furthermore, silencing multiple NTS neuron groups augmented food intake and body weight to a greater extent than silencing single populations, consistent with the notion that each of these NTS neuron populations plays crucial and cumulative roles in the control of energy balance. We found that silencing NTSLCK neurons failed to blunt the weight-loss response to vertical sleeve gastrectomy (VSG) and that feeding activated many non-NTSLCK neurons, however, suggesting that as-yet undefined NTS cell types must make additional contributions to the restraint of feeding.