Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
Summary:
In this study, Nandy and colleagues examine neural and behavioral correlates of perceptual variability in monkeys performing a visual change detection task. They used a laminar probe to record from area V4 while two macaque monkeys detected a small change in stimulus orientation that occurred at a random time in one of two locations, focusing their analysis on stimulus conditions where the animal was equally likely to detect (hit) or not-detect (miss) a briefly presented orientation change (target). They discovered two behavioral measures that are significantly different between hit and miss trials - pupil size tends to be slightly larger on hits vs. misses, and monkeys are more likely to miss the target on trials in which they made a microsaccade shortly before target onset. They also examined multiple measures of neural activity across the cortical layers and found some measures that are significantly different between hits and misses.
Strengths:
Overall the study is well executed and the analyses are appropriate (though multiple issues do need to be addressed).
We thank the reviewer for their enthusiasm and their constructive comments which we address below.
Weaknesses:
My main concern with this study is that with the exception of the pre-target microsaccades, the physiological and behavioral correlates of perceptual variability (differences between hits and misses) appear to be very weak and disconnected. Some of these measures rely on complex analyses that are not hypothesis-driven and where statistical significance is difficult to assess. The more intuitive analysis of the predictive power of trial outcomes based on the behavioral and neural measures is only discussed at the end of the paper. This analysis shows that some of the significant measures have no predictive power, while others cannot be examined using the predictive power analysis because these measures cannot be estimated in single trials. Given these weak and disconnected effects, my overall sense is that the current results do not significantly advance our understanding of the neural basis of perceptual variability.
Reviewer #1 (Recommendations For The Authors):
(1) Most of the effects are very small. For example, the difference in pupil size between hits and misses is ~0.08 z-score units. The differences in firing rates between hits and misses are in the order of 1-2% of normalized firing rates. While these effects may be significant, their contribution to perceptual variability could be negligible, as suggested by the analysis of predictive power at the end of the result section. On a related note, it would be useful to mention the analysis of predictive power earlier in the paper. The finding that some of the measures do not have significant predictive power w/r to behavioral outcome raises questions regarding their importance. Finally, it would strengthen the paper if the authors could come up with methods to assess the predictive power of the PPC and interlaminar SSC. Without such analyses, it is difficult to assess the importance of these measures.
We expect that relatively small differences in early to intermediate sensory areas could cumulatively result in large differences in higher areas and contribute to the binary distinction between hits and misses. We certainly do not claim that these results completely explain state-dependent differences that determine the outcome of these trials. Instead, we have focused on neural signatures at the level of the V4 columnar microcircuit that might ultimately contribute to the variability in perception.
We would like to emphasize that, based on the reviewer’s recommendation, we have now analyzed our results separately for each animal (see below). The consistency and significance of our findings across both animals give us confidence that what we have reported here are important neural signatures underlying perceptual variability at threshold.
We would also like to note that SSC and PPC are now part of the standard toolkit of systems neuroscience and have been employed in numerous studies to our knowledge. While all measures come with their set of caveats and limitations, these two measures provide a frequency-resolved metric of the relationship between two temporal processes (point or continuous), which we believe provide insights into the interlaminar flow of information that we report here.
Unfortunately, limitations in the GLM method and the reliability of these analyses with limited data make it impossible for these two measures to be included. The GLM requires all variables to be defined for each trial in the input. SSC and PPC can be undefined at low firing rates and require a substantial amount of data to be reliably calculated. While we did consider imputing data or estimating SSC and PPC using multiple trials, we ultimately did not pursue this idea as the purpose of the GLM was to use simultaneous measurements from single trials.
(2) What is the actual predictive power of the GLM model (i.e., what is the accuracy of predicting whether a given held-out trial will lead to a hit or a miss)? How much of this predictive power is accounted for by the effect of microsaccades?
As the GLM is not a decoder, it does not classify whether a given left out trial will be a hit or a miss. However, the GLM was highly predictive compared to a constant model. This information has been added to Table 3. The deviance of the GLM with and without microsaccades as a variable was not significantly different (p >0.9).
(3) The role of stimulus contrast is not explained clearly. Are all the analyses and figures restricted to a single contrast level? Was the contrast the same on both sides? If multiple contrasts are used, could contrast account for some of the observed neural-behavioral covariations?
All of the analyses include stimuli of all tested contrast levels. Stimulus contrasts were the same at both locations (attended and unattended). We have added a more detailed description of the contrast in hit and miss trials (Lines 289-296 and reproduced that here:
“Non-target stimulus contrasts were slightly different between hits and misses (mean:
33.1% in hits, 34.0% in misses, permutation test, 𝑝 = 0.02), but the contrast of the target was higher in hits compared to misses (mean: 38.7% in hits, 27.7% in misses, permutation test, 𝑝 = 1.6 𝑒 − 31). Firing rates were normalized by contrast in Figure 3. In all other figures, we considered only non-target stimuli, which had very minor differences in contrast (<1%) across hits and misses. While we cannot completely rule out any other effects of stimulus contrast, the normalization in Figure 3 and minor differences for non-target stimuli should minimize them.”
(4) Do the animals make false alarms (i.e., report seeing a target in non-target epochs)?
If not, then it is not clear that the animals are performing near their perceptual threshold. If the false-alarm rate is non-zero, it should be reported and analyzed for neural/behavioral correlates. Does the logistic regression fit allow for a false alarm rate? More generally, it would be useful to see a summary of behavioral performance, such as distribution of thresholds, lower and upper asymptotes, and detection rates on foil trials vs. matched target trials.
The logistic regression does allow for a false alarm rate. We have reported additional behavioral parameters in Figure 1-figure supplement 3A-G.
(5) As far as I can tell, all the analyses in the paper are done on data combined across the two animals. Given that these effects are weak and that the analyses are complex, it is important to demonstrate for each analysis/figure that the results hold for each animal separately before combining the data across animals. This can be done in supplementary figures.
We have updated the paper to include all main results plotted separately for each animal as supplementary figures.
- Figure 2-figure supplement 2
- Figure 3-figure supplement 1
- Figure 3-figure supplement 2
- Figure 4-figure supplement 1
- Figure 5-figure supplement 2
- Figure 7-figure supplement 1
All the results except for the canonical correlation analysis were present, consistent, and significant when we analyzed them in each monkey independently.
(6) The selection of the temporal interval used for the various analyses appears somewhat post hoc and is not explained clearly. Some analyses are restricted to the period immediately before or during target onset (e.g., 400 ms before target onset for analysis of the effect of microsaccade, 60 ms before stimulus onset for the analysis of the effect of neural variability). Other analyses are done on non-target rather than target stimuli. What is the justification for selecting these particular periods for these analyses? The differences in firing rates between hits and misses are restricted to the target epoch and are not present in the non-target epochs. Given these results, it seems important to compare the effects in target and non-target epochs in other analyses as well.
Restricting the analysis of the Fano Factor to 60 ms before non-target onset seems odd. Given that the duration of the interval between stimulus presentations is random, how could this pre-stimulus effect be time-locked to target onset?
We selected a 200ms time window during the pre-stimulus or stimulus-evoked period for almost all our analyses. The results relating to microsaccade occurrence were robust to narrower time windows more consistent with the other pre-stimulus windows we used, but we chose to use the 400ms window to capture a larger fraction of trials with microsaccades.
Only the Fano factor time window was selected post-hoc based on the traces in Figure 4A, and the result is robust across animals (new Figure 4-figure supplement 1). The inter-stimulus intervals are random, and we do not believe the neural variability is timelocked to upcoming stimuli, but that lower variability in this pre-stimulus window is characteristic of hits.
We believe that the consistency of our results across both animals provides further evidence that our time window selection was appropriate.
We are interested in the extent to which these effects would remain consistent when applied only to target stimuli. However, restricting our analyses to only target stimuli substantially reduces the amount of neural data available for analysis. We plan to explore target stimulus representation more thoroughly in future studies.
(7) Can the measured neural response be used to discriminate between target and nontarget stimuli? If so, is the discriminability between target and non-target higher in hits vs. misses?
Thank you for raising this interesting point. We performed this analysis and find that target stimuli are more discriminable from non-targets in hits compared to misses. This has been added as a new Figure 3A.
(8) How many trials were performed per session? Did miss probability tend to increase over time over the session? If so, could this slow change in hit probability account for some of the observed neural and behavioral correlations with perceptual decisions?
Monkeys initiated a median of 905 trials (range of 651 to 1086). This has been added to the manuscript (Line 106). Approximately 1/8 of those trials were at perceptual threshold. Hit probability at threshold does not change substantially over the course of the session. We now report this in new Figure 1- figure Supplement 3I (error bars show standard deviation).
(9) Did miss probability depend on the time of the change within the trial? If so, do any of the behavioral/neural metrics share a similar within-trial time course?
Change times were not significantly different across hit and miss trials (p=0.15, Wilcoxon rank sum test). We now report this in new Figure 1-figure supplement 3H.
(10) "Deep layer neurons exhibit reduced low-frequency phase-locking in hit trials than in misses (Figure 5B), suggesting an improvement in pooled signal-to-noise among this neural population." - why does this metric suggest improved SNR? Is there any evidence for improved SNR in the data? Why just in deep layers?
Thank you for raising this question. We agree this statement is not fully supported by the data and have removed it.
(11) I may have missed this but what were the sizes of the Gabor stimuli?
This has been added to the methods section (Line 454). The Gaussian halfwidth was 2 degrees.
Reviewer #2 (Public Review):
In this manuscript, the authors conducted a study in which they measured eye movements, pupil diameter, and neural activity in V4 in monkeys engaged in a visual attention task. The task required the monkeys to report changes in the orientation of Gabors' visual stimuli. The authors manipulated the difficulty of the trials by varying the degree of orientation change and focused their analysis on trials of intermediate difficulty where the monkeys' hit rate was approximately 50%. Their key findings include the following: 1) Hit trials were preceded by larger pupil diameter, reflecting higher arousal, and by more stable eye positions; 2) V4 neurons exhibit larger visual responses in hit trials; 3) Superficial and deep layers exhibited greater coherence in hit trials during both the pre-target stimulus period and the non-target stimulus presentation period. These findings have useful implications for the field, and the experiments and analyses presented in this manuscript validly support the authors' claims.
Strengths:
The experiments were well-designed and executed with meticulous control. The analyses of both behavioural and electrophysiological data align with the standards in the field.
We thank the reviewer for their enthusiasm about our study and their constructive comments which we address below.
Weaknesses:
Many of the findings appear to be incremental compared to previous literature, including the authors' own work. While incremental findings are not necessarily a problem, the manuscript lacks clear statements about the extent to which the dataset, analysis, and findings overlap with the authors' prior research. For example, one of the main findings, which suggests that V4 neurons exhibit larger visual responses in hit trials (as shown in Fig. 3), appears to have been previously reported in their 2017 paper. Additionally, it seems that the entire Fig1-S1 may have been reused from the 2017 paper. These overlaps should have been explicitly acknowledged and correctly referenced.
While the raw data used in this paper overlaps entirely with Nandy et al. (2017), all the analyses and findings in this manuscript are new and have not been previously reported. Figure 1-figure supplement 1 is modified and reproduced from that paper only to allow readers to understand the recording methods used to collect the data without needing to go back to the previous paper. We have added an explicit acknowledgment of this to the figure caption.
Previous studies have demonstrated that attention leads to decorrelation in V4 population activity. The authors should have discussed how and why the high coherence across layers observed in the current study can coexist with this decorrelation.
We have updated the discussion section (Lines 347-351) to further elaborate on this interpretation.
Furthermore, the manuscript does not explore potentially interesting aspects of the dataset. For instance, the authors could have investigated instances where monkeys made 'false' reports, such as executing saccades towards visual stimuli when no orientation change occurred. It would be valuable to provide the fraction of the monkeys' responses in a session, including false reports and correct rejections in catch trials, to allow for a broader analysis that considers the perceptual component of neural activity over pure sensory responses.
We appreciate this feedback. While we agree these are interesting directions, we decided to limit the scope of this study to only focus on trials at threshold with an orientation change, and are considering these directions for future studies.
Reviewer #2 (Recommendations For The Authors):
• Figure Design: Since eLife does not impose space limitations, it is advisable for the authors to avoid using very small font sizes. Consistency in font size throughout the figures is recommended. Some figures are challenging to discern, for example, the mean+-sem in Fig. 2B, and the alpha values of green and purple colours for superficial/deep layers are too high, making them too transparent or pale.
We have increased the size of some small fonts and improved font size consistency throughout the figures. We have changed the layer colors to improve legibility.
• Line 119: trail,
This has been fixed.