1. Neuroscience
Download icon

Clustered functional domains for curves and corners in cortical area V4

  1. Rundong Jiang
  2. Ian Max Andolina
  3. Ming Li
  4. Shiming Tang  Is a corresponding author
  1. Peking University School of Life Sciences, China
  2. Peking-Tsinghua Center for Life Sciences, China
  3. IDG/McGovern Institute for Brain Research at Peking University, China
  4. Key Laboratory of Machine Perception (Ministry of Education), Peking University, China
  5. The Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Key Laboratory of Primate Neurobiology, Institute of Neuroscience, Chinese Academy of Sciences, China
  6. Beijing Normal University Faculty of Psychology, China
Research Article
  • Cited 0
  • Views 290
  • Annotations
Cite this article as: eLife 2021;10:e63798 doi: 10.7554/eLife.63798

Abstract

The ventral visual pathway is crucially involved in integrating low-level visual features into complex representations for objects and scenes. At an intermediate stage of the ventral visual pathway, V4 plays a crucial role in supporting this transformation. Many V4 neurons are selective for shape segments like curves and corners; however, it remains unclear whether these neurons are organized into clustered functional domains, a structural motif common across other visual cortices. Using two-photon calcium imaging in awake macaques, we confirmed and localized cortical domains selective for curves or corners in V4. Single-cell resolution imaging confirmed that curve- or corner-selective neurons were spatially clustered into such domains. When tested with hexagonal-segment stimuli, we find that stimulus smoothness is the cardinal difference between curve and corner selectivity in V4. Combining cortical population responses with single-neuron analysis, our results reveal that curves and corners are encoded by neurons clustered into functional domains in V4. This functionally specific population architecture bridges the gap between the early and late cortices of the ventral pathway and may serve to facilitate complex object recognition.

Introduction

The visual system faces the daunting task of combining highly ambiguous local patterns of contrast into robust, coherent, and spatially extensive complex object representations (Connor et al., 2007; Haxby et al., 1991; Mishkin et al., 1983). Such information is predominantly processed along the ventral visual pathway (areas V1, V2, V4, and inferotemporal cortex [IT]). At early stages of this cortical pathway, neurons are tuned to a local single orientation (Hubel and Livingstone, 1987; Hubel and Wiesel, 1968) or a combination of orientations (Anzai et al., 2007; Ito and Komatsu, 2004). Orientation responses are functionally organized into iso-orientation domains that form pinwheel structures in V1 (Ts'o et al., 1990). At later stages like IT, neurons are selective for complex objects, predominantly organized categorically (Desimone et al., 1984; Freiwald and Tsao, 2010; Fujita et al., 1992; Kobatake and Tanaka, 1994; Tsao et al., 2003; Tsao et al., 2006). Such complex object organization is embodied using combinations of structurally separated feature columns (Fujita et al., 1992; Rajalingham and DiCarlo, 2019; Tanaka, 2003; Tsunoda et al., 2001; Wang et al., 1996). Positioned in-between the local orientation architecture of V1 and the global object architecture of IT lies cortical area V4, exhibiting visual selectivity that demonstrates integration of simple-towards-complex information (Pasupathy et al., 2019; Roe et al., 2012; Yue et al., 2014), and extensive anatomical connectivity across the visual hierarchy (Gattass et al., 1990; Ungerleider et al., 2008).

Functional organization within V4 has previously been visualized by intrinsic signal optical imaging (ISOI), and cortical representations of low-level features for orientation, color, and spatial frequency have been systematically demonstrated (Conway et al., 2007; Li et al., 2014; Li et al., 2013; Lu et al., 2018; Tanigawa et al., 2010). Such functional clustering suggests that the intracortical organizational motifs in V4 bear some similarity to V1. It remains unknown how more complex feature-selective neurons in V4 are spatially organized, and whether feature-like columns found in IT also exist in V4. Because intrinsic imaging is both spatially and temporally limited, it is unable to measure selective responses of single neurons. Using electrophysiology, early studies in V4 using bar and grating stimuli found that V4 neurons are tuned for orientation, size, and spatial frequency (Desimone and Schein, 1987). Subsequent studies revealed V4 selectivity for complex gratings and shapes in natural scenes (David et al., 2006; Gallant et al., 1993; Kobatake and Tanaka, 1994). In particular, Gallant and colleagues discovered V4 neurons with significant preferences for concentric, radial, and hyperbolic gratings (Gallant et al., 1993; Gallant et al., 1996). Neurons with similar preferences were spatially clustered when reconstructing the electrophysiological electrode penetrations (Gallant et al., 1996). These results were extended by later studies confirming the systematic tuning of V4 neurons for shape segments such as curves and corners as well as combination of these segments using parametric stimulus sets consisting of complex shape features (Cadieu et al., 2007; Carlson et al., 2011; Oleskiw et al., 2014; Pasupathy and Connor, 1999; Pasupathy and Connor, 2001; Pasupathy and Connor, 2002). Temporally varying heterogeneous fine-scale tuning within the spatial-temporal receptive field has also been observed (Nandy et al., 2016; Nandy et al., 2013; Yau et al., 2013). More recently, artificial neural networks were used to generate complex stimuli that characterize the selectivity of V4 neurons (Bashivan et al., 2019). However, whether such complex feature-selective neurons are spatially organized in V4 remains poorly understood.

In this study, we aimed to confirm the presence of functional domains in V4 encoding complex features such as curves and corners. We utilized two-photon (2P) calcium imaging in awake macaque V4, which provides visualization of the spatial distribution and clustering within the cortical population alongside substantially enhanced spatial resolution for functional characterization at the single-cell level (Garg et al., 2019; Li et al., 2017; Nauhaus et al., 2012; Ohki et al., 2005; Seidemann et al., 2016; Tang et al., 2018). We scanned a large cortical area in dorsal V4 using a low-power objective lens to search for patches selectively activated by curves or corners. We subsequently imaged these patches using a high-power objective lens to record single neurons’ responses in order to examine whether spatially clustered curve or corner-selective neurons could be found. If such neural clusters were found, we further aimed to understand how different curves and corners are encoded and differentiated in greater detail.

Results

We injected AAV1-hSyn-GCaMP into dorsal V4 (V4d) of two rhesus macaques — GCaMP6f for monkey A and GCaMP5G for monkey B. An imaging window and head posts were implanted 1–2 months after viral injection (see Materials and methods). Subjects were trained to initiate and maintain fixation within a 1° circular window for 2 s: the first second contained the fixation spot alone, and then stimuli appeared for 1 s on a LCD monitor positioned 45 cm away (17 inch, 1280 × 960 pixel, 30 pixel/°). Neuronal responses were recorded using 2P calcium imaging, with differential images generated using ΔF = F – F0, where F0 is the average fluorescence 0.5–0 s before stimulus onset, and F is the average response 0.5–1.25 s after stimulus onset.

Cortical mapping of curve-biased and corner-biased patches in V4

We first identified the retinal eccentricity using drifting gratings for our sites and found they were positioned with an eccentricity of ~0.7° from the fovea in monkey A and ~0.9° in monkey B. We next used a low-power (4×) objective lens to identify and localize any cortical subregions selectively activated by curves or corners. Using a large range of contour feature stimuli including bars, curves, and corners (Figure 1A), we scanned a large area (3.4 × 3.4 mm) in V4d (Figure 1B, C, Figure 1—figure supplement 1) between the lunate sulcus (LS) and the terminal portion of the inferior occipital sulcus (IOS). We obtained global activation maps by Gaussian smoothing (standard deviation σ = 10 pixels, 67 μm) the ΔF/F0 maps. We observed that orientation is organized in linear iso-orientation domains or pinwheel-like patterns, as previously reported (Roe et al., 2012), using ISOI in V4 (Figure 1—figure supplement 2).

Figure 1 with 3 supplements see all
Cortical mapping of curve-biased and corner-biased patches in V4 using a 4× objective lens.

(A) The stimulus set used for initial cortical mapping consisting of bars, corners, and smooth curves. (B) Vascular map. LS: lunate sulcus; IOS: inferior occipital sulcus. The black box indicates the imaging site in each subject. (C) Two-photon fluorescence images of the two monkeys. Scale bar = 400 μm. (D) Left: subtraction map showing curve-selective activation in monkey A, derived by the average response (ΔF/F0) to all curves minus the average response to all other stimuli (corners and bars). Right: subtraction map showing corner-selective activation in monkey A. (E) The equivalent of (D) for monkey B. (F) Left: significant curve patches in monkey A. For each pixel, independent t-tests were performed to compare the responses to all curves against all corners and against all bars. Benjamini-Hochberg procedure was used to compute the pixel FDR (false discovery rate, see Materials and methods). Threshold q = 0.01. The white box indicates the imaging site selected for 16× objective single-cell mapping. Right: significant corner patches in monkey A. (G) The equivalent of (F) for monkey B.

We then examined the response to curve and corner stimuli. Using map subtraction, we computed the curve-selective activation as the average response (ΔF/F0) to all curves minus all corners and bars, and corner-selective activation as the average response to all corners minus all curves and bars (Figure 1D, E). The subtraction maps we obtained clearly revealed several possible candidates for curve- or corner-selective patches. To statistically detect and locate the curve and corner patches, we performed pixel-level FDR tests to examine the curve or corner preference. For each pixel, we performed independent t-tests to compare the responses to all curves, all corners, and all bars, obtaining the p-value maps for curve and corner selectivity (see Materials and methods and Figure 1—figure supplement 3). We then computed the FDR (false dicovery rate) using Benjamini-Hochberg procedure (Benjamini and Hochberg, 1995), with the threshold level q = 0.01 to locate the significant patches. Cluster permutation tests were also performed to exclude patches with not enough significant pixels (Nichols and Holmes, 2002). We found several patches significantly selective to curves or corners in dorsal V4 (Figure 1F, G). These curve- or corner-selective patches were considered candidates for functional domains encoding shape segments in V4.

Single-cell mapping of curve- and corner-selective neurons reveals they are spatially clustered

To confirm that neurons within these patches were indeed curve or corner selective, we next performed single-cell resolution imaging with a high-power objective lens (16×) to record neuronal responses (ΔF/F0) as well as their spatial organization (Figure 2—figure supplement 1). The imaging sites (850 × 850 μm) in both subjects were chosen to include both curve- and corner-selective domains found by our 4× imaging (Figure 1F, G). 535 visually responsive neurons (292 from monkey A and 243 from monkey B) were recorded in total. Each stimulus was repeated 10 times and averaged to derive neuronal responses (Figure 2—figure supplement 2). To characterize neurons’ curve and corner selectivity, we calculated a curve selectivity index (CVSI) and corner selectivity index (CNSI). A positive CVSI value indicates a neuron’s maximal response to curves is stronger than its maximal response to other stimuli: a CVSI = 0.33 signifies a response twice as strong, and a CVSI = 0.2 is 1.5 times as strong. The same definition applies to CNSI. 70.5% (74 out of 105) neurons with CVSI > 0.2 significantly (one-way ANOVA, p<0.05) preferred curves over corners and bars, and 76.9% (120 out of 156) for CNSI (Figure 2—figure supplement 3A, B). We found neurons with high CVSI or CNSI were spatially clustered (Figure 2A–D), and these neurons were also selective to the orientation of the integral curves or corners (Figure 2E–H; 91.6% of the neurons are significantly tuned to the orientation of curves or corners; one-way ANOVA, p<0.05). Their overall spatial distribution was consistent with the spatial distribution of curve and corner domains revealed by 4× imaging (Figure 2A–D vs. Figure 1E, F), especially considering the possible loss of detailed spatial information during Gaussian smoothing of 4× images. This parsimoniously suggests that the observed cortical activation was evoked by responsive neuronal clusters.

Figure 2 with 3 supplements see all
Single-cell mapping of curve- and corner-selective neurons using a 16× objective lens.

(A) Cell map of curve selectivity index (CVSI). Responsive neurons are labeled at their spatial location and colored according to their CVSI. Neurons with high positive CVSI (high curve preference) were clustered in the upper part of the imaging area. The white line indicates the curve-biased patches derived by 4× imaging (Figure 1E). Scale bar = 100 μm. (B) Cell map of corner selectivity index (CNSI). Neurons with high positive CNSI (high corner preference) were clustered in the lower part of the imaging area. (C, D) Equivalent maps for monkey B. (E–H) Responses of four example neurons preferring curves or corners, their locations labeled in (A–D), respectively. (I) Neuronal pairwise tuning correlation (mean ± SE, averaging all neurons every 100 μm) plotted against spatial distances. The average correlation between different repeats of same neuron is 0.71 (Figure 2—figure supplement 2). The dash curve indicates the average of neurons when shuffled. Significance levels were determined by permutation test. (J) Absolute CVSI value differences (mean ± SE) plotted against distances. (K) Absolute CNSI value differences (mean ± SE) plotted against distances.

We next assessed this clustering quantitatively by examining how neuronal responses correlate with spatial distance. For each neuronal pair recorded from the same subject, we computed the pairwise tuning correlation and absolute value differences for CVSI and CNSI plotted against the neuronal pairwise distances. We found that neurons close to each other (<300 μm approximately) often had more correlated tuning (Figure 2I) and generally exhibited more similar CVSI and CNSI values (Figure 2J, K). These results indicate curve-selective and corner-selective neurons are spatially clustered, which could potentially form curve domains and corner domains in V4, which could therefore be detected when imaged at a larger scale.

Out of all 535 neurons recorded from two animals, the majority (346 neurons, 64.7%) significantly preferred curve and corner stimuli over single bars, and only 1.5% (eight neurons) significantly preferred bars over curves and corners (Figure 3—figure supplement 1A), indicating that neurons in these areas were indeed much more likely to encode more complex shape features compared to simple orientation. Therefore, we made a combined cell map to depict curve and corner selectivity (Figure 3A), neglecting bar responses, by calculating curve/corner index (CVCNI). Similar to CVSI and CNSI, positive CVCNI values indicate a neuron’s maximum response to curves is stronger than its maximum response to corners, and vice versa. As expected, neurons with similar CVCNI values were spatially clustered. Neurons that fell into the 4×-defined curve domains generally had positive CVCNI values (Figure 3B) and those in the 4× corner domains generally had negative CVCNI values (Figure 3C). We also performed a one-way ANOVA comparing neurons’ maximum curve and corner responses. We found that neurons with CVCNI > 0.2 or <−0.2 (which means 1.5 times as strong) predominantly showed significant preferences (p<0.05) to curves or corners over the other kind (Figure 3D). The curve- or corner-selective neurons (red and blue neurons in Figure 3D) have very diverse curve or corner tuning, and could be either selective or invariant to the radius and radian of curves or bar length and separation angle of corners (Figure 3—figure supplement 1B–D), which potentially enables the encoding of multiple shape segments. More interestingly, these neurons that were heavily biased to curves or corners over the other tended to respond very weakly to single bars (Figure 3E), implying that they might be detecting more complex and integral shape features instead of local orientation. These results suggest that curves and corners are encoded by different neuronal clusters organized in curve and corner domains, and these domains are distinct from those representing single orientations.

Figure 3 with 1 supplement see all
Combined maps of curve/corner preference.

(A) Cell map of curve/corner index (CVCNI). Positive CVCNI indicates preference for curves over corners and vice versa. Curve-selective neurons and corner-selective neurons are spatially clustered. Scale bar = 100 μm. (B) Histogram of CVCNI for neurons located within the curve-biased domains. Mean = 0.15 ± 0.03 S.E. (C) Histogram of CVCNI for neurons located within the corner-biased domains. Mean = −0.20 ± 0.02 S.E. (D) Scatterplot of maximum responses to bars (normalized to 0–1 by the maximum responses to all contour features) against CVCNI. Red dots indicate neurons showing significant preference for curves (ANOVA p<0.05, n = 10) and blue for corners. The majority of neurons (74.5%) with CVCNI < −0.2 or >0.2 were significantly selective. Neurons that highly preferred curves over corners or corners over curves did not respond strongly to single-orientated bars. (E) Neurons’ maximum bar responses were negatively correlated with the absolute values of CVCNI. The red line represents the linear regression line.

Curve-preferring neurons are selective for smoothness

Curves and corners are both different from single bars in that they potentially contain multiple different local orientations, yet we found them to be encoded by different neuronal clusters in V4. This suggests that V4 neurons are not recognizing shapes with more than one local orientation, but computing a more fundamental feature difference. To investigate what distinguishes curves from corners in V4, we tested hexagonal segments (Π-shape stimuli; Figure 4A) that highly resemble curves except for a lack of smoothness (Nandy et al., 2013). We found that neurons that were very selective to smooth curves did not respond strongly to Π-shape stimuli (Figure 4A), suggesting that they were selective to smoothness, rather than multiple orientations. In the same way as CVCNI, we calculated curve/Π-shape index (CVPII), which characterizes a neuron’s preference to smooth curves over the Π-shape stimuli. We found that neurons’ CVPII were highly consistent with CVCNI (R = 0.72, p<0.001, Figure 4B), which means neurons preferring smooth curves over corners would also prefer smooth curves over Π-shape stimuli. As a result, the maps of CVPII were also consistent to CVCNI maps (Figure 4C vs. Figure 3A). K-means clustering analysis of population responses also showed that smooth curves are encoded differently from rectilinear shapes including Π-shapes and corners (Figure 4—figure supplement 1). Therefore, smoothness is important to the distinct encoding of curves and corners in the specific curve domains and corner domains in V4.

Figure 4 with 1 supplement see all
Curve-preferring neurons are selective for smoothness.

(A) Left: responses of an example curve preferring neurons to bars, corners, smooth curves, and Π-shape stimuli, indicated by the white circle in (C). The neurons responded strongly to smooth curves but not to Π-shape, which highly resemble curves despite lack of smoothness. Right: an example neuron responding to rectilinear corners and Π-shapes, indicated by the white square in (C). (B) Scatterplot of curve/corner index (CVCNI) against curve/Π-shape index (CVPII), which characterizes neuronal preference for smooth curves over Π-shape stimuli. The red dash line represents the linear regression line. The two values were highly correlated, indicating that neurons preferring curves over corners also preferred curves over Π-shape stimuli. (C) Cell map of CVPII. Scale bar = 100 μm. Neurons are clustered similarly to CVCNI (Figure 3A).

Curve and corner selectivity is related to concentric and radial grating preference

Early studies in V4 demonstrated that many V4 neurons are selective for non-Cartesian gratings (David et al., 2006; Gallant et al., 1993; Gallant et al., 1996). While concentric gratings highly resemble curves and radial gratings resemble corners, this result highly implied the curve/corner preference. Therefore, we wondered whether these two types of gratings are also separately encoded by neurons in curve and corner domains. So in addition to contour feature stimuli, we also tested concentric, radial, and Cartesian gratings (Figure 5—figure supplement 1A). The resultant selectivity maps were consistent with the contour feature maps as predicted. 48.4% of the neurons recorded in the imaging areas significantly preferred concentric or radial gratings over Cartesian gratings, while only 2.2% significantly preferred Cartesian gratings (Figure 5—figure supplement 1B). In addition, many of them were heavily biased to one over the other. Similar to CVCNI, we computed concentric/radial index (CRI) to characterize this bias. CRI and CVCNI values were found to be correlated (R = 0.38, p<0.001; Figure 5B, Figure 5—figure supplement 2), and naturally their cell maps were also consistent (Figure 5C vs. Figure 3A), suggesting that classical polar grating selectivity is closely related to curve and corner selectivity. Meanwhile, to assess whether the observed selectivity is related to different spatial frequencies, we examined the CRI map at 1, 2, and 4 cycle/°. The CRI values of all neurons at three spatial frequencies are highly correlated (Pearson correlation, all R > 0.5, p<0.001), and the map structures were found to remain consistent across three spatial frequencies (Figure 5C), implying such selectivity is not directly related to spatial frequency.

Figure 5 with 2 supplements see all
Concentric and radial gratings preference.

(A) Cell map of concentric/radial index (CRI). Positive CRI indicates preference for concentric over radial gratings and vice versa. Concentric grating-selective neurons and radial grating-selective neurons are spatially clustered, and the overall distribution was consistent with curve/corner selectivity (Figure 3A). Scale bar = 100 μm. (B) Scatterplot of curve/corner index (CVCNI) against CRI, which were positively correlated. The red dash line represents the linear regression line. (C) CRI cell maps at spatial frequencies of 1, 2, and 4 cycles/° (cpd). The map structure remained consistent.

Discussion

Using 2P calcium imaging, we identified cortical patches in macaque V4d selective for curves or corners (Figure 1E, F), with individual curve- and corner-selective neurons consistently clustered spatially (Figure 3A). These neurons exhibited diverse curve or corner selectivity (Figure 3—figure supplement 1B–D) and could potentially be involved in the encoding and processing of a large variety of curves and corners. These results demonstrate the existence of functionally specific curve and corner domains in V4d.

Functional organization for low-order orientation and spatial frequency representations in macaque V4 had previously been visualized using ISOI (Lu et al., 2018; Roe et al., 2012). For more complex shape features, very few studies have been carried out in V4 to characterize its functional organization, let alone at single-cell resolution. We report here the existence of cortical micro-domains consisting almost entirely of neurons selective for curves. This finding at the single-cell level is consistent with an fMRI study that reported curvature-biased patches in macaque V4 (Yue et al., 2014). The patches we found were smaller in size (about 300 μm) than those observed using fMRI; we suspect due to the improved spatial resolution afforded by 2P imaging. Additionally, we also found cortical domains in V4d selective for corners. Apart from this fMRI study, one of the reasons why exactly the curve/corner contrast was used to study functional domains in V4 was that in our recent study using natural images we found smooth curves and rectilinear corners to be one of the dominant features encoded by many V4 neurons (Jiang et al., 2019). Here, we directly demonstrated and visualized the combined functional organization of smooth curves and rectilinear corners in V4 at both cortical and single-cell level.

Two recent papers have also reported curvature domains in anesthetized macaque V4. Hu et al., 2020 used ISOI, finding functional domains that prefer curved over straight gratings. Tang et al., 2020 used both ISOI and 2P imaging, finding functional domains that prefer circles over rectilinear triangles. In general, these two imaging studies alongside our own provide clear replication of the core importance of curvature as an organizing principle in the functional architecture of V4. Compared to ISOI, 2P imaging holds the advantage of higher spatial resolution, and therefore makes it possible to characterize the transition between domains more precisely than Gaussian smoothed ISOI. We found the transition taking place within around 300 µm, remaining relatively elevated thereafter (Figure 2—figure supplement 3, Figure 5—figure supplement 2). Comparing the different stimulus set (curves vs. corners, concentric vs. radial), the transition of CRI maps of monkey A in Figure 5A looked sharper probably because too many neurons had negative CRI values. CRI and CVCNI were correlated but not identical. Since concentric gratings only have 360° full circles but some neurons might prefer short arches (small radian, Figure 3—figure supplement 1), it is possible that they do not respond strongly to concentric gratings and tend to have negative CRI.

A number of electrophysiology studies have reported that some neurons in V4d are selective for more complex features (Gallant et al., 1993; Hegdé and Van Essen, 2007; Kobatake and Tanaka, 1994; Pasupathy and Connor, 1999). Our results, consistent with these works, identified many curve- or corner-selective neurons. In addition, given the ability of 2P imaging to quantify the spatial relationships between neurons, we confirm that they are spatially clustered. We also observed some deviations of our results from earlier studies. First, the percentage of complex feature-selective neurons we found in our study is higher than previously observed (Gallant et al., 1993; Pasupathy and Connor, 1999); in our hands, the vast majority of neurons preferred curves and corners over bars and concentric and radial gratings over Cartesian gratings. Second, although Π-shape stimuli were sometimes regarded also as curved contours (Nandy et al., 2013), we found V4 neurons responding to them differently. We think these two deviations are primarily due to sampling neurons within or close to curve and corner domains (which is difficult to detect with classical electrophysiology). We do not wish to infer that curve and corner stimuli are only encoded by neurons in the curve and corner domains while other neurons are not involved. But we have demonstrated that neurons in the curve and corner domains are tuned to more complex and integral features rather than local orientation, spatial frequency, or multiple orientations, alone, supporting the encoding of shape segments with intermediate complexity in V4 (Bushnell and Pasupathy, 2012; El-Shamayleh and Pasupathy, 2016; Oleskiw et al., 2014; Rust and DiCarlo, 2010).

Complexity increases as visual shape information is processed along the ventral visual pathway. Neurons in V1 are tuned to low-order orientation and spatial frequency and organized in iso-orientation domains and orientation pinwheels (Nauhaus et al., 2012; Ts'o et al., 1990). Neurons in IT are selective for complex features and objects and organized in feature columns and face patches (Tanaka, 2003; Tsao et al., 2006; Tsunoda et al., 2001). The simple-to-complex transformation and integration take place in the intermediate stages between V1 and IT. Researchers have reported that some V2 neurons are selective to combination of multiple local orientations, from which corner selectivity might emerge (Anzai et al., 2007; Ito and Komatsu, 2004). Our results in V4d showed that intermediate shape segments like curves and corners are separately encoded by neurons in specific functional domains, and the curve- and corner-selective neurons are tuned to the integral features instead of local orientation or combination of orientations. It is possible that these complex feature-selective neurons receive inputs or modulation from nearby neurons or downstream areas to form a recurrent network, which might underlie previous findings that the response profiles of V4 neurons were temporally heterogeneous (Nandy et al., 2016; Yau et al., 2013). Such evidence is also recently accumulating for IT cortex (Kar and DiCarlo, 2021). Unfortunately, this question is difficult to address given the temporal resolution of the existing calcium imaging technique. One possible solution is to use genetically encoded voltage indicators (Xu et al., 2017; Yang and St-Pierre, 2016), which once successfully applied in macaques could help to reveal the simple-to-complex integration of neurons.

Given that we recorded neurons whose stimulation was not isolated to the ‘optimal’ spatial location in the receptive fields (i.e., the RF locations of some neurons might deviate for the population RF), the nature of the domains may also be modulated by stimulus translation variance, and future studies addressing positional variance and stimulus encoding are warranted. Our sample of V4d was also near-foveal in terms of eccentricity. It is well established that the ventral pathway connectivity to IT favors central rather than peripheral visual space (Ungerleider et al., 2008), but the relationship of visual eccentricity to these functional domains remains unknown. The existence of curve and corner domains for neuronal encoding in V4d provides significant support for integration of shape information in the intermediate stages of the visual hierarchy. These findings provide a more comprehensive understanding of the functional architecture of V4 feature selectivity.

Finally, our results may also help to explore the later stage of the visual hierarchy. The data suggests that higher-order pattern domains may emerge gradually along the ventral pathway. The specificity of clustered patches/domains in the cortex has been proposed as an important organizing principle for some, though not all, domains of cortical processing (Kanwisher, 2010). A recent study has suggested that, at least for faces and color processing, such functional domains are causally specific for human visual recognition (Schalk et al., 2017). The curve and corner domain responses in V4 could possibly form the basis for more complex feature columns, object domains, and face patches in IT. This is consistent with a growing body of evidence from the ventral stream (Bao et al., 2020; Rajalingham and DiCarlo, 2019; Yue et al., 2014). Recent explorations of neuronal response fields in artificial neural networks have likewise found a prevalence of curve detectors with increasing complexity along the processing hierarchy (Cammarata et al., 2020). Studying such functional cross-areal connectivity (both bottom-up and top-down) remains a critical goal for future studies of the visual system. It is also interesting to try to identify why smooth curves and rectilinear corners are separated as early as V4. One possible explanation is that smooth curves are more prevalent in living animals or foods that are of particular interest to primates, while corners are often found in the background environment of stones or branches. Such differences may underlie the statistical regularities in natural images of objects (Long et al., 2018; Levin et al., 2001; Yetter et al., 2020; Zachariou et al., 2018). Such comparisons will provide a basis for future investigations comparing the statistical feature relationships for natural images between V4 and IT functional domains.

Materials and methods

Key resources table
Reagent type
(species) or
resource
DesignationSource or
reference
IdentifiersAdditional
information
Strain, strain background (Macaca mulatta)Macaca mulattaBeijing Prima Biotech Inchttp://www.primasbio.com/en/Home
Recombinant DNA reagentAAV9.Syn.GCaMP6f.WPRE.SV40Penn Vector CoreCS1001
Recombinant DNA reagentAAV1.Syn.GCaMP5G.WPRE.SV40Penn Vector CoreV4102MI-R
Software, algorithmMATLAB R2018bMathWorkshttps://www.mathworks.com
Software, algorithmCode for data analysisThis paperhttps://github.com/RJiang1994/macaque-v4-2P (Jiang, 2021 copy archived at swh:1:rev:57dfeac5e81b91c93ef0687f8cf04010d3f47f8c)

All procedures involving animals were in accordance with the Guide of Institutional Animal Care and Use Committee (IACUC) of Peking University Laboratory Animal Center and approved by the Peking University Animal Care and Use Committee (LSC-TangSM-5).

Animal preparation

Request a detailed protocol

The subjects used in this study were two adult male rhesus monkeys (Macaca mulatta, 4 and 5 years of age, respectively), purchased from Beijing Prima Biotech Inc and housed at Peking University Laboratory Animal Center. Two sequential surgeries were performed on each animal under general anesthesia. In the first surgery, we performed a craniotomy over V4 and opened the dura. We injected 200 nl of AAV9.Syn.GCaMP6f.WPRE.SV40 (CS1001, titer 7.748e13 [GC/ml], Penn Vector Core) or AAV1.Syn.GCaMP5G.WPRE.SV40 (V4102MI-R, titer 2.37e13 [GC/ml], Penn Vector Core) at a depth of about 350 μm and speed of 5–10 nl/s. Injection and surgical protocols followed our previous study (Li et al., 2017). After injections, we sutured the dura, replaced the skull cap with titanium screws, and closed the scalp. The animal was then returned for recovery and received Ceftriaxone sodium antibiotic (Youcare Pharmaceutical Group Co. Ltd., China) for 1 week. 45 days later, we performed the second surgery to implant the imaging window and head posts. The dura was removed and a glass coverslip was put directly above the cortex without any artificial dura and glued to a titanium ring. We then glued the titanium ring to the skull using dental acrylic. The detailed design of the chamber and head posts can be found in our previous study (Li et al., 2017). Monkeys can be ready for recording about 1 week after the second surgery.

Behavioral task

Request a detailed protocol

Monkeys were trained to maintain fixation on a small white spot (0.1°) while seated in a primate chair with head restraint to obtain a juice reward. Eye positions were monitored by an ISCAN ETL-200 infrared eye-tracking system (ISCAN Inc, Woburn, MA) at a 120 Hz sampling rate. Trials in which the eye position deviated 1° or more from the fixation point were terminated and the same condition was repeated immediately. Only data from the successful trials was used.

Visual stimuli

Request a detailed protocol

The visual stimuli were displayed on an LCD monitor 45 cm from the animal’s eyes (Acer v173Db, 17 inch, 1280 × 960 pixel, 30 pixel/°, 80 Hz refresh rate). After acquiring fixation, only the gray background (32 cd/m2) was presented for the first 1 s to obtain the fluorescence baseline, and then the visual stimuli were displayed for further 1 s. No inter-trial interval was used. Stimuli were presented in pseudo-random order. We used square-wave drifting gratings (0.4° diameter circular patch, full contrast, 4 cycle/°, 3 cycle/s) generated and presented by the ViSaGe system (Cambridge Research Systems, Rochester, UK) to measure the retinal eccentricity, which was about 1° bottom left to the fovea for both monkeys.

Contour feature stimuli were generated using MATLAB (The MathWorks, Natick, MA) and presented using the ViSaGe system (Cambridge Research Systems). The contour feature stimuli were two pixels wide. The lengths of the bars and corner edges were 10 and 20 pixels (30 pixel/°, 0.33° and 0.67°), and the radius of curve stimuli were also 10 and 20 pixels. For each of the two sizes, the curve stimuli varied in radians (120°, 180° for 4× imaging and 60°, 90°, 120°, 180° for 16× imaging). The corner stimuli also varied in three separation angles (45° and 90° and 135°). All contour feature stimuli were rotated to eight orientations (0°, 45°, 90°, 135°, 180°, 225°, 270°, 315° for curves and corners, and 0°, 22.5°, 45°, 67.5°, 90°, 112.5°, 135°, 157.5° for bars).

The Cartesian (eight orientations, 0°, 45°, 90°, 135°, 180°, 225°, 270°, 315°), concentric, and radial grating stimuli were full contrast sinusoidal gratings (edge blurred), which were 90 pixels (3°) in diameter, with spatial frequencies (SF) of 1, 2, and 4 cycle/°. The concentric gratings were generated as

CG=sin(2πSF*x2+y2)

The radial gratings were generated as

RG=sin(2πSF*arctan(yx))

The data for contour feature stimuli was recorded on one day, and the data for gratings on another day.

2P imaging

Request a detailed protocol

2P imaging was performed using a Prairie Ultima IV 2P laser scanning microscope (Bruker Corporation, Billerica, MA) during experiments. 1000 nm mode-lock laser (Spectra-Physics, Santa Clara, CA) was used for excitation of GCaMPs, and resonant galvo scanning (512 × 512 pixel, 32 frame/s) was used to record the fluorescence images (8 fps, averaging every four frames). A 4× objective (Nikon Corporation, Tokyo, Japan) was used for sub-cortical-level recording (3.4 × 3.4 mm, 6.7 μm/pixel), and a 16× objective (Nikon Corporation) for neural population recording at single-cellular resolution (850 × 850 μm, 1.7 μm/pixel). We used a Neural Signal Processor (Cerebus system, Blackrock Microsystem, Salt Lake City, UT) to record the time stamp of each frame of the 2P images as well as the time stamps of visual stimuli onset for synchronization.

Image data processing

Request a detailed protocol

Image data was processed by MATLAB. The 2P images were first aligned to a template image by a 2D cross-correlation algorithm (Li et al., 2017) to eliminate motion artifacts during recording sections. For all the successful trials, we found the corresponding 2P images by synchronizing the time stamps of stimulus onset recorded by the Neural Signal Processor (Cerebus system, Blackrock Microsystem). The differential fluorescence image was calculated as ΔF = F – F0, where the basal fluorescence image F0 was defined as the average image of 0–0.5 s before stimulus onset, and F as the average of 0.5–1.25 s after stimulus onset, both averaged across all repeats for each stimulus.

For 4× imaging, the ΔF/F0 maps were Gaussian smoothed using a low-pass Gaussian filter (σ = 10 pixels) to obtain the activation maps. For 16× imaging, to identify responding cell bodies (ROIs), the differential image (ΔF) for each stimuli went through a band-pass Gaussian filter (σ = 2 pixels and 5 pixels, respectively, only used for identifying ROIs) and were then binarized using a pixel value threshold of 3 SD. The connected components (>25 pixels) were identified as candidates for active ROIs. An ROI was discarded if its maximum response (ΔF/F0) was below 0.3. The roundness of these ROIs was calculated as

C=P24πS

where P is the perimeter of the ROI and S is the area. Only ROIs with C < 1.1 were identified as cell bodies. We also tested this criterion by ANOVA, comparing the fluorescence 0–0.5 s before and 0.5–1.25 s after stimulus onset (same definition as ΔF), all trials together. 533 out of the 535 neurons identified had p<0.05.

Curve and corner domains

Request a detailed protocol

All trials (stim number × repeat number) in 4× imaging were categorized first as curves (32 stim), corners (48 stim), or bars (16 stim). Curve patches: for each pixel, independent t-tests were performed to compare the responses to all curves against all corners and against all bars, respectively, and the larger one of the two p-values was chosen if the mean response to curves is stronger than corners and bars. FDR was computed following a Benjamini–Hochberg procedure, using the MATLAB command mafdr, in which qi = pi × 512×512/rank(pi). Corner patches followed the same procedure.

Cluster permutation tests were then performed to exclude patches with too few significant pixels. For each permutation, all trials (stim number × repeat number) were randomly relabeled as curves, corners, or bars, keeping the total trial number within each of the three groups unchanged. Independent t-tests as in Figure 1F were performed, with an uncorrected p=0.01 as threshold. The cluster (connected component) with the maximum pixel number was recorded. 60,000 random permutations were performed, resulting in 60,000 maximum cluster sizes as null distribution. The top 5% (3000) of the null distribution was used as the threshold, and the patches with pixels below this level were regarded as insignificant and excluded.

Quantification and statistical analysis

Request a detailed protocol

Two tests were performed to determine whether a neuron was selective to the orientation of curves or corners. First, we performed ANOVA to compare the fluorescence 0–0.5 s before and 0.5–1.25 s after stimulus onset (same definition as ΔF) using all the trials for curve and corner stimuli. Then we find the optimal curve or corner stimuli of this neuron and used ANOVA to compare among the eight orientations of this optimal form. The p-value was then Bonferroni-corrected (14 comparisons, 6 corners, and 8 curves). Only neurons passing both ANOVA tests (p<0.05) were deemed as tuned to the orientation of curves or corners.

CVSI is used to characterize a neuron’s preference to curves over other stimuli (bars and corners), defined as

CVSI=MaxRespcurveMaxRespotherMaxRespcurve+MaxRespother.

where MaxRespcurve is the neuron’s maximum response to curve stimuli and MaxRespother is the neuron’s maximum response to other stimuli (bars and corners). CVSI ranges from −1 to 1, and a positive CVSI value indicates a neuron’s response to its optimal curve stimuli is greater than its response to optimal bar or corner stimuli.

CNSI is defined as

CNSI=MaxRespcornerMaxRespotherMaxRespcorner+MaxRespother.

where MaxResp corner is the neuron’s maximum response to corner stimuli and MaxRespother is the neuron’s maximum response to other stimuli (bars and curves).

CVCNI is defined as

CVCNI=MaxRespcurveMaxRespcornerMaxRespcurve+MaxRespcorner.

We also performed one-way ANOVA test comparing neuron’s maximum response to curve stimuli and maximum response to corner stimuli in Figure 3D, with threshold value p=0.05, repeats n = 10. The same tests were also applied to CVSI and CNSI in Figure 2—figure supplement 3.

CVPII is defined as

CVPII=MaxRespcurveMaxRespΠshapeMaxRespcurve+MaxRespΠshape.

where MaxRespΠ-shape is the neuron’s maximum response to Π-shape stimuli. The Pearson correlation of CVCNI and CVPII was calculated in Figure 4B, and the regression line was derived by minimizing (Δx)^2+(Δy)^2.

CRI is defined as

CRI=MaxRespconcentricMaxRespradialMaxRespconcentric+MaxRespradial.

where MaxRespconcentric is the neuron’s maximum response to concentric gratings and MaxRespradial is the neuron’s maximum response to radial gratings. The Pearson correlation of CVCNI and CRI was also calculated in Figure 5B, and the regression line was derived by minimizing (Δx)^2+(Δy)^2.

Clustering analysis

Request a detailed protocol

We analyzed 2922 neuron pairs from monkey A and 2432 neuron pairs from monkey B in Figure 2I–K. Pairwise tuning correlation was calculated as the Pearson correlation of the two neurons’ responses to all bar, curve, and corner stimuli, and were plotted against pairwise spatial distances (averaging all neurons every 100 μm).

Similarly, the differences in CVSI and CNSI were also plotted against pairwise spatial distances:

|ΔCVSIij|=|CVSIiCVSIj|,

where CVSIi is the CVSI of neuron i and CVSIj is the CVSI of neuron j.

|ΔCNSIij|=|CNSIiCNSIj|,

where CNSIi is the CVSI of neuron i and CNSIj is the CNSI of neuron j.

Permutation test was performed to evaluate the significance of each average |ΔCVSI| and |ΔCNSI|. |ΔCVSI| or |ΔCNSI| were randomly paired with distances for 100,000 times to build the null distribution and averaged. A point was considered significant if it is higher than the top 100 of the null distribution or lower than the bottom 100 (p<0.001).

K-means analysis

Request a detailed protocol

We performed K-means analysis to cluster the stimulus forms and the neurons. Responses of 535 neurons to 20 forms (two bars, eight curves, six corners, and four Π-shapes, each at eight orientations) are used to construct the responses matrix R as

R=r1,1r1,535r20,1r20,535

where ri,j is the response of neuron j to stimulus form i. Only the maximum responses among eight orientations were used.

We used population response vectors (RP, rows of matrix R) to cluster the forms. For form i,

RPi=ri,1ri,2ri,535.

We used neuron response vectors (RN, columns of matrix R) to cluster the neurons. For neuron j,

RNj=r1,jr2,jr20,j.

The number of clusters was determined using Calinski–Harabasz criterion and squared Euclidean distance. Maximum literation time = 10,000. Clustering was repeated for 10,000 times with new initial cluster centroid, and the one with the lowest within-cluster sum was used.

Multi-dimensional scaling

Request a detailed protocol

Classical multi-dimensional scaling (MDS) was performed to visualize the clustering of stimulus forms derived by K-means. The distance (dissimilarity matrix) was computed as

Di,j=1-corrcoef(RPi,RPj)

where Di,j is the distance between form i and j, corrcoef is the Pearson correlation, and RPi is the population response vectors of form i. Classical MDS was performed using singular value decomposition (SVD) algorithm.

The normalized stress was computed as

Stress=(Di,jDi,j)2Di,j2

where Di,j is the distance in the original space and Di,j is the distance in the new MDS space.

Data availability

The data and MATLAB codes used in this study can be found in GitHub (https://github.com/RJiang1994/macaque-v4-2P; copy archived at https://archive.softwareheritage.org/swh:1:rev:57dfeac5e81b91c93ef0687f8cf04010d3f47f8c).

The following data sets were generated
    1. Jiang R
    2. Tang S
    (2020) GitHub
    ID RJiang1994/macaque-v4-2P. macaque-v4-2P.

References

    1. Gattass R
    2. Rosa MG
    3. Sousa AP
    4. Piñon MC
    5. Fiorani Júnior M
    6. Neuenschwander S
    (1990)
    Cortical streams of visual information processing in primates
    Brazilian Journal of Medical and Biological Research = Revista Brasileira De Pesquisas Medicas E Biologicas 23:375–393.

Decision letter

  1. Martin Vinck
    Reviewing Editor; Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, Germany
  2. Tirin Moore
    Senior Editor; Stanford University, United States
  3. Timo van Kerkoerle
    Reviewer
  4. Ed Connor
    Reviewer; Johns Hopkins University, United States
  5. Jack L Gallant
    Reviewer; University of California, Berkeley, United States

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

Two-photon imaging in area V4 of awake monkeys was used to characterize the organization of tuning for distinct shape elements (curves, corners, and bars). The authors use a combination of wide field/low resolution imaging, to visualize large scale organization, with smaller field/high resolution imaging, to measure tuning and organization of individual neurons underlying the wide field results. At both scales, they establish that most V4 neurons are more responsive to curves and corners than to bars, and they establish anatomical segregation between neurons tuned for curves and neurons tuned for bars. These findings advance our understanding of the topographic organization of neuronal feature selectivity in area V4 of the macaque monkey.

Decision letter after peer review:

Thank you for submitting your article "Clustered Functional Domains for Curves and Corners in Cortical Area V4" for consideration by eLife. Your article has been reviewed by 4 peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Tirin Moore as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Timo van Kerkoerle (Reviewer #2); Ed Connor (Reviewer #3); Jack L Gallant (Reviewer #4).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

As the editors have judged that your manuscript is of interest, but as described below that additional experiments are required before it is published, we would like to draw your attention to changes in our revision policy that we have made in response to COVID-19 (https://elifesciences.org/articles/57162). First, because many researchers have temporarily lost access to the labs, we will give authors as much time as they need to submit revised manuscripts. We are also offering, if you choose, to post the manuscript to bioRxiv (if it is not already there) along with this decision letter and a formal designation that the manuscript is "in revision at eLife". Please let us know if you would like to pursue this option. (If your work is more suitable for medRxiv, you will need to post the preprint yourself, as the mechanisms for us to do so are still in development.)

Summary:

A prominent aspect of the visual cortex is the topographic organization in feature dimensions such as orientation, color, motion etc. The present study uses 2-photon imaging in area V4 of awake monkeys, which is a novel application of 2-photon, to characterize the organization of tuning for established shape elements (curves, corners, and bars). The authors use a combination of wide field/low resolution imaging, to visualize large scale organization, with smaller field/high resolution imaging, to measure tuning and organization of individual neurons underlying the wide field results. At both scales, they establish that most V4 neurons are more responsive to curves and corners than to bars, and they establish anatomical segregation between neurons tuned for curves and neurons tuned for bars.

Overall the reviewers made positive comments about this study especially noting the technological advance and the application of a new high-resolution imaging modality to the question of topographic organisation in area V4, although reviewers also commented that the present study is largely a replication of previous work.

Nonetheless, because of the technology used here, the reviewers assess that the work is of significant interest. The main comments of the reviewers pertained to the statistical analyses in this manuscript, which will require extensive revisions and data analyses.

Essential revisions:

1. Statistics

Major improvements will be required on the level of statistical and data analyses. In light of these concerns, we require the authors publish the data and software underlying the figures so that the statistical analyses become transparent and can be verified by the reviewers.

Reviewers commented that statistical analysis is almost completely lacking and is potentially wrong where it is provided. Complex results such as the ones presented by the authors need to be accompanied by appropriate spatial statistics. This will likely require substantial revision to the data analysis and the text. If necessary, the authors should consult a statistical/data science specialist for advice on how to perform the statistical analyses. It remains unclear whether the main claims will survive after appropriate analysis.

More specifically, it is unclear whether the ANOVA tests for significance of curvature- and corner-selective patches has been performed correctly. It appears that the authors identified curvature-selective patches by subtraction, and then performed the ANOVA on these patches. It is unclear whether this procedure is correct and may amount to double-dipping because regions are pre-selected before statistics are run. This kind of analysis can dramatically increase the Type 1 error rate and lead to false conclusions. Therefore, the significance values that are reported here are likely far more extreme than they would be otherwise. Many tutorials regarding how to do these sorts of tests correctly can be found in the neuroimaging literature, where this sort of problem has been extensively discussed in the literature and where it is standard to address it appropriately. The authors should consult one of those tutorials and implement a strictly correct (probably FDR-based) procedure. For instance, here is a possible starting point (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3221040/).

Reviewers commented that the statistical analysis to determine the significance of clustering also appear to be problematic. In fact, it is not clear from the details of the paper what has been precisely done. It appears that the CVSI and CNSI were not evaluated statistically, but ANOVA was used to evaluate tuning in some way, which remains unclear. It furthermore appears that spatial clustering was not assessed statistically at all. The lack of statistics and unclarity about statistics does not meet the prevailing standards of the field. Single neuron tuning needs to be assessed with the correct statistical tests, as does spatial clustering. Given these data and the pre-selection methods that were used to identify targets for the high-resolution analysis, this could be tricky. The authors should consult a statistical/data science specialist for advice on how do to these analyses correctly.

In general the results with the 16x objective likely suffer from double dipping, as they are preselected, therefore the statistical claims about large-scale topographic organization remain unconvincing.

2. Limitations were noted about the fine-grained analyses with the 16x objective. A limitation of the present work is that there is only one pair of patches for each animal images at 16x. The analyses need to be also extended. The authors should further analyze the topographic organization at the local scale, is the transition sharp or gradual, what is the variability etc. It seems that there is a rather sharp boundary and that the tuning stays relatively flat, looking at the Figures 2A-D, 3A, 4C and seems particular clear in 5A and C (using concentric versus radial gratings). However, there is no real quantification of this. Figure 2I-K shows the tuning over distance, but this analysis seems to be performed without taking the shape of the domain into account. One possibility would be to show a similar plot, but where the axis is taken perpendicular to the boundary of the domain. It seems that the interpretation that the authors give of the data would predict that the selectivity shows a sharp transition at the boundary and stays elevated within the domain. Furthermore, it would be relevant to get an estimate of the averaged selectivity as well as the variability within the domain, separately for the two animals. Finally, it would be relevant to compare both the sharpness of the transition as well as the mean and variability with the domain between the different stimulus set (curves versus angles, and concentric versus radial gratings).

3. Data visualisation

The authors should show more raw data (high-resolution fluorescence images with the field of view used for the main analyses), as well as traces of fluorescence as a function of time as is standard with imaging to appreciate the quality of the fluorescence traces (over tens of seconds). In addition showing dF/F responses for single neurons to different stimuli would be important.

4. Bar tuning

It needs to be very clear if the small amount of bar tuning reported is only in the ROIs that are defined by subtracting bars (where this would be therefore expected) or overall, in the discussion it currently sounds like this is the case overall which was not clear from the results.

5. Choice of stimuli

The exact choice of stimulus needs to be discussed: why only black (other studies used only white stimuli), why only lines (not surfaces as in e.g. Pasupathy et al. study that is referred to), why no colors. Is it assumed that this will not matter for the results and why?

The bar length is matched to the radius of the curve stimuli, which implies to me that the overall number of black pixels is never matched for bars vs the other categories? The authors should discuss if this is a problem.

Do you expect more curve/corner functional domains if you use different color or luminance contrast, or do you expect the non-significantly curve/corner clustered parts of V4 to contain other functional domains?

6. Temporal dynamics

The imaging technique confines analyses to a late time window. If possible refer to literature demonstrating that response preferences remain similar across time for these stimuli, since tuning can be dynamic over time (e.g. Nandy et al. 2016, Issa and DiCarlo eLife).

7. Introduction and Discussion:

Intro and Discussion read quite well and a lot of the relevant literature is referred to. But Intro and Discussion could include further/more explicit clarification why exactly this contrast (curve/corner) was used to study functional domains (or is this just a starting point), what other functional domains there could be.

The paper needs to cite literature relating curve/corner to animate/inanimate contrasts you discuss (e.g. Zachariou et al. 2018, and other work from Yue lab). You may consider a brief discussion/mention the potential use or function of functional topographic clustering (e.g. Kanwisher, DiCarlo), which is proposed to be related to naturalistic experience that is also discussed here without references.

The history presented in the introductory section of this paper is very strange. The first paper that reported curvature tuning in V4 was the Gallant et al. 1993 paper that is cited ambiguously here. It is true that paper used gratings rather than curved lines, but a neuron that is selective for curved gratings is also likely selective for curved lines. A similar principle holds for the hyperbolic grating selectivity reported in Gallant et al. 1993. The authors should address this directly and acknowledge the relationship late in their paper.

Similarly, in their subsequent 1996 longer report Gallant et al. argued that neurons selective for curved and hyperbolic gratings were spatially clustered. The data presented in the paper under review is far better than the data that were available to Gallant et al. way back in 1996, but this result was anticipated by that earlier 1996 report, however this finding is not cited.

The authors should discuss the recent paper by Roe lab on curvature patches using intrinsic optical imaging has just been published in eLife: https://elifesciences.org/articles/57261. This paper is relevant for relevant the points above, as they claim that there is a smooth transition from rectilinear to low curvature to high curvature (figure 7).

The authors should furthermore discuss this recent eLife paper on curvature domains, using both intrinsic and 2-photon imaging: https://elifesciences.org/articles/57502

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Clustered Functional Domains for Curves and Corners in Cortical Area V4" for further consideration by eLife. Your revised article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by Tirin Moore as the Senior Editor, and a Reviewing Editor.

The manuscript has been improved but there are some remaining issues that need to be addressed, as outlined below:

Reviewer #2:

The authors replied sufficiently to most of the comments.

One answer is not clear to me, in response to the comment:

"Some more details about the expression levels would be useful. Most importantly, it is unclear from Figure 1C-F how homogenous the expression was in the selected region. Could you show a separate image where it is possible to judge the level of expression? Also, would it be possible to give an estimate of the general expression levels in terms of percentage of total neurons, as well as the percentage of neurons that were nucleus filled? Finally, it would be relevant to know injection speed in this regard."

First of all, they still do not provide the injection speed.

Also, they write: "Most of the neurons that are clearly visible in an average image are not nucleus filled (Figure 1—figure supplement 2)."

However, Figure 1—figure supplement 2 does not show any individual cells. Nor do any of the other supplementary figures provide an image where it is possible to judge the structure of the labelling in individual cells, so allowing to see whether they have a clear donut shape, or are nucleus filled. It would therefore still be relevant to see a large / high resolution image where this can be judged.

Reviewer #4:

The authors have put a lot of work into this revision and the paper is substantially improved over the initial submission. The paper is still largely replicative and confirmatory, but there is a place in the literature for such papers.

It is reported that the V4 receptive fields sampled here were very close to the fovea. That implies that the viewing window was very far lateral, much farther than most prior V4 studies. My intuition is that the ear would have had to be removed in order to access V4 at this location. If the authors recorded more medially then I suggest that they recheck their reported eccentricity to be sure that it is correct.

The indexes that are used here have a pretty unintuitive and unusual scaling range. (For example, an index of 0.2 indicates a 1.5 times difference.) The paper would probably be easier to understand if they had a more intuitive range/form. (For example, if 1.5 indicated a 1.5 times difference.) However, this is up to the authors' discretion.

Figure 2I "significant" is misspelled. There are also a few places throughout the manuscript where pronouns are missing. (I commend the authors on the English though, it is generally quite good!)

Also in Figure 2, please spell out what "CVSI" and "CNSI" mean in the caption. In this and other captions, it is best if the reader can generally understand the caption on its own, w/o having to wade through the text.

The use of hexagonal segments to try to understand differences in tuning for curves versus angles is a weak approach, because hexagonal shapes are a poor intermediate model for these feature classes. A much more powerful method for understanding these differences would be to use an explicit computational model. But that seems to be beyond the scope of this paper…

https://doi.org/10.7554/eLife.63798.sa1

Author response

Essential revisions:

1. Statistics

Major improvements will be required on the level of statistical and data analyses. In light of these concerns, we require the authors publish the data and software underlying the figures so that the statistical analyses become transparent and can be verified by the reviewers.

Reviewers commented that statistical analysis is almost completely lacking and is potentially wrong where it is provided. Complex results such as the ones presented by the authors need to be accompanied by appropriate spatial statistics. This will likely require substantial revision to the data analysis and the text. If necessary, the authors should consult a statistical/data science specialist for advice on how to perform the statistical analyses. It remains unclear whether the main claims will survive after appropriate analysis.

More specifically, it is unclear whether the ANOVA tests for significance of curvature- and corner-selective patches has been performed correctly. It appears that the authors identified curvature-selective patches by subtraction, and then performed the ANOVA on these patches. It is unclear whether this procedure is correct and may amount to double-dipping because regions are pre-selected before statistics are run. This kind of analysis can dramatically increase the Type 1 error rate and lead to false conclusions. Therefore, the significance values that are reported here are likely far more extreme than they would be otherwise. Many tutorials regarding how to do these sorts of tests correctly can be found in the neuroimaging literature, where this sort of problem has been extensively discussed in the literature and where it is standard to address it appropriately. The authors should consult one of those tutorials and implement a strictly correct (probably FDR-based) procedure. For instance, here is a possible starting point (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3221040/).

We agree that the map subtraction used in the previous version of the manuscript is more appropriate as a visualization of the data trends but not a strict statistical analysis. In the revised manuscript, we now have performed independent t-test to each pixel using Benjamini–Hochberg FDR correction and cluster permutation testing following the recommendations in Nichols and Holmes 2002 (Hum. Brain Mapp.). The new FDR q-value, replacing the previous SD maps, can be found in Figure 1F-G. The details of the new analysis can be found in the Methods – Curve and corner domains section line 491-507. In brief, p-value was computed for each pixel comparing the responses to all curves, all corners and all bars, and corrected by BH FDR (Figure 1 — figure supplement 3). Cluster permutation tests were performed to exclude patches (q-value<0.01) with too few pixels (Author response image 1). We used uncorrected p-values instead of q-values in permutation to build the null distribution and compared to the real FDR clusters because otherwise the cluster sizes in the permutations would be too small. This would result in larger cluster sizes in the null distribution and therefore larger critical value, and make the Type 1 error rate even lower.

As the shape of the domains are slightly changed using the corrected procedures, the example neuron 2 in Figure 2G of the old version is now outside the previous curve domain, so we have changed it to another neuron.

Author response image 1
The null distribution of cluster permutation test (in descending rank order).

The 3000th (top 0.05) maximum cluster size (in pixels) is chosen as threshold.

Reviewers commented that the statistical analysis to determine the significance of clustering also appear to be problematic. In fact, it is not clear from the details of the paper what has been precisely done. It appears that the CVSI and CNSI were not evaluated statistically, but ANOVA was used to evaluate tuning in some way, which remains unclear. It furthermore appears that spatial clustering was not assessed statistically at all. The lack of statistics and unclarity about statistics does not meet the prevailing standards of the field. Single neuron tuning needs to be assessed with the correct statistical tests, as does spatial clustering. Given these data and the pre-selection methods that were used to identify targets for the high-resolution analysis, this could be tricky. The authors should consult a statistical/data science specialist for advice on how do to these analyses correctly.

In general the results with the 16x objective likely suffer from double dipping, as they are preselected, therefore the statistical claims about large-scale topographic organization remain unconvincing.

We agree that CVSI and CNSI should be statistically analyzed. In fact, we only performed ANOVA to compare the responses to the optimal curve against the optimal corner (Figure 3D) to evaluate CVCNI. (As we were comparing only 2 conditions, it’s equivalent to independent t-test). We only compared the optimal stimuli for each neuron instead of all the stimuli as in the 4x imaging because single neurons are often very selective, while 4x signals are Gaussian smoothed and are therefore contributed by many nearby cell bodies and neurites. We added the same analysis for CVSI and CNSI in Figure 2 — figure supplement 3A-B. For CVSI, we compare the maximum response to curves against the maximum response to corners and bars (one-way ANOVA, repeat = 10, p < 0.05), and 70.5% (74 out of 105) neurons with CVSI > 0.2 significantly preferred curves over corners and bars. For CNSI it was 76.9% (120 out of 156).

We also added permutation test to the clustering analysis in Figure 2I-K. For instance, in Figure 2I, tuning correlation and distances were randomly paired for 100,000 times to build the null distribution and average to derive the dash curve. A point was considered significant if it’s higher than the top 100 of the null distribution (p < 0.001) or lower than the bottom 100 (Methods line 556-559). We said in the old version that the curve and corner domains are about 400 μm in size, now we changed it to 300.

2. Limitations were noted about the fine-grained analyses with the 16x objective. A limitation of the present work is that there is only one pair of patches for each animal images at 16x. The analyses need to be also extended.

The possible choice for 16x imaging is quite limited (Figure 1F-G and Figure 1 — figure supplement 3). Yue et al. 2014 reported only one significant curvature patch in dorsal V4, and we would not be surprised if these patches are not too many in V4d. We also have to avoid dense blood vessels in 16x imaging as it will heavily affect the visualization of clustering. In fact, we had another monkey but GCaMP6s was expressed in that monkey. GCaMP6s is easily saturated and hard to differentiate strong and weak responses, and therefore may be not extremely suitable for quantitative and semi-quantitative analysis, though its absolute response can be very strong.

The authors should further analyze the topographic organization at the local scale, is the transition sharp or gradual, what is the variability etc. It seems that there is a rather sharp boundary and that the tuning stays relatively flat, looking at the Figures 2A-D, 3A, 4C and seems particular clear in 5A and C (using concentric versus radial gratings). However, there is no real quantification of this. Figure 2I-K shows the tuning over distance, but this analysis seems to be performed without taking the shape of the domain into account. One possibility would be to show a similar plot, but where the axis is taken perpendicular to the boundary of the domain. It seems that the interpretation that the authors give of the data would predict that the selectivity shows a sharp transition at the boundary and stays elevated within the domain. Furthermore, it would be relevant to get an estimate of the averaged selectivity as well as the variability within the domain, separately for the two animals. Finally, it would be relevant to compare both the sharpness of the transition as well as the mean and variability with the domain between the different stimulus set (curves versus angles, and concentric versus radial gratings).

This is a very good point. We now show the plot over distances to the boundary in Figure 2 — figure supplement 3 and Figure 5 — figure supplement 2. To us the transition is rather gradual. Generally it took around 300 μm before getting elevated. The transition of CRI maps of monkey A in Figure 5A intuitively looked sharp probably because too many neurons had negative CRI values. CRI and CVCNI were correlated but not identical. Since concentric gratings only have 360° full circles but some neurons might prefer short arch (small radian, Figure 3 — figure supplement), it is possible that they don’t respond strongly to concentric gratings. We added these discussions in the revised manuscript (line 313-321).

3. Data visualisation

The authors should show more raw data (high-resolution fluorescence images with the field of view used for the main analyses), as well as traces of fluorescence as a function of time as is standard with imaging to appreciate the quality of the fluorescence traces (over tens of seconds). In addition showing dF/F responses for single neurons to different stimuli would be important.

Yes, we have added the raw fluorescence traces of neurons in Figure 2 — figure supplement 1. We show the responses to the optimal curves, corners and bars together in figure supplement 2.

4. Bar tuning

It needs to be very clear if the small amount of bar tuning reported is only in the ROIs that are defined by subtracting bars (where this would be therefore expected) or overall, in the discussion it currently sounds like this is the case overall which was not clear from the results.

It is only in the FOVs defined in Figure 1F-G white box. Sorry we didn’t make it clear. We wrote in the old version L297: “This was not the case for our data, which we infer is primarily due to sampling neurons within or close to curve and corner domains.” We considered it the reason for both the above two points but it unfortunately seemed ambiguous. We have rewritten this part the revised manuscript (line 327-335).

5. Choice of stimuli

The exact choice of stimulus needs to be discussed: why only black (other studies used only white stimuli), why only lines (not surfaces as in e.g. Pasupathy et al. study that is referred to), why no colors. Is it assumed that this will not matter for the results and why?

Bushnell and Pauspathy 2012 study demonstrated shape encoding are largely consistent across colors. So we think the curve/corner preference may be to some extent color invariant and the choice of color may not matter too much. Besides, from our natural image result we did not find dominant color dimension in the curve and corner domains, and the preferred natural images of curves can be of various color. Our V1 study used mostly black lines (Tang et al. 2018), so we just used black lines.

The bar length is matched to the radius of the curve stimuli, which implies to me that the overall number of black pixels is never matched for bars vs the other categories? The authors should discuss if this is a problem.

Yes, but the long bars are precisely matched to the small corners in pixel number, and very closely matched to 60° curves (length ratio = 1:π/3 = 1:1.05). We compare the maximum responses to 60° curves against bars for curve selective neurons in Figure 3D and Figure 3 — figure supplement 1C-D, and we found 52.4% significantly preferred 60° curves to bars, only 7.1% preferred bars to 60° curves (Author response image 2). An example of this is Neuron 3 in Figure 2G. Note that many neurons preferred longer curves and might not even respond to 60° curves (Figure 3 — figure supplement 1D). So we think even with similar pixel number many neurons still preferred curves over bars, and curve preference is not a mere artifact of more pixels.

Author response image 2
Scatterplot showing the maximum response to 60° curves against the maximum response to bars of the curve selective neurons.

Do you expect more curve/corner functional domains if you use different color or luminance contrast, or do you expect the non-significantly curve/corner clustered parts of V4 to contain other functional domains?

We are not exactly sure if more curve/corner domains can be found, but Bushnell and Pauspathy 2012 reported shape tuning remain consistent across color, and from our natural images result the neurons in one curve or corner could respond to images containing curves or corners of various color.

There are certainly other functional domains in V4, such as orientation (Tanigawa et al. 2010), color (Conway et al. 2007), spatial frequency (Lu et al. 2018) and 3D vision (Srinath et al. 2020).

6. Temporal dynamics

The imaging technique confines analyses to a late time window. If possible refer to literature demonstrating that response preferences remain similar across time for these stimuli, since tuning can be dynamic over time (e.g. Nandy et al. 2016, Issa and DiCarlo eLife).

Yes, the time it takes for Calcium influx and accumulation limits the temporal resolution so that we only record late responses. In fact, it’s known that the early and late responses of V4 neurons could be quite different (Yau et al. 2013). The early responses are considered feed-forward signals and are therefore more tuned to local orientation. Complex pattern preference emerges gradually and is more likely to follow a recurrent model. We are currently undertaking some loose patch recordings in V4 and we find that the early and late responses can be different. The limitation of temporal resolution cannot at present be overcome using Calcium based fluorescence. Hopefully the newly developed neurotransmitter or voltage sensors may help to achieve high temporal resolution imaging. We have added some of this to the discussion in the revised manuscript (line 353-362).

7. Introduction and Discussion:

Intro and Discussion read quite well and a lot of the relevant literature is referred to. But Intro and Discussion could include further/more explicit clarification why exactly this contrast (curve/corner) was used to study functional domains (or is this just a starting point), what other functional domains there could be.

We used curve/corner based on the early version of this manuscript posted on bioRxiv in 2019: https://www.biorxiv.org/content/10.1101/808907v2.full (The recent 2 eLife papers from Roe and Lu both cited this manuscript). In brief, we recorded V4 neurons’ responses to thousands of natural images, and purely data-driven dimensional reduction on population responses without any apriori assumption. Apart from feature dimensions encoding simple orientation, we also found a dimension encoding curves and corners (on the opposite directions of the axis). The curve and corner selective neurons were separately spatially clustered. However, there are very limited curve and corner domains, and may other functional domains like orientation (Tanigawa et al. 2010), color (Conway et al. 2007) and spatial frequency (Lu et al. 2018) could be found in V4. If the result is to be repeated, one will have to first localize the curve/corner domain and then recording natural images responses (otherwise other feature dimensions might be more dominant), which is against the idea of “purely data-driven without apriori assumption”. So we removed the natural image part and simplify the manuscript to only focus on curve and corner domains in V4. We added some of these to the discussion in the revised manuscript (line 299-302).

The paper needs to cite literature relating curve/corner to animate/inanimate contrasts you discuss (e.g. Zachariou et al. 2018, and other work from Yue lab). You may consider a brief discussion/mention the potential use or function of functional topographic clustering (e.g. Kanwisher, DiCarlo), which is proposed to be related to naturalistic experience that is also discussed here without references.

We have restructured the final paragraph of the discussion to encompass the thread of ideas a little bit more clearly, and we have added references related to the concepts of functional specialization, including its specificity in terms of cognition.

The history presented in the introductory section of this paper is very strange. The first paper that reported curvature tuning in V4 was the Gallant et al. 1993 paper that is cited ambiguously here. It is true that paper used gratings rather than curved lines, but a neuron that is selective for curved gratings is also likely selective for curved lines. A similar principle holds for the hyperbolic grating selectivity reported in Gallant et al. 1993. The authors should address this directly and acknowledge the relationship late in their paper.

Similarly, in their subsequent 1996 longer report Gallant et al. argued that neurons selective for curved and hyperbolic gratings were spatially clustered. The data presented in the paper under review is far better than the data that were available to Gallant et al. way back in 1996, but this result was anticipated by that earlier 1996 report, however this finding is not cited.

Yes, we should have directly acknowledged these findings. These studies are now referred to on line 50-53 and line 250-253 of the revised manuscript.

The authors should discuss the recent paper by Roe lab on curvature patches using intrinsic optical imaging has just been published in eLife: https://elifesciences.org/articles/57261. This paper is relevant for relevant the points above, as they claim that there is a smooth transition from rectilinear to low curvature to high curvature (figure 7).

The authors should furthermore discuss this recent eLife paper on curvature domains, using both intrinsic and 2-photon imaging: https://elifesciences.org/articles/57502

These papers are now referred to on line 305-311 (both papers cite our 2019 preprint, and were not yet published when we submitted our manuscript to eLife). Roe’s paper used curved gratings vs straight gratings, while in our case it’s mainly curve vs corner. The 90° corners and the Π-shapes we used were of comparable level of curvature as smooth curves, but were encoded more in the corner domains. So we feel that our data may not be suitable to address the low curvature to high curvature transition problem.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Reviewer #2:

The authors replied sufficiently to most of the comments.

One answer is not clear to me, in response to the comment:

"Some more details about the expression levels would be useful. Most importantly, it is unclear from Figure 1C-F how homogenous the expression was in the selected region. Could you show a separate image where it is possible to judge the level of expression? Also, would it be possible to give an estimate of the general expression levels in terms of percentage of total neurons, as well as the percentage of neurons that were nucleus filled? Finally, it would be relevant to know injection speed in this regard."

First of all, they still do not provide the injection speed.

We are sorry for missing this in the previous revision. The injection speed is 5-10 nl/s. We have added this to Materials and methods line 408 of the revised manuscript.

Also, they write: "Most of the neurons that are clearly visible in an average image are not nucleus filled (Figure 1—figure supplement 2)."

However, Figure 1—figure supplement 2 does not show any individual cells. Nor do any of the other supplementary figures provide an image where it is possible to judge the structure of the labelling in individual cells, so allowing to see whether they have a clear donut shape, or are nucleus filled. It would therefore still be relevant to see a large / high resolution image where this can be judged.

We firstly apologize for our mistaken figure labelling. Figure 1—figure supplement 2 here should be Figure 2—figure supplement 1A. We have now rearranged the layout of Figure 2—figure supplement 1 in the manuscript to make the image larger and have uploaded the source TIF file in the system. We hope that this revised figure satisfies your concern.

Reviewer #4:

The authors have put a lot of work into this revision and the paper is substantially improved over the initial submission. The paper is still largely replicative and confirmatory, but there is a place in the literature for such papers.

It is reported that the V4 receptive fields sampled here were very close to the fovea. That implies that the viewing window was very far lateral, much farther than most prior V4 studies. My intuition is that the ear would have had to be removed in order to access V4 at this location. If the authors recorded more medially then I suggest that they recheck their reported eccentricity to be sure that it is correct.

Yes, the optical window is indeed quite lateral. As can be seen in Figure 1B, a large part of the IOS and PIT is included in the 10 mm diameter window, and we were imaging in the lower half. Nevertheless, since the imaging window is smaller than those used for ISOI we did not remove any of the auricle; though the edge of the window was very close to it. In fact, we can also implant such windows to image PIT without removing the auricle.

The indexes that are used here have a pretty unintuitive and unusual scaling range. (For example, an index of 0.2 indicates a 1.5 times difference.) The paper would probably be easier to understand if they had a more intuitive range/form. (For example, if 1.5 indicated a 1.5 times difference.) However, this is up to the authors' discretion.

We agree that the selectivity indexes can be unintuitive. But such a definition is often used to characterize orientation selectivity (OSI, orientation selectivity index), and is also commonly used in some recent imaging studies such as those by Wilson et al. 2018 and Garg et al. 2019. We therefore think this precedent justifies the use such indexes.

Figure 2I "significant" is misspelled. There are also a few places throughout the manuscript where pronouns are missing. (I commend the authors on the English though, it is generally quite good!)

We have corrected the misspelling in Figure 2I, and thank you for your commendation.

Also in Figure 2, please spell out what "CVSI" and "CNSI" mean in the caption. In this and other captions, it is best if the reader can generally understand the caption on its own, w/o having to wade through the text.

We agree this is helpful to the reader. We now spell them out in the captions of Figure 2A-B, as well as Figure 3A and Figure 5A.

The use of hexagonal segments to try to understand differences in tuning for curves versus angles is a weak approach, because hexagonal shapes are a poor intermediate model for these feature classes. A much more powerful method for understanding these differences would be to use an explicit computational model. But that seems to be beyond the scope of this paper…

This is a very valid criticism. We are still working on a computational model to understand how neurons encode curves and corners differently and also to interpret our natural images data. But, as you said, it’s beyond the current scope of this paper. We do hope that we can better address this in our future work.

https://doi.org/10.7554/eLife.63798.sa2

Article and author information

Author details

  1. Rundong Jiang

    1. Peking University School of Life Sciences, Beijing, China
    2. Peking-Tsinghua Center for Life Sciences, Beijing, China
    3. IDG/McGovern Institute for Brain Research at Peking University, Beijing, China
    4. Key Laboratory of Machine Perception (Ministry of Education), Peking University, Beijing, China
    Contribution
    Conceptualization, Data curation, Software, Formal analysis, Investigation, Writing - original draft
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-9217-0749
  2. Ian Max Andolina

    The Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Key Laboratory of Primate Neurobiology, Institute of Neuroscience, Chinese Academy of Sciences, Shanghai, China
    Contribution
    Formal analysis, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-9985-3414
  3. Ming Li

    Beijing Normal University Faculty of Psychology, Beijing, China
    Contribution
    Investigation
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-5173-1602
  4. Shiming Tang

    1. Peking University School of Life Sciences, Beijing, China
    2. Peking-Tsinghua Center for Life Sciences, Beijing, China
    3. IDG/McGovern Institute for Brain Research at Peking University, Beijing, China
    4. Key Laboratory of Machine Perception (Ministry of Education), Peking University, Beijing, China
    Contribution
    Conceptualization, Resources, Data curation, Software, Supervision, Funding acquisition, Methodology, Project administration, Writing - review and editing
    For correspondence
    tangshm@pku.edu.cn
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-0294-3259

Funding

National Natural Science Foundation of China (31730109)

  • Shiming Tang

National Basic Research Program of China (2017YFA0105201)

  • Shiming Tang

National Natural Science Foundation of China (U1909205)

  • Shiming Tang

Beijing Municipal Commission of Science and Technology (Z181100001518002)

  • Shiming Tang

Peking-Tsinghua Center for Life Sciences

  • Shiming Tang

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank the Peking University Laboratory Animal Center for animal care. We acknowledge the Genetically Encoded Calcium Indicator (GECI) project at Janelia Farm Research Campus Howard Hughes Medical Institute. We thank Niall Mcloughlin and Cong Yu for their comments and suggestion on the manuscript.

Ethics

Animal experimentation: All procedures involving animals were in accordance with the Guide of Institutional Animal Care and Use Committee (IACUC) of Peking University Laboratory Animal Center, and approved by the Peking University Animal Care and Use Committee (LSC-TangSM-5).

Senior Editor

  1. Tirin Moore, Stanford University, United States

Reviewing Editor

  1. Martin Vinck, Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, Germany

Reviewers

  1. Timo van Kerkoerle
  2. Ed Connor, Johns Hopkins University, United States
  3. Jack L Gallant, University of California, Berkeley, United States

Publication history

  1. Received: October 7, 2020
  2. Accepted: May 16, 2021
  3. Accepted Manuscript published: May 17, 2021 (version 1)
  4. Version of Record published: June 3, 2021 (version 2)

Copyright

© 2021, Jiang et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 290
    Page views
  • 56
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Neuroscience
    Xiaoxuan Jia et al.
    Research Article

    Temporal continuity of object identity is a feature of natural visual input, and is potentially exploited -- in an unsupervised manner -- by the ventral visual stream to build the neural representation in inferior temporal (IT) cortex. Here we investigated whether plasticity of individual IT neurons underlies human core-object-recognition behavioral changes induced with unsupervised visual experience. We built a single-neuron plasticity model combined with a previously established IT population-to-recognition-behavior linking model to predict human learning effects. We found that our model, after constrained by neurophysiological data, largely predicted the mean direction, magnitude and time course of human performance changes. We also found a previously unreported dependency of the observed human performance change on the initial task difficulty. This result adds support to the hypothesis that tolerant core object recognition in human and non-human primates is instructed -- at least in part -- by naturally occurring unsupervised temporal contiguity experience.

    1. Neuroscience
    Nick Taubert et al.
    Research Article

    Dynamic facial expressions are crucial for communication in primates. Due to the difficulty to control shape and dynamics of facial expressions across species, it is unknown how species-specific facial expressions are perceptually encoded and interact with the representation of facial shape. While popular neural network models predict a joint encoding of facial shape and dynamics, the neuromuscular control of faces evolved more slowly than facial shape, suggesting a separate encoding. To investigate these alternative hypotheses, we developed photo-realistic human and monkey heads that were animated with motion capture data from monkeys and humans. Exact control of expression dynamics was accomplished by a Bayesian machine-learning technique. Consistent with our hypothesis, we found that human observers learned cross-species expressions very quickly, where face dynamics was represented largely independently of facial shape. This result supports the co-evolution of the visual processing and motor control of facial expressions, while it challenges appearance-based neural network theories of dynamic expression recognition.