Contrary neuronal recalibration in different multisensory cortical areas

  1. Fu Zeng
  2. Adam Zaidel  Is a corresponding author
  3. Aihua Chen  Is a corresponding author
  1. Key Laboratory of Brain Functional Genomics (Ministry of Education), East China Normal University, China
  2. Gonda Multidisciplinary Brain Research Center, Bar-Ilan University, Israel

Abstract

The adult brain demonstrates remarkable multisensory plasticity by dynamically recalibrating itself based on information from multiple sensory sources. After a systematic visual–vestibular heading offset is experienced, the unisensory perceptual estimates for subsequently presented stimuli are shifted toward each other (in opposite directions) to reduce the conflict. The neural substrate of this recalibration is unknown. Here, we recorded single-neuron activity from the dorsal medial superior temporal (MSTd), parietoinsular vestibular cortex (PIVC), and ventral intraparietal (VIP) areas in three male rhesus macaques during this visual–vestibular recalibration. Both visual and vestibular neuronal tuning curves in MSTd shifted – each according to their respective cues’ perceptual shifts. Tuning of vestibular neurons in PIVC also shifted in the same direction as vestibular perceptual shifts (cells were not robustly tuned to the visual stimuli). By contrast, VIP neurons demonstrated a unique phenomenon: both vestibular and visual tuning shifted in accordance with vestibular perceptual shifts. Such that, visual tuning shifted, surprisingly, contrary to visual perceptual shifts. Therefore, while unsupervised recalibration (to reduce cue conflict) occurs in early multisensory cortices, higher-level VIP reflects only a global shift, in vestibular space.

Editor's evaluation

This important study combines quantitative behavior and single-unit recordings in nonhuman primates to investigate the role of three cortical areas in cross-modal sensory calibration, a form of neural plasticity that is important for perception and learning. The results convincingly demonstrate key similarities and striking differences across the three areas and provide the first evidence for this form of calibration (in correspondence with behavior) at the level of single neurons. The work will be of broad interest to neuroscientists and psychologists studying multisensory perception, plasticity, and the role of sensory and association cortices in perceptual decisions.

https://doi.org/10.7554/eLife.82895.sa0

Introduction

Our different sensory systems each continuously adapt to changes in the environment (Webster, 2012). Thus, to maintain stable and coherent perception in a multisensory and ever-changing world, the brain needs to dynamically adjust for sensory discrepancies between the different modalities. This process of multisensory recalibration takes place continually and is perhaps more fundamental than multisensory integration because integration would not be beneficial when the underlying cues are biased. While the neural bases of multisensory integration have received a lot of attention (Chen et al., 2013a; Gu et al., 2008; Stein et al., 2014), the neural bases of multisensory recalibration have been explored to a much lesser degree.

Cross-modal recalibration has been observed in a variety of multisensory settings. One well-known example is the ventriloquist aftereffect, in which exposure to a consistent spatial discrepancy between auditory and visual stimuli induces a subsequent shift in the perceived location of sounds (Bertelson and De Gelder, 2004; Canon, 1970; Kramer et al., 2019; Radeau and Bertelson, 1974; Recanzone, 1998; Watson et al., 2021). Also, the rubber-hand illusion leads to an offset in hand proprioception in the direction of the visually observed rubber hand (Abdulkarim et al., 2021; Botvinick and Cohen, 1998; Kennett et al., 2001; Thériault et al., 2022; Tsakiris and Haggard, 2005). Although it was initially thought that only the non-visual cues recalibrate to vision, termed visual dominance (Brainard and Knudsen, 1993; Rock and Victor, 1964), further work in a variety of paradigms has revealed both visual and non-visual recalibration (Atkins et al., 2003; Lewald, 2002; van Beers et al., 2002; Zaidel et al., 2011).

Most of what we know about multisensory recalibration is described at the behavioral level (Lewald, 2002), with little known about its neuronal underpinnings. Recent studies in humans have shed some light on this question. In the ventriloquism aftereffect, cross-modal (audio-visual) recalibration of auditory signals (fMRI) is seen in low-level auditory cortical areas (Zierul et al., 2017). According to that study and another recent (EEG) study (Park and Kayser, 2021), higher-level parietal regions also play a central role in cross-modal spatial recalibration. Moreover, Park and Kayser, 2021 further suggest that frontal regions consolidate the behavioral shift under sustained multisensory discrepancies. However, these methods (fMRI and EEG) lack the resolution to probe recalibration at the level of single neurons.

In a series of classic studies, Kundsen and Brainard investigated multisensory plasticity at the neuronal and circuit levels in the barn owl (Knudsen, 2002; Knudsen and Brainard, 1991; Linkenhoker and Knudsen, 2002). They found profound neuronal plasticity in juvenile owls reared with prismatic lenses that systematically displaced their field of view. In that case, the auditory space map in the optic tectum was recalibrated to be aligned with the displaced visual field (Knudsen and Brainard, 1991). However, multisensory plasticity is not limited to the development, and the neuronal bases of how multiple sensory systems continuously adapt to one another in the adult brain remain fundamentally unknown.

Self-motion perception (the subjective feeling of moving through space) relies primarily on visual and vestibular cues (Butler et al., 2015; Butler et al., 2010; de Winkel et al., 2010; Fetsch et al., 2011; Fetsch et al., 2009; Gu et al., 2007; Warren et al., 1988). Multisensory integration of visual and vestibular signals can improve heading perception (Butler et al., 2015; Dokka et al., 2015; Gu et al., 2008). However, conflicting or inconsistent visual and vestibular information often leads to motion sickness (Oman, 1990; Reason and Brand, 1975). Interestingly, this subsides after prolonged exposure to the sensory motion conflict, presumably through brain mechanisms of multisensory recalibration (Held, 1961; Shupak and Gordon, 2006). Thus, self-motion perception – a vital skill for everyday function with intrinsic plasticity – offers a prime substrate to study cross-modal recalibration.

We previously investigated and found robust perceptual recalibration of both visual and vestibular cues in response to a systematic vestibular–visual heading discrepancy (Zaidel et al., 2011). Similar results were seen for both humans and monkeys. In that paradigm, no external feedback was given. Thus, the need for recalibration arose solely because of the cue discrepancy (we therefore call this condition unsupervised). This led to shifts in subsequent visual and vestibular perceptual estimates toward each other, presumably to reduce the conflict. This is in line with the notion that unsupervised recalibration aims to maintain ‘internal consistency’ between the cues (Burge et al., 2010). However, the neuronal basis of this everyday multisensory plasticity is unknown. This study aimed to test unsupervised recalibration of visual and vestibular neuronal tuning, and how it may differ across multisensory cortical areas.

In line with human neuroimaging studies that showed cross-modal (auditory–visual) recalibration in relatively early sensory areas (Amedi et al., 2002; Zierul et al., 2017), and because unsupervised recalibration is sensory driven (occurs as a result of the cross-modal discrepancy, in the absence of overt feedback) we expected to observe neural correlates of unsupervised visual–vestibular recalibration in relatively early cortical areas with self-motion signals. Previous studies with monkeys identified two relatively early multisensory cortical areas involved in self-motion perception: the medial superior temporal area (Gu et al., 2006) and the parietal insular vestibular cortex (PIVC, Chen et al., 2010). Neurons in MSTd respond to large optic flow stimuli, conducive to the visual perception of self-motion (Gu et al., 2006). Vestibular responses are also present in MSTd, however visual self-motion signals dominate (Gu, 2018; Gu et al., 2012). PIVC has strong vestibular responses, without strong tuning to visual optic flow (Chen et al., 2010).

The ventral intraparietal (VIP) area also has robust responses to visual and vestibular self-motion stimuli, however, it is marked by strong choice signals (Chen et al., 2016; Gu, 2018; Zaidel et al., 2017). It is thus considered a higher-level multisensory area, possibly involved in perceptual decision-making or higher-order perceptual functions. Accordingly, and in line with findings of parietal involvement in human cross-modal recalibration (Park and Kayser, 2021; Zaidel et al., 2021; Zierul et al., 2017), we were interested to see what correlates of unsupervised recalibration we would see in parietal neurons. Different types of multisensory recalibration observed in VIP vs. lower-level (MSTd and PIVC) multisensory areas can provide important insights into their differential underlying functions. Thus, in this study, we focused on these three multisensory cortical areas. We examined whether and how their visual and vestibular neural tuning changed in accordance with corresponding perceptual shifts during a single session (~1 hr) of unsupervised cross-modal recalibration.

Results

Three monkeys performed a heading discrimination task before, during, and after undergoing cross-modal recalibration to spatially conflicting vestibular–visual signals. Simultaneous to behavioral performance, we recorded from single neurons extracellularly in areas MSTd (upper bank of the superior temporal sulcus, N = 83 total; 19 from monkey D, 64 from monkey K), PIVC (upper bank and the tip of the lateral sulcus, N = 160 total; 91 from monkey D, 69 from monkey B), and VIP (lower bank and tip of the intraparietal sulcus, N = 118 total; 103 from monkey D, 15 from monkey B).

The experiment paradigm followed similar methodology as our previous behavioral study (Zaidel et al., 2011). It consisted of three consecutive blocks: pre-recalibration (Figure 1A), recalibration (Figure 1B), and post-recalibration (Figure 1C). In the recalibration block, the monkeys were presented with combined stimuli (simultaneous visual and vestibular cues) with a systematic discrepancy between the visual and vestibular heading directions. In the pre- and post-recalibration blocks, the unisensory perception was measured using visual-only or vestibular-only cues. The effects of recalibration on visual and vestibular perception were measured by the shifts in the post- vs. pre-recalibration psychometric curves. We first (in the next section) present the monkeys’ perceptual recalibration results. Thereafter, we present the neural correlates thereof.

Multisensory recalibration paradigm.

(A) Pre-recalibration block. The vestibular stimulus was elicited by moving the motion platform (schematic in the middle, viewed from above). The visual stimulus, presented on the screen in front of the monkey, corresponded to optic flow (schematic on the right) as it would be experienced during self-motion (without motion of the platform). The self-motion stimuli comprised linear motions (of either vestibular or visual stimuli) in a forward motion with a small leftward or rightward component (black arrows, schematic in the middle). Monkeys were required to fixate on a central target (yellow circle) presented on the screen during the stimulus and then to report their perceived heading by making a saccade to one of two targets (left or right relative to straight ahead). The heading angle (θ) was varied across trials. (B) Recalibration block. Simultaneous vestibular and visual stimuli (combined) with a systematic discrepancy (Δ) between the vestibular and visual headings were presented. Only one discrepancy orientation (Δ+ or Δ) was used per session. The blue and red arrows represent the vestibular and visual headings, respectively. The gray arrows represent the headings (varied across trials) from which the vestibular and visual cues were offset (to either side by Δ/2). The black dashed lines represent straight ahead. (C) Post-recalibration block. The single-cue trials (like in A) were interleaved with combined-cue trials (like in B).

Vestibular and visual perceptual estimates shift toward each other

Figure 2 shows example psychophysical data from two experimental sessions. Replicating our previous behavioral results (Zaidel et al., 2011), we found that both visual and vestibular psychometric functions shifted in the direction required to reduce cue conflict. Namely, when the vestibular and visual heading stimuli were systematically offset, such that they consistently deviated to the right and the left, respectively (Δ+, Figure 2A), the vestibular post-recalibration curve (blue) was shifted rightward vs. pre-recalibration (black). Note that a rightward shift of the psychometric curve indicates a leftward perceptual shift (identified by a lower propensity for ‘rightward’ choices at 0° heading for the blue curve). Complementarily, the visual post-recalibration psychometric curve (red) shifted leftward vs. pre-recalibration (black), albeit to a lesser degree, indicating a rightward perceptual shift. In a reverse manner, when the vestibular and visual heading stimuli were offset to the left and right, respectively (Δ, Figure 2B), the vestibular post-recalibration curve (blue) shifted to the left, and the visual post-recalibration curve shifted to the right.

Multisensory recalibration behavior.

(A, B) Perceptual recalibration in two example sessions, with (A) Δ+ (vestibular and visual headings offset to the right and left, respectively; monkey D, session #10) and (B) Δ (vestibular and visual headings offset to the left and right, respectively; monkey D, session #48). Psychometric curves (cumulative Gaussian distribution functions) were fitted to the data (circles), and represent the proportion of rightward choices, as a function of stimulus heading direction. Pre-recalibration heading judgments are depicted by black curves, in the left and right columns for vestibular and visual cues, respectively. After recalibration, vestibular and visual curves (blue and red, respectively) were shifted in relation to the pre-calibration curves. (C) Blue and red histograms represent the distributions of the point of subjective equality (PSE) shifts (post-recalibration minus pre-recalibration PSE) for vestibular and visual cues, respectively. Histograms above and below the abscissa represent sessions with Δ+ and Δ, respectively. Inverted triangles (▼) and upright triangles (▲) with error bars represent mean ± standard error of the mean (SEM) shifts for sessions with Δ+ and Δ, respectively. The numbers on the plots represent the mean PSE shifts. Asterisk symbols indicate significant shifts (p < 0.05). For the vestibular cue, p = 2.3 × 10−17, N = 241 sessions (Δ+ condition), and p = 1.0 × 10−28, N = 227 sessions (Δ condition), paired t-test. For the visual cue, p = 6.5 × 10−11+ condition), and p = 5.4 × 10−23 condition), paired t-test. Summary statistics for the individual animals are presented in Figure 2—source data 1.

Figure 2—source data 1

Individual monkey summary statistics of behavioral shifts.

https://cdn.elifesciences.org/articles/82895/elife-82895-fig2-data1-v1.doc

These perceptual shifts were quantified by the difference between the post- vs. pre-recalibration curves’ PSEs (points of subjective equality). A psychometric curve’s PSE represents the heading angle of equal right/left choice proportion, that is, the heading that participants would supposedly perceive as straight ahead. The vestibular and visual psychometric shifts were positive and negative, 3.40° and −1.01°, respectively, in Figure 2A, and negative and positive, −3.68° and 1.00°, respectively, in Figure 2B. Thus, in both cases (Figure 2A, B), both the vestibular and the visual cues shifted in the direction required to reduce the cue conflict (i.e., in opposite directions).

In each session, only one discrepancy orientation was tested. Namely, vestibular and visual headings were either offset to the right and left (respectively), or vice versa. These discrepancies were arbitrarily defined as positive (Δ+) or negative (Δ), respectively. In total, we collected data from 241 sessions with Δ+ and 227 sessions with Δ. Distributions of the vestibular and visual PSE shifts across sessions are presented in Figure 2C (above and below the abscissa for Δ+ and Δ, respectively). The vestibular PSEs were shifted significantly to the right for the Δ+ condition (mean ± SE = 1.12° ± 0.12°; p = 2.3 × 10−17, paired t-test), and significantly to the left for the Δ condition (mean ± SE = −1.76° ± 0.14°; p = 1.0 × 10−28, paired t-test). The visual PSEs were shifted significantly to the left for the Δ+ condition (mean ± SE = −0.73° ± 0.11°; p = 6.5 × 10−11, paired t-test), and significantly to the right for the Δ condition (mean ± SE = 1.10° ± 0.10°; p = 5.4 × 10−23, paired t-test). Thus, consistent with our previous study, both cues shifted (in opposite directions) to reduce cue conflict.

Comparing the vestibular vs. visual shift magnitudes (pooled by flipping the vestibular and visual shift signs in the Δ and Δ+ conditions, respectively) demonstrated significantly larger vestibular vs. visual shifts (1.43° ± 0.09° and 0.91° ± 0.07°, respectively; p = 6.8 × 10−5, paired t-test). This result is also consistent with our previous study. Thus, the behavioral results from the original study (performed in the Angelaki laboratory) were replicated in these experiments (in the Chen laboratory) using a new set of monkeys, with simultaneous neuronal recording. In the following sections, we present how neuronal responses in areas MSTd, PIVC, and VIP (Figures 3, 4 and 5, respectively) were recalibrated in comparison to the perceptual shifts.

Figure 3 with 1 supplement see all
Dorsal medial superior temporal (MSTd) neuronal recalibration.

(A) An example recalibration session (Δ+) with simultaneous recording from MSTd. The left column depicts the behavioral responses, pre-, and post-recalibration. The vestibular psychometric curve shifted 3.01° (to the right) and the visual curve shifted −2.71° (to the left). Neuronal responses (middle column) as a function of heading (pre- and post-recalibration). Circles and error bars represent average firing rates (FRs, baseline subtracted) ± standard error of the mean (SEM). The right column shows corresponding neurometric curves with fitted cumulative Gaussian functions. Each data point shows the proportion of trials in which an ideal observer would make a rightward choice given the FRs of the neurons. The vestibular neuronal shift was 4.73° (to the right) and the visual neuronal shift was −1.22° (to the left). (B) Correlations between neuronal point of subjective equality (PSE) shifts and perceptual PSE shifts for the vestibular and visual cues (left and right plots, respectively). Only neurons that passed screening (had significant responses and reliable neurometric PSEs, see Methods for details) were included in this analysis. Solid symbols represent sessions with Δ+ and open symbols represent Δ. The solid lines illustrate the regression lines of the data. r, Pearson’s correlation coefficient. Summary statistics for the individual animals, and linear mixed model (LMM) results, are presented in Figure 3—source data 1 and Figure 3—source data 2, respectively.

Figure 3—source data 1

Individual monkey summary statistics for dorsal medial superior temporal (MSTd) correlations.

https://cdn.elifesciences.org/articles/82895/elife-82895-fig3-data1-v1.doc
Figure 3—source data 2

Comparison of pooled model (PM) and linear mixed model (LMM) for dorsal medial superior temporal (MSTd).

https://cdn.elifesciences.org/articles/82895/elife-82895-fig3-data2-v1.docx
Parietoinsular vestibular cortex (PIVC) neuronal recalibration.

(A) An example recalibration session (Δ+) with simultaneous recording from PIVC (conventions are the same as Figure 3). The vestibular and visual psychometric curves shifted 1.37° and −0.51° (to the right and left, respectively). The vestibular neurometric curve shifted 5.37° (to the right). Although a visual neurometric curve is presented for this example, no visual neurometric shift was calculated, and the neuron was excluded from subsequent visual cue analyses, because it did not pass the screening for significant tuning to visual stimuli. (B) Correlations between neuronal point of subjective equality (PSE) shifts and perceptual PSE shifts for the vestibular and visual cues. Summary statistics for the individual animals, and linear mixed model (LMM) results, are presented in Figure 4—source data 1 and Figure 4—source data 2, respectively.

Figure 4—source data 1

Individual monkey summary statistics for parietoinsular vestibular cortex (PIVC) correlations.

https://cdn.elifesciences.org/articles/82895/elife-82895-fig4-data1-v1.doc
Figure 4—source data 2

Comparison of pooled model (PM) and linear mixed model (LMM) for parietoinsular vestibular cortex (PIVC).

https://cdn.elifesciences.org/articles/82895/elife-82895-fig4-data2-v1.docx
Figure 5 with 1 supplement see all
Ventral intraparietal (VIP) neuronal recalibration.

(A) An example recalibration session (Δ+) with simultaneous recording from VIP (conventions are the same as Figure 3). The vestibular and visual psychometric curves shifted 4.81° and −1.13° (to the right and left, respectively). The vestibular and visual neurometric curves shifted 15.18° and 7.58°, respectively (both to the right). (B) Correlations between neuronal point of subjective equality (PSE) shifts and perceptual PSE shifts for the vestibular and visual cues. Summary statistics for the individual animals, and linear mixed model (LMM) results, are presented in Figure 5—source data 1 and Figure 5—source data 2, respectively.

Figure 5—source data 1

Individual monkey summary statistics for ventral intraparietal (VIP) correlations.

https://cdn.elifesciences.org/articles/82895/elife-82895-fig5-data1-v1.doc
Figure 5—source data 2

Comparison of pooled model (PM) and linear mixed model (LMM) for ventral intraparietal (VIP).

https://cdn.elifesciences.org/articles/82895/elife-82895-fig5-data2-v1.docx

Vestibular and visual tuning in MSTd shifted according to their respective perceptual shifts

Responses of an example neuron recorded from MSTd during unsupervised recalibration are presented in Figure 3A. Behaviorally, the vestibular PSE shifted rightward and the visual PSE shifted leftward (left column, Figure 3A). Shifts in neuronal tuning could be subtle, therefore we used neurometrics to expose and quantify the neuronal shifts. Specifically, we calculated neurometric responses for the heading stimuli using the neuron’s firing rates (FRs), and fit these with a cumulative Gaussian function. The neurometric PSE reflects the heading direction at which the fitted neurometric curve crosses 0.5, that is, estimated straight ahead according to the neuronal data, in reference to the mean pre-recalibration FRs (see Methods for details). Neurometric curves for this example neuron are presented in the rightmost column of Figure 3A.

For this MSTd neuron, the vestibular neurometric curve shifted to the right, while the visual neurometric curve shifted to the left. Thus, the shifts in vestibular and visual tuning were consistent with the perceptual shifts. For subsequent (group) analyses, only neurons that both: (1) were significantly tuned to the respective (visual or vestibular) stimulus, and (2) had reliable neurometric PSEs, were included (see Methods and Figure 3—figure supplement 1 for details). Neuronal shifts were calculated, similar to perceptual shifts, by the difference between the post- vs. pre-recalibration neurometric curves’ PSEs. The MSTd neuronal shifts were significantly correlated with the perceptual shifts, both for vestibular and visual cues (r = 0.62, p = 0.019, N = 14, and r = 0.38, p = 2.7 × 10−3, N = 59, respectively; Pearson correlations, data pooled across monkeys). Similar results were found when analyzing the monkeys individually (Figure 3—source data 1) and when using a linear mixed model (LMM) which took into account differences between individual monkeys (the LMM did not provide a better fit vs. the pooled model; Figure 3—source data 2). Therefore, in area MSTd neuronal recalibration occurs in accordance with perceptual recalibration, both for vestibular and visual cues.

Vestibular tuning in PIVC shifted in accordance with vestibular perceptual shifts

In PIVC, a similar result was observed for vestibular tuning. The example vestibular neurometric curve (Figure 4A, top right) shifted to the right, which was consistent with the vestibular perceptual shift (Figure 4A, top left). Across the population of PIVC neurons, a significant positive correlation was seen between the neuronal and perceptual shifts for the vestibular cue (r = 0.80, p = 9.7 × 10−6, N = 30, Pearson correlation, data pooled across monkeys; Figure 4B, left panel). Similar results were found when analyzing the monkeys individually (Figure 4—source data 1) and when using an LMM (the LMM did not provide a better fit vs. the pooled model; Figure 4—source data 2).

In general, the PIVC neurons did not demonstrate robust responses to the visual stimuli. This example neuron was not significantly tuned to the visual stimuli (Figure 4A, bottom, middle), thus it had poor visual neurometric curves (Figure 4A, bottom, right) and was excluded from further visual (group) analyses. The correlation between the neuronal and perceptual shifts (performed for those neurons that did pass screening) was not significant for the visual cue (r = 0.26, p = 0.47, N = 10, Pearson correlation, data pooled across monkeys). Similar results were found when analyzing the monkeys individually (Figure 4—source data 1) and when using an LMM (the LMM did not provide a better fit vs. the pooled model; Figure 4—source data 2). A Bayesian Pearson correlation (BF10 = 0.49) supported neither the alternative hypothesis (H1) of a correlation between neuronal and perceptual shifts for the visual cue, nor the null hypothesis (H0). The lack of support for or against visual recalibration in PIVC primarily reflects the lack of robust tuning to visual heading stimuli in PIVC.

Neuronal tuning in VIP to both vestibular and visual stimuli shifted according to vestibular perceptual shifts

Figure 5A presents an example neuron from VIP. The vestibular neurometric curve shifted rightward (Figure 5A, top right), in accordance with the vestibular perceptual shift (Figure 5A, top left). Surprisingly, the visual neurometric curve also shifted rightward (Figure 5A, bottom right). This was unexpected because the visual psychometric curve shifted leftward (Figure 5A, bottom left). Thus, while the vestibular and visual behavioral psychometric curves shifted in opposite directions (toward each other) the vestibular and visual neurometric curves shifted together, in accordance with the vestibular (not visual) perceptual shift.

Across the population of VIP neurons, the vestibular neurometric shifts were significantly positively correlated with the vestibular perceptual shifts (r = 0.77, p = 2.7 × 10−8, N = 37, Pearson correlation, data pooled across monkeys; Figure 5B, left). Similar results were found when analyzing the monkeys individually (Figure 5—source data 1) and when using an LMM (the LMM did not provide a better fit vs. the pooled model; Figure 5—source data 2). Like in MSTd and PIVC, the positive correlation coefficient indicates that neuronal and behavioral curves shifted in the same direction for the vestibular cue.

By contrast, the visual neurometrics in VIP shifted in the opposite direction to the visual perceptual shifts. Neuronal and perceptual shifts for the visual cue were negatively correlated (r = −0.68, p = 8.4 × 10−7, N = 42, Pearson correlation, data pooled across monkeys; Figure 5B, right). Similar results were found when analyzing the monkeys individually (Figure 5—source data 1) and when using an LMM (the LMM did not provide a better fit vs. the pooled model; Figure 5—source data 2). This exposes a striking mismatch between visual neuronal responses in VIP and visual perceptual function. It also exposes a striking mismatch between visual tuning in MSTd (which shifted in the same direction as visual perception) in comparison to visual tuning in area VIP (which shifted contrary to visual perception).

To test whether this mismatch between behavior and tuning for visual cues in VIP relates to specific subtypes of neurons, we sorted the VIP data into three subsets: multisensory neurons (respond significantly to both vestibular and visual stimuli), and two groups of unisensory neurons (respond significantly exclusively to vestibular or visual stimuli). Similar results were seen for both multisensory and unisensory neurons (the neuronal–perceptual correlations remained consistently positive and negative for vestibular and visual cues, respectively; Figure 5—figure supplement 1A). We further sorted the multisensory neurons into those with congruent and opposite vestibular and visual heading preferences (Chen et al., 2011a; Gu et al., 2006) with no observable differences (Figure 5—figure supplement 1B). Therefore, the contrary shifts of visual tuning in VIP seem to reflect a general feature of this cortical area, rather than an anomaly of a subgroup of neurons.

Temporal evolution of the correlation between neuronal and perceptual shifts

The neurometric curves in Figures 35 were calculated using mean FRs averaged across the stimulus duration. But the self-motion stimuli generated by the platform and optic flow followed a specific dynamic time course, specifically, a Gaussian velocity profile and correspondingly a biphasic acceleration profile (see bottom row, Figure 6). Therefore, we further examined whether the correlations between neurometric and perceptual shifts depend on the time point within the stimulus interval.

Recalibration of neuronal responses within the stimulus time course.

Pearson correlations between neuronal and perceptual point of subjective equality (PSE) shifts, using the neuronal activity at specific time points during the stimulus, for (A) dorsal medial superior temporal (MSTd), (B) parietoinsular vestibular cortex (PIVC), and (C) ventral intraparietal (VIP). Top row: vestibular (blue histograms), middle row: visual (red histograms), bottom row: stimulus (acceleration and velocity) time course. Vertical dashed lines mark peak acceleration and peak deceleration. ‘*’ symbols mark significant correlations.

For MSTd neurons, positive correlations (between neuronal and perceptual shifts) were seen for both vestibular and visual cues during the stimulus (Figure 6A). Correlations increased toward the middle of the stimulus, and dropped off rapidly at the end of the stimulus. Significant correlations (blue and red asterisk markers for vestibular and visual cues, respectively) were only seen around the middle of the stimulus. Thus neural recalibration in MSTd (in accordance with behavioral recalibration) could reflect the velocity responses.

For PIVC neurons, positive correlations (between neuronal and perceptual shifts) were seen only for vestibular cues, during the stimulus (upper panel in Figure 6B). Like MSTd, the vestibular correlations seemed to follow the velocity profile of the stimulus, with significant values around the middle of the stimulus. Correlations in the visual condition were very weak and not significant (middle panel in Figure 6B).

A very different profile was seen in VIP. Firstly, as described above, correlations between neuronal and perceptual recalibration were positive for the vestibular cue (upper panel in Figure 6C) and negative for the visual cue (middle panel in Figure 6C). Furthermore, the time course of these correlations was different in VIP: they increased in size gradually (positively for vestibular and negatively for visual), reaching a maximum around the middle of the stimulus epoch (the velocity peak), but significant correlations were found for time intervals beyond the end of the stimulus. This pattern is in line with sustained neuronal activity described previously for VIP. However, here this sustained activity correlated with subsequent vestibular choices, and was contrary to visual choices.

VIP choice signals are reduced after cross-modal recalibration

Previous studies have found that neuronal responses in VIP are strongly influenced (sometimes even dominated) by choice signals (Chen et al., 2021; Zaidel et al., 2017). Hence our finding here, that neuronal tuning recalibrated contrary to perceptual shifts for the visual cue, was surprising and counterintuitive. We, therefore, wondered what happened to the strong choice signals for which VIP is renowned, which would predict that neuronal tuning (also for visual cues) would shift with behavior.

To visualize choice tuning for an example VIP neuron, we plotted ‘choice-conditioned’ tuning curves, namely, neuronal responses as a function of heading, separately for rightward and leftward choices (Figure 7). In the pre-recalibration block vestibular responses were strongly choice related (Figure 7A, left panel) – neuronal responses to the same heading stimulus were larger when followed by rightward (►, blue) vs. leftward (◄, cyan) choices (the blue line lies above the cyan line). After recalibration, the choice effect decreased (Figure 7A, right panel) – the choice-conditioned tuning curves were no longer separate. Similarly, visual responses were strongly choice-related pre-recalibration, and this decreased post-recalibration (Figure 7B). To quantify the choice (and sensory) components of neuronal activity, and to observe how these changed after recalibration, we applied a partial correlation analysis (Zaidel et al., 2017). For this example neuron, the partial choice correlation values (Rc, presented on the plots) were reduced both for vestibular and visual cues.

Choice tuning is reduced post-recalibration in an example ventral intraparietal (VIP) neuron.

Neuronal responses for an example VIP neuron to (A) vestibular and (B) visual heading stimuli, pre- and post-recalibration (left and right columns, respectively). Blue and cyan curves depict choice-conditioned tuning curves (neuronal responses followed by rightward and leftward choices, respectively) for the vestibular cue. Red and magenta curves depict choice-conditioned tuning curves for the visual cue. Black curves (in the corresponding plots) represent all responses (not sorted by choice). Partial heading (Rh) and partial choice (Rc) correlations (with corresponding p values) are presented on the plots.

Across our sample of VIP neurons, the choice partial correlations in the post-recalibration block were significantly reduced compared to the pre-recalibration block, for both vestibular and visual cues (p = 6.0 × 10−4 and p = 1.3 × 10−3, respectively, paired t-tests; Figure 8B). However, the heading partial correlations (Rh) did not differ significantly from pre- to post-recalibration, neither for vestibular not visual cues (p = 0.96 and p = 0.85, respectively, paired t-tests; Figure 8A). For these statistical comparisons and for plotting we used the squared partial correlations (which quantify the amount of unique variance explained by choice or heading). We did not observe any significant changes in partial correlations in areas PIVC and MSTd (Figure 8—figure supplement 1). Lastly, there was no evidence for differences between post- and pre-recalibration baseline FRs in any of the three areas (Figure 8—figure supplement 2). Thus, shifts in neuronal tuning are not explained by changes in baseline activity.

Figure 8 with 2 supplements see all
Choice tuning is reduced in ventral intraparietal (VIP) post-recalibration.

(A) Heading and (B) choice partial correlation coefficients (squared) are depicted post- vs. pre-recalibration. Blue and red circles (top and bottom rows) represent vestibular and visual cues, respectively. Filled (empty) circles indicate significant (non-significant) partial correlations for heading or choice. p values are presented on the corresponding plots (two-tailed paired t-tests).

Discussion

This study provides the first demonstration of unsupervised (cross-modal) neuronal recalibration, in conjunction with perceptual recalibration, in single sessions. Single neurons from MSTd, PIVC, and VIP revealed clear but different patterns of recalibration. In MSTd, neuronal responses to vestibular and visual cues shifted – each according to their respective cues’ perceptual shifts. In PIVC, vestibular tuning similarly shifted in the same direction as vestibular perceptual shifts (the PIVC cells were not robustly tuned to visual stimuli). However, recalibration in VIP was notably different: both vestibular and visual neuronal tuning shifted in the direction of the vestibular perceptual shifts. Thus, visual neuronal tuning shifted, surprisingly, contrary to visual perceptual shifts. These results indicate that neuronal recalibration differs profoundly across multisensory cortical areas.

Neural correlates of vestibular–visual recalibration

To investigate the neuronal bases of unsupervised cross-modal recalibration, we first replicated the perceptual results from our previous study (Zaidel et al., 2011). Indeed, in the presence of a systematic vestibular–visual heading offset (with no external feedback) vestibular and visual cues both shifted in the direction required to reduce the cue conflict. And, as before, the vestibular shifts were larger compared to the visual shifts. Thus we confirmed robust recalibration of vestibular and visual cues, resulting from a systematic discrepancy between the cues’ headings in an unsupervised context (i.e., without external feedback).

Since there was no external feedback regarding which cue was (in)accurate, unsupervised recalibration is driven by the cue conflict, presumably through an internal mechanism to maintain consistency between vestibular and visual perceptual estimates (Zaidel et al., 2011). Accordingly, we expected to see neuronal correlates of perceptual recalibration in early multisensory areas related to self-motion perception (Zierul et al., 2017), specifically: MSTd, which primarily responds to visual (but also vestibular) self-motion stimuli, and PIVC, which primarily responds to vestibular stimuli. We further expected that the neuronal recalibration in MSTd and PIVC would propagate to higher-level multisensory area VIP.

In MSTd, we indeed found that both visual and vestibular neuronal signals shifted, each in accordance with their corresponding cue’s perceptual shifts. Hence, recalibration of visual self-motion responses was observed at least at the level of MSTd, which is the primary area in the visual hierarchy to respond to large field optic flow stimuli (Britten, 2008; Britten and van Wezel, 1998; Britten and Van Wezel, 2002; Duffy and Wurtz, 1995; Gu et al., 2008; Gu et al., 2012; Gu et al., 2006; Wurtz and Duffy, 1992). We cannot ascertain whether recalibration to visual responses occurred already in earlier visual regions, such as the middle temporal visual area, which projects to MSTd (Maunsell and van Essen, 1983; Ungerleider and Desimone, 1986), or whether it occurred only at the level of MSTd. Because MSTd is mainly a visual area, the recalibration of vestibular signals observed in MSTd likely occurred in upstream vestibular areas that project to MSTd, such as PIVC (Chen et al., 2010; Chen et al., 2011a). Indeed, robust vestibular recalibration (that was in line with the vestibular perceptual shifts) was observed in PIVC. Hence, neuronal correlates of perceptual recalibration were observed in relatively early multisensory areas related to self-motion perception (MSTd and PIVC).

Modality-specific recalibration of vestibular and visual cues

Results from this experiment exposed modality-specific neuronal recalibration (in MSTd and PIVC). Namely, visual and vestibular tuning curves shifted differently (in opposite directions). This provides neuronal evidence against ‘visual dominance’, even for short-term recalibration (in single sessions). Rather, it supports the idea that cross-modal neuronal recalibration occurs also for visual (and not only for non-visual) cues. Furthermore, it exposes neuronal mechanisms to maintain internal consistency between vestibular and visual cues. This dynamic cross-modal plasticity may underlie our adept ability to adapt to sensory conflict commonly experienced in many modes of transport (on land, at sea, or in flight).

In a recent set of complementary studies, we tested supervised self-motion recalibration, by providing external feedback regarding cue accuracy (Zaidel et al., 2021; Zaidel et al., 2013). There, we found that supervised recalibration is a high-level cognitive process that compares the combined-cue (multisensory) estimate to feedback from the environment. Behaviorally, this resulted in ‘yoked’ recalibration – both cues shifted in the same direction, to reduce conflict between the combined estimate and external feedback (Zaidel et al., 2013). Neuronally, robust recalibration of both vestibular and visual neuronal tuning was seen in VIP, such that tuning for both cues shifted together, in accordance with the behavior (Zaidel et al., 2021).

However, because the shifts for both vestibular and visual cues were in the same direction in the supervised recalibration studies, neuronal tuning was also expected to shift in the same direction for both cues. Thus, we could not dissociate there whether neuronal shifts for a particular cue (e.g., visual) indeed followed the behavioral shifts for that cue (visual) or, less intuitively, the other cue (vestibular). By contrast, the unsupervised paradigm, tested in this study, elicits visual and vestibular shifts in opposite directions. It could thereby expose the (unexpected) finding that visual tuning in VIP actually shifts with vestibular (rather than visual) behavioral shifts.

The results here therefore also shed new light on the neuronal shifts observed in VIP after supervised recalibration (Zaidel et al., 2021). They indicate that yoking of visual and vestibular tuning is observed in VIP irrespective of the paradigm (supervised or unsupervised). Hence, yoked recalibration may be a feature of VIP, not just a feature of supervised recalibration.

Contrary recalibration in higher-level area VIP

VIP is a higher-level multisensory area (Bremmer et al., 2002; Colby et al., 1993; Duhamel et al., 1998; Schlack et al., 2002; Schlack et al., 2005; Schroeder and Foxe, 2002) with clear vestibular and visual heading selectivity (Chen et al., 2011a; Chen et al., 2011b). But the nature of these self-motion signals in VIP is not fully understood. In contrast to our prediction that recalibrated signals in MSTd and PIVC would simply propagate to VIP, we found a different and unexpected pattern of recalibration in VIP. While vestibular tuning shifted in line with vestibular perceptual shifts (like MSTd and PIVC), visual tuning shifted opposite in direction to the visual perceptual shifts (and opposite in direction to MSTd visual recalibration). These findings indicate that visual responses in VIP do not reflect a simple feed-forward projection from MSTd. They also suggest that visual responses in VIP are not decoded for heading perception (otherwise these would not shift in opposite directions). This interpretation is in line with findings that inactivation (Chen et al., 2016) and microstimulation (Yu and Gu, 2018) in VIP do not affect perceptual decisions. Thus, the convergence of visual and vestibular signals in VIP likely serves purposes other than cue integration.

We previously found strong choice-related activity in VIP neurons (Zaidel et al., 2017). Accordingly, we considered that shifts in VIP neuronal tuning (after supervised recalibration) might simply reflect the altered choices (Zaidel et al., 2021). However, choice-related activity cannot explain the results here, because the predicted shifts in neuronal tuning would be in the same direction as the altered choices (perceptual shifts), whereas we found contrary visual recalibration. To understand contrary shifts that could arise despite strong choice-related activity in VIP, we investigated choice tuning pre- and post-recalibration in VIP neurons. We found that choice tuning in VIP decreased after unsupervised recalibration. This allowed contrary shifts to be exposed, and opens up new and fascinating questions regarding the purpose of contrary visual recalibration in VIP.

Because visual and vestibular tuning in VIP both shifted in the same direction (in accordance with vestibular perceptual shifts) we speculate that VIP recalibration reflects a global shift in the vestibular reference frame. This notion is consistent with suggestions that VIP encodes self-motion and tactile stimuli in head or body-centered coordinates (Avillac et al., 2005; Avillac et al., 2004; Chen et al., 2013b; Chen et al., 2018; Zhang et al., 2004), and that visual signals in VIP are remapped within these coordinates (Avillac et al., 2005; Sereno and Huang, 2014). Accordingly, visual responses in VIP are transformed into a vestibular-recalibrated space. This leads to a remarkable dissociation between visual tuning in VIP and MSTd. Interestingly, visual self-motion perception follows the MSTd (not VIP) recalibration. This is in line with a causal connection between MSTd and visual heading discrimination (Britten and van Wezel, 1998; Gu et al., 2012).

What purpose might such visual signals in VIP serve? One possible idea is that they might reflect an expectation signal – for example, predicted vestibular or somatosensory sensation, based on the current visual signal. During combined stimuli (in the recalibration and post-recalibration blocks), the visual signal always appeared together with the vestibular sensory input. Thus, if visual responses in VIP reflect vestibular expectations, then these would shift together with vestibular (rather than visual) recalibration.

Limitations and future directions

Our results revealed correlations between neuronal recalibration and perceptual recalibration. However, they do not implicate any causal connections. Therefore, whether these cortical areas are actively involved in cross-modal recalibration (i.e., play a causal role) vs. simply reflecting the recalibrated signals (without playing a causal role) requires further research. To probe more directly for causal links, direct manipulation of neuronal activity might be required. For example, would reversible inactivation or microstimulation (of one or a combination of these multisensory areas) eliminate (or bias) unsupervised recalibration? In addition, future studies are needed to examine how the systematic error between vestibular and visual heading signals is detected. This likely involves additional brain areas, for example, the cerebellum, implicated in internal-model-based error monitoring (Markov et al., 2021; Rondi-Reig et al., 2014), and/or the anterior cingulate cortex, implicated in conflict monitoring (Bush et al., 2000; Holroyd and Coles, 2002). Thus, a wide-ranging effort to record and manipulate neural activity across a variety of brain regions will be necessary to tease apart the circuitry underlying this complex and important function.

The lack of evidence for (or against) visual recalibration in PIVC primarily reflects the lack of robust tuning to visual heading stimuli. We interpret the observed shifts in vestibular tuning in PIVC as lower-level, sensory, recalibration (similar to MSTd) based on the broader understanding that PIVC encodes lower-level vestibular signals, with transient time courses, and impoverished visual tuning (Chen et al., 2016; Chen et al., 2021). Our results are in line with this interpretation, and there is no reason to suspect that PIVC reflects more complex multisensory recalibration (like VIP). Nonetheless, the data could also be in line with alternative interpretations. A broader range of headings, and analyses beyond neurometrics, would be required to better understand whether (and how) visual signals in PIVC might be recalibrated.

The most surprising and intriguing finding in this study was the contrary recalibration of visual tuning in VIP. We propose that yoked recalibration of visual and vestibular responses in VIP (despite differential perceptual recalibration) might reflect a global shift in vestibular space. Accordingly, we suggest that visual responses in VIP might reflect an expectation signal (in vestibular space), for example, a simulation of the expected corresponding vestibular response (or integrated position, because VIP responses are sustained beyond the stimulus period). However, this idea is speculative, and the data from this study cannot address this question. Hence, further research is needed to investigate this idea, for example, by conditioning expectations for vestibular motion on other (non-motion) cues, and investigating whether these cues can induce simulated vestibular responses. If this hypothesis turns out to be true, it could greatly contribute to our understanding regarding the functions of the parietal cortex, and the brain mechanisms of perceptual inference.

Concluding remarks

This study exposed modality-specific recalibration of neuronal signals, resulting from a cross-modal (visual–vestibular) cue conflict. It further revealed profound differences in neuronal recalibration across multisensory cortical areas MSTd, PIVC, and VIP. The results therefore provide novel insights into adult multisensory plasticity, and deepen our understanding regarding the different functions of these multisensory cortical areas.

Methods

Subjects and surgery

Three male rhesus monkeys (Macaca mulatta, monkeys D, B, and K) weighing 8–10 kg participated in the experiment. The monkeys were first trained to sit in a custom primate chair and gradually exposed to the laboratory environment. Then the monkeys were chronically implanted a head-restraint cap and a sclera coil for measuring eye movement. After full recovery, the monkeys were trained to perform experimental tasks. All animal surgeries and experimental procedures were approved by the Institutional Animal Care and Use Committee at East China Normal University (IACUC protocol number: Mo20200101).

Equipment setup and motion stimuli

During the experiments, the monkeys were head-fixed and seated in a primate chair which was secured to a six degrees of freedom motion platform (Moog, East Aurora, NY, USA; MB-E-6DOF/12/1000 kg). The chair was also inside a magnetic field coil frame (Crist Instrument Co, Inc, Hagerstown, MD, USA) mounted on the platform for measuring eye movement with the sclera coil technique (for details, see Zhao et al., 2021).

Vestibular stimuli corresponded to linear movements of the platform (for details, see Chen et al., 2013a; Gu et al., 2006; Zhao et al., 2021). Visual stimuli were presented on a large computer screen (Philips BDL4225E, Royal Philips, Amsterdam, Netherlands), attached to the field coil frame. The display (62.5 cm × 51.5 cm) was viewed from a distance of 43 cm, thus subtending a visual angle of 72° × 62°. The sides of the coil frame were covered with a black enclosure, so the monkey could only see the visual stimuli on the screen (Gu et al., 2006; Zhao et al., 2021). The display had a pixel resolution of 1920 × 1080 and was updated at 60 Hz. Visual stimuli were programmed in OpenGL to simulate self-motion through a 3D cloud of ‘stars’ that occupied a virtual cube space 80 cm wide, 80 cm tall, and 80 cm deep, centered on the central fixation point on the screen. The ‘star’ density was 0.01/cm3. Each ‘star’ comprised a triangle with base by height: 0.15 cm × 0.15 cm. Monkeys wore custom stereo glasses made from Wratten filters (red #29 and green #61; Barrington, NJ, USA), such that the optic flow stimuli could be rendered in three dimensions as red-green anaglyphs.

The self-motion stimulus was either vestibular-only, visual-only, or combined (visual and vestibular stimuli). In the vestibular-only condition, there was no optic flow on the screen and the monkey was translated by the motion platform. In the visual-only condition, the motion platform remained stationary while optic flow was presented on the screen. For the combined condition, the monkeys experienced both translation and optic flow simultaneously. Each motion stimulus followed a Gaussian velocity profile with a duration of 1 s, and a displacement amplitude of 13 cm (bottom row, Figure 6). The peak velocity was 0.41 m/s, and the peak acceleration was 2.0 m/s2.

Task and recalibration protocol

The monkeys were trained to report their perceived direction of self-motion in a two-alternative forced-choice (2AFC) heading discrimination task (for details, see Chen et al., 2013a; Gu et al., 2008). In each trial, the monkey primarily experienced a forward motion with a small leftward or rightward component. During stimulation, the monkey was required to maintain fixation on a central point, within a 3° × 3° window. At the end of the trial (after a 300-ms delay period beyond the end of the stimulus), the monkeys needed to make a saccade to one of two targets (located 5° to the left and right of the central fixation point) to report their motion percept as leftward or rightward relative to straight ahead. The saccade endpoint had to remain within 2.5° of the target for at least 150 ms to be considered a valid choice. Correct responses were rewarded with a drop of liquid.

To elicit recalibration, we used an unsupervised cue-conflict recalibration protocol previously tested behaviorally in humans and monkeys (Zaidel et al., 2011). Each experimental session consisted of three consecutive blocks, as described below.

Pre-recalibration block

This block was used to deduce the baseline performance (psychometric curve) of each modality for the monkeys, thus only a single-cue (vestibular-only or visual-only) stimulus was presented (Figure 1A). Across trials, the heading angle was varied in small steps around straight ahead. Ten logarithmically spaced heading angles were tested for each monkey (±16°, ±8°, ±4°, ±2°, and ±1°). To accustom the monkeys to not getting a reward for all the trials, they were rewarded with 95% probability for correct choices, and not rewarded for incorrect choices.

Recalibration block

Only combined vestibular–visual cues were presented in this block (Figure 1B). A discrepancy (Δ) between the vestibular and visual cues was introduced gradually from 2° to 10° (or −2° to −10°) with steps of 2°, and then held at ±10° for the rest of the block. This gradual introduction was applied to avoid the monkeys from noticing the discrepancy. The sign of Δ represents the orientation of the discrepancy: positive Δ (marked by Δ+) indicates that the vestibular and visual cues were systematically offset to the right and to the left, respectively, and vice versa for negative Δ (Δ). Only one discrepancy orientation (Δ+ or Δ) was used per session. The combined stimulus headings followed the same ten headings as the single-cue stimuli in the pre-recalibration block. For the combined stimuli, the vestibular and visual headings were each offset by Δ/2 (to opposite sides), such that the combined heading was defined in the middle between the vestibular and visual headings. Unlike the pre-recalibration block, monkeys only needed to maintain fixation on the central fixation point during the stimulus presentation and did not need to make choices at the end of trials. They were rewarded for all the trials for which they maintained fixation. 7–10 repetitions were run for each Δ increment, and an additional 10–16 repetitions were run for maximum Δ (±10°).

Post-recalibration block

During this block, performance for the individual (visual/vestibular) modalities was once again tested using single-cue trials (as in the pre-recalibration block). Responses to these trials were used to measure recalibration. The single-cue trials were interleaved with combined-cue trials (with a ±10° discrepancy, like the end of the recalibration block, Figure 1C). The combined-cue trials were interleaved to maintain recalibration while it was measured (for details, see Zaidel et al., 2011). To avoid perturbing the recalibrated behavior, we adjusted the reward probability for single-cue trials as follows: if the single-cue heading was of relatively large magnitude, such that, if it were part of a combined-cue trial also the other cue would lie to the same side (right or left), monkeys were rewarded as in the pre-recalibration block (95% probability reward for correct choices; no reward for incorrect choices). If, however, the heading for other modality would have been to the opposite side, the monkeys were rewarded stochastically (70% reward probability, regardless of their choices).

Electrophysiological recordings

We recorded extracellular activity from isolated single neurons in areas MSTd, PIVC, and VIP using tungsten microelectrodes (Frederick Haer Company, Bowdoin, ME, USA; tip diameter ~3 μm; impedance, 1–2 MΩ at 1 kHz). The microelectrode was advanced into the cortex through a transdural guide tube, using a hydraulic microdrive (Frederick Haer Company). Raw neural signals were amplified, band-pass filtered (400–5000 Hz), and digitized at 25 kHz using the AlphaOmega system (AlphaOmega Instruments, Nazareth Illit, Israel). Spikes were sorted online, and spike times along with all behavioral events were collected with 1-ms resolution using the Tempo system. If the online sorting was not adequate, offline spike sorting was performed.

The target areas (MSTd, PIVC, and VIP) were identified based on the patterns of gray and white matter transitions, magnetic resonance imaging scans, stereotaxic coordinates, and physiological response properties as described previously (MSTd: Gu et al., 2006; PIVC: Chen et al., 2010; VIP: Chen et al., 2011a).

Data analysis

Data analysis was performed with custom scripts in Matlab R2016a (The MathWorks, Natick, MA, USA). Psychometric plots were constructed by fitting the proportion of ‘rightward’ choices as a function of heading angle with a cumulative Gaussian distribution function, using the psignifit toolbox for MATLAB (version 2.5.6). Separate psychometric functions were constructed for each cue (visual and vestibular) and block (pre- and post-recalibration). The psychophysical threshold and PSE were defined, respectively, by the standard deviation (SD, σ) and mean (μ) of the fitted Gaussian function. The PSE represents the heading angle that would be perceived as straight ahead, also known as the ‘bias’. Vestibular and visual recalibration (PSE shift) was calculated for each session by subtracting the pre-recalibration PSE from the post-recalibration PSE:

(1) PSE shift=PSEpost-PSEpre

Neuronal tuning curves were constructed for vestibular and visual cues, pre- and post-recalibration, by calculating the average (baseline subtracted) FR responses, as a function of heading. FR responses were calculated over the duration of stimulus presentation (t = 0–1 s), and baseline FRs were calculated (per block) by the average FR in the 1-s window before stimulus onset. A neuron was considered ‘tuned’ if the linear regression of FR responses by heading (over the narrow range of stimuli presented: −16° to 16°) had a significant slope (p < 0.05).

This selection criterion was selective for neurons that have sloped tuning around straight ahead, and excluded neurons with flat tuning, or a tuning preference, straight ahead. In the cortical areas of interest in this study, a disproportionately large number of neurons have steep tuning slopes around straight ahead (Chen et al., 2011b; Gu et al., 2010). These neurons are most informative for heading discrimination (large Fisher information, Gu et al., 2010). By contrast, neurons with relatively flat tuning around straight ahead are less informative for heading discrimination (low Fisher information). Accordingly, small shifts can be readily detected in neurons with sloped tuning (but not in those with flat tuning) around straight ahead. Therefore, in this study we focused on the prevalent neurons with sloped tuning around straight ahead.

Neurometrics

For each neuron recorded, neurometric curves (per cue and block) were constructed from the FRs (Chen et al., 2013a; Fetsch et al., 2011; Gu et al., 2008; Gu et al., 2007). For this, the FRs were first normalized (z-scored) by subtracting the pre-recalibration mean, and dividing by the pre-recalibration SD. The same (pre-recalibration) mean and SD values were used to normalize both the pre- and post-recalibration FRs (per cue). A common reference (pre-recalibration mean, corresponding to z-score = 0) was needed to expose PSE shifts (calculating neurometric curves by comparing responses to positive vs. corresponding negative headings assumes PSE = 0°).

Then, for each heading, an ROC (receiver operating characteristic) curve was computed by moving a ‘criterion’ value from the minimum to the maximum z-score (in 100 steps), and plotting the probability that the z-scores exceeded the criterion vs. whether z-score = 0 (the pre-recalibration mean) exceeded that same criterion, or not (1 or 0, respectively). A single point on the ROC curve was produced for each increment in the criterion. The area under the ROC curve reflects the probability that an ideal observer would discriminate the neuronal responses for the given heading to the neuron’s preferred (vs. non-preferred) side (right/left), in relation to the pre-recalibration mean. Then these values were mapped onto the probability of a rightward choice and fitted with a cumulative Gaussian function (similar to perceptual psychometrics).

Neuronal shifts

For subsequent analyses, that is, calculating neurometric shifts (and comparison thereof to perceptual shifts) only neurons that passed both of the following two screening criteria (per cue) were included: (1) significant tuning to the corresponding cue (either pre- or post-recalibration; see Data analysis subsection above for details). (2) Both the pre- and post-recalibration neurometrics produced reliable PSEs (bootstrapped SD of the PSE <10°, both pre- and post-recalibration). The bootstrapped SDs of the PSEs (for the neurons that passed the first criterion, of significant tuning) are presented in Figure 3—figure supplement 1. This resulted in 14 and 59 MSTd neurons for vestibular and visual cues, respectively (Figure 3); 30 and 10 PIVC neurons for vestibular and visual cues, respectively (Figure 4); 37 and 42 PIVC neurons for vestibular and visual cues, respectively (Figure 5).

Neuronal shifts were measured by the difference between the post- and pre-recalibration neurometric PSEs (similar to perceptual shifts, see Equation. 1). For each recording area (MSTd, PIVC, and VIP) and cue (vestibular and visual) neuronal shifts were compared to perceptual shifts, using Pearson correlations (pooling data across monkeys). Additionally, to assess the relationship between neuronal and perceptual shifts, while taking into account the differences of individual monkeys, we used an LMM, which allowed for random effects in slope and intercept for the different monkeys. The goodness of fit was assessed for the LMM and the pooled model (which did not take into account differences of individual monkeys) using AIC (Akaike Information Criterion) and BIC (Bayesian Information Criterion) (Vrieze, 2012). The LMM did not provide better fits vs. the pooled model, and the results (fixed effects) remained similar compared to the pooled model (Figure 3—source data 2, Figure 4—source data 2, and Figure 5—source data 2).

To measure neuronal shifts at different time points during the stimulus, we calculated neurometric shifts based on FRs in narrow (200 ms) windows, in increments of 100 ms. The time index (the center of the window) ranged from t = 0.1 s to t = 1.2 s (relative to stimulus onset). This range did not include the choice saccade, which could only take place after t = 1.3 s because of the delay period (300 ms) between the offset of the stimulus and the onset of the saccade targets. All neurons that passed both of the screening criteria (described above) were included in this analysis.

Partial correlation analysis

To disassociate the unique contributions of heading stimuli and choices to the neural responses (FRs) we computed Pearson partial correlations between these variables (for details, see Chen et al., 2021; Zaidel et al., 2017). This produced: (1) a heading partial correlation (Rh) that captured the linear relationship between FRs and headings, given the monkey’s choices, and (2) a choice partial correlation (Rc) that captured the linear relationship between FRs and choices, given the stimulus headings. Partial correlations were calculated based on data acquired over the entire stimulus duration. Positive (negative) heading partial correlations indicate that FRs were greater (smaller) for rightward vs. leftward headings (given the choices). Likewise, positive (negative) choice partial correlations indicate that FRs were greater (smaller) for rightward vs. leftward choices (given the stimulus headings).

Statistical analysis

To evaluate differences in monkey behavior (PSE), heading, or choice partial correlations, between pre- and post-recalibration, we used two-tailed paired t-tests. Possible differences in spontaneous (baseline) FRs between pre- and post-recalibration were evaluated using Bayesian paired-samples t-tests (BF10 values). Relationships between neuronal and perceptual shifts were tested using Pearson’s correlation coefficients and LMMs. Statistical analysis was conducted using JASP (Version 0.16.3) and R (Version 4.2.2).

Data availability

The data and analysis code for this study have been uploaded to Github and can be found at https://github.com/FuZengBio/Recalibration (copy archived at Zeng, 2022).

References

  1. Book
    1. Bertelson P
    2. De Gelder B
    (2004) The psychology of multimodal perception
    In: Charles Spence, Jon Driver, editors. Crossmodal Space and Crossmodal Attention. Oxford University Press. pp. 141–177.
    https://doi.org/10.1093/acprof:oso/9780198524861.001.0001
  2. Book
    1. Reason JT
    2. Brand JJ
    (1975)
    Motion Sickness
    Cambridge, Massachusetts: Academic Press.
    1. Shupak A
    2. Gordon CR
    (2006)
    Motion sickness: Advances in pathogenesis, prediction, prevention, and treatment
    Aviation, Space, and Environmental Medicine 77:1213–1223.
    1. Warren WH
    2. Morris MW
    3. Kalish M
    (1988) Perception of translational heading from optical flow
    Journal of Experimental Psychology. Human Perception and Performance 14:646–660.
    https://doi.org/10.1037//0096-1523.14.4.646

Decision letter

  1. Christopher R Fetsch
    Reviewing Editor; Johns Hopkins University, United States
  2. Michael J Frank
    Senior Editor; Brown University, United States
  3. Umberto Olcese
    Reviewer; University of Amsterdam, Netherlands
  4. Robbe Goris
    Reviewer; The University of Texas at Austin, Austin, United States

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Contrary neuronal recalibration in different multisensory cortical areas" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Michael Frank as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Umberto Olcese (Reviewer #2); Robbe Goris (Reviewer #3).

The reviewers' assessment of the work is overall quite positive; congratulations on an interesting and potentially impactful study. The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

1) Please report summary statistics (means and correlation coefficients) and p values for individual animals, for the data in Figure 2C, 3B, 4B, and 5B. This is not to say that the results must be significant at the individual animal level in order to support the study's main conclusions; we are all aware of the practicalities and accepted conventions of the field. Yet we feel it is important to be up front about this limitation, if indeed some of the individual monkey results fail to reach significance.

To be clear, the reasoning behind this request is that repeated observations from the same subject/condition are not independent of each other, thus correlations calculated from pooled data might obscure, inflate, or even reverse the true relationship between the variables (e.g., Simpson's Paradox).

Apart from individual subject analyses, another approach (indeed our strong recommendation) is to perform a hierarchical analysis. There are multiple options for this; probably the easiest one is to calculate linear mixed regression models (LMM) with one variable as predictor, the other as outcome and no intercept. This can be done in the frequentist way using (for example) the lmer package in R, or in a Bayesian way using Stan, which even as an automatized module for GLMMs. Partial correlations can be achieved by adding the variable that is partialized out as a predictor.

In summary, the authors may choose one (or more) of the three mentioned approaches to enhance statistical rigor: (1) separate t-tests and correlations for each animal and condition, (2) frequentist hierarchical linear models, and (3) Bayesian hierarchical linear models. It is worth pointing out that Bayesian approaches have advantages over frequentist methods, for instance quantifying the evidence for the null hypothesis (and thus better evaluating negative results such as Figure 4B-right).

2) Please consider applying a goodness-of-fit criterion to the neurometric curves before inclusion of their PSEs in the neuron-behavior correlation analyses – AND/OR evaluate the reliability of the PSEs using standard error obtained from the fitting procedure, or a bootstrap-based confidence interval. We would not require individual neuron shifts (δ-PSE) to be significant according to such SEs (i.e., through error propagation), but a reanalysis after removing particularly poor fits seems appropriate. The criterion to use for this is difficult for us to specify and thus can use your best judgment.

3) Depending on how the first two points are handled, and the outcome thereof, it may be necessary to scrutinize the role of outliers in generating the correlation coefficients and p values obtained. For instance, is the correlation of Figure 3B-left still significant without the upper four points, and 3B-right without the rightmost three points? A hierarchical analysis and/or neurometric goodness-of-fit criterion could reduce the role of outliers, in which case no formal outlier correction or other way to address this is needed.

Reviewer #1 (Recommendations for the authors):

The results are really interesting, yet, the manuscript in its current form needs revisions along two dimensions, 1) data analysis and 2) writing.

Methods

I am aware that basically all analyses in this manuscript have been used in published papers. The problems outlined below hold for those papers, too, sorry. Bad (or good) luck having this time a reviewer from another field.

Correlations derived from data that includes multiple repetitions per subject (and condition) require a hierarchical analysis because repeated observations from the same observer are not independent of each other. Subject A might on average have higher shifts (behavioral and neurometric) than subject B. In that case, a non-hierarchical analysis might return a significant positive correlation even if within subject A negative behavioral shifts are associated with positive neurometric shifts and vice versa. There are multiple ways to account for intra-subject correlations, probably the easiest one is to calculate linear mixed regression models (LMM) with one variable as predictor, the other as outcome and no intercept. This can be done in the frequentist way using the lmer package in R and in a Bayesian way using Stan which even as an automatized module for GLMMs. Partial correlations can be achieved by adding the variable that is partialized out as a predictor. Of course, it is also possible to just analyze each subject separately.

The same holds for any comparisons of mean values (across heading offsets during the recalibration phase or across modalities). For example, a t-test that includes perceptual shifts from all sessions and all monkeys is not valid as the values from a single monkey are correlated. Again, there are several ways to account for this. For example, an LMM with cross-modal discrepancy as predictor (and a free intercept) or each monkey's data could be tested separately.

The procedure to derive neurometric curves, which are central to the study, should be explained better. In the main text and in the figures, it should be clarified that each data point shows the proportion of trials in which an ideal observer would make a rightwards choice given only the firing rates of the neurons (and assumed anti-neurons). In the methods section, it should be explained in detail how the ROC curve is derived. Readers unfamiliar with the method should understand that the ROC incorporates decisions for a range of decision thresholds and that the area under the ROC curve (AUROC, not ROC value) corresponds to a theoretical observer's ability to discriminate between leftwards and rightwards motion given firing rates across repetitions of a specific motion direction. Finally, those AUROC values (ranging from 0.5 to 1) are mapped onto the probability of a rightwards choice (ranging from 0 to 1) given a heading direction.

The use of PSEs for the correlational analysis should be conditional on the goodness of fit of the sigmoid and the SD should fall within a reasonable range. A PSE derived from an ill-fitting or very flat curve has no informational value and is likely to be extreme, which in turn is huge a problem for any correlational analysis.

The methods section states that only neurons tuned to heading directions as indicated by cues in a specific modality were included in the neurometric analysis for that modality. If that exclusion criterion was applied, why are the neurometric curves for 'the other modality' so bad? I would have expected that neurons tuned to visual heading cues result in a good neurometric function for visual heading no matter where these neurons are located in the brain. In turn, I would have expected flat neurometric functions for visual heading direction in PIVC if also neurons tuned to vestibular heading information were included in the analysis. What am I missing?

Another instance of me probably not getting something: to identify neurons tuned for heading directions as cued by one of the two modalities, spike rates were regressed onto heading directions and only those neurons with a significant slope were included. I get that the range of tested headings is smaller than the width of a typical tuning curve and thus a linear regression makes more sense than trying to fit a full tuning curve. However, it seems to me that his selection method excludes neurons tuned to straight-ahead as their slope should be flat for a symmetric range of headings.

Regression and correlational analyses are extremely sensitive to outliers. To be sure that the results are robust, please repeat the analyses after outlier correction, e.g., based on the 1.5xIQR rule along each dimension or based on each data points influence on the regression (Cook's distance).

Looking at the currently depicted sigmoids, I suggest increasing the maximal value for the lapse rate. Even though it is not fully clear to me what a lapse rate of neurometric curve actually means, achieving a better fit for the curves seems essential given that the PSEs are at the center of the results.

The methods section provides information about the number of neurons per area included in the analyses in general. Additional information about the number of neurons per neurometric curve would be useful.

Please add in l. 698 how the baseline activity for each neuron was determined, e.g., based on a single interval at the beginning of the session or based on recordings in between trials.

Probably just a typo but in l. 689 Pearson's correlation has nothing to do with linear regression.

Writing

Generally, I find the manuscript to be well-written and organized. However, in its current form, the manuscript is geared towards a small audience, electrophysiologists familiar with most publications by the Angelaki and DeAngelis labs. A much wider audience could be reached by 1) phrasing more precisely and 2) providing more information, reasoning, and explanations.

The verb 'recalibrate' is often used in a way that doesn't match the literature and the way recalibration is thought of. Information cannot be recalibrated; cues cannot recalibrate actively; responses do not recalibrate (together). A system is recalibrated and then it interprets incoming information differently. In most instances, a simple replacement of recalibrated with "shifted" will help -- AND/OR include a statement early on defining these more colloquial uses of recalibrate (e.g., "we refer to this pattern of neural activity as 'recalibration'…)

A few examples for more common and precise phrasing:

l. 35 -> recalibrating itself based on information from …

l. 36 -> estimates for subsequently presented unisensory stimuli are shifted towards each other.

l. 25 -> the tuning of neural responses to … cues was shifted in the same direction as the monkeys' perceptual judgments of subsequently presented unisensory stimuli.

l. 43 -> tuning of vestibular neurons was shifted in the same direction as vestibular heading perception.

This issue is present throughout the manuscript but especially so in the abstract, in brief section, and the discussion. I doubt that someone not familiar with the literature can understand the abstract. There is a section in the discussion (l. 483f) that is written in a very abstract manner but phrased according to the literature, i.e., in accordance with the ways most people think about recalibration. Please adjust the rest of the text accordingly.

Comments in chronological order:

l. 57 Behavioral papers cited in a sentence about the neuronal basis of multisensory integration.

l. 76 Just saying that results exist is rather unusual. In a few words, what did the neuroimaging studies find?

l. 91 Burge et al. is not about heading perception but about visual-haptic slant perception.

l. 103 Again, the typical phrasing would be that perception was recalibrated as indicated by shifts in subsequent perceptual estimates.

l. 108f It would be easier for readers to learn about supervised recalibration in the discussion, as now they have to shift mentally from unsupervised to supervised and back to unsupervised recalibration.

l. 119 Why would diverging shifts in the tuning of neural mechanisms between cortical areas not be detectable when both perceptual estimates are shifted in the same direction?

l. 126 Isn't that what the neuroimaging studies showed? Changes in relatively early areas?

l. 140 "therefore we expected to see perceptual shifts resulting from unsupervised recalibration in MSTd and PIVC". That is a surprising claim, which needs further explanation as it implies that perceptual decisions are based on neural activity in these relatively early areas. Maybe leave that conclusion entirely to the discussion.

l. 145 The claim that VIP underlies cognitive processing will startle many. Why not simply say that VIP seems to be involved in perceptual decision making or higher order perceptual functions some of which yet have to be understood?

l. 161 Please explain that 1) during the recalibration block, the monkeys are exposed to visual-vestibular stimulus pairs with a consistent discrepancy in heading direction and that 2) in pre- and post-recalibration blocks the perception of heading direction indicated by unisensory stimuli is measured and that 3) recalibration effects are measured as the difference between pre- and post-recalibration results. The text should make sense to a reader who has never performed a recalibration study. No reader should be forced to read another paper just to understand the most crucial aspects of the current one!

l. 181f It does not seem necessary to describe the figure in detail in the text. Explanations of how to read a figure are best placed in its caption. Given that these are selected-curves of a single-subject in one of several sessions, the size of the shifts should not be compared or discussed in the text.

l. 200 At this point the reader has never heard that the monkeys underwent several sessions nor in which way the sessions differed from each other. Please add that information.

l. 202 Why force the reader to learn what δ + means in this manuscript? It is much better to just speak of 'sessions in which visual heading was rightwards of vestibular heading during the recalibration phase'. Please apply that thought throughout.

l. 232 What does 'tuning recalibrate with perceptual shifts' mean? A good advice I got from a book on scientific writing is to take sentences literally even when they describe scientific matter. One possible general phrasing would be that the neural tuning shifted in the same (opposite for VIP) direction as heading perception. A more results-oriented phrasing is that the neurometric functions shifted in the same direction as the psychometric functions for each modality. Please repair this throughout the text!

l. 240 This description of the neurometric analysis is not sufficient (see comment in the methods section). In addition, "PSEs were extracted similar to" is confusing, PSEs always correspond to the 50% point of a psychometric function (e.g., the mean of the Gaussian distribution). More importantly, it should be explained to readers that the PSE of a neurometric curve is the physical heading direction at which the chances to make a rightward judgment based on the firing rates of the neurons are fifty-fifty, i.e., straight-ahead according to the neuronal data.

l. 245 I think this is the first instance in which the term "behavioral shifts" is used instead of "perceptual shifts" but then it occurs consistently and is even present in the figures. In psychophysics, the term "behavioral shifts" would be used if there is any reason to suspect that the behavior does not correspond to the percepts, e.g., because participants show a response bias rather than a perceptual bias. If this might be the case, it should be discussed, otherwise please use perceptual.

l. 286 Again, please phrase more precisely.

l. 315f I like the analysis and the paragraph could function as a prototype for the subsequent paragraphs regarding its degree of abstraction and briefness. Yet, again more precise phrasing would be nice, e.g., neurons don't respond visually they exclusively respond when visual stimuli are presented.

l. 334f Again, in my view it is not necessary to describe figures and do so panel by panel.

l. 339f The last sentence of the paragraph does not parse.

The claim that neural recalibration follows the velocity profile of the stimulus is too strong and not correct as the figures show correlations not neural recalibration. For the earlier areas, the claim that the significant correlation between neurometric and psychometric shifts is driven by firing rates during the period of maximal stimulus velocity might be correct.

l. 350f Similarly, for VIP neurons it is also not necessary to describe the figure.

l. 356 The claim that the correlations between neurometric and psychometric shifts given firing rates recorded at the end of the stimulus presentation has nothing to do with choice behavior but reflects neuronal recalibration is not substantiated at this point, an anti-correlation is a strong relation. The conclusion might be drawn based on the last section of the results.

l. 364 "During recalibration" means during the recalibration phase but I think that is not what the authors mean because there are no behavioral choices during the recalibration phase. Please search the text for this phrasing, it probably is not adequate at other instances, too.

l. 412 Not sure if this is the first instance, but cross-sensory is not a word used in the literature, please use either multisensory or cross-modal or across the senses. Some authors will point out that multisensory should only be used in the case of perceptual fusion. Please replace the word throughout the text.

l. 412f This holds for the full Discussion: See above comments, the phrasing is very uncommon, especially the use of the verb 'recalibrate'. Almost all instances of 'recalibrated' should be replaced with 'changed' or 'shifted'. "Together with" means "shifts in same direction as perceptual shifts" and so on. It gets much better from line 478 on.

l. 422 Not sure if vision scientists would call MSTd a multisensory area.

l. 432f Why does 'unsupervised' imply changes in early areas? I wondered the same in the introduction. And how does that go together with the suggestion that the conflict is detected in ACC? It might be easiest to just refer to the neuroimaging studies or simply not make a claim.

l. 457f What is "individualized recalibration"? The literature uses "modality-specific". I fail to see what this section adds that is not in the previous section. Why would the yoking found in the supervised recalibration study for both perceptual and neuronal shifts predict uniformity in the neuronal shifts in a paradigm that leads to non-yoked perceptual shifts?

l. 499 Please phrase the conclusion as a possibility, as it remains unclear to which degree supervised and unsupervised recalibration correspond.

l. 513 'Shift in reference frame' does this refer to a change in the supramodal definition of straight-ahead based on the vestibular tuning in lower sensory areas? The idea that VIP is tuned in a vestibular reference frame fits with earlier studies investigating visual-tactile reference frames (e.g., Avillac et al., Sereno and Huang, and also Graziano recording in F4).

l. 531 what does "cross-sensory recalibration, vs. simply reflecting the recalibrated signals" mean? What are recalibrated signals and why are they different from cross-modal recalibration?

General writing guidelines:

All figures should be optimized for the outlet, which in the case of eLife is wide and short figures as the text is never set in two-columns.

All acronyms should be defined when they are used for the first time, e.g., point of subjective equivalence (PSE). It can be very helpful for readers to treat abstract, significance statement, main text, and methods as separate and define acronyms anew.

A number and its unit are separated by a space, e.g., 300 ms instead of 300ms. Please check this throughout the text including the figures.

Figure 1

A: A visualization of the optic flow stimuli (e.g., two subsequent frames or little arrows indicating the motion vectors) would be nice.

B: It should be indicated either in the figure or in the caption that δ was constant within a single session, but theta was varied within a session and could take on all values depicted in A.

The grey vector corresponds to the combined direction if both cues have the same reliability, which probably wasn't exactly the case given Figure 2A.

C: 'no choice' is misplaced.

All of them, please add a space between a number and its unit.

Figure 2

A,B please indicate the monkey and the session number.

A,B usually rightward choice ratio would be interpreted as the ratio of rightward to leftward choices, I assume the proportion of rightward responses is shown.

C indicate how the shift was calculated.

C I cannot see the error bars referred to in the caption.

C please indicate the distribution of each monkey to assure readers that the results hold within and across subjects (see comments on the statistical analysis).

Reviewer #2 (Recommendations for the authors):

Aside from the major comments outlined in the public review, I found that some figure legends should be expanded. For instance, in Figure 8 it is unclear what the empty and filled circles indicate, respectively. I would recommend the authors to check the manuscript carefully.

While in general the manuscript is well written, I found the "in brief" section rather difficult and not suitable for a broader audience. I would suggest rewriting the section accordingly.

https://doi.org/10.7554/eLife.82895.sa1

Author response

Essential revisions:

1) Please report summary statistics (means and correlation coefficients) and p values for individual animals, for the data in Figure 2C, 3B, 4B, and 5B. This is not to say that the results must be significant at the individual animal level in order to support the study's main conclusions; we are all aware of the practicalities and accepted conventions of the field. Yet we feel it is important to be up front about this limitation, if indeed some of the individual monkey results fail to reach significance.

To be clear, the reasoning behind this request is that repeated observations from the same subject/condition are not independent of each other, thus correlations calculated from pooled data might obscure, inflate, or even reverse the true relationship between the variables (e.g., Simpson's Paradox).

Apart from individual subject analyses, another approach (indeed our strong recommendation) is to perform a hierarchical analysis. There are multiple options for this; probably the easiest one is to calculate linear mixed regression models (LMM) with one variable as predictor, the other as outcome and no intercept. This can be done in the frequentist way using (for example) the lmer package in R, or in a Bayesian way using Stan, which even as an automatized module for GLMMs. Partial correlations can be achieved by adding the variable that is partialized out as a predictor.

In summary, the authors may choose one (or more) of the three mentioned approaches to enhance statistical rigor: (1) separate t-tests and correlations for each animal and condition, (2) frequentist hierarchical linear models, and (3) Bayesian hierarchical linear models. It is worth pointing out that Bayesian approaches have advantages over frequentist methods, for instance quantifying the evidence for the null hypothesis (and thus better evaluating negative results such as Figure 4B-right).

Thank you for raising this important point. In response, we 1) added summary statistics for the individual monkeys, and 2) applied the linear mixed model (LMM) analysis to the data, with neuronal shifts as the dependent variable, perceptual shifts as the fixed predictor, and monkeys as a random effects grouping factor. Results from both of these approaches indicate that the findings are consistent across monkeys. We added these results as supplementary tables (see source data associated with each figure) to the revised manuscript. The LMM analyses (with monkeys as a random effects factor) did not provide better fits than the pooled model (PM) analyses (with pooled data across monkeys). These additional analyses support the main findings of the manuscript.

2) Please consider applying a goodness-of-fit criterion to the neurometric curves before inclusion of their PSEs in the neuron-behavior correlation analyses – AND/OR evaluate the reliability of the PSEs using standard error obtained from the fitting procedure, or a bootstrap-based confidence interval. We would not require individual neuron shifts (δ-PSE) to be significant according to such SEs (i.e., through error propagation), but a reanalysis after removing particularly poor fits seems appropriate. The criterion to use for this is difficult for us to specify and thus can use your best judgment.

Thank you for this valid suggestion. To quantify the reliability of each neurometric PSE we used the standard deviation (SD) of the bootstrapped PSEs obtained from the fitting procedure. We considered that this measure would be a more direct estimate of PSE reliability (the parameter of interest) vs. more general goodness-of-fit measures (such as pseudo-R2) which are influenced by other parameters (e.g., thresholds and lapse rates) that are less relevant for assessing PSE estimate reliability. We required that the SD of the neuron’s bootstrapped PSE was < 10°, in both the pre-and post-recalibration blocks. This cutoff was chosen empirically, based on the distribution of SD values across neurons (see Figure 3—figure supplement 1). This screening was applied in addition to (after) the initial screening for significant tuning (as described in the original manuscript). We have now updated the manuscript to say that the neurons were first screened for significant response tuning, and then the remaining neurons were screened for neurometric PSE reliability (please see the revised Methods subsection “Neuronal shifts”, paragraph 1).

For MSTd, the neurometric PSE reliability screening removed 9 neurons from the vestibular data and 9 neurons from the visual data. For PIVC, this removed 14 neurons from the vestibular data and 11 neurons from the visual data. For VIP, this removed 16 neurons from the vestibular data and 24 neurons from the visual data. We redid all the analyses including only the neurons that passed this additional screening (the results remained similar), updated the figures, and revised the manuscript accordingly.

3) Depending on how the first two points are handled, and the outcome thereof, it may be necessary to scrutinize the role of outliers in generating the correlation coefficients and p values obtained. For instance, is the correlation of Figure 3B-left still significant without the upper four points, and 3B-right without the rightmost three points? A hierarchical analysis and/or neurometric goodness-of-fit criterion could reduce the role of outliers, in which case no formal outlier correction or other way to address this is needed.

The robustness of the results was assessed with the systematic approaches described above (in response to the two previous comments). Namely: additional screening to remove neurons with low-reliability estimates for the neuronal PSE (see the response to Comment #2), and additional analysis (LMM) to account for the random effects of different monkeys (see the response to Comment #1). These specific data points passed this screening, and the results were robust to further analyses that took into account the random effects of monkeys. In addition, we tested the influence of outliers using ‘Cook's distance’ (Author response image 1) . Redoing the analyses after removing data points with Cook’s distances larger than three times the mean provided similar results as before (summarized in Author response table 1).

Author response image 1
Cook's distances.

Cook’s distances for the vestibular and visual data (top and bottom row, respectively) from areas (A) MSTd, (B) PIVC, and (C) VIP. The green dashed line marks three times the mean Cook's distance.

Author response table 1
‘Cook's distance’ outlier analysis.
MSTdrpN
VestibularAll0.620.019 *14
Outliers Removed0.630.021 *13
VisualAll0.382.7 × 10-3 ***59
Outliers Removed0.367.8 × 10-3 ***53
PIVCrpN
VestibularAll0.809.7 × 10-8 ***30
Outliers Removed0.819.3 × 10-8 ***29
VisualAll0.260.4710
Outliers Removed0.260.4710
VIPrpN
VestibularAll0.772.7× 10-8 ***37
Outliers Removed0.801.1× 10-8 ***35
VisualAll-0.688.4× 10-7 ***42
Outliers Removed-0.708.0 × 10-7 ***38
  1. N = number of neurons, r, and p-values from Pearson correlations. *** p < 0.001; * p < 0.05.

Reviewer #1 (Recommendations for the authors):

The results are really interesting, yet, the manuscript in its current form needs revisions along two dimensions, 1) data analysis and 2) writing.

Methods

I am aware that basically all analyses in this manuscript have been used in published papers. The problems outlined below hold for those papers, too, sorry. Bad (or good) luck having this time a reviewer from another field.

Correlations derived from data that includes multiple repetitions per subject (and condition) require a hierarchical analysis because repeated observations from the same observer are not independent of each other. Subject A might on average have higher shifts (behavioral and neurometric) than subject B. In that case, a non-hierarchical analysis might return a significant positive correlation even if within subject A negative behavioral shifts are associated with positive neurometric shifts and vice versa. There are multiple ways to account for intra-subject correlations, probably the easiest one is to calculate linear mixed regression models (LMM) with one variable as predictor, the other as outcome and no intercept. This can be done in the frequentist way using the lmer package in R and in a Bayesian way using Stan which even as an automatized module for GLMMs. Partial correlations can be achieved by adding the variable that is partialized out as a predictor. Of course, it is also possible to just analyze each subject separately.

Thank you for raising this valid point. In response, we: (1) calculated the individual monkey summary statistics, and report these in the revised manuscript (Figure 2–source data 1 presents the behavioral shift data; Figure 3–source data 1, Figure 4–source data 1, and Figure 5–source data 1 present the correlations between neuronal and behavioral shifts for MSTd, PIVC, and VIP, respectively). (2) We performed the linear mixed model (LMM) analysis, and present the results, and comparison thereof to the pooled model (PM) with pooled data across monkeys, in the respective supplementary tables (see source data). Please see our response to Essential Revision #1 (above) for further details.

The same holds for any comparisons of mean values (across heading offsets during the recalibration phase or across modalities). For example, a t-test that includes perceptual shifts from all sessions and all monkeys is not valid as the values from a single monkey are correlated. Again, there are several ways to account for this. For example, an LMM with cross-modal discrepancy as predictor (and a free intercept) or each monkey's data could be tested separately.

The procedure to derive neurometric curves, which are central to the study, should be explained better. In the main text and in the figures, it should be clarified that each data point shows the proportion of trials in which an ideal observer would make a rightwards choice given only the firing rates of the neurons (and assumed anti-neurons). In the methods section, it should be explained in detail how the ROC curve is derived. Readers unfamiliar with the method should understand that the ROC incorporates decisions for a range of decision thresholds and that the area under the ROC curve (AUROC, not ROC value) corresponds to a theoretical observer's ability to discriminate between leftwards and rightwards motion given firing rates across repetitions of a specific motion direction. Finally, those AUROC values (ranging from 0.5 to 1) are mapped onto the probability of a rightwards choice (ranging from 0 to 1) given a heading direction.

In response to this comment, we added more detailed explanations about the procedure to derive the neurometric curves (please see the revised Methods subsection “Neurometrics”, paragraphs 1-2). We also added to the figure legends (regarding neurometric curves) that each data point shows the proportion of trials in which an ideal observer would make a rightward choice given the firing rates of the neurons. We would like to clarify that for this calculation we did not compare neurons to “anti-neurons” (which would assume PSE = 0°). Rather, we z-scored the data in reference to the pre-calibration firing rates. Using a common reference (pre-recalibration mean, corresponding to z-score = 0) to normalize both pre- and post-recalibration data was needed to allow for and expose PSE shifts. Then we calculated ROC curves by moving a ‘criterion’ value from the minimum to the maximum z-score (in 100 steps), and plotting the probability that the z-scores exceeded the criterion vs. whether z-score = 0 (the pre-recalibration mean) exceeded that same criterion, or not (1 or 0, respectively). We have also added this clarification to the revised manuscript (see Methods subsection “Neurometrics”, paragraph 2).

The use of PSEs for the correlational analysis should be conditional on the goodness of fit of the sigmoid and the SD should fall within a reasonable range. A PSE derived from an ill-fitting or very flat curve has no informational value and is likely to be extreme, which in turn is huge a problem for any correlational analysis.

In response to this comment we further screened the neurons by calculating the SD of the neurometric PSE (from bootstrapped values) and excluded neurons with SDs > 10° (either pre- or post-recalibration). We considered that this would be a more direct measure of PSE reliability (the parameter of interest) vs. goodness-of-fit of the whole psychometric function, which could be affected by other parameters (such as thresholds and lapse rates). Please see our response to Essential Revision #2 (above) for further details.

The methods section states that only neurons tuned to heading directions as indicated by cues in a specific modality were included in the neurometric analysis for that modality. If that exclusion criterion was applied, why are the neurometric curves for 'the other modality' so bad? I would have expected that neurons tuned to visual heading cues result in a good neurometric function for visual heading no matter where these neurons are located in the brain. In turn, I would have expected flat neurometric functions for visual heading direction in PIVC if also neurons tuned to vestibular heading information were included in the analysis. What am I missing?

In the revised Methods subsection “Neuronal shifts”, paragraph 1, we now better explain the inclusion criteria for the different stages of analysis. First, we calculated neurometric functions for all recorded cells, and both modalities, whether or not the cell was significantly tuned. For the subsequent neuronal vs. behavioral shift analyses, we only included neurons that satisfied both the following criteria (the second criterion was added to the revised manuscript in response to Essential Revisions, Comment #2): i) significant tuning to the corresponding cue, and ii) reliable neurometric PSE estimates (SD < 10°, both in the pre- and post-recalibration blocks). The example neuron from PIVC in Figure 4 responded significantly to vestibular (but not visual) stimuli. Thus, although neurometric fits are presented for both cues in Figure 4, this PIVC neuron was included only in the vestibular (but not the visual) PSE correlation analysis.

Another instance of me probably not getting something: to identify neurons tuned for heading directions as cued by one of the two modalities, spike rates were regressed onto heading directions and only those neurons with a significant slope were included. I get that the range of tested headings is smaller than the width of a typical tuning curve and thus a linear regression makes more sense than trying to fit a full tuning curve. However, it seems to me that his selection method excludes neurons tuned to straight-ahead as their slope should be flat for a symmetric range of headings.

Indeed our selection criteria were selective for neurons that have sloped tuning around straight-ahead, and exclude neurons that could be tuned to forward motion stimuli (i.e., with a tuning preference straight-ahead). The reasons for this are: (1) neurons with sloped tuning straight-ahead are most informative for heading discrimination (large Fisher information, Gu et al., 2010). By contrast, neurons with a tuning preference to straight-ahead stimuli have relatively flat responses around straight-ahead (low Fisher information) and are thus less informative for heading discrimination. Accordingly, small shifts can be readily detected in neurons with sloped tuning (but not in those with flat tuning) around straight ahead. (2) In the cortical areas of interest in this study, a disproportionately large number of neurons have steep tuning slopes around straight ahead, presumably to support heading discrimination (Chen et al., 2011b; Gu et al., 2010). (3) The standard neurometric analysis wouldn’t work for neurons with a tuning preference to straight-ahead (rightward and leftward headings around straight-ahead would elicit ambiguous firing rates). Thus estimating tuning shifts in these neurons would require different (heterogeneous and more complex) models, and the results would be very noisy. Therefore, in this study, we focused on the prevalent neurons with sloped tuning around straight-ahead. We have added these points to the revised Methods subsection “Data analysis”, paragraph 3.

Regression and correlational analyses are extremely sensitive to outliers. To be sure that the results are robust, please repeat the analyses after outlier correction, e.g., based on the 1.5xIQR rule along each dimension or based on each data points influence on the regression (Cook's distance).

In this revision, we added: 1) summary statistics of the individual monkeys, and performed an LMM analysis (in response to Essential Revisions, Comment #1), and 2) we screened the neurons based on their PSE estimate reliability (in response to Essential Revisions, Comment #2). In addition, we also tested the influence of outliers using ‘Cook's distance’ (see details in response to Essential Revisions, Comment #3). Results from these analyses support the robustness of the results.

Looking at the currently depicted sigmoids, I suggest increasing the maximal value for the lapse rate. Even though it is not fully clear to me what a lapse rate of neurometric curve actually means, achieving a better fit for the curves seems essential given that the PSEs are at the center of the results.

We assessed the effect of allowing for a broader range of lapse rates on the neurometric fits, and PSEs in particular. Allowing for a broader range of lapse rates increased pseudo-R2 values (expected when increasing the number of model parameters) and sometimes affected threshold estimates (thresholds could be under- or overestimated, depending on whether lapse rates are allowed or not, respectively). However, it had little influence on the PSE estimates. For example, the post-recalibration neurometric curve in Figure 3A (blue; redrawn in Author response image 2A) has a threshold = 17.8°. If larger lapse rates are allowed (Author response image 2B), this could provide an unreasonably low threshold estimate (threshold = 0.6°). Thus, allowing (or not) large lapse rates can drastically affect threshold estimates. By contrast, the post-recalibration neurometric PSE was 3.6° and 2.8° for the two curves, respectively. Hence, in general, the effect of lapse rates on PSE estimates is small. Because PSE (not threshold) is the parameter of interest in this study, we opted for a more conservative neuromeric fit (with fewer parameters, i.e. tighter lapse rates) and exclusively used the PSE (not threshold) estimates. In addition, we added screening of neuronal PSE estimate reliability (via bootstrapping) in the revised manuscript (please see details in response to Essential Revisions, Comment #2).

Author response image 2
Influence of lapse rates on example neurometric fit.

(A) Neurometric fit of vestibular responses, without allowing large lapse rates, for the example MSTd neuron (same as Figure 3A). (B) Neurometric fit of the same data, allowing large lapse rates.

The methods section provides information about the number of neurons per area included in the analyses in general. Additional information about the number of neurons per neurometric curve would be useful.

We added this information in the Methods subsection “Neuronal shifts”, paragraph 1.

Please add in l. 698 how the baseline activity for each neuron was determined, e.g., based on a single interval at the beginning of the session or based on recordings in between trials.

Baseline FRs were calculated by taking the average FR in the 1 s window before stimulus onset. We have added this to the revised Methods, subsection “Data analysis”, paragraph 2.

Probably just a typo but in l. 689 Pearson's correlation has nothing to do with linear regression.

We have corrected this.

Writing

Generally, I find the manuscript to be well-written and organized. However, in its current form, the manuscript is geared towards a small audience, electrophysiologists familiar with most publications by the Angelaki and DeAngelis labs. A much wider audience could be reached by 1) phrasing more precisely and 2) providing more information, reasoning, and explanations.

In response to this comment and thanks to the comments below, we now provide more information and explanations, with improved phrasing in our revised manuscript.

The verb 'recalibrate' is often used in a way that doesn't match the literature and the way recalibration is thought of. Information cannot be recalibrated; cues cannot recalibrate actively; responses do not recalibrate (together). A system is recalibrated and then it interprets incoming information differently. In most instances, a simple replacement of recalibrated with "shifted" will help -- AND/OR include a statement early on defining these more colloquial uses of recalibrate (e.g., "we refer to this pattern of neural activity as 'recalibration'…)

In the revised manuscript we now use the term ‘recalibrate’ more carefully. This entailed replacing many instances of “recalibrated” with “shifted”.

A few examples for more common and precise phrasing:

l. 35 -> recalibrating itself based on information from …

Thank you for the improved phrasing. We have updated the text.

l. 36 -> estimates for subsequently presented unisensory stimuli are shifted towards each other.

Thank you for the improved phrasing. We have updated that sentence.

l. 25 -> the tuning of neural responses to … cues was shifted in the same direction as the monkeys' perceptual judgments of subsequently presented unisensory stimuli.

Thank you for the improved phrasing. We have updated that sentence.

l. 43 -> tuning of vestibular neurons was shifted in the same direction as vestibular heading perception.

This issue is present throughout the manuscript but especially so in the abstract, in brief section, and the discussion. I doubt that someone not familiar with the literature can understand the abstract. There is a section in the discussion (l. 483f) that is written in a very abstract manner but phrased according to the literature, i.e., in accordance with the ways most people think about recalibration. Please adjust the rest of the text accordingly.

Thanks for bringing this to our attention. We have updated the relevant text accordingly.

Comments in chronological order:

l. 57 Behavioral papers cited in a sentence about the neuronal basis of multisensory integration.

We updated those references

l. 76 Just saying that results exist is rather unusual. In a few words, what did the neuroimaging studies find?

We agree and have now added a brief explanation of these neuroimaging studies in the revised Introduction.

l. 91 Burge et al. is not about heading perception but about visual-haptic slant perception.

Thank you. We removed this reference from the paper.

l. 103 Again, the typical phrasing would be that perception was recalibrated as indicated by shifts in subsequent perceptual estimates.

Thanks. We have updated that sentence.

l. 108f It would be easier for readers to learn about supervised recalibration in the discussion, as now they have to shift mentally from unsupervised to supervised and back to unsupervised recalibration.

We agree and have moved this paragraph about supervised recalibration to the Discussion.

l. 119 Why would diverging shifts in the tuning of neural mechanisms between cortical areas not be detectable when both perceptual estimates are shifted in the same direction?

We now better clarify this point in the revised manuscript. In the supervised calibration paradigm, the overall shifts for both vestibular and visual cues were “yoked” in the same direction. Thus, we could not dissociate whether neuronal shifts for a particular cue (e.g., vestibular) follow the behavioral shifts for that cue (vestibular) or the other cue (visual). Both predict shifts in the same direction, and it would be difficult to dissociate these based on small differences in the expected shift magnitudes (because of noise). This (unsupervised) paradigm elicits behavioral shifts in opposite directions, and thus can more readily discern if the vestibular neurometrics shift with visual (rather than vestibular) behavioral shifts. We have added this explanation to the revised Discussion subsection “Modality-specific recalibration of vestibular and visual cues”, paragraph 3.

l. 126 Isn't that what the neuroimaging studies showed? Changes in relatively early areas?

Yes, we have updated the sentence by referencing neuroimaging studies that found auditory-visual recalibration in relatively early sensory areas (Amedi et al., 2002; Zierul et al., 2017).

l. 140 "therefore we expected to see perceptual shifts resulting from unsupervised recalibration in MSTd and PIVC". That is a surprising claim, which needs further explanation as it implies that perceptual decisions are based on neural activity in these relatively early areas. Maybe leave that conclusion entirely to the discussion.

We now justify this hypothesis based on the neuroimaging studies that found auditory-visual recalibration in relatively early sensory areas. Also, we further explain that because unsupervised plasticity is sensory-driven (occurs as a result of the cross-modal discrepancy, in the absence of overt feedback) we expected to observe neural correlates of recalibration in these lower-level sensory areas, and for these to propagate along the perceptual decision-making hierarchy.

l. 145 The claim that VIP underlies cognitive processing will startle many. Why not simply say that VIP seems to be involved in perceptual decision making or higher order perceptual functions some of which yet have to be understood?

We thank the reviewer for this refinement, and have updated that explanation accordingly.

l. 161 Please explain that 1) during the recalibration block, the monkeys are exposed to visual-vestibular stimulus pairs with a consistent discrepancy in heading direction and that 2) in pre- and post-recalibration blocks the perception of heading direction indicated by unisensory stimuli is measured and that 3) recalibration effects are measured as the difference between pre- and post-recalibration results. The text should make sense to a reader who has never performed a recalibration study. No reader should be forced to read another paper just to understand the most crucial aspects of the current one!

We agree and have now added more details to explain the experimental procedure.

l. 181f It does not seem necessary to describe the figure in detail in the text. Explanations of how to read a figure are best placed in its caption. Given that these are selected-curves of a single-subject in one of several sessions, the size of the shifts should not be compared or discussed in the text.

We have removed the unnecessary details from the text, and also removed the shift size comparison for the example plots. We also modified the figure legends.

l. 200 At this point the reader has never heard that the monkeys underwent several sessions nor in which way the sessions differed from each other. Please add that information.

Thank you. We have now added information (specifically the number of sessions, and how the sessions differed) so that the reader can better understand the results which follow (please see updated Results subsection “Vestibular and visual perceptual estimates shift toward each other”, paragraph 3).

l. 202 Why force the reader to learn what δ + means in this manuscript? It is much better to just speak of 'sessions in which visual heading was rightwards of vestibular heading during the recalibration phase'. Please apply that thought throughout.

We have now added (in response to the Reviewer’s previous comment) a better explanation regarding the two types of sessions, and the meaning of δ + and δ -. Because these appear numerously throughout the text, and there is not enough space in the figures to write explicitly 'sessions in which visual heading was rightwards of vestibular heading during the recalibration phase’ etc., we think that it is clearer to keep this nomenclature.

l. 232 What does 'tuning recalibrate with perceptual shifts' mean? A good advice I got from a book on scientific writing is to take sentences literally even when they describe scientific matter. One possible general phrasing would be that the neural tuning shifted in the same (opposite for VIP) direction as heading perception. A more results-oriented phrasing is that the neurometric functions shifted in the same direction as the psychometric functions for each modality. Please repair this throughout the text!

We have amended the language accordingly.

l. 240 This description of the neurometric analysis is not sufficient (see comment in the methods section). In addition, "PSEs were extracted similar to" is confusing, PSEs always correspond to the 50% point of a psychometric function (e.g., the mean of the Gaussian distribution). More importantly, it should be explained to readers that the PSE of a neurometric curve is the physical heading direction at which the chances to make a rightward judgment based on the firing rates of the neurons are fifty-fifty, i.e., straight-ahead according to the neuronal data.

We have updated the text accordingly.

l. 245 I think this is the first instance in which the term "behavioral shifts" is used instead of "perceptual shifts" but then it occurs consistently and is even present in the figures. In psychophysics, the term "behavioral shifts" would be used if there is any reason to suspect that the behavior does not correspond to the percepts, e.g., because participants show a response bias rather than a perceptual bias. If this might be the case, it should be discussed, otherwise please use perceptual.

We agree with the reviewer’s suggestion. We have replaced the term "behavioral shifts" with "perceptual shifts".

l. 286 Again, please phrase more precisely.

We now changed this subtitle to “Neuronal tuning in VIP to both vestibular and visual stimuli shifted according to vestibular perceptual shifts”.

l. 315f I like the analysis and the paragraph could function as a prototype for the subsequent paragraphs regarding its degree of abstraction and briefness. Yet, again more precise phrasing would be nice, e.g., neurons don't respond visually they exclusively respond when visual stimuli are presented.

We have revised the phrasing accordingly.

l. 334f Again, in my view it is not necessary to describe figures and do so panel by panel.

We have removed unnecessary descriptions of the figure from the main text.

l. 339f The last sentence of the paragraph does not parse.

The claim that neural recalibration follows the velocity profile of the stimulus is too strong and not correct as the figures show correlations not neural recalibration. For the earlier areas, the claim that the significant correlation between neurometric and psychometric shifts is driven by firing rates during the period of maximal stimulus velocity might be correct.

We agree with this point and have refined the explanation in line with the reviewer’s suggestion.

l. 350f Similarly, for VIP neurons it is also not necessary to describe the figure.

Thanks, we have refined this paragraph.

l. 356 The claim that the correlations between neurometric and psychometric shifts given firing rates recorded at the end of the stimulus presentation has nothing to do with choice behavior but reflects neuronal recalibration is not substantiated at this point, an anti-correlation is a strong relation. The conclusion might be drawn based on the last section of the results.

We have removed that sentence.

l. 364 "During recalibration" means during the recalibration phase but I think that is not what the authors mean because there are no behavioral choices during the recalibration phase. Please search the text for this phrasing, it probably is not adequate at other instances, too.

We agree with the reviewer’s suggestion. We have replaced the phrase "during recalibration" with "after recalibration" where appropriate in the text.

l. 412 Not sure if this is the first instance, but cross-sensory is not a word used in the literature, please use either multisensory or cross-modal or across the senses. Some authors will point out that multisensory should only be used in the case of perceptual fusion. Please replace the word throughout the text.

We agree with the reviewer’s suggestion. We have replaced the phrase "cross-sensory " with " cross-modal" throughout the text.

l. 412f This holds for the full Discussion: See above comments, the phrasing is very uncommon, especially the use of the verb 'recalibrate'. Almost all instances of 'recalibrated' should be replaced with 'changed' or 'shifted'. "Together with" means "shifts in same direction as perceptual shifts" and so on. It gets much better from line 478 on.

We agree with the reviewer’s suggestion. We replaced many instances of the word “recalibrate” with “shifted”. We also replaced phrases like “recalibrated together with the corresponding vestibular perceptual shifts” with “shifted in the same direction as vestibular perceptual shifts” etc.

l. 422 Not sure if vision scientists would call MSTd a multisensory area.

Although MSTd is part of the extra-striate visual cortex, it has been extensively shown to have multisensory (including vestibular) responses. We back up this claim in the manuscript with relevant references.

l. 432f Why does 'unsupervised' imply changes in early areas? I wondered the same in the introduction. And how does that go together with the suggestion that the conflict is detected in ACC? It might be easiest to just refer to the neuroimaging studies or simply not make a claim.

We now justify this hypothesis (in the revised Introduction) based on the neuroimaging studies, as recommended by the reviewer.

l. 457f What is "individualized recalibration"? The literature uses "modality-specific". I fail to see what this section adds that is not in the previous section. Why would the yoking found in the supervised recalibration study for both perceptual and neuronal shifts predict uniformity in the neuronal shifts in a paradigm that leads to non-yoked perceptual shifts?

We agree with the reviewer’s suggested phrasing and replaced "individualized recalibration" with "modality-specific recalibration". In the supervised calibration paradigm, the overall shifts for both vestibular and visual cues were “yoked” in the same direction. Thus, we could not dissociate whether neuronal shifts for a particular cue (e.g., vestibular) follow the behavioral shifts for that cue (vestibular) or the other cue (visual). Both predict shifts in the same direction, and it would be difficult to dissociate these based on the expected shift magnitudes (because of noise). This (unsupervised) paradigm elicits behavioral shifts in opposite directions, and thus can more readily discern if the vestibular neurometrics shift with visual (rather than vestibular) behavioral shifts.

l. 499 Please phrase the conclusion as a possibility, as it remains unclear to which degree supervised and unsupervised recalibration correspond.

We agree with the reviewer’s suggestion and have amended the language accordingly.

l. 513 'Shift in reference frame' does this refer to a change in the supramodal definition of straight-ahead based on the vestibular tuning in lower sensory areas? The idea that VIP is tuned in a vestibular reference frame fits with earlier studies investigating visual-tactile reference frames (e.g., Avillac et al., Sereno and Huang, and also Graziano recording in F4).

We do not know if this necessarily means “a change in the supramodal definition of straight-ahead based on the vestibular tuning in lower sensory areas”. We just point out that neuronal signals in VIP have been previously shown to be in head/body centered coordinates, and that the shifts we observe here are in line with that notion. We have clarified this section, and added these relevant references, in the revised Discussion subsection “Contrary recalibration in higher-level area VIP ” paragraph 3. Thank you.

l. 531 what does "cross-sensory recalibration, vs. simply reflecting the recalibrated signals" mean? What are recalibrated signals and why are they different from cross-modal recalibration?

By “cortical areas actively involved in cross-modal recalibration” we mean that they had an instrumental (causal) role in vestibular-visual recalibration. By “simply reflecting the recalibrated signals” we mean that they were not instrumental (causal) for recalibration, and thus only reflect signals that were recalibrated in/by other brain areas. We have now refined and better explained this idea in the revised Discussion.

General writing guidelines:

All figures should be optimized for the outlet, which in the case of eLife is wide and short figures as the text is never set in two-columns.

We have changed the layout of Figures 3, 4, and 5 accordingly.

All acronyms should be defined when they are used for the first time, e.g., point of subjective equivalence (PSE). It can be very helpful for readers to treat abstract, significance statement, main text, and methods as separate and define acronyms anew.

A number and its unit are separated by a space, e.g., 300 ms instead of 300ms. Please check this throughout the text including the figures.

We thank the reviewer for these suggestions, and have amended the text accordingly.

Figure 1

A: A visualization of the optic flow stimuli (e.g., two subsequent frames or little arrows indicating the motion vectors) would be nice.

We added a schematic (rightmost in Figure 1A) to visualize the optic flow.

B: It should be indicated either in the figure or in the caption that δ was constant within a single session, but theta was varied within a session and could take on all values depicted in A.

We added these points to the legend of Figure 1.

The grey vector corresponds to the combined direction if both cues have the same reliability, which probably wasn't exactly the case given Figure 2A.

The grey vector was not meant to reflect the perceived ‘combined cue’, rather, it is just a convention for defining the stimuli (visual and vestibular stimuli were offset ± 5° relative to this). We have clarified this in the legend.

C: 'no choice' is misplaced.

Thanks. We have corrected this.

All of them, please add a space between a number and its unit.

OK. We have updated accordingly.

Figure 2

A,B please indicate the monkey and the session number.

We added this information to the figure legend.

A,B usually rightward choice ratio would be interpreted as the ratio of rightward to leftward choices, I assume the proportion of rightward responses is shown.

Yes. We replaced "ratio" with "proportion" in the y-label.

C indicate how the shift was calculated.

We added to the legend how the shift was calculated.

C I cannot see the error bars referred to in the caption.

The error bars are indeed difficult to see because the SEs are small. To improve visibility, we reduced the triangle symbol size and made the error bars thicker.

C please indicate the distribution of each monkey to assure readers that the results hold within and across subjects (see comments on the statistical analysis).

We updated Figure 2C to display the distribution of each monkey (using different texture patterns). In addition, the summary statistics for each monkey are presented in Figure 2-source data 1.

Reviewer #2 (Recommendations for the authors):

Aside from the major comments outlined in the public review, I found that some figure legends should be expanded. For instance, in Figure 8 it is unclear what the empty and filled circles indicate, respectively. I would recommend the authors to check the manuscript carefully.

We thank the reviewer for bringing this to our attention. Filled and empty circles indicate significant and non-significant partial correlations respectively. We have added this and other details to the figure legends.

While in general the manuscript is well written, I found the "in brief" section rather difficult and not suitable for a broader audience. I would suggest rewriting the section accordingly.

We removed the "in brief" and “highlights” sections from the manuscript because these are not standard in eLife.

https://doi.org/10.7554/eLife.82895.sa2

Article and author information

Author details

  1. Fu Zeng

    Key Laboratory of Brain Functional Genomics (Ministry of Education), East China Normal University, Shanghai, China
    Contribution
    Data curation, Formal analysis, Validation, Visualization, Methodology, Writing - original draft
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0009-0000-6857-6485
  2. Adam Zaidel

    Gonda Multidisciplinary Brain Research Center, Bar-Ilan University, Ramat Gan, Israel
    Contribution
    Conceptualization, Formal analysis, Supervision, Funding acquisition, Validation, Methodology, Writing - review and editing
    For correspondence
    adam.zaidel@biu.ac.il
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-4405-8717
  3. Aihua Chen

    Key Laboratory of Brain Functional Genomics (Ministry of Education), East China Normal University, Shanghai, China
    Contribution
    Conceptualization, Data curation, Formal analysis, Supervision, Funding acquisition, Investigation, Methodology, Writing - original draft, Project administration, Writing - review and editing
    For correspondence
    ahchen@brain.ecnu.edu.cn
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-5066-2844

Funding

Ministry of Science and Technology of the People's Republic of China (2021ZD0202600)

  • Aihua Chen

National Natural Science Foundation of China (32171034)

  • Aihua Chen

National Natural Science Foundation of China (32061143003)

  • Aihua Chen

Israel Science Foundation (3318/20)

  • Adam Zaidel

The funders had no role in study design, data collection, and interpretation, or the decision to submit the work for publication.

Acknowledgements

This work was supported by grants from the 'STI2030-major projects' (No. 2021ZD0202600), the National Basic Research Program of China (No. 32171034) to AC, and the ISF-NSFC joint research program to AC (No. 32061143003) and AZ (No. 3318/20). We thank Prof. Dora Angelaki for the helpful comments. We are also grateful to Minhu Chen for outstanding computer programming.

Ethics

All animal surgeries and experimental procedures were approved by the Institutional Animal Care and Use Committee at East China Normal University (IACUC protocol number: Mo20200101).

Senior Editor

  1. Michael J Frank, Brown University, United States

Reviewing Editor

  1. Christopher R Fetsch, Johns Hopkins University, United States

Reviewers

  1. Umberto Olcese, University of Amsterdam, Netherlands
  2. Robbe Goris, The University of Texas at Austin, Austin, United States

Version history

  1. Received: August 22, 2022
  2. Preprint posted: September 27, 2022 (view preprint)
  3. Accepted: February 21, 2023
  4. Version of Record published: March 6, 2023 (version 1)

Copyright

© 2023, Zeng et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 435
    Page views
  • 58
    Downloads
  • 1
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Fu Zeng
  2. Adam Zaidel
  3. Aihua Chen
(2023)
Contrary neuronal recalibration in different multisensory cortical areas
eLife 12:e82895.
https://doi.org/10.7554/eLife.82895

Further reading

    1. Neuroscience
    Amanda J González Segarra, Gina Pontes ... Kristin Scott
    Research Article

    Consumption of food and water is tightly regulated by the nervous system to maintain internal nutrient homeostasis. Although generally considered independently, interactions between hunger and thirst drives are important to coordinate competing needs. In Drosophila, four neurons called the interoceptive subesophageal zone neurons (ISNs) respond to intrinsic hunger and thirst signals to oppositely regulate sucrose and water ingestion. Here, we investigate the neural circuit downstream of the ISNs to examine how ingestion is regulated based on internal needs. Utilizing the recently available fly brain connectome, we find that the ISNs synapse with a novel cell-type bilateral T-shaped neuron (BiT) that projects to neuroendocrine centers. In vivo neural manipulations revealed that BiT oppositely regulates sugar and water ingestion. Neuroendocrine cells downstream of ISNs include several peptide-releasing and peptide-sensing neurons, including insulin producing cells (IPCs), crustacean cardioactive peptide (CCAP) neurons, and CCHamide-2 receptor isoform RA (CCHa2R-RA) neurons. These neurons contribute differentially to ingestion of sugar and water, with IPCs and CCAP neurons oppositely regulating sugar and water ingestion, and CCHa2R-RA neurons modulating only water ingestion. Thus, the decision to consume sugar or water occurs via regulation of a broad peptidergic network that integrates internal signals of nutritional state to generate nutrient-specific ingestion.

    1. Neuroscience
    Lucas Y Tian, Timothy L Warren ... Michael S Brainard
    Research Article

    Complex behaviors depend on the coordinated activity of neural ensembles in interconnected brain areas. The behavioral function of such coordination, often measured as co-fluctuations in neural activity across areas, is poorly understood. One hypothesis is that rapidly varying co-fluctuations may be a signature of moment-by-moment task-relevant influences of one area on another. We tested this possibility for error-corrective adaptation of birdsong, a form of motor learning which has been hypothesized to depend on the top-down influence of a higher-order area, LMAN (lateral magnocellular nucleus of the anterior nidopallium), in shaping moment-by-moment output from a primary motor area, RA (robust nucleus of the arcopallium). In paired recordings of LMAN and RA in singing birds, we discovered a neural signature of a top-down influence of LMAN on RA, quantified as an LMAN-leading co-fluctuation in activity between these areas. During learning, this co-fluctuation strengthened in a premotor temporal window linked to the specific movement, sequential context, and acoustic modification associated with learning. Moreover, transient perturbation of LMAN activity specifically within this premotor window caused rapid occlusion of pitch modifications, consistent with LMAN conveying a temporally localized motor-biasing signal. Combined, our results reveal a dynamic top-down influence of LMAN on RA that varies on the rapid timescale of individual movements and is flexibly linked to contexts associated with learning. This finding indicates that inter-area co-fluctuations can be a signature of dynamic top-down influences that support complex behavior and its adaptation.