Using the past to estimate sensory uncertainty

  1. Ulrik Beierholm  Is a corresponding author
  2. Tim Rohe
  3. Ambra Ferrari
  4. Oliver Stegle
  5. Uta Noppeney
  1. Psychology Department, Durham University, United Kingdom
  2. Department of Psychiatry and Psychotherapy, University of Tübingen, Germany
  3. Department of Psychology, Friedrich-Alexander University Erlangen-Nuernberg, Germany
  4. Centre for Computational Neuroscience and Cognitive Robotics, University of Birmingham, United Kingdom
  5. Max Planck Institute for Intelligent Systems, Germany
  6. European Molecular Biology Laboratory, Genome Biology Unit, Germany
  7. Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany, Germany
  8. Donders Institute for Brain, Cognition and Behaviour, Radboud University, Netherlands
9 figures, 2 tables and 2 additional files

Figures

Figure 1 with 1 supplement
Audiovisual localization paradigm and Bayesian causal inference model for learning visual reliability.

(A) Visual (V) signals (cloud of 20 bright dots) were presented every 200 ms for 32 ms. The cloud’s location mean was temporally independently resampled from five possible locations (−10°, −5°, 0°, 5°, 10°) with an inter-trial asynchrony jittered between 1.4 and 2.8 s. In synchrony with the change in the cloud’s mean location, the dots changed their color and a sound was presented (AV signal) which the participants localized using five response buttons. The location of the sound was sampled from the two possible locations adjacent to the visual cloud’s mean location (i.e. ±5° AV spatial). (B) The generative model for the Bayesian learner explicitly modeled the potential causal structures, that is whether visual (Vi) signals and an auditory (A) signal were generated by one common audiovisual source St, that is C = 1, or by two independent sources SVt and SAt, that is C = 2 (n.b. only the model component for the common source case is shown to illustrate the temporal updating, for complete generative model, see Figure 1—figure supplement 1). Importantly, the reliability (i.e. 1/variance) of the visual signal at time t (λt) depends on the reliability of the previous visual signal (λt-1) for both model components (i.e. common and independent sources).

Figure 1—figure supplement 1
Generative model for the Bayesian learner.

The Bayesian Causal Inference model explicitly models whether auditory and visual signals are generated by one common (C = 1) or two independent sources (C = 2) (for further details see Körding et al., 2007). We extend this Bayesian Causal Inference model into a Bayesian learning model by making the visual reliability (λV,t, i.e. the inverse of uncertainty or variance) of the current trial dependent on the previous trial.

Figure 2 with 1 supplement
Time course of visual noise and relative auditory weights for continuous sequences of visual noise.

The visual noise (i.e. STD of the cloud of dots, right ordinate) and the relative auditory weights (mean across participants ± SEM, left ordinate) are displayed as a function of time. The STD of the visual cloud was manipulated as (A) a sinusoidal (period 30 s, N = 25), (B) a random walk (RW1, period 120 s, N = 33) and (C) a smoothed random walk (RW2, period 30 s, N = 19). The overall dynamics as quantified by the power spectrum is faster for RW2 than RW1 (peak in frequency range [0 0.2] Hz: Sinusoid: 0.033 Hz, RW1: 0.025 Hz, RW2: 0.066 Hz). The RW1 and RW2 sequences were mirror-symmetric around the half-time (i.e. the second half was the reversed first half). The visual clouds were re-displayed every 200 ms (i.e. at 5 Hz). The trial onsets, that is audiovisual (AV) signals (color change with sound presentation, black dots), were interspersed with an inter-trial asynchrony jittered between 1.4 and 2.8 s. On each trial observers located the sound. The relative auditory weights were computed based on regression models for the sound localization responses separately for each of the 20 temporally adjacent bins that cover the entire period within each participant. The relative auditory weights vary between one (i.e. pure auditory influence on the localization responses) and zero (i.e. pure visual influence). For illustration purposes, the cloud of dots for the lowest (i.e. V signal STD = 2°) and the highest (i.e. V signal STD = 18°) visual variance are shown in (A).

Figure 2—figure supplement 1
Time course of the relative auditory weights for continuous sequences of visual noise when controlling for location of the cloud of dots in the previous trial.

Relative auditory weights (mean across participants ± SEM, left ordinate) and visual noise (i.e. STD of the cloud of dots, right ordinate) are displayed as a function of time as shown in Figure 2 of the main text. To compute the relative auditory weights, the sound localization responses where regressed on the A and V signal locations within bins of 1.5 s (A, B) or 6 s (C) width across sequence repetitions within each participant. To control for a potential effect of past visual locations, the location of the visual cloud of dots in the previous trial was included in this regression model as a covariate (Supplementary file 1-Table 3).

Observers’ relative auditory weights for continuous sequences of visual noise.

Relative auditory weights wA of the 1st (solid) and the flipped 2nd half (dashed) of a period (binned into 20 bins) plotted as a function of the normalized time in the sinusoidal (red), the RW1 (blue), and the RW2 (green) sequences. Relative auditory weights were computed from auditory localization responses of human observers.

Observed and predicted relative auditory weights for continuous sequences of visual noise.

Relative auditory weights wA of the 1st (solid) and the flipped 2nd half (dashed) of a period (binned into 20 bins) plotted as a function of the normalized time in the sinusoidal (red), the RW1 (blue) and the RW2 (green) sequences. Relative auditory weights were computed from auditory localization responses of human observers (A), Bayesian (B), exponential (C), or instantaneous (D) learning models. For comparison, the standard deviation of the visual signal is shown in (E). Please note that all models were fitted to observers’ auditory localization responses (i.e. not the auditory weight wA). (F) Bayesian model comparison – Random effects analysis: The matrix shows the protected exceedance probability (color coded and indicated by the numbers) for pairwise comparisons of the Instantaneous (Inst), Bayesian (Bayes) and Exponential (Exp) learners separately for each of the four experiments. Across all experiments we observed that the Bayesian or the Exponential learner outperformed the Instantaneous learner (i.e. a protected exceedance probability >0.94) indicating that observers used the past to estimate sensory uncertainty. However, it was not possible to arbitrate reliably between the Exponential and the Bayesian learner across all experiments (protected exceedance probability in bottom row).

Figure 5 with 2 supplements
Time course of visual noise and relative auditory weights for sinusoidal sequence with intermittent jumps in visual noise (N = 18).

(A) The visual noise (i.e. STD of the cloud of dots, right ordinate) is displayed as a function of time. Each cycle included one abrupt increase and decrease in visual noise. The sequence of visual clouds was presented every 200 ms (i.e. at 5 Hz) while audiovisual (AV) signals (black dots) were interspersed with an inter-trial asynchrony jittered between 1.4 and 2.8 s. (B, C) Relative auditory weights wA of the 1st (solid) and the flipped 2nd half (dashed) of a period (binned into 15 bins) plotted as a function of the time in the sinusoidal sequence with intermitted inner (light gray), middle (gray), and outer (dark gray) jumps. Relative auditory weights were computed from auditory localization responses of human observers (B) and the Bayesian learning model (C). Please note that all models were fitted to observers’ auditory localization responses (i.e. not the auditory weight wA).

Figure 5—figure supplement 1
Time course of relative auditory weights and visual noise for the sinusoidal sequence with intermittent jumps in visual noise for the exponential and instantaneous learning models.

Relative auditory weights wA,bin (mean across participants) of the 1st (solid) and the flipped 2nd half (dashed) of a period (binned into 15 time bins) plotted as a function of the time in the sinusoidal sequence with intermitted inner (light gray), middle (gray), and outer (dark gray) jumps. Relative auditory weights were computed from auditory localization responses of exponential (A) or instantaneous (B) learning models. For comparison, the standard deviation of the visual signal is shown in (C). Please note that all models were fitted to observers’ auditory localization responses (i.e. not the auditory weight wA).

Figure 5—figure supplement 2
Time course of relative auditory weights and root mean squared error of the computational models before and after the jumps in the sinusoidal sequence with intermittent jumps.

(A) Relative auditory weights wA (mean across participants) shown as a function of time around the up-jumps (left panel) and the down-jumps (right panel) for observers’ behavior, the instantaneous, exponential and Bayesian learner. Relative auditory weights were computed from auditory localization responses for behavioral data and for the predictions of the three computational models in time bins of 200 ms (i.e. 5 Hz rate of the visual clouds). Trials from the three types of up- and down-jumps were pooled to increase the reliability of the wA estimates. Because time bins included only few trials in some participants, individual wA values that were smaller or larger than the three times the scaled median absolute deviation were excluded from the analysis. Note that the up jumps occurred around the steepest increase in visual noise, so that the Bayesian and exponential learners underestimated visual noise (Figure 5C), leading to smaller wA as compared to the instantaneous learner already before the up jump. (B) Root mean squared error (RMSE; computed across participants) between wA computed from behavior and the models’ predictions (as shown in A), shown as a function of the time around the up-jumps (left panel) and the down-jumps (right panel). Please note that all models were fitted to observers’ auditory localization responses (i.e. not the auditory weight wA).

Time course of the relative auditory weights, the standard deviation (STD) of the visual cloud and the STD of the visual uncertainty estimates.

(A) Relative auditory weights wA of the 1st (solid) and the flipped 2nd half (dashed) of a period (binned into 15 bins) plotted as a function of the time in the sinusoidal sequence. Relative auditory weights were computed from the predicted auditory localization responses of the Bayesian (blue) or exponential (green) learning models fitted to the simulated localization responses of a Bayesian learner based on visual clouds of 5 dots. (B) Relative auditory weights wA computed as in (A) for the sinusoidal sequence with intermitted jumps. Only the outer-most jump (dark brown in Figure 5B/C and Figure 5—figure supplement 1) is shown. (C, D) STD of the visual cloud of 5 dots (gray) and the STD of observers’ visual uncertainty as estimated by the Bayesian (blue) and exponential (green) learners (that were fitted to the simulated localization responses of a Bayesian learner) as a function of time for the sinusoidal sequence (C) and in the sinusoidal sequence with intermitted jumps (D). Note that only an exemplary time course from 600 to 670 s after the experiment start is shown.

Appendix 2—figure 1
Generative model, for one (C = 1) or two sources (C = 2).
Appendix 2—figure 2
Approximation of theta using Laplace approximation.
Appendix 2—figure 3
Comparing variational Bayes approximation with a numerical discretised grid approximation.

Top row: Example visual stimuli over eight subsequent trials. Middle row: The distribution of estimated sample variance, with no learning over trials. Bottom row: The distribution of _V;t for the Bayesian model that incorporates the learning across trials. Red line is the numerical comparison when using a discretised grid to estimate variance, as opposed to the variational Bayes (green line).

Tables

Table 1
Analyses of the temporal asymmetry of the relative auditory weights across the four sequences of visual noise using repeated measures ANOVAs with the factors sequence part (1st vs. flipped 2nd half), bin and jump position (only for the sinusoidal sequences with intermittent jumps).
EffectFdf1df2pPartial η2
SinusoidPart12.1621240.0020.336
Bin92.0073.10874.584<0.0010.793
PartXBin2.1672.94270.6170.1010.083
RW1Part14.1291320.0010.306
Bin76.0554.911157.151<0.0010.704
PartXBin1.2254.874155.9710.3000.037
RW2Part2.8841180.1070.138
Bin60.1423.30459.467<0.0010.770
PartXBin3.3854.60382.8490.0100.158
Sinusoid with intermittent jumpsJump28.306234<0.0010.625
Part24.824117<0.0010.594
Bin76.4761.87331.839<0.0010.818
JumpXPart0.3002340.7430.017
JumpXBin8.3833.30956.247<0.0010.330
PartXBin1.6413.24855.2220.1870.088
JumpXPartXBin0.6405.71697.1750.6900.036
  1. Note: The factor bin comprised nine levels in the first three and seven levels in the fourth sequence. In this sequence, the factor Jump comprised three levels. If Mauchly tests indicated significant deviations from sphericity (p<0.05), we report Greenhouse-Geisser corrected degrees of freedom and p values.

Table 2
Model parameters (median), absolute WAIC and relative.

ΔWAIC values for the three candidate models in the four sequences of visual noise.

SequenceModelσAPcommonσ0κ or γWAICΔWAIC
SinusoidInstantaneous learner5.560.638.95-81931.2109.9
Bayesian learner5.640.659.03κ: 7.3781821.30
Exponential discounting5.620.649.02γ: 0.2381866.945.6
RW1Instantaneous learner6.300.698.46-110051.289.0
Bayesian learner6.290.728.68κ: 8.06109962.20
Exponential discounting6.260.708.75γ: 0.33109929.9−32.3
RW2Instantaneous learner6.360.7210.79-62576.4201.3
Bayesian learner6.490.7810.9κ: 6.762375.20
Exponential discounting6.460.7311.0γ: 0.2562421.546.3
Sinusoid with intermittent jumpsInstantaneous learner6.380.658.19-83891.494.9
Bayesian learner6.450.688.26κ: 6.1383796.50
Exponential discounting6.430.678.20γ: 0.2483798.11.64
  1. Note: WAIC values were computed for each participant and summed across participants. A low WAIC indicates a better model. ΔWAIC is relative to the WAIC of the Bayesian learner.

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Ulrik Beierholm
  2. Tim Rohe
  3. Ambra Ferrari
  4. Oliver Stegle
  5. Uta Noppeney
(2020)
Using the past to estimate sensory uncertainty
eLife 9:e54172.
https://doi.org/10.7554/eLife.54172