Separable neural signatures of confidence during perceptual decisions

  1. Tarryn Balsdon  Is a corresponding author
  2. Pascal Mamassian
  3. Valentin Wyart
  1. Laboratoire des Systèmes Perceptifs (CNRS UMR 8248), DEC, ENS, PSL University, France
  2. Laboratoire de Neurosciences Cognitives et Computationnelles (Inserm U960), DEC, ENS, PSL University, France
13 figures, 2 tables and 1 additional file



(a) Stimulus presentation: stimuli were presented at an average rate of 3 Hz, but with variable onset and offset (vs83,133 ms, vss+ves-1216 ms; see Materials and methods). Stimuli were presented within a circular annulus which acted as a colour guide for the category distributions. The colour guide and the fixation point were present throughout the trial. (b) Free task: on each trial observers were presented with a sequence of oriented Gabors, which continued until the observer entered their response (or 40 samples were shown). 100 sequences were predefined and repeated three times. (c) Replay task: The observer was presented with a specific number of samples and could only enter their response after the cue (fixation changing to red). The number of samples (x) was determined relative to the number the observer chose to respond to on that same sequence in the Free task (p). There were three intermixed conditions, Less (x = pmin – 2; where pmin is the minimum p of the three repeats), Same (x = pmed; where pmed is the median p) and More (x = pmax + 4; where pmax is the maximum p of the three repeats of that predefined sequence). (d) Categories were defined by circular Gaussian distributions over the orientations, with means -45° (ψ1, blue) and 45° (ψ2, orange), and concentration κ=0.5. The distributions overlapped such that an orientation of 45° was most likely drawn from the orange distribution but could also be drawn from the blue distribution with lower likelihood. (e) The optimal observer accumulates the difference in the evidence for each category, which is defined as the log probability of the sample orientation (θ) given the distributions. The perceptual decision is determined by the sign of the accumulated evidence, where the evidence accumulated across more samples better differentiates the true categories (example evidence traces are coloured by the true category).

Behaviour and computational modelling.

(a) Proportion correct in each condition of the Replay task, relative to the Free task (orange horizontal lines). Individual data are shown in scattered points, error bars show 95% between- (thin) and 95% within- (thick) subject confidence intervals. Open red markers show the model prediction. (b) Distributions of the number of samples per trial in the Free task, and Replay task conditions (over all observers). (c) Difference in log-likelihood of the models utilising a covert bound relative to the models with no covert bound. On the left, the model fitting perceptual decisions only. The middle bar shows the difference in log-likelihood of the fit to confidence ratings with identical perceptual and confidence bounds. The right bar shows the difference in log-likelihood of the fit to confidence ratings of the model with an independent bound for confidence evidence accumulation. Error bars show 95% between-subject confidence intervals. (d) The computational architecture of perceptual and confidence decisions, based on model comparison. Perceptual and confidence decisions accumulate the same noisy perceptual evidence, but confidence is affected by additional noise (εc) and a separate temporal bias (αc). This partial dissociation allows confidence evidence accumulation to continue after the observer has committed to a perceptual decision. (e) Predicted proportion correct compared to actual proportion correct for each observer, based on the fitted model parameters of the final computational model. The left panel shows proportion correct split by condition, and the right, split by confidence rating. (f) Regression coefficients from the GLM analysis showing the relationship between the optimal evidence L, and observers’ perceptual (top) and confidence (bottom) responses for trials split by condition. The right set of bars show the same analysis but with evidence accumulated up to four samples from the response cue.

EEG signatures of premature perceptual decisions.

(a). Classifier AUC training at each time-point in the Free task and testing across time in the Less (top), Same (middle), and More (bottom) conditions of the Replay task. Black contours encircle regions where the mean is 3.1 standard deviations from chance (0.5; 99% confidence). (b) Difference in AUC between the More and Less conditions. Cluster corrected significant differences are highlighted. (c) The relationship between the evidence accumulated up to n samples prior to the response cue and the strength of the neural signature of response execution in each condition. Error bars show 95% within- (thick) and between-subject (thin) confidence intervals.

Representation of decision variables.

(a) Representation precision (Fischer transformed correlation coefficient, z) of stimulus orientation (blue, left), momentary decision update (green, middle), and accumulated decision evidence (purple, right). The encoded variables are shown in the insets (the accumulated evidence is the cumulative sum of the momentary evidence signed by the response, only one example sequence is shown). Shaded regions show 95% between-subject confidence intervals. (b) Relative electrode representation precision over three characteristic time windows (100–200 ms, left; 400–600 ms, middle; and 600–800 ms, right). (c) Representation precision for epochs leading to optimal and suboptimal perceptual (T1) and confidence (T2) responses. Lighter lines show perceptual decisions that match the optimal response, dashed lines show suboptimal confidence ratings. Dashed red horizontal lines show significant interactions between perceptual and confidence suboptimality. The light red horizontal line shows the significant effect of suboptimal perception and the dark red horizontal line shows the significant effect of suboptimal confidence. Shaded regions show 95% within-subject confidence intervals. (d) Difference in decoding precision between the More and the Less conditions for epochs corresponding to the last four samples of the trial. The purple horizontal line shows the significant difference in decoding of accumulated evidence.

Clusters of behaviourally relevant representations and their sources.

(a) Log likelihood ratio (LLR) of the data given the hypothesis that decoding precision varies with behavioural suboptimalities, against the null hypothesis that decoding precision varies only with measurement noise. Perceptual (Type-I) behaviour is shown on top and confidence (Type-II) behaviour is shown on the bottom. Clusters where the log posterior odds ratio outweighed the prior are circled, only the bold area of the perceptual cluster was further analysed. Time series (left) show the maximum LLR of electrodes laterally, with frontal polar electrodes at the top descending to occipital electrodes at the bottom. Scalp maps (right) show the summed LLR over the indicated time windows. (b) Left: representation precision (z) training and testing on signals within the clusters. Colours correspond to the circles in (a), with the dark green bar showing the combined decoding precision of the anterior and posterior confidence clusters, and the black bar showing the combined representation precision of all clusters. Right: Representation precision of the last four samples in the Less and the More conditions for the combined confidence representation and the perceptual representation. Error bars show 95% within-subject confidence intervals. (c) ROIs (defined by mindBoggle coordinates; Klein et al., 2017): lateral occipital cortex (blue); superior parietal cortex (green); orbitofrontal cortex (orange); and rostral middle frontal cortex (red). (d) ROI time series for Noise Max (black) and Noise Min (coloured) epochs, taking the average rectified normalised current density (z) across participants. Shaded regions show 95% within-subject confidence intervals, red horizontal lines indicate cluster corrected significant differences. Standardised within-subject differences are traced above the x-axis, with the shaded region marking z = 0 to z = 1.96 (95% confidence). (e) Standardised regression weight (t-statistic) of the GLM comparing observers’ confidence ratings to those predicted from the activity localised to the orbitofrontal cortex. The shaded region shows the 95% between-subject confidence interval, and the red horizontal line marks the time-window showing cluster-corrected significant differences from 0.

Appendix 1—figure 1
Schematic of possible relationships between perceptual (Type-I) and confidence (Type-II) evidence accumulation.

(a) Same evidence accumulation processes: Type-I (perceptual) and Type-II (confidence) decisions are different responses to the same evidence: each sample of perceptual evidence is disrupted by a sample of sensory noise (εs) drawn from a zero-mean Gaussian with standard deviation σ, and accumulated with a temporal bias described by αs. (b) Parallel processing: Type-I and Type-II decisions rely on entirely separate processing of the same physical stimulus: the confidence decision also incurs noise and temporal integration bias (with subscript c), but these may vary independently of the perceptual processing suboptimalities (subscript s). (c) Partial dissociation: Type-I and Type-II decisions rely on partially dissociable accumulation of the same evidence.

Appendix 2—figure 1
Model simulation of accumulated evidence for perceptual and confidence decisions.

(a) Example trial from one observer showing simulated evidence traces agreeing with the observer’s response (blue) and a sample of example traces which did not agree (red). The perceptual decision is shown on the left. An evidence trace was taken to agree with the observer’s decision if the corresponding bound was reached prior to the opposing bound, or if no bound was reached but the final accumulated evidence was in favour of the chosen option. The median evidence trace (thick blue line) was calculated assuming the evidence that reached the bound early was maintained until the response was entered. For the confidence rating (right) we compared the median evidence from traces where the final accumulator (plus one additional sample of noise) agreed with the observer’s confidence rating. We examined the difference from the ideal accumulated evidence (thick green line) relative to the likelihood of the observers’ rating given all simulated evidence traces. (b) Median final simulated accumulated evidence for the perceptual decision (abscissa), and the confidence decision (ordinate) for all trials of the example observer, colours indicate the condition. (c) Correlation (Fisher transformed z) between perceptual and confidence evidence for each observer. The example observer is highlighted in orange.

Appendix 3—figure 1
Confidence behaviour.

(a) proportion correct (in the perceptual decision) by confidence rating. (b) Decision evidence (based on the presented samples) by confidence rating. (c) Number of samples presented by confidence rating. In all plots, error bars show 95% within-subject confidence intervals. Red circles show the predictions of the best fitting confidence model (Appendix 1). (d) Confidence responses of two observers (top and bottom panels) on all trials sorted by the confidence evidence of the optimal observer. The median confidence evidence (shown by a black vertical line) defines an optimal confidence observer whose confidence above this median are rated high. Observers’ high confidence ratings are shown in blue and low confidence ratings in green. Suboptimal confidence ratings, where human and optimal confidence observers do not match, are indicated with small vertical segments (green for Type-II misses and blue for Type-II false alarms). Negative confidence evidence corresponds to incorrect perceptual decisions. The observer shown on top clearly has fewer suboptimal responses compared with the observer below, and the frequency of suboptimal responses decreases further from the median. (e) Model estimated confidence error by confidence rating suboptimality (0 = the observer’s confidence rating was the same as the optimal observer, 1 = suboptimal confidence rating). (f) The effect of response bias on the analysis of suboptimal confidence in the EEG representation of accumulated evidence. Observers’ confidence ratings were compared to an unbiased optimal observer (purple), and two biased (but otherwise optimal) observers, who respond with high confidence on 35% and 65% of trials (making the human observers relatively more liberal and conservative with their response strategy in comparison). Thick lines show the within-subject difference in precision (Fisher transformed correlation) between trials where the human observers’ confidence ratings correspond to the (un/biased) optimal observer and suboptimal confidence ratings. Shaded regions show the 95% between-subject confidence intervals on the difference.

Appendix 4—figure 1
Amplitude modulations with task variables.

(a) The Laterised readiness potential by condition (top), perceptual decision accuracy (middle) and reported confidence (bottom). Horizontal red lines mark significant differences in amplitude. (b) Central Parietal Positivity, with the same comparisons. Shaded regions show 95% within subject confidence intervals, and the region of slope comparison for the CPP is highlighted in grey.

Appendix 5—figure 1
Response Classification analysis.

(a) Classifier AUC training and testing at each time point (abscissa) based on the power (dB) in each frequency band (ordinate). Clusters where average performance is greater than 3.1 standard deviations (99% confidence) from baseline (0.5) are circled in black. (b) Scalp map of the difference in power for right- compared to left-handed responses averaged over 8 to 32 Hz and −0.5 to 05 s around the response. (c) Classifier performance (AUC) training and testing at each time point, in each condition of the Replay task and in the Free task.

Appendix 6—figure 1
Encoding variable regression.

(a) Encoded variables used to regress EEG signals. The encoded orientation (Cθ, left) and encoded momentary decision update (Cl, middle) were dependent on the orientation presented to the observer. The encoded accumulated evidence (Cz) varied over all presented orientations in a trial, the figure on the right shows only one example. (b) Representation precision of encoding variables using different low-pass filters. (c) Cross correlation between encoding variables over consecutive samples. (d) Temporal generalisation of representations: the regression weights were calculated on EEG signals at each time point and precision was tested across time. Colour scales are relative to the maximal precision, with zero precision in white and negative in grey (a sign flip of the regression weights). (e) Representation precision of the accumulated evidence for the first (left) and last (right) four stimuli of the Less and More conditions. Shaded error bars show the 95% within subject confidence intervals, red horizontal bars mark cluster corrected significant differences between conditions. (f) Representation precision of the previous (n-1), current (n) and future (n+1) accumulated evidence, based on the EEG signals locked to the current epoch. (g) Representation precision of the momentary decision update (top) and the accumulated evidence (bottom) for epochs separated by the timing of the subsequent stimulus, shown in coloured bars (317 ms, red, left; 333 ms, green, middle; and 350 ms blue, right).

Appendix 8—figure 1
Estimating inference error.

(a) Two approaches to estimate inference error. It is assumed the observer’s behaviour is based on a suboptimal inference over the physical stimulus. We do not have access to the single-sample inference error, but can estimate it using the measured variables: the physical stimulus properties, the behaviour, and the EEG signals. Two approaches are outlined: The EEG inference error estimate, which relies on the error of the representation of the accumulated evidence, in clusters where the precision of the representation is related to suboptimal behaviour; and the model error, which relies on simulating the processing of the evidence based on the fitted model parameters, and taking the median of simulated traces which concur with the observer’s response. (b) Correlation between variables measured from behaviour, the stimulus input, and the estimated inference error. (c) Effect size on the difference between Noise Min and Noise Max epochs.

Appendix 9—figure 1
Regions of interest and corresponding current density.

(a) Average rectified normalised current density in Noise Min epochs for the corresponding time windows, filtered above the half-maximum amplitude (b) Regions of interest based on Mindboggle coordinates. (c) Average normalised rectified current density in the right (top) and left (bottom) hemispheres. Noise Min epochs are shown coloured, Noise Max in black, with shaded regions showing the 95% within-subject confidence interval.


Appendix 1—table 1
Average parameter values.

Table shows the average values and the sum of BIC across participants. The large difference in the average log-likelihood (LLH) across tasks is due to the fact the Free task model was fit to both when and what observers responded, whereas in the Replay task only the response was fit. Red values show the fixed parameters.

Free task
c = 00.800.920.004520.540.−736.8630387.02
a = 0.10.770.920.034170.520.105.782.200.016−735.4830331.75
b = 5.50.780.940.024100.640.135.501.790.013−742.1830599.67
l = 0.0010.820.980.014000.480.104.772.220.001−730.6630139.17
c = 0; l = 0.0010.790.940.003970.510.104.522.260.001−732.6630104.74
c = 0; l = 0.001; a = 0.10.770.940.004030.520.105.372.130.001−742.4230381.13
Replay Task - no-bound
c = 00.480.900.00~~~~~0.009−82.733651.38
l = 0.0010.500.910.06~~~~~0.001−82.053624.39
c = 0; l = 0.0010.510.900.00~~~~~0.001−83.643573.67
Replay task - bound
c = 0; l = 0.0010.480.880.00~~0.138.5815.550.001−82.223859.24
c = 0; l = 0.001; a = 0.10.480.880.000.108.9115.880.001−82.383751.55
Appendix 1—table 2
Average parameter values for perceptual and confidence behaviour.

Bound parameters with subscript c describe the criteria for confidence ratings, which take the same form as the perceptual decision bound. They have the same minimum and scale, but different rates of decline, such that λ c1determines the upper bound on a confidence rating of 1, and the lower bound on a rating of 2. Apart from the ‘Serial’ and ‘Serial continued’ models, parameters for perceptual decisions were fixed to those fit in the winning perceptual decision model and the listed parameters affect only the confidence evidence accumulation.

Serial continued0.670.910.139.6017.980.060.530.667.0830.41−424.8818135.83
Partial - same sigma0.000.89~~~0.060.471.036.7725.92−446.4118540.68
Partial - accumulation noise0.450.91~~~0.030.580.507.7131.03−421.5917662.25
Partial - read-out noise0.120.90~~~0.020.521.858.6337.39−417.9417516.29
Partial - same alpha0.120.88~~~0.020.520.988.2235.16−423.0217605.29

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Tarryn Balsdon
  2. Pascal Mamassian
  3. Valentin Wyart
Separable neural signatures of confidence during perceptual decisions
eLife 10:e68491.