Figures and data

Data set #1: Cue-target 2AFC task and results.
(A) Events during a single trial. While pupil dilation was recorded, participants predicted the orientation (left/right) of the upcoming target (Gabor patch) based on the visual and/or auditory cues (in the data analyzed here, only the visual cue had predictive validity). Predictions were given by a button press with the corresponding finger on the left or right hand. Two mapping conditions (condition 1 or condition 2) were counterbalanced across participants such that a participant in condition 1 was shown the square cue followed by a left-oriented target on 80% of the trials, while the square cue was followed by a right-oriented target on 20% of the trials. Gray box indicates the feedback event of interest. (B) Accuracy (fraction of correct responses) as a function of cue-target frequency. Data points are individual participants; stats, paired-samples Wilcoxon signed-rank t-test. (C) RT as a function of both cue-target frequency and accuracy (error/correct); stats, repeated-measures ANOVA. (D) Feedback-locked pupil response time course, plotted as a function of cue-target frequency and accuracy. Shading represents the standard error of the mean across participants. Light gray boxes, time windows of interest; early time window, [0.75, 1.25]; late time window, [2.5, 3.0]. The black horizontal bar indicates a significant interaction term (cluster-corrected, permutation test). (E) Early time window, average feedback-locked pupil response as a function of cue-target frequency and accuracy. stats, repeated-measures ANOVA. (F) As E, for the late time window. ANOVA results (multiple panels): top, main effect of frequency; middle, main effect of accuracy; bottom, frequency x accuracy interaction. Error bars, standard error of the mean across participants. *p < 0.05, **p < 0.01, *** p < 0.001.

Data set #2: Letter-color 2AFC task and results.
(A) Left, an independent learning phase was administered in the form of an odd-ball detection task during which six letters together with six shades of green colors as background (squares) were presented in three frequency conditions (33%, 50%, and 84%) on most trials (91%). Participants had to quickly respond to odd-ball targets (numbers and/or non-green color, 9% of trials). Letter-color mapping conditions were randomized per participant. Right, events during a single trial of the subsequent letter-color decision 2AFC task. While pupil dilation was recorded, participants indicated whether the letter “matched” the colored square with a button press. A match was correct when the letter and color had occurred most often together in the preceding odd-ball task. Gray box indicates the feedback event of interest. (B) Accuracy (fraction of correct responses) as a function of letter-color frequency. Dashed line represents chance level. Data points are individual participants; stats, paired-samples t-test. (C) RT as a function of both letter-color frequency and accuracy; stats, repeated-measures ANOVA. (D) Feedback-locked pupil response time course, plotted as a function of letter-color frequency and accuracy. Shading represents the across participants of the mean. Dark gray box, duration of the auditory feedback stimulus (0.3 s). Light gray boxes, time windows of interest; early time window, [0.75, 1.25]; late time window, [2.5, 3.0]. The purple horizontal bar indicates a significant two-way interaction effect (uncorrected for multiple comparisons). No significant time points remained after correction using the false discovery rate (FDR). (E) Early time window, average feedback-locked pupil response as a function of letter-color frequency and accuracy. (F) As E, for the late time window. ANOVA results (multiple panels): top, main effect of frequency; middle, main effect of accuracy; bottom, frequency x accuracy interaction. Error bars, standard error of the mean across participants. *p < 0.05, **p < 0.01, *** p < 0.001.

Results of the three-way repeated-measures ANOVA on the feedback-locked pupil response in the cue-target 2AFC task (data set #1).
The three-way repeated-measures ANOVA included factors: time window (levels: early vs. late), frequency (levels: 20% vs. 80%) and accuracy (levels: error vs. correct).

Results of the three-way repeated-measures ANOVA on the feedback-locked pupil response in the letter-color 2AFC task (data set #2).
The three-way repeated-measures ANOVA included factors: time window (levels: early vs. late), frequency (levels: 33%, 50%, and 84%) and accuracy (levels: error vs. correct). Greenhouse-Geisser statistics are reported when assumptions of sphericity were violated.

Correlations between the feedback-locked pupil response time course and the information-theoretic variables.
Left column, results for the cue-target 2AFC task. Right column, results for the letter-color 2AFC task. (A) The information gain, surprise, and entropy parameters are shown as a function of task trial. Model parameter units are in bits. (B) The mean information gain, surprise, and entropy parameters are shown as a function of frequency condition. (C) Average trial-by-trial correlations at the group level between the ideal learner model parameters (information gain, surprise, and entropy) at each time point in the feedback-locked pupil response. (D) Average trial-by-trial correlations at the group level between the information gain parameter and the feedback-locked pupil response separately for the error and correct trials. (E) As D, for the surprise parameter. (F) As D, for the entropy parameter. (G-L) As A-F for the letter-color 2AFC task. (C-L) Shading represents the standard error of the mean across participants. Light gray boxes, time windows of interest; early time window, [0.75, 1.25]; late time window, [2.5, 3.0]. The colored horizontal bars indicate time periods of significant correlation coefficients tested against zero for each model parameter or condition of interest (cluster-corrected, permutation test). The black horizontal bar indicates a difference between conditions (cluster-corrected, permutation test).

Linear mixed model results for the cue-target orientation 2AFC task.
Explanation of abbreviations, rows: I, Shannon surprise predictor variable; H, entropy; DKL, information gain; Baseline, pre-feedback baseline pupil dilation; RT, reaction times. Columns: 95% CI, the 95% credible interval of the median posterior distribution; pd, the probability (in percentage) of direction; ESS, effective sample size; *, indicates strong evidence that the parameter has a positive/negative effect on the post-feedback pupil response.

Linear mixed model results for the letter-color 2AFC task.
Explanation of abbreviations, rows: I, Shannon surprise predictor variable; H, entropy; DKL, information gain; Baseline, pre-feedback baseline pupil dilation; RT, reaction times. Columns: 95% CI, the 95% credible interval of the median posterior distribution; pd, the probability (in percentage) of direction; ESS, effective sample size; *, indicates strong evidence that the parameter has a positive/negative effect on the post-feedback pupil response.

Control tasks for data set #2: letter-color 2AFC task.
Left column, results from the control task for colors. Right column, results from the control task for feedback tones. (A) Mean tone-locked pupil response across all trials. (B) Feedback-locked pupil response time course plotted as a function of color used in the main letter-color 2AFC task (hexadecimal codes are given in the legend). (C) as A, for the control task for feedback tones. (D) Pupil response time courses plotted as a function of feedback tone used for error and correct trials in the main letter-color 2AFC task. All panels, dark gray boxes indicate the duration of the stimuli; 0.7 s, for the colors (group average); 0.3 s, auditory stimulus for the tones. Light gray boxes indicate time windows of interest; early time window, [0.75, 1.25]; late time window, [2.5, 3.0]. Shading represents the of the mean across participants. The black and green horizontal bars indicate a significant effect of interest (cluster-corrected, permutation based).

Sanity checks on pupil pre-processing for (A) the cue-target 2AFC task and (B) the letter-color 2AFC task.
All plots, feedback-locked pupil response time course plotted as a function of cue-target frequency and accuracy for different pre-processing stages. Shading represents the standard error of the mean across participants. Light gray boxes, time windows of interest; early time window, [0.75, 1.25]; late time window, [2.5, 3.0]. The black horizontal bar indicates a significant interaction term (cluster-corrected, permutation test). Top row, the raw and band-pass filtered pupil signal before interpolation. Second row, the interpolated and band-pass filtered pupil signal but without the nuisance regression. Third row, the fully pre-processed pupil (as in the main results) for the conservative analysis in which only trials containing at least 60% of original (non-interpolated) data were included. Bottom row, the nuisance predictors based on blink and saccade events estimated by deconvolution.

Main effects of frequency and accuracy in the feedback-locked pupil time courses.
Left column, results from the cue-target 2AFC task. Right column, results from the letter-color 2AFC task. (A) Mean feedback-locked pupil response across all trials. (B) Feedback-locked pupil response time course plotted as a function of accuracy. (C) Feedback-locked pupil response time course plotted as a function of stimulus-pair frequency. (D, E, F) as A, B, C for the letter-color 2AFC task. Gray boxes, time windows of interest; early time window, [0.75, 1.25]; late time window, [2.5, 3.0]. Shading represents the standard error of the mean across participants. The black horizontal bars indicate a significant effect of interest (panels A-E were cluster-corrected using permutation tests; panel F, a one-way repeated-measures ANOVA was conducted on each time point and corrected for multiple comparisons with the false discovery rate).

Individual differences analysis between accuracy and pupil responses.
Top row, cue-target 2AFC task. Bottom row, letter-color 2AFC task. Left column, early time window. Right column, late time window. The average feedback-locked pupil response frequency difference (80-20% and 84-33% frequency conditions for the cue-target and letter-color 2AFC tasks, respectively) is plotted against the frequency difference in accuracy. Data points, individual participants.

Correlations between the feedback-locked pupil response time course and the information-theoretic variables using a uniform prior distribution in the letter-color 2AFC task.
(A) The information gain, surprise, and entropy parameters are shown as a function of task trial. Model parameter units are in bits. (B) The mean information gain, surprise, and entropy parameters are shown as a function of frequency condition. (C) Average trial-by-trial correlations at the group level between the ideal learner model parameters (information gain, surprise, and entropy) at each time point in the feedback-locked pupil response. (D) Average trial-by-trial correlations at the group level between the information gain parameter and the feedback-locked pupil response separately for the error and correct trials. (E) As D, for the surprise parameter. (F) As D, for the entropy parameter. (C-F) Shading represents the standard error of the mean across participants. Gray boxes, time windows of interest; early time window, [0.75, 1.25]; late time window, [2.5, 3.0]. The colored horizontal bars indicate time periods of significant correlation coefficients tested against zero for each model parameter or condition of interest (cluster-corrected, permutation test). No differences between error and correct trials were obtained (cluster-corrected, permutation test).

Linear mixed model comparisons for the cue-target 2AFC task.
