The regime-shift detection task.

A. Trial sequence. In each trial, the subjects saw a sequence of red and/or blue signals and were told that these signals were drawn from one of the two regimes, a Red regime and a Blue regime. Both regimes were described as urns containing red and blue balls. The Red regime contained more red balls, while the Blue regime contained more blue balls. Each trial always started at the Red regime but could shift to the Blue regime in any of the 10 periods according to some transition probability (q). At the beginning of a trial, information about transition probability (shown as “switch” probability in the illustration) and signal diagnosticity (shown as “color ratio”) were revealed to the subjects. In this example, the transition probability is 0.1 and signal diagnosticity is 1.5. See main text for more detailed descriptions. B. Manipulation of the system parameters, i.e., transition probability (q) and signal diagnosticity (d). We independently manipulated the q (3 levels) and d (3 levels), resulting in a 3×3 factorial design. C. An example of a particular combination of the system parameters from the 3×3 design. Here the system that produces the signals has a q = 0.01 transition probability and d = 1.5 signal diagnosticity. Signals were sequentially presented to subjects. After each new signal appeared (a period), subjects provided a probability estimate (Pt) of a regime shift. D. Two example trials sequences. The example on the left shows the sequence of 10 periods of blue and red signals where d = 1.5 and q = 0.01. In this example, the regime was never shifted. The example on the right shows the sequence of periods where d = 9 and q = .1. In this example, the regime was shifted from the Red to the Blue regime in Period 3 such that the signals shown starting at this period were drawn from the Blue regime. E. We performed three fMRI experiments (30 subjects in each experiment) to investigate the neural basis of regime-shift judgments. Experiment 1 was the main experiment looking at regime shift—which corresponds to P(Change) in the Venn diagram—while Experiments 2 and 3 were the control experiments that ruled out additional confounds. In both Experiments 1 and 2, the subjects had to estimate the probability that signals came from the blue regime. But unlike Experiment 1, in Experiment 2, which corresponds to P(Blue), no regime shift was possible. In Experiment 3, the subjects were simply asked to enter a number with a button-press setup identical to Experiments 1 and 2. Therefore, Experiment 3 (Motor) allowed to rule out motor confounds.

Behavioral results for Experiment 1.

A. Illustrations of Over- and underreactions. Left column: stable environment (q = 0.01) with noisy signals (d = 1.5) and the 10 periods of red and blue signals a subject encountered. Right column: unstable environment (q = 0.1) with precise signals (d = 9). Top row: we plot a subject’s actual probability estimates (Pt, solid line) and the normative Bayesian posterior probability (, dashed line). Bottom row: belief revision shown by the subject (ΔPt = Pt, −Pt−1, solid line) and the Bayesian belief revision (, dashed line). The orange bars represent , which we define as the Index of Overreaction (IO; vertical axis in orange on the right). B. Over- and underreactions to change. The mean IO (across all 30 subjects) is plotted as a function of transition probability and signal diagnosticity. Subjects overreacted to change if IO > 0 and underreacted if IO < 0. Error bars represent ±1 standard error of the mean. C. Parameter estimates of the system-neglect model. Left graph: Weighting parameter (α) for transition probability. Right graph: Weighting parameter (β) for signal diagnosticity. Dashed lines indicate parameter values equal to 1, which is required for Bayesian updating. D-F. Sensitivity to transition probability and signal diagnosticity are independent. D. Correlation between α and β estimates at different levels of transition probability (q1 to q3) and signal diagnosticity (d1 to d3). All pairwise Pearson correlation coefficients (indicated by the values on the table that were also color coded) were not significantly different from 0 (p > .05). E. Pearson correlation coefficients of α estimates between different levels of transition probability. All pairwise correlations were significantly different from 0 (p < .05). F. Pearson correlation coefficients of β estimates between different levels of signal diagnosticity. All pairwise correlations were significantly different from 0 (p < .05).

Neural representations for the updating of beliefs about regime shift.

A. An example. Updating is captured by the difference in probability estimates between two adjacent periods (ΔPt). The Blue bars reflect the period probability estimates (Pt), while yellow bars depict ΔPt. B. Whole-brain results on the main experiment (Experiment 1) showing significant brain regions that correlate with the updating of beliefs about change (ΔPt). The clusters in orange represent activity correlated with ΔPt. The blue clusters represent activity correlated with the regime-shift probability estimates (Pt). The magenta clusters represent the overlap between the Pt and ΔPt clusters. C-D. Comparison between experiments. To rule out visual and motor confounds on the Pt results described in B, we compared the Pt contrast between the main experiment (Experiment 1) and two control experiments (Experiments 2 and 3). C. Whole-brain results on the effect of Pt between Experiments 1 and 2 (Experiment 1 – Experiment 2 on the negative Pt contrast). D. Whole-brain results on the effect of Pt between Experiments 1 and 3 (Experiment 1 – Experiment 3 on the negative Pt contrast).

A frontoparietal network represented key variables for regime-shift estimation.

A. Variable 1: strength of evidence in favor of/against regime shifts (strength of change evidence), as measured by the interaction between signal diagnosticity and signal. Left: two examples of the interaction between signal diagnosticity (d) and sensory signal (s), where a blue signal is coded as 1 and a red signal is coded as −1. The x-axis represents the time periods, from the first to the last period, in a trial. The y-axis represents the interaction, ln(d) × 𝑠. Right: whole-brain results showing brain regions in a frontoparietal network that significantly correlated with ln(d) × 𝑠. B. Variable 2: intertemporal prior probability of change. Two examples of intertemporal prior are shown on the left graphs. To examine the effect of the intertemporal prior, we performed independent region-of-interest analysis (leave-one-subject-out, LOSO) on the brain regions identified to represent strength of evidence (5A). Due to the LOSO procedure, individual subjects’ ROIs (a cluster of contiguous voxels) would be slightly different from one another. To visualize such differences, we used the red color to indicate voxels shared by all individual subjects’ ROIs, and orange to indicate voxels by at least one subject’s ROI. The ROI analysis examined the regression coefficients (mean PE) of intertemporal prior. The * symbol indicates p < .05, ** indicates p < .01. dmPFC: dorsomedial prefrontal cortex; lIPS: left intraparietal sulcus; rIPS: right intraparietal sulcus; lIFG: left inferior frontal gyrus; rIFG: right inferior frontal gyrus.

Estimating and comparing neural measures of sensitivity to system parameters with behavioral measures of sensitivity.

A. Behavioral measures of sensitivity to system parameters. For each system parameter, we plot the subjectively weighted system parameter against the system parameter level (top row: signal diagnosticity; bottom row: transition probability). For each subject and each system parameter, we estimated the slope (how the subjectively weighted system parameter changes as a function of the system parameter level) and used it as a behavioral measure of sensitivity to the system parameter (behavioral slope). We also show a Bayesian (no system neglect) decision maker’s slope (dark green) and the slope of a decision maker who completely neglects the system parameter (in light green; the slope would be 0). A subject with stronger neglect would have a behavioral slope closer to complete neglect. B. Comparison of behavioral and neural measures of sensitivity to the system parameters. To estimate neural sensitivity, for each subject and each system parameter, we regressed neural activity of a ROI against the parameter level and used the slope estimate as a neural measure of sensitivity to that system parameter (neural slope). We also estimated the neural slope separately for blue-signal periods (when the subject saw a blue signal) and red-signal periods. We computed the Pearson correlation coefficient (r) between the behavioral slope and the neural slope and used it to statistically test whether there is a match between the behavioral and neural slopes. C. The frontoparietal network selectively represented individuals’ sensitivity to signal diagnosticity (left two columns), but not transition probability (right two columns). Further, neural sensitivity to signal diagnosticity (neural slope) correlated with behavioral sensitivity (behavioral slope) only when a signal in favor of potential change (blue) appeared: all the regions except the right IPS showed statistically significant match between the behavioral and neural slopes. By contrast, sensitivity to transition probability was not represented in the frontoparietal network. D. The vmPFC selectively represented individuals’ sensitivity to transition probability (r = −0.38, p = 0.043 for red signals; r = −0.37, p = 0.047 for blue signals), but not signal diagnosticity (r = 0.28, p = 0.13 for red signals; r = 0.26, p = 0.17 for blue signals). The ventral striatum did not show selectivity to either transition probability or signal diagnosticity. Error bars represent ±1 standard error of the mean.

Model comparison.

A-E. Modeling results from five competing models. For each model, we plot subjects’ belief revision (ΔPt) and the model-estimated ΔPt. Light-colored dots and dashed lines respectively represent the model-estimated ΔPt at the individual and group levels. Dark-colored dots and solid lines indicate individual subjects’ ΔPt and group-averaged behavioral data respectively. Blue indicates data and model estimates at change-consistent signals; Red indicates data and model estimates at change-inconsistent signals. A. Bayesian model. B. Original system-neglect model (SN-original). C. Signal-dependent β system-neglect model (SN-SigDep-β). D. Signal-dependent α system-neglect model (SN-SigDep-α). E. Signal-dependent α and β system-neglect model (SN-SigDep-αβ). F. Model comparison based on the Akaike Information Criterion (AIC). Lower AIC values indicate better model fits. The bars indicate group mean AIC (averaged across all subjects), while the black dots indicate individual subjects’ AIC values. Error bars represent ±1 standard error of the mean. The * symbol indicates p < .05, ** indicates p < .01 (paired t-test; see Table S13 in SI for summary of statistical tests).

Probability estimates from all subjects are plotted as histograms separately for each condition—a combination of transition probability and signal diagnosticity.

The blue bars represent the actual probability estimates, while the orange bars correspond to the probability estimates predicted by the Bayesian model.

Experiment 2: estimates of the weighting parameter for signal diagnosticity (β) in the system-neglect model.

Dashed lines indicate parameter value equal to 1.

Parameter recovery.

We simulated probability estimates according to the system-neglect model. We used each subject’s parameter estimates as our choice of parameter values used in the simulation. Using simulated data, we estimated the parameters (α and β) in the system-neglect model. To examine parameter recovery, we plotted the parameter values we used to simulate the data against the parameter estimates we obtained based on simulated data and computed their Pearson correlation. Further, we added different levels of Gaussian white noise with standard deviation 𝜎 = [0.01, 0.05, 0.1, 0.2, 0.3] to the simulated data to examine parameter recovery and show the results respectively in Fig. A, B, and C. For each noise level, we show the parameter estimates in the left two graphs. In the right two graphs, we plot the parameter estimates based on simulated data against the parameter values used to simulate the data. A. Noise 𝜎 = 0.01. B. Noise 𝜎 = 0.05. C. Noise 𝜎 = 0.1. D. Noise 𝜎 = 0.2. E. Noise 𝜎 = 0.3. F. Empirically estimated noise (𝜎) of each subject. Each bar represents a subject’s estimated noise level.

Impact of noise homoscedasticity on parameter estimation.

A. Empirically estimated residual standard deviation. Mean residual standard deviation (across subjects, black data points) in the five probability intervals, [0.0–0.2), [0.2–0.4), [0.4–0.6), [0.6–0.8), and [0.8–1.0], were 0.1015, 0.1296, 0.1987, 0.1929, and 0.2061, respectively. Error bars represent ±1 standard error of the mean. B. Parameter recovery results assuming heteroscedastic noise. We performed parameter recovery using the empirically estimated, probability-dependent residual variance shown in A (the mean residual standard deviation estimates). Conventions are the same as in Fig. S3.

Probability estimates from the actual and simulated data.

A. Histogram of subjects’ actual probability estimation data collapsed across all conditions (left graph) and simulated probability estimation data under three different noise levels (Noise 𝜎 = 0.01, 0.05, 0.1). Descriptions of how we performed simulations can be seen in Supplementary Fig. S3. B. Subjects data are plotted as histograms separately for each condition.

System-neglect model can well-describe subjects’ over- and underreactions to change.

We fit the system-neglect model to each individual subject’ probability estimates and used the resulting parameter estimates to compute each subject’s probability estimates under the system-neglect model (). We then used to compute Index of Overreaction (IO). Here, IO was computed by subtracting belief revision predicted by the Bayesian model from belief revision estimated by system-neglect model . Formally, . The mean IO (across all subjects; indicated by the bars) is plotted as a function of transition probability and signal diagnosticity. Solid symbols represent data points from individual subjects. Error bars represent ±1 standard error of the mean. The patterns of over- and underreactions here resembled those based on actual data (Fig. 2B), suggesting that the system-neglect model can describe subjects’ over- and underreactions well.

Whole-brain results on activity that significantly correlated with probability estimates of regime shift (Pt) in GLM-2.

In GLM-2, we implemented variables important for estimating regime shifts as parametric regressors in addition to Pt. In particular, the intertemporal prior that captured the prior probability of change increasing as a function of time was included in the GLM-2. Still, we observed Pt correlates in the vmPFC and ventral striatum after controlling for these variables. Color for significant activations: blue indicates negative correlation with Pt; yellow indicates positive correlation with Pt. Familywise error-corrected at p < .05 using Gaussian random field theory with a cluster-forming threshold z > 3.1.

Whole-brain results on activity that significantly correlated with the subjects’ log odds estimates of regime shift, ln(Pt⁄(1 - Pt)), where Pt represents subjects’ probability estimates.

In this analysis, we replaced the parametric regressor of Pt with the log odds of regime shifts in GLM-1. Color for significant activations: blue indicates negative correlation with ln(Pt⁄(1 - Pt)). Familywise error-corrected at p < .05 using Gaussian random field theory with a cluster-forming threshold z > 3.1.

Probability estimates of regime shifts (Pt) separately analyzed for change-consistent (blue) and change-inconsistent (red) signals.

Whole-brain results on activity that significantly correlated with probability estimates of regime shift for change-consistent and change-inconsistent signals. We observed that vmPFC correlated with Pt for both types of signals. Blue color showing significant activations indicates negative correlation with Pt. Familywise error-corrected at p < .05 using Gaussian random field theory with a cluster-forming threshold z > 3.1.

Independent region-of-interest (ROI) analysis in vmPFC and ventral striatum on probability estimates across the three experiments (Experiment 1 was the main experiment on regime-shift detection, Experiments 2 and 3 were control experiments).

The vmPFC ROI we used was based on Bartrat et al. (2013). The ventral striatum mask was based on nucleus accumbens mask from Harvard-Oxford Cortical Structural Atlas. For each subject and each ROI, we extracted the mean parameter estimates (PE) of the probability estimates contrast in GLM-1. A. vmPFC ROI. Experiment 1: One-sample t test, t(29) = −3.82, p < 0.01; Experiment 2: One-sample t test, t(29) = 0.36, p = 0.71; Experiment 3: One-sample t test, t(29) = −1.11, p = 0.28; Experiments 1 - Experiment 2: two-sample t test, t(58) = −3.67, p < 0.01; Experiments 1 - Experiment 3: two-sample t test, t(58) = −3.12, p < 0.01;. B. Ventral striatum ROI. Experiment 1: (29) = −3.06, p < 0.01; Experiment 2: t(29) = 0.44, p = 0.67; Experiment 3: t(29) = −0.93, p = 0.36; Experiments 1 - Experiment 2: t(58) = −2.55, p = 0.01; Experiments 1 - Experiment 3: t(58) = −1.95, p = 0.06. The * symbol indicates p<0.05 (two-tailed), and ** symbol indicates p<0.01 (two-tailed).

Key variables for regime-shift computations.

A. Whole-brain results on the intertemporal prior of regime shift. B. Using the intertemporal prior ROI (left graph: magenta indicates voxels shared by the LOSO ROI of all subjects; blue indicates voxels of LOSO ROI of at least one subject) to examine the regression coefficients of the strength of evidence in favor of change, ln (d) × signal. The mean parameter estimates (mean PE), i.e., regression coefficient, was not significantly different from 0 (one-sample t test, t(29) = 0.54, p = 0.59, two-tailed).

Experiment 1: Probability estimates (Pt) and belief revision (ΔPt) contrasts based on GLM-1.

Cluster-level inference using Gaussian random field theory (familywise error corrected at p < .05 with a cluster-forming threshold z > 3.1).

Experiment 1: Probability estimates (Pt) and belief revision (ΔPt) contrasts based on GLM-1.

Permutation tests based on threshold-free-cluster-enhancement (TFCE) statistic.

Experiment 1: Probability estimates (Pt) and belief revision (ΔPt) contrasts based on GLM-1.

Permutation tests based on cluster extent.

Experiment 3: Instructed number (𝐼𝑁t) and difference in instructed number (Δ𝐼𝑁t) contrasts based on GLM-1.

For Experiment 3, 𝐼𝑁t represents the two-digit number subjects were instructed to press at each period, and Δ𝐼𝑁t represents the difference in number between successive periods. 𝐼𝑁t is the control for Pt in Experiment 1, and Δ𝐼𝑁t is the control for ΔPt in Experiment 1. Cluster-level inference using Gaussian random field theory (familywise error corrected at p < .05 with a cluster-forming threshold z > 3.1).

Experiment 1 - Experiment 2 based on the probability estimates (Pt) contrast in GLM-1.

Cluster-level inference using Gaussian random field theory (familywise error corrected at p < .05 with a cluster-forming threshold z > 3.1).

Experiment 1 - Experiment 3: based on the probability estimates (Pt) contrast in GLM-1.

Cluster-level inference using Gaussian random field theory (familywise error corrected at p<0.05 with a cluster-forming threshold z>3.1).

Experiment 1: GLM-2.

Cluster-level inference using Gaussian random field theory (familywise error corrected at p < .05 with a cluster-forming threshold z > 3.1).

Experiment 1: GLM-2.

Permutation tests based on the threshold-free-cluster-enhancement (TFCE) statistic.

Experiment 1: GLM-2.

Permutation tests based on cluster extent.

Experiment 1: GLM-1 without the action-handedness regressor.

Cluster-level inference using Gaussian random field theory (familywise error corrected at p < .05 with a cluster-forming threshold z > 3.1).

Experiment 1: GLM-2 without Pt and ΔPt regressors.

Cluster-level inference using Gaussian random field theory (familywise error corrected at p < .05 with a cluster-forming threshold z > 3.1).

Model-fitting summary.

Model comparison using paired t-test with Bonferroni correction.