Pupil diameter tracks perceptual change.

A) Example trial showing the continuous change from a stable image (plane) into a shark; Lower: the probability of detecting a switch (Δ) as a function of Image – most switches occur around the mid-point, but not exclusively so, leading to our prediction of heightened locus coeruleus activity at the switch point; B) Representation of the locus coeruleus (red), its diffuse projections to the whole brain network and its link to pupil dilation. C) Pupil diameter group average evoked response time locked to the perceptual change (dark line, t = 0), significance is shown in the top grey bar (pFDR < 0.05), showing two images around the perceptual change (Δ) are different from the null model. The average of the first and last two images are shown in the left (right) section of the plot (dotted line). We observed an increase of the pupillary response that peaked after the perceptual change. D) Group average of evoked pupillary responses to image switches – red represents the faster response when the switch occurs at image 6; green indicates a medium response with the switch at image 8; and blue denotes the slowest response with the switch at image 10.

A recurrent neural network model of perceptual switching.

A) we trained a continuous time E/I recurrent neural network (RNN) to categorise linearly changing inputs representing two discrete categories (e.g., output z1 and output z2). B) Softmax of network outputs on example trial with γ = .6, dotted line shows the timing of the perceptual switch. C) Following training, the firing rate of the excitatory units was clearly separated into two stimulus selective clusters - those that responded maximally to u1 (blue) and those that respond maximally to u2 (orange). Inhibitory units demonstrated a similar modular clustering but were sorted by the selectivity of the excitatory units they inhibited. D) Dynamics of gain on example trial with γ = .6 which peaks close to the perceptual switch (inset shows similarity to pupil diameter). E) Simplified network structure implied by selectivity analysis. Excitatory units (blue) form two stimulus selective modules. Each excitatory cluster is inhibited by a cluster of inhibitory units and a third non-selective inhibitory population. Pipette show lesion targets. F) Switch time as a function of γ magnitude (i.e. magnitude of uncertainty forcing). Lower black line shows a speeding effect of heightened γ (and therefore heightened gain at the perceptual switch). Teal lines show switch time for lesions to the inhibitory population targeting the initially dominant population (dark teal upper), and lesions to the inhibitory the population selective for the stimulus the input is morphing into (light teal middle).

Analysis of RNN dynamical regime.

A) Contour map of convergence time across the full gain by Δinput parameter space averaged across 100 initialisations with random initial conditions. Example parameter trajectories shown in white for high and low γ trials. B) Contour map of convergence proportion across the full parameter space. C-E) Example dynamics with gain = 1.1 and Δinput ≈ [1,0], [.5, .5], and [0, 1] respectively. F-H) Example dynamics with gain = 1.5 and Δinput ≈ [1,0], [.5, .5], and[0, 1] respectively.

Allocentric and egocentric energy landscape dynamics underlying the perceptual speeding effect of heightened gain.

A) Example network trajectory projected onto PC1 and averaged across trials for low (0.1; solid blue), medium (0.5; dotted green), and high (0.9; solid red) γ for the u1u2 condition. B) (abs) Velocity of PC1 trajectories across low (0.1), medium (0.5), and high (0.9) γ. C-D) Allocentric landscapes for low (0.1; blue) and high (0.9; red) γ conditions. Trial averaged PC1 trajectory shown in black. For purposes of visualisation energy values > 6 are set to a constant value. E-F) Egocentric landscapes for low (0.1; blue) and high (0.9; red) γ conditions. G) (Allocentric) neural work for low (0.1), medium (0.5), and high (0.9) γ, averaged across networks and conditions. H) Egocentric AUC for low (0.1), medium (0.5), and high (0.9) γ, averaged across networks and conditions.

Low-dimensional switch-related dynamics and connectivity.

A) spatial loadings of PC1 (green), PC2 (red) and PC3 (blue); B) Mean absolute β loading (solid lines) and group standard error (shaded) of PC1 (green), PC2 (red) and PC3 (blue), organized around the image switch point (Δ) - the dotted grey lines show the 95th percentile of the null distribution of a block-resampling permutation; C) radar plot showing the partial correlations of PC1 (green), PC2 (red) and PC3 (blue); D) Evoked Brain activity of PC2 + PC3 during the perceptual switch. E) Group averaged functional connectivity and module assignments using a Louvain analysis - three clusters were observed. F) Pearson’s correlation between the sum of PC2 and PC3 (per subject) and a joint-histogram comparing Integration (participation coefficient) and Segregation (module-degree Z-score); p < 0.05 following permutation testing.

Confirmation of model predictions in whole-brain BOLD data.

A) analysis of the RNN also predicted that the energy landscape dictating the likelihood of state transitions should be flat (i.e., have a small attractor depth) at the switch point; B) the energy landscape was demonstratively flatter (quantified as surprisal over brain activity displacement) at the switch-point; C) by interrogating the low-dimensional trajectories in the RNN, we predicted that there should be a peak in the gradient of the loadings in principal component space at the switch point between output #1 and output #2 ; D) the gradient (Δ×PC) of the β loading of PC2 as a function of the switch point.

Overall Analysis Flow.

Top Row (orange) - pupil diameter was collected in a cohort of 35 individuals while they performed the Ambiguous Figures task. We observed a large peak in pupil dilation at the perceptual change point, which led us to make the prediction that there should be an increase in inter-regional gain at the switch point. Middle Row (blue) - we trained a 100-node RNN to perform a similar classification task in the presence of shifting perceptual ambiguity, and then tested the network at different levels of gain (i.e., the slope of the tanh activation function). We observed early switches with heightened gain, as well as altered attractor dynamics that caused a flattening of the energy landscape characterising state switches. Bottom Row (green) - we tested the predictions of the RNN using BOLD data from 17 subjects performing the same task. After filtering the BOLD data through a principal component analysis (in which we retained the top 5 principal components; PC1–5), we observed an increase in the gradient of PC loading around the switch point using an FIR model, as well as a flattening of the energy landscape, thus confirming our original predictions.

Difference in mean firing rates between stimulus selective excitatory clusters.

To examine the effect of manipulating gain on the operation of the network (A) we averaged over the firing rate of the excitatory neurons in each stimulus selective cluster (rc1, rc2) and looked for the point at which rc2 > rc1 (and v.v.). In line with expectations the speeding and slowing effect of gain on network output time was straightforwardly reflected in the mean firing rates. B) Difference in mean firing rates for high (red), intermediate (green), and low (blue) gain, for gain manipulations targeting both excitatory and inhibitory neurons. Notice that the switch from rc1 > rc2 to rc2 > rc1 occurs sooner in time for high gain, and slower for low gain. C) Manipulating excitatory gain in isolation led to slower switches in mean firing rates for low gain but high gain did not speed switches. D) Manipulating inhibitory gain in isolation led to slower switches in mean firing rates for low gain and speeded switches under high gain.

Relationship between Attractor Depth and Energy Landscape.

We simulated a simple model (the normal form of a pitchfork bifurcation) (middle row is the corresponding potential ) and set the α term to two different values: on the left (red), α = 0.25, which corresponds to relatively shallow attractors; on the right (blue), α = 0.75, which corresponds to relatively deep attractors. We simulated model in the presence of light noise noisy, and then calculated the energy landscape of the timeseries (see methods in main paper) (bottom row; z-axis) As can be seen by comparing the middle and bottom rows, deeper attractors relate to higher energy barriers.

PC2 evoked displacement and surplice around the perceptual change.

A) Pearson’s correlation between each PCs and the evoked brain activity at the perceptual switch (β values), dashed line at PC2. B) Pearson’s correlation between the inverted brain maps using βPC (PC(i-i) × βPC(i-i)). Dashed line shows that the correlation gets to 94% using the first 3 PCs (Pearson’s r =0.94, p < 0.001). C) Mean absolute β loading (red) and group standard error (shaded red). D) Mean surprise calculated as -log(1-pvalues) in each regressors. Dotted black line define the perceptual switch point.