Abstract
Perceptual updating has been hypothesized to rely on a network reset modulated by bursts of ascending neuromodulatory neurotransmitters, such as noradrenaline, abruptly altering the brain’s susceptibility to changing sensory activity. To test this hypothesis at a large-scale, we analysed an ambiguous figures task using pupillometry and functional magnetic resonance imaging (fMRI). Behaviourally, qualitative shifts in the perceptual interpretation of an ambiguous image were associated with peaks in pupil diameter, an indirect readout of phasic bursts in neuromodulatory tone. We further hypothesized that stimulus ambiguity drives neuromodulatory tone leading to heightened neural gain, hastening perceptual switches. To explore this hypothesis computationally, we trained a recurrent neural network (RNN) on an analogous perceptual categorisation task, allowing gain to change dynamically with classification uncertainty. As predicted, higher gain accelerated perceptual switching by transiently destabilizing the network’s dynamical regime in periods of maximal uncertainty. We leveraged a low-dimensional readout of the RNN dynamics, to develop two novel macroscale predictions: perceptual switches should occur with peaks in low-dimensional brain state velocity and with a flattened egocentric energy landscape. Using fMRI we confirmed these predictions, highlighting the role of the neuromodulatory system in the large-scale network reconfigurations mediating adaptive perceptual updates.
Introduction
The overwhelming majority of neurons in our brains have only indirect interactions with the external world. This means that the identity of sensory inputs is inherently ambiguous1–5. The equivocal nature of perceptual input is overcome by incorporating prior information about the causal structure of the world into sensory inferences. This is clearly evidenced in laboratory experiments that present participants with sensory inputs that offer two equally valid yet mutually exclusive perceptual interpretations (e.g. the Necker cube illusion and binocular rivalry): in these ambiguous scenarios, observers periodically switch between mutually exclusive percepts6–8.
Outside of conditions of extreme perceptual ambiguity, perceptual awareness is remarkably stable, suggesting that the nervous system can rapidly (and flexibly) identify the best ‘match’ between visual data and a stable (likely known) stimulus category6,9. Importantly, this process of combining ambiguous sensory input with prior information must be dynamic: adaptive behaviour requires that the relative reliability of prior information and current sensory input are made suitably contextually dependent10–13. In ecological settings, the problem is even more pronounced: not only does the reliability of the sensory input vary, the urgency of perceptual decision making also changes between context14–16.
Neuroimaging studies investigating perceptual updating and switches have typically identified a distributed set of regions within the cerebral cortex17,18. These cortical regions are presumed to play a role in attentional shifts driving switches in perceptual contents by selectively boosting activity within the relevant circuits18,19. This interpretation is complemented by behavioural evidence showing that attention plays a prominent role in determining the contents of perception in bistable perception tasks where competition is not resolved at low-levels of the visual hierarchy21,22. Similarly, computational models of perceptual decision making typically consist of winner-take-all competition between cortical populations23–25. Yet, the ability to flexibly respond to ambiguous visual inputs according to changing task demands is a feature that is present across phylogeny26 and hence is present in a wide variety of animals that have poorly developed cerebral cortices27. Indeed, phasic change in the highly conserved ascending arousal system have been linked to moment-by-moment adaptive updates in the relative weighting of prior information, sensory input, and the urgency of the perceptual decision process through neuromodulatory mediated alterations in neural gain28–31.
The ascending neuromodulatory system, and specifically the noradrenergic locus coeruleus (LC), is well-suited to modulate the large-scale, brain state switches required to flexibly alter perceptual contents32,33. While the cell body of the LC is located in the brainstem, the nucleus sends projections throughout the central nervous system, wherein its axons release noradrenaline, which in turn modulate the excitability of targeted regions11. In previous work, it has been argued that the phasic release of noradrenaline from the LC acts as a “network reset” signal, which effectively disrupts ongoing processing, and hence allows animals to reconfigure their ongoing neural dynamics towards more salient (and hopefully, behaviourally-relevant) processes34–36. This mechanism is of critical importance in ecological contexts in which an animal needs to be able to both focus on the current task in an exploitative mode (such as foraging), while being able to rapidly modify its internal, attentional and behavioural state when required (e.g., if resources are depleted, or in the presence of a predator).
Preliminary evidence in the context of bistable perception has shown that when a stimulus is task-relevant, pupil diameter (a non-specific and indirect readout of phasic LC activity37–40 and neuromodulatory tone) is tightly linked to switches in the content of perception31,41,42. In line with this, recent modelling has shown that linking perceptual updates to fluctuations in neuromodulatory tone recapitulates the phasic-tonic firing rate pattern known to characterise LC spiking dynamics and improves performance in reinforcement learning tasks28. Thus, whilst the LC could plausibly mediate perceptual switches in a task-relevant setting, we still need a more robust test of this hypothesis.
Based on previous work34–36,43,44, and the projections of the LC to many of the regions implicated in whole-brain imaging studies of perceptual uncertainty38,45,46, we hypothesized that task-related perceptual switches can be modulated by phasic bursts of LC activity, which act as a ‘network reset’36, flattening the whole-brain energy landscape47 and thus allowing cortical dynamics to evolve into a new state thereby changing the contents of perception.
To test this hypothesis, we leveraged a cognitive task designed to investigate switches in perceptual categorisation48. We observed that pupil diameter peaked at the point of the perceptual switch and predicted their timing. We then trained a recurrent neural network (RNN) to perform an analogous change detection task. Based on previous modelling and theory we allowed the gain of the activation function (an established mechanism for the action of noradrenaline on the cerebral cortex11,49,50) to vary as a function of the uncertainty in the pretrained network’s perceptual categorisation. This revealed that heightened gain facilitated earlier perceptual switches by transiently destabilizing the network’s dynamics under conditions of maximal uncertainty. Further analyses translated these neural dynamics into two predictions that could be tested in fMRI data17,48: (1) heightened gain increases the velocity of low-dimensional neural trajectories around perceptual switches, and (2) it flattens the energy landscape of the neural state space47. Overall, our results support the hypothesis that phasic bursts of neuromodulatory activity act as a “network reset” 34,36, dynamically disrupting stable network states and facilitating switches in perceptual categorisation. This reset mechanism highlights the role of neuromodulatory systems in transiently reorganizing network dynamics to enhance flexibility and adaptability in response to uncertainty.
Results
Evoked pupil dilations coincide with the resolution of perceptual ambiguity
To assess the role of the ascending arousal activity during task performance, we analysed a dataset of 35 participants who performed an ambiguous figures task whilst simultaneously recording pupil diameter with an eye tracker device (SR Research, 1000 Hz). Briefly, the task consisted of a set of continuously transforming images that transition from an initial object (e.g., a shark) into a second object (e.g., a plane), while preserving basic psychophysical attributes (Fig. 1A). Crucially, even though the task stimuli change incrementally and linearly, with maximal ambiguity at the mid-point of each trial (the peak of the dotted line curve in Fig. 1A), awareness of a change in the stimulus is known to ‘pop out’, often at different times on each trial48. When these perceptual switches occurred, subjects were instructed to change the button they were pressing, thus indicating a change in perceptual interpretation across stimuli. Participants viewed 20 unique sets of images, each of which morphed from a starting image into a second image through 15 equally spaced intermittent stages (Fig. 1A). For each participant, we identified the first and last time they viewed a sequence of images, as well as the three images leading up to and following an identified perceptual switch, irrespective of the categories associated with each specific object switch. The rest of the analyses in this manuscript are organised around this perceptual transition.
Given the known (admittedly non-specific) relationship between LC activity and the dynamics of pupil diameter51 (Fig. 1B), we were able to test the hypothesis that neuromodulatory tone is associated with perceptual switching. The linear nature of the morphing procedure meant that luminance levels (which could otherwise bias pupil diameter32,37,38) were kept constant across all trials. Additionally, motor preparation was controlled by requiring subjects to press a button on each image (indicating the content of their perception). Mapping all blink-corrected, filtered and normalized trials over time.

Pupil diameter tracks perceptual change.
A) Example trial showing the continuous change from a stable image (plane) into a shark; Lower: the probability of detecting a switch (Δ) as a function of Image – most switches occur around the mid-point, but not exclusively so, leading to our prediction of heightened locus coeruleus activity at the switch point; B) Representation of the locus coeruleus (red), its diffuse projections to the whole brain network and its link to pupil dilation. C) Pupil diameter group average evoked response time locked to the perceptual change (dark line, t = 0), significance is shown in the top grey bar (pFDR < 0.05), showing two images around the perceptual change (Δ) are different from the null model. The average of the first and last two images are shown in the left (right) section of the plot (dotted line). We observed an increase of the pupillary response that peaked after the perceptual change. D) Group average of evoked pupillary responses to image switches – red represents the faster response when the switch occurs at image 6; green indicates a medium response with the switch at image 8; and blue denotes the slowest response with the switch at image 10.
We observed a clear increase in the phasic pupillary response approximately three trials before participants switched to a new perceptual category, potentially reflecting the onset of increased ambiguity toward a new object (Fig. 1C). This response peaked at the point of the perceptual switch, corresponding to the maximum pupil diameter (Fig. 1C). Further analysis revealed a significant increase in the mean pupil response starting three images before the change point (mean β = 0.22; t(32) = 8.02, p = 2.3 × 10-19), before returning to baseline levels.
Next, we sought to elucidate the relationship between ascending arousal, quantified by pupil diameter, and the temporal dynamics of perceptual shifts on a trial-by-trial basis. Given the pivotal role of the LC in modulating sensory processing and perceptual switches (Fig. 1B), we hypothesized that the speed of a perceptual switch would correlate with neuromodulatory tone. Specifically, we predicted that trials with faster perceptual switches would be associated with an increase in pupil diameter, while slower switches would correspond to a decrease.
To test this prediction, we performed a two-level linear model analysis. The peak pupil diameter observed during the perceptual switch was designated as the independent variable, and the trial on which the perceptual shift was reported served as the regressor for each subject. To control for potential confounds, such as impulsive premature responses, and to address reduced statistical power in extreme response epochs (both early and late), we limited our analysis to responses within two images from the median switch point (9 ± 2; 84.1% of total trials). At the group level, we conducted a one-tailed t-test on the regressors from the linear model. As expected, we observed an inverse relationship between evoked pupil diameter and the trial marking the perceptual switch (mean β = -0.19, t(27) = -2.6452, p = 6.7 × 10-3, SD = 0.3880). Earlier responses showed a positive relationship with higher evoked pupil diameter during the switch epoch (Fig. 1D red), whereas later responses were associated with a more constricted pupil (Fig. 1D blue). In summary, these results provide indirect evidence for our hypothesis that ascending neuromodulation – such as LC activity – is associated with the speed of perceptual switches.
Computational evidence for neuromodulatory-mediated perceptual switches in a recurrent neural network
Our initial results provided confirmatory evidence implicating the neuromodulatory tone of the ascending arousal system in perceptual switches. There is evidence, however, suggesting that simply changing stimulus categories can also induce similar pupillary dilations41,52,53. What we need, therefore, is a more mechanistic means of both framing and testing our network reset hypothesis in the context of perceptual switching. Along with others54–57, we have used a combination of computational modelling49,58, neurobiological theory11, and multi-model neuroimaging19,27,31,32 to suggest that noradrenaline alters neural gain50,61, which in turn affects inter-regional communication flattening the energy landscape traversed by the brain’s dynamics allowing the brain state to jump between perceptual attractors more easily. Whether these signatures of large-scale network reconfiguration are mechanistically related to network reset remains an important and open question.
To test whether our hypothesised neuromodulatory mechanism could recapitulate the behaviour we observed in the ambiguous figures task, we trained 50 continuous time recurrent neural networks (RNN) constrained to respect Dale’s law (i.e., 80/20 split of purely excitatory/inhibitory units62,63) to perform a perceptual change detection task analogous to the task performed by our participants (Fig. 2A). The input and readout weights were constrained to be purely excitatory and only the firing rate of excitatory units contributed to the readout64 (see Materials and Methods).
Each network was provided with a two-dimensional input u(t) = [u1 u2]T, with each column representing the “sensory evidence” for each of the two stimulus categories (Fig. 2A). The task lasted for 1 second of simulation time (we used a shorter time period for the simulation than the empirical task so that we could keep the integration step relatively small making the training and simulations more numerically tractable): to mimic the linear transition between image categories in our task, each trial began with maximum evidence for one of the two categories and minimum evidence for the other (e.g., u1 = 1 ,u2 = 0), and then linearly changed the evidence over the course of each trial such that by the final time-step the evidence for each category had switched (e.g., u1 = 0,u2 = 1). At each time point, the network was trained to output a categorical response indicating which input dimension had a higher value (Fig. 2B). Following training, all networks achieved near perfect behavioural accuracy (0.97 ± 0.02).
We next sought to test our hypothesis about the role of neural gain in perceptual switches. In previous work, we (and others) have argued that the impact of neuromodulators (such as NA) on population-level activity can be approximated by steepening (or flattening) the sigmoid activation function, thus mimicking the effect NA has on neuronal excitability by liberating intracellular calcium stores and/or opening (or closing) voltage-gated ions channels11,65. As a first test of this hypothesis, we manipulated the gain of the sigmoid activation function for all units in the network across a range of gain values (0.5 to 1.5) in a static manner. As predicted, increased gain (red; corresponding to heightened adrenergic tone) lead to earlier ‘perceptual switches’ in the network output whereas low gain caused later switches (Fig. S1).
Having confirmed that static manipulations of gain alter the speed of perceptual switches we constructed a more precise test of our hypothesis. Specifically, inspired by previous theoretical and experimental work showing that sensory prediction errors (i.e. transient increases in perceptual uncertainty) lead to phasic bursts in the noradrenergic locus coeruleus28,30 we made gain time dependent with dynamics governed by a linear ODE with a forcing term proportional to the uncertainty (i.e. the entropy
When the network’s readout becomes uncertain approaching the perceptual switch (i.e. has high entropy) gain increases in a phasic manner (with magnitude γ), and in absence of the forcing, gain decays exponentially to its tonic value (gtonic = 1). This modification resulted in gain dynamics reminiscent of the participant’s pupil-diameter (Fig. 2D), and crucially, the speed of perceptual switches increased with the magnitude of the uncertainty driven forcing term (γ; Fig. 2F).

A recurrent neural network model of perceptual switching.
A) we trained a continuous time E/I recurrent neural network (RNN) to categorise linearly changing inputs representing two discrete categories (e.g., output z1 and output z2). B) Softmax of network outputs on example trial with γ = .6, dotted line shows the timing of the perceptual switch. C) Following training, the firing rate of the excitatory units was clearly separated into two stimulus selective clusters - those that responded maximally to u1 (blue) and those that respond maximally to u2 (orange). Inhibitory units demonstrated a similar modular clustering but were sorted by the selectivity of the excitatory units they inhibited. D) Dynamics of gain on example trial with γ = .6 which peaks close to the perceptual switch (inset shows similarity to pupil diameter). E) Simplified network structure implied by selectivity analysis. Excitatory units (blue) form two stimulus selective modules. Each excitatory cluster is inhibited by a cluster of inhibitory units and a third non-selective inhibitory population. Pipette show lesion targets. F) Switch time as a function of γ magnitude (i.e. magnitude of uncertainty forcing). Lower black line shows a speeding effect of heightened γ (and therefore heightened gain at the perceptual switch). Teal lines show switch time for lesions to the inhibitory population targeting the initially dominant population (dark teal upper), and lesions to the inhibitory the population selective for the stimulus the input is morphing into (light teal middle).
Having confirmed our hypothesis that increasing gain as a function of the network uncertainty increased the speed of perceptual switches, we next sought to understand the mechanisms governing this effect starting with the circuit level and working our way up to the population level (c.f. Sheringtonian and Hopfieldian modes of analysis66). Because of the constraint that the input and output weights were strictly positive, we could use their (normalised) value as a measure of stimulus selectivity. Inspection of the firing rates sorted by input weights revealed that the networks had learned to complete the task by segregating both excitatory and inhibitory units into two stimulus-selective clusters (Fig. 2C). As the inhibitory units could not contribute to the networks read out, we hypothesised that they likely played an indirect role in perceptual switching by inhibiting the population of excitatory neurons selective for the currently dominant stimulus allowing the competing population to take over and a perceptual switch to occur.
To test this hypothesis, we sorted the inhibitory units by the selectivity of the excitatory units they inhibit (i.e. by the normalised value of the readout weights). Inspecting the histogram of this selectivity metric revealed a bimodal distribution with peaks at each extreme strongly inhibiting a stimulus selective excitatory population at the exclusion of the other (Fig. S2). Based on the fact that leading up to the perceptual switch point both the input and firing rate of the dominant population are higher than the competing population, we hypothesized that gain likely speeds perceptual switches by actively inhibiting the currently dominant population rather than exciting/disinhibiting the competing population. We predicted, therefore, that lesioning the inhibitory units selective for the stimulus (i.e. with normalised selectivity > 0.5) that is initially dominant would dramatically slow perceptual switches, whilst lesioning the inhibitory units selective for the stimulus the input is morphing into would have a comparatively minor slowing effect on switch times since the population is not receiving sufficient input to take over until approximately half way through the trial irrespective of the inhibition it receives. As selectivity is not entirely one-to-one, we expect both lesions to slow perceptual switches but differ in magnitude. In line with our prediction, lesioning the inhibitory units strongly selective for the initially dominant population greatly slowed perceptual switches (Fig. 3F upper), whereas lesioning the population selective for the stimulus the input morphs into removed the speeding effect of gain but had a comparatively small slowing effect on perceptual switches (Fig. 3F lower).
Having found a circuit level explanation for the speeding effect of gain we next sought to understand the network’s behaviour at a population level by interrogating the parameter space (with dimensions defined by network input and gain) traversed by the network. Unlike standard non-linear dynamical systems with stationary or (very) slowly time varying parameters, input and gain change rapidly over the course of each trial dynamically shifting the location and existence of the attractors shaping the network dynamics. Each trial is, therefore, characterised by a trajectory through a two-dimensional parameter space with dimensions corresponding to the gain of the activation function and the mismatch between input dimensions (Δinput).

Analysis of RNN dynamical regime.
A) Contour map of convergence time across the full gain by Δinput parameter space averaged across 100 initialisations with random initial conditions. Example parameter trajectories shown in white for high and low γ trials. B) Contour map of convergence proportion across the full parameter space. C-E) Example dynamics with gain = 1.1 and Δinput ≈ [1,0], [.5, .5], and [0, 1] respectively. F-H) Example dynamics with gain = 1.5 and Δinput ≈ [1,0], [.5, .5], and[0, 1] respectively.
Based on the selectivity of the network firing rates, we hypothesised that the dynamics were shaped by a fixed-point attractor, whose location and existence were determined by gain and Δinput, and changed dynamically over the course of a single trial67–70. Because of the large size of the network, we could not solve for the fixed points or study their stability analytically. Instead, we opted for a numerical approach and characterised the dynamical regime (i.e. the location and existence of approximate fixed-point attractors) across all combinations of gain and Δinput visited by the network. Specifically, for each combination of elements in the parameter space
Across gain values when Δinput had unambiguous values (u1 ≫ u2 or u2 ≫ u1), the network rapidly converged across all initialisations (Fig. 3A & 3C-H). When Δinput became ambiguous, however, the dynamics acquired a decaying (inhibition-driven) oscillation and on many trails did not converge within the time frame of the simulation. As gain increased, the range of Δinput values characterised by oscillatory dynamics broadened. Crucially, for sufficiently high values of gain, ambiguous Δinput values transitioned the network into a regime characterised by high amplitude oscillations (Fig. 3D & 3G). Each trial can, therefore, be characterised by a trajectory through this 2-dimensional parameter space, with dynamics shaped by the dynamical regimes of each location visited (Fig. 3A-B).
When uncertainty had a small impact on gain (low γ) the network had a trajectory through an initial regime characterised by the rapid convergence to a fixed point where the population representing the initial stimulus dominated whilst the other was silent (Fig. 3C), an uncertain regime characterised by oscillations with all neurons partially activated (Fig. 3D), and after passing through the oscillatory regime, the network once again entered a (new) fix-point regime where the population representing the initial stimulus was silent whilst the other was dominant (Fig. 3E).
For high γ trails, the network again started and finished in states characterised by rapid convergence to a fixed point representing the dominant input dimension (Fig. 3F-H). However, it differed in how it transitioned between these states. Uncertain inputs generated high amplitude oscillations, causing the network to flip-flop between active and silent states (Fig. 3G). We hypothesised that, within the task, this mechanism silenced the initially dominant population, while boosting the competing population. To test this, we initialised each network with parameter values well inside the oscillatory regime (u ≈ [.5 .5], gain = 1.5) with initial conditions determined by the selectivity of each unit. Excitatory units selective for u1, as well as the associated inhibitory units projecting to this population, were fully activated, whilst the excitatory units selective for u2 (and the associated inhibitory units) were silenced (and vice versa for u2 → u1 trials). As we predicted, when initialised in this state the network dynamics displayed an out of phase oscillation where the initially dominant population was rapidly silenced and the competing population was boosted after a brief delay (219 (ms), ± 114; Fig. S3).
At the population level, therefore, heightened gain at points of ambiguity accelerates perceptual switches by transiently pushing the dynamics into an unstable regime. This regime replaces the fixed-point attractor representing the input with an oscillatory regime that actively inhibits the currently dominant population and boosts the competing population, before transitioning back to a stable (approximate) fixed-point attractor representing the new stimulus (Fig. 3F-H & Fig. S3).
Large-scale neural predictions of recurrent neural network model
Having confirmed the behavioural component of our gain modulation hypothesis in our model, and characterised both the circuit and population level mechanisms, we next sought to test our hypotheses that the speeding effect of uncertainty driven gain on perceptual switches is mediated by a flattening of the energy landscape traversed by the network dynamics. Crucially, translating the dynamics of the RNN into an energy-based framework also allowed us to generate a series of predictions that we could later test in functional neuroimaging data.
In recent work47,71, we have shown that peaks in BOLD within the LC precede large changes in brain state dynamics. Viewed through the lens of dynamical systems theory72 in which the brain is treated as a dynamical system whose state space (i.e., an instantaneous snap-shot of the activity of all regions of the system) evolves over time shaped by the presence (or absence) of attractors the effect of the LC can be conceptualised as akin to lowering the energy barrier required to escape a fixed-point attractor or as a transient injection of kinetic energy via an external force allowing the brain to reach a novel location in state-space47. Crucially, there are two complementary viewpoints from which we can construct an energy landscape; the first allocentric (i.e., third-person view) perspective quantifies the energy associated with each position in state space, whereas the second egocentric (i.e., first person view) perspective quantifies the energy associated relative changes independent of the direction of movement or the location in state space. The allocentric perspective is straightforwardly comparable to the potential function of a dynamical system but can only be applied to low dimensional data in settings where a position-like quantity is meaningfully defined. The egocentric perspective is analogous to taking the point of view of a single particle in a physical setting and quantifying the energy associated with movement relative to the particle’s initial location. An egocentric framework is thus more applicable, when signal magnitude is relative rather than absolute. See materials and methods, and (see Fig. S4 for an intuitive explanation of the allocentric and egocentric energy landscape analysis on a toy dynamical system).

Allocentric and egocentric energy landscape dynamics underlying the perceptual speeding effect of heightened gain.
A) Example network trajectory projected onto PC1 and averaged across trials for low (0.1; solid blue), medium (0.5; dotted green), and high (0.9; solid red) γ for the u1 → u2 condition. B) (abs) Velocity of PC1 trajectories across low (0.1), medium (0.5), and high (0.9) γ. C-D) Allocentric landscapes for low (0.1; blue) and high (0.9; red) γ conditions. Trial averaged PC1 trajectory shown in black. For purposes of visualisation energy values > 6 are set to a constant value. E-F) Egocentric landscapes for low (0.1; blue) and high (0.9; red) γ conditions. G) (Allocentric) neural work for low (0.1), medium (0.5), and high (0.9) γ, averaged across networks and conditions. H) Egocentric AUC for low (0.1), medium (0.5), and high (0.9) γ, averaged across networks and conditions.
To characterise the energy landscape traversed by the network dynamics we ran both time-resolved allocentric and egocentric energy landscape analyses. For the allocentric analysis we first had to reduce the dimensionality of the RNN’s dynamics by performing a Principal Component Analysis (PCA) on the concatenated activity of the network at gain = 1. The set of PCs was low-dimensional, with 80.58 ± 6.34% of the variance explained by the first principal component (PC1). Based on this information, we projected the network activity on each trial and for each gain value and timepoint onto the first PC. The resultant low dimensional trajectories all showed a change in direction around the timepoint of the switch in network output from category 1 to category 2 (and v.v.; Fig. 4A). This recapitulates a system jumping between attractors, occurring earlier as a function of heightened gain associated with heightened values of γ (Fig. 4A)· This switch not only occurred sooner as a function of heightened gain, it also occurred at a higher neural “speed” with the velocity of the trajectory peaking sharply at the point of the switch under high γ, whereas the transition between states was comparatively gradual under low γ (Fig. 4B).
With a low dimensional description of our data in hand, we leveraged the relationship between probability and energy in statistical mechanics to construct a measure of the allocentric energy landscape (Fig. 4C-D) traversed by the low dimensional dynamics
Although explanatory useful in understanding the operation of the RNN, the allocentric landscape is not straightforwardly applicable to non-invasive neuroimaging data. In order to compare out network dynamics to neuroimaging data, and with previous work from our group, we inferred an estimate of the egocentric energy landscape (Fig. 4E-F) traversed by the dynamics. Specifically, we calculated the mean-squared displacement
These results reinforce our previous work and clearly demonstrates that the implementation of neuromodulatory-mediated dynamics in the RNN acted in a similar fashion to previously observed patterns in resting-state fMRI26. In addition, our results confirm that the putative impact of the release of noradrenaline from the locus coeruleus can change the manner in which brain states evolve over time facilitating the navigation of otherwise difficult state transitions26.
The low-dimensional signature of ambiguity resolution and perceptual change
Having confirmed our hypothesis about the speeding effect of gain in our RNN model we next sought to test the predictions in the human brain - i.e., examining whether the increase in neural speed and the flattening of the energy landscape observed in the RNN were also present in functional neuroimaging data. To this end, we re-analysed an existing BOLD dataset collected while participants performed a similar version of the ambiguous figures task to identify the lowdimensional patterns that occurs during the perceptual change.
We were, however, left with a dilemma: RNNs provide a proof-in-principle of how computations can be instantiated in neural networks, however there are key differences between artificial neural networks and the human brain that require careful consideration73,74. While both RNNs and the brain are thought to compute through dynamics 75, the human brain is comprised of highly specialised neural circuits that have been shaped over evolutionary time to perform a range of highly idiosyncratic functions that matter for adaptive behaviour75, but aren’t necessarily related to task-switching. So where in the brain should we look for the same lowdimensional signatures we observed in the RNN as a function of gain? Rather than select a particular region a priori, we instead opted for a data-driven approach - principal components analysis (PCA) - which summarizes regional timeseries concatenated across all subjects and trials into a set of low-dimensional patterns that can then be interrogated in a similar fashion to the activity of the RNN (see Methods for details). Consistent with previous work76, a small number of principal components (PCs) mapped onto distributed regions across the brain (Fig. 5A) and explained a substantial proportion of the variance observed in the task (PC1–3 explained 32% of the total variance).

Low-dimensional switch-related dynamics and connectivity.
A) spatial loadings of PC1 (green), PC2 (red) and PC3 (blue); B) Mean absolute β loading (solid lines) and group standard error (shaded) of PC1 (green), PC2 (red) and PC3 (blue), organized around the image switch point (Δ) - the dotted grey lines show the 95th percentile of the null distribution of a block-resampling permutation; C) radar plot showing the partial correlations of PC1 (green), PC2 (red) and PC3 (blue); D) Evoked Brain activity of PC2 + PC3 during the perceptual switch. E) Group averaged functional connectivity and module assignments using a Louvain analysis - three clusters were observed. F) Pearson’s correlation between the sum of PC2 and PC3 (per subject) and a joint-histogram comparing Integration (participation coefficient) and Segregation (module-degree Z-score); p < 0.05 following permutation testing.
To isolate the low-dimensional component that best reflected the task (Fig. 1A), we performed a principal component regression77 that modelled the switch point of each trial using the loadings of the top 3 PCs calculated from fMRI data. PC1 was not selectively aligned with switches, both PC2 and PC3 showed a pronounced, isolated peak around the switch point across trials (Fig. 5B), with PC2 showing the most robust task-related engagement (Fig. 5B & Fig. S5). To ensure that these results could not be explained by the spatial autocorrelation inherent within the PC maps, we created a null distribution of regression coefficients calculated using the same statistical model but with block-resampling applied to the switch times in the design matrix. The dotted grey line in Fig. 5B denotes the 95th percentile of the null distribution, and clearly shows that the engagement of both PC2 and PC3 during the switch point was greater than to be expected by chance. Furthermore, to validate that the perceptual switch was predominantly represented by PC2 and PC3 (Fig. 5D), we conducted a regression with these two PCs as predictors and the evoked activity derived from the original BOLD time-series as the dependent variable. The resulting variance accounted for was 88% (R2 = 0.88, β = 0.99, p = 9.2×10-178).
To determine whether PC2 or PC3 was a better index of perceptual switching, we then correlated the spatial loadings of PC2 and PC3 with the spatial map associated with the term “switching” from a meta-analysis performed on the neurosynth database78. We observed a significant positive correlation between the map for “switching” and both PC2 (r = 0.453, p = 3.041×10-18), and PC3 (r = 0.115, p = 0.037), however the correlation for PC3 was much lower than PC3, suggesting that PC2 was a better match for “switching”. The spatial map of PC2 was also positively correlated with other terms putatively associated with the ambiguous figures task (notably, “effort”, “load” and “attention”; all r > 0.2; and not with “episodic”, which was included as a negative control), a partial correlation analysis revealed that PC2 was selectively associated with “switching” and “attention” (Fig. 5C). Given the multifaceted nature of the ambiguous figures task, the convergence between brain maps for “switching”, “attention”, and “effort” was to be expected, and we therefore did not try to dissociate them in further analysis.
Before turning to the predictions of the RNN we first sought to validate the face validity of focusing on a limited number of principle components. In previous work, we have linked the impacts of NA on systems-level neural dynamics to alterations in network topology47,79,80, with NA increasing large-scale network integration. Given that PCA naturally captures patterns of covariance between regions, we expected to see that the observed time signatures of PC engagement at the switch-point should coincide with similar measures of network integration. To test this hypothesis, we clustered the time-averaged functional connectivity matrix using a hierarchical modular decomposition approach (see Methods) - doing so revealed three main clusters (Fig. 5E). For each participant, we used this matrix and the three clusters to estimate the amount of integration (using the participation coefficient) and segregation (using the moduledegree Z-score, see Methods) of each region. We then correlated a joint histogram of these measures with the sum of subject-specific regression coefficients for PC2 and PC3 and observed a robust correlation with integration (Fig. 5F; p < 0.05 following permutation testing). These results clearly demonstrate the highly convergent nature of PCA and our previous network-based approaches.
Confirmation of model predictions in whole-brain BOLD data
Based on the patterns observed in the RNN (i.e., those in Fig. 2–4), we hypothesized that the energy landscape topography would decrease, and the velocity of the low dimensional brain patterns would peak at the switch point. Given the prominent role in switching, we focused our analysis on the PC2 time series. To estimate the (egocentric) energy landscape, we first estimated the mean displacement of PC2 by averaging the β value around the switch-point and then divided this term by the logarithm of the inverse probability of the loading of PC2, which was also inferred from the GLM. Using this approach, we observed that PC2 was maximally displaced at the perceptual change, suggesting that the brain state showed a substantial shift from baseline during the perceptual change. Energy (log[1/pswitch]; see Methods) showed a U-shaped pattern around the perceptual change point - i.e., with a minimum value in the perceptual change along with the first and last images (Fig. 6B-D). To relate this measure to the energy landscape framework, and to control by the specific displacement occurring at each image, we then calculated the ratio between energy and the mean displacement (i.e., energy landscape ‘depth’; Fig. 6A). As predicted, the brain-state reduced the amount of energy per displacement towards its minimum around the perceptual change (Fig. 6B). We interpret this set of results as the system flattening the energy landscape, reducing the energy (i.e., higher system changes become more common) required for large displacement values effectively generating a ‘network reset’34,36,81 of the brain state, which ultimately facilitated an updating of the content of perception.

Confirmation of model predictions in whole-brain BOLD data.
A) analysis of the RNN also predicted that the energy landscape dictating the likelihood of state transitions should be flat (i.e., have a small attractor depth) at the switch point; B) the energy landscape was demonstratively flatter (quantified as surprisal over brain activity displacement) at the switch-point; C) by interrogating the low-dimensional trajectories in the RNN, we predicted that there should be a peak in the gradient of the loadings in principal component space at the switch point between output #1 and output #2 ; D) the gradient (Δ×PC) of the β loading of PC2 as a function of the switch point.
To analyse speed-evoked changes in brain trajectories, we used a GLM to analyse each PC time series as a function of each perceptual switch. Our design matrix included the first and last images seen in each set, along with the three images leading up to the switch, the switch trial itself and the three images following the switch (see Methods for details). This approach thus allowed us to track the low-dimensional signature of the brain through the processing and resolution of perceptual ambiguity. As predicted (Fig. 6C), we found evidence that PC2 showed a peak in velocity at the change point (Fig. 6D) providing confirmatory evidence that the low-dimensional brain state dynamics observed in whole-brain fMRI were highly similar to those observed in the trained RNN.
Discussion
Here, we studied the relationship between the ascending arousal system, low-dimensional neuronal trajectories and energy landscape dynamics during a perceptual switch task. Our results provide evidence that the ascending arousal system is involved in the modulation of dynamic brain state topography during task-relevant perceptual switches. We found that pupil diameter tracked with ambiguity of task stimuli and was directly related to the speed of perceptual switches (Fig. 1). Next, we confirmed that this process could be replicated in an RNN Model (Fig. 2) of perceptual change detection where the gain of the activation function was updated dynamically by the uncertainty of the network’s classification output (Fig. 2–3). We then used this model to generate two key predictions: around the time of the perceptual switch brain state velocity should peak, and the egocentric energy landscape should flatten which we confirmed in neuroimaging data (Fig. 4–6). Together, these results suggests that the ascending arousal system facilitates state changes in the content of perception by transiently increasing neural gain - acting in a manner analogous to an external forcing function transiently increasing kinetic energy in the system - flattening the ego-centric energy landscape and thereby reducing the energy needed to reset the system topography in an adaptive and task dependent manner.
The relationship between perception and pupil diameter found here is consistent with the role of the ascending neuromodulation in cognition and attention61,65. For instance, the LC dynamically changes its activity according to external and cognitive demands imposed on the system51,61,65,82,83. Importantly, our results extend these findings by suggesting a more precise role for LC-mediated alterations in neural gain. Specifically based on the pupil dynamics in our task and previous experimental and theoretical work, we hypothesised that neural gain should change dynamically as a function of uncertainty (operationalised here as perceptual ambiguity) via the recruitment of the LC (along with other structures in the ascending arousal system), which then subsequently increases brain-wide communication by increasing the gain in targeted brain regions11,16,49,83. In the pupillometry data pupil diameter (which is an indirect marker of the noradrenergic system32,37) increased as a function of perceptual ambiguity, which rose sharply in the few images prior to the reported perceptual change (Fig. 1D). Based upon this finding we then implemented an analogous mechanism in our pretrained RNN by making gain depend upon the entropy of the network’s classification which acted as a forcing function transiently increasing gain when the input became ambiguous, which, in line with our hypothesis lead to earlier perceptual switches. We chose to use an RNN, instead of a simpler (more transparent) model as we wanted to use the RNN as a means of both hypothesis generation and hypothesis testing. Specifically, unlike more standard neuronal models which are handcrafted to reproduce a specific effect, when building an RNN the modeller only specifies the network inputs, labels, and the parameter constraints (e.g. Dale’s law) in advance. The dynamics of the RNN are entirely determined by optimisation. Post-training manipulations of the RNN are not built in, or in any way guaranteed to work, making them more analogous to experimental manipulations of an approximately task-optimal brain-like system. Confirmatory results are arguably, therefore, a first steps towards an in vitro experimental test.
Thus, we provide early empirical and computational evidence that ascending neuromodulatory activity facilitates state changes in perception under conditions of perceptual ambiguity31,45,46 when a stimulus is task relevant. Importantly, we do not expect that our results will generalise to experimental setting when a stimulus is not task relevant. We can make sense of this computationally by imagining the gain dynamics in our model if we added in a second taskirrelevant condition where at the beginning of each trial the model was given a cue indicating whether it would have to simply “maintain fixation” or readout the category of the input. In the presence of the task irrelevant cue the model would readout the “maintain fixation” action with high certainty and thus not ramp up gain. We hypothesise therefore that the pupil dynamics observed in the task will depend on participants task-set. Indeed, there is evidence from a recent multistable perception experiment showing that arousal-related changes in pupil dilation disappear when the stimulus is not task-relevant. The authors of the study attribute the arousaldependent pupil dilation to task-execution. This explanation, however, could not explain the ramping of pupil diameter in our task where the participants perform an action on every trial. Instead, based upon the workings of our computational model, we hypothesise that arousalbased changed in pupil diameter are driven by task-set related uncertainty and thus will depend on task-relevance rather than task-execution per se.
A core neuroanatomical property of the LC noradrenergic system is that a relatively small number of neurons (~fifty thousand in an adult human) projects to almost all brain regions38,84. This organisation implies that the LC acts as a low dimensional modulator of the much more high-dimensional cerebral cortex. Subtle changes in the activity of LC can have significant effects on how different brain regions communicate49,65,82,85–87. The mechanism of gain modulation in our model was, likewise, dependent of a low-dimensional process with the network output altering the gain uniformly across the full network. At a neuronal level NA increases excitability by liberating intracellular calcium and opening (or closing) voltage-gated ions channels11,65. In our model this global increase in excitability increased the speed of perceptual switches by recruiting inhibitory units to more rapidly actively inhibit the population encoding the initially dominant stimulus. At a population level the interaction between excitatory and inhibitory units led to the emergence of a gain-dependent oscillatory regime which suppresses the currently active population encoding the initially dominant stimulus and boosts the competing quiescent population. At the scale of the full network the gain mediated changes resemble the transient application of an external forcing function pushing the network trajectory in the direction of the new percept which, from the perspective of the allocentric landscape, manifests as a spike in neural work at turning points in the network’s low-dimensional trajectory leading up to and following the perceptual switch. From the egocentric perspective this is characterised by a flattening of the landscape analogous to an externally driven increase in kinetic energy making large changes in the location of a particle more likely.
In line with the predictions of the RNN in our analysis of the BOLD data, we showed that the velocity of the low dimensional brain state trajectory most associated with perceptual switching increased significantly during the point of reported perceptual change in comparison (Fig. 5B), which we interpret as the brain moving from one attractor to another (Fig. 6A). Importantly, we showed that around the perceptual switch, the energy needed for each unit of change in brain state (i.e., displacement) is smaller than at other points in the task (Fig. 6A-B). Under the (egocentric) energy landscape framework47,60, this tells us that the landscape is flattened, and the energy required to transition between states is reduced. Together with the pupillary findings (Fig. 1), the computation model (Fig. 2–4), and replication from former results47,60 (Fig. 5E-F), we propose that the ascending neuromodulatory system is responsible for the large-scale flattening of the egocentric energy landscape which facilitating changes in task-relevant perceptual content.
This work is not without limitations. First, the pupil diameter dataset and the fMRI analysis came from different participants, such that the link between the pupil diameter and the fMRI results is inherently indirect. Moreover, differences in task timing, structure, and instructions between the fMRI and pupil experiments add complexity to interpreting the results. For instance, the fMRI task includes jittered inter-trial intervals (ITIs) and catch trials, features absent in the pupil task, which presents a more rapid stimulus sequence. These differences may have influenced perceptual switch points and task behavior across experiments. Additionally, the specificity of the pupil diameter as a marker of the LC activity is under active debate37. For instance, there is evidence suggesting a role of the superior colliculus, the dorsal raphe nucleus and central cholinergic system in driving pupil dilations43,75,76. Although there is uncertainty regarding whether these other nuclei are directly related to pupil dilation, or only indirectly via their connections with other neural regions and nuclei. Despite this, we believe that our pupillometry dataset captures an important function of the noradrenergic system in cases of task-relevant perceptual ambiguity as there is strong evidence showing that pupil diameter is a reliable marker of noradrenergic activity during evoked cognitive tasks44,85,89–91. Additionally, the sample size of our fMRI study makes it difficult to generalize our results. In spite of this, the converging evidence from the pupillometry dataset, the fMRI dataset and the computational model, supports the role of the ascending neuromodulation in mediating task-relevant perceptual switches. Future work is needed both in humans, with higher sample sizes utilizing fMRI and eye-tracking recordings, as well as animal studies, to directly modulate and record the LC activity in a task manipulating perceptual uncertainty.
Conclusion
In summary, we provide computational and empirical evidence for the association between neuromodulation, pupil dilation, and (egocentric) energy landscape flattening in task-relevant perceptual switches. Our results strengthen our understanding of the neurobiological processes underpinning moment-by-moment adaptive changes to perception. Specifically, we suggest that the widespread excitatory projections of the noradrenergic arousal system mediate the systems-level reconfigurations of cortical network architecture84,85 via uncertainty driven alterations in neural gain. This suggests that more highly conserved features of the nervous system may play a role in driving task-relevant switches in the contents of perception
Methods
Overview of empirical data
There were two independent groups analysed in this study: 35 subjects performed a perceptual decision-making perceptual task while pupil diameter was recorded; and a separate group of 17 subjects performed a version of the task adapted for the MRI scanner.
Perceptual Task
Twenty picture sets were used in which line drawings of common objects morphed over 15 iterations into a different object (Fig. 1A). Picture sets were selected from a larger set validated in an earlier study48. In the original study, participants reported verbally what they saw by typing in the name of the object. This reporting method guaranteed that participants could freely indicate what they saw without being restricted by categories (e.g., forced choice). Picture sets for the current study were selected with the criterion that all sets were perceived categorically in the normative study (i.e., that the majority of participants in the normative study categorized each picture they saw as either the first object or second object in the set17). Selecting only the categorically perceived image sets guaranteed that pictures in the middle of the morphing sequence were not simply ‘noisier’ than pictures at the beginning or end. In other words, the ambiguous images were still easily categorised by participants as either object 1 or object 2. All images were a standard size (316×316 pixels) and were displayed on a white background. In addition, in the fMRI study, participants were presented with two kinds of control picture sets to ensure that they were responding to changes in the pictures in the set rather than simply to the position in the set (e.g., always switching after the 8th picture). In these control picture sets, a salient deviating picture was presented either after three pictures or after thirteen pictures resulting in an early or late abrupt shift. Those sets served as controls and were not analysed further.
The picture morphing task consisted of five experimental runs. We randomized the order in which the picture sets were presented in each run and kept this randomized order consistent across participants. Picture morphing in each picture set occurred over fifteen discrete steps, each corresponding with the acquisition of a whole-brain image. In the fMRI experiment, each picture within a set was presented for two seconds. Pictures were randomly intermixed with eight interstimulus-intervals (2, 4, 6 or 8 seconds) during which participants saw a fixation cross. In the eyetracking experiments, each picture was presented for 500ms, followed by a fixation cross of 2 seconds. Participants provided their responses in the scanner using two buttons on a four button Cedrus fibre optic system. In a two-alternative forced-choice task, participants were asked to press the first button when they ‘saw the first object’ and the second button when they ‘saw the second object’ - this ensured that there was not a motor confound present on only the switch trials. All participants were ignorant as to the identity of the second object in each picture set. At the end of each set of 15 images the word END was presented for 2 s to indicate that the next picture set would begin shortly. Participants provided their responses in the fMRI scanner using a Cedrus fiber-optic response system with four buttons. For the two-alternative forced-choice task, participants were instructed to press the first button when they ‘saw the first object’ and the second button when they ‘saw the second object.’ This design ensured that motor responses were not confounded with perceptual switches, as responses occurred on both switch and nonswitch trials. Importantly, participants were not informed about the identity of the second object in each picture set beforehand. At the end of each sequence of 15 images, the word ‘END’ was displayed for 2 seconds to signal the conclusion of that picture set and the imminent start of the next one.
Participants
A total of seventeen (6 male) neurologically healthy participants with normal or corrected to normal vision took part in the fMRI study (mean age 27.65 ± 8.01). Fifteen were right-hand dominant. A separate cohort of 35 participants performed the task while simultaneous pupil diameter was recorded using an eye tracker device (SR Research, 1000 Hz). None of the participants had a history of brain injury. Participants received $30 for their participation. All participants provided informed consent prior to participation. The research protocol was approved by the Office of Research Ethics at the University of Waterloo and the Tri-Hospital Research Ethics Board of the Region of Waterloo in Ontario, Canada.
Pupillometry
Fluctuations in pupil diameter of the left eye were collected using an Eyelink 1000 (SR Research Ltd., Mississauga, Ontario, Canada), with a 1 kHz sampling frequency. Blinks, artifacts, and outliers were removed and linearly interpolated92. High-frequency noise was smoothed using a second-order 2.5-Hz low-pass Butterworth filter. To obtain the pupil diameter average profile, data from each participant were normalized across each trial (corresponding to the 15 consecutive image set). This allowed us to correct for low-frequency baseline changes without eliminating the load effect and baseline differences due to load manipulations93,94.
Recurrent Neural Network Modelling
We used PyTorch95 to implement and train 50 continuous-time recurrent neural networks that we constrained to respect Dale’s law (NE+I = 40, 80% excitatory NE = 32, and 20% inhibitory NI = 8) using the procedure set out in63. The dynamics of each network evolved according to the following system of stochastic differential equations:
Where x ∈ ℝN×1 represents the sub-threshold activation of each unit, u ∈ ℝ2×1 the external input into the network, Wrec ∈ ℝ40×40 the recurrent weights, Win ∈ ℝ40×2 the input weights, and τ the time constant which we set to 100ms. In addition to task input each unit in the network was driven by a Weiner process dW. The subthreshold activation variable x was converted into a vector of instantaneous firing rates by applying a sigmoid function
We imposed Dale’s law on the recurrent weights of the network by parametrising the weight matrix with a mask Wmask ∈ ℝ40 × 40 which contained zeros in the leading diagonal (removing self-connections), +1 in all non-diagonal entries of the first 32 rows/columns and -1 in the remaining 8 rows/columns. We obtained the constrained recurrent weight matrix by multiplying the absolute value of the trained weights element wise with the mask Wrec =
Following standard practice63, we simulated the network by discretising the system using a Euler-Maruyama integration scheme where
Each network was trained by optimising
The task consisted of a simple change detection paradigm analogous to the task performed by our human participants. Specifically, at each time point the network was fed a two-dimensional input u(t) = [ui u2]T with each column representing the “sensory evidence” for each of the two stimulus categories. The task lasted for 1 second of simulation time beginning with maximum evidence for one of the two categories u(t) = [1 0]T and over the course of each trial changed linearly such so that at the half way point of the simulation the sensory evidence for each stimulus category changed was perfectly matched category u(t) = [.5 .5]T and by the final time-step consisted of maximum evidence for the second stimulus category u(t) = [0 1]T. We trained the network to output a response for stimulus category 1 whenever u1 > 0.5, and u2 < 0.5, and category 2 whenever u1 < 0.5, and u2 > 0.5.
To test our hypothesis that perceptual uncertainty increases neuromodulatory via phasic bursts in the noradrenergic locus coeruleus we made gain time dependent with dynamics governed by a linear ODE with a forcing term proportional to the uncertainty (i.e. the entropy H(z) = Σip(z)i ln(p(z)i)) of the network’s readout.
Where p(z) is obtained by passing z(t) through a softmax function at each time step of the simulation
To study how the population dynamics of the trained networks changed as a function of gain in a shared space we performed a Principal Component Analysis (PCA) on the concatenated activity of the network at γ = 0. The set of principal components was highly low-dimensional, with 80.58 ± 6.34% of the variance explained by the first principal component (PC1). We then projected the trial averaged activity at each gain value at each timepoint onto the top PC.
Energy Landscape Analysis
Leveraging previous work from our group47 we constructed a measure of the energy landscape traversed by each network through an analogy to the relationship between probability and energy in statistical mechanics98 given by the Boltzmann distribution.
Where pi denotes the probability of each state, Ei the energy of each state, β the thermodynamic beta, and z the canonical partition function. Solving for Ei we obtain:
Instead of inferring the probability distribution from the energy of a state as in done in physics we used the fitdist function in MATLAB with a Gaussian kernel
For the allocentric landscape analysis we defined the state of the system in terms of the trial averaged loadings on PC1 which we divided into 250 ms windows. For the egocentric landscape analysis, we calculated the mean-squared displacement (MSD) of the activity of the RNN at each time point τ0 relative to reference point τ0 + τ.
For congruency with the allocentric analysis we increased τ and τ0 in steps of 250 ms starting 1s into the trial and ending with a maximum difference between τ and τ0 of 5 s to ensure that all steps had equivalent window sizes.
Following the physical analogy, we think of the state of the system, PC1 loadings in the allocentric analysis, and MSD in the egocentric analysis, as akin to the location and movement of a particle respectively. Positions in state space with low energy have a higher probability of being occupied, and systems with a higher average energy have a more uniform probability distribution making large jumps in the position of a particle more likely (i.e. lower energy for large MSD values). See supplementary material (Fig. S4).
To quantify the effect of gain mediated alterations to the topography of the allocentric energy landscape we devised a novel measure - neural work - of the force (which in classical mechanics is equal to the negative gradient of potential energy) exerted on the low dimensional neural trajectory by the vector field quantified by the allocentric energy landscape at each time point in the trial.
Where st is the displacement of the PC trajectory, and
MRI Data
Functional data were acquired using gradient echo-planar T2*-weighted images collected on a 1.5 T Phillips scanner located at Grand River Hospital in Waterloo, Ontario (TR = 2000 ms; TE = 40 ms; slice thickness = 5 mm with no gap; 26 slices/volume; FOV = 220×220 mm2; voxel size = 2.75 × 2.75 × 5mm3; flip angle = 90°). Each experimental run consisted of 26 slices per volume and 285 volumes. At the beginning of each run, a whole brain T1-weighted anatomical image was collected for each participant (TR = 7.4 ms; TE = 3.4 ms; voxel size = 1×1×1mm3; FOV = 240×240 mm2; 150 slices with no gap; flip angle = 8°). The experimental protocol was programmed using E-Prime experimental presentation software (v1.1 SP3; Psychology Software Tools, Pittsburgh, PA). Stimuli were presented on an Avotec Silent Vision™ fibre-optic presentation system using binocular projection glasses (Model SV-7021). The onset of each trial was synchronized with the onset of data collection for the appropriate functional volume using trigger pulses from the scanner.
fMRI Data Preprocessing
After realignment (using FSL’s MCFLIRT), we used FEAT to unwarp the EPI images in the y-direction with a 10% signal loss threshold and an effective echo spacing of 0.333. Following noisecleaning with FIX (custom training set for scanner, threshold 20, including regression of estimated motion parameters), the unwrapped EPI images were then smoothed at 6-mm FWHM, and nonlinearly co-registered with the anatomical T1 to 2-mm isotropic MNI space. Temporal artifacts were identified in each dataset by calculating framewise displacement (FD) from the derivatives of the six rigid-body realignment parameters estimated during standard volume realignment99, as well as the root mean square change in BOLD signal from volume to volume (DVARS). Frames associated with FD > 0.25 mm or DVARS > 2.5% were identified; however, as no participants were identified with greater than 10% of the resting time points exceeding these values, no trials were excluded from further analysis. There were no differences in head motion parameters between the five runs (p > 0.500). Following artifact detection, nuisance covariates associated with the six linear head movement parameters (and their temporal derivatives), DVARS, physiological regressors (created using the RETROICOR method), and anatomical masks from the cerebrospinal fluid and deep cerebral white matter were regressed from the data using the CompCor strategy100. Finally, in keeping with previous time-resolved connectivity experiments101, a temporal band pass filter (0.0071 < f < 0.125 Hz) was applied to the data.
Brain Parcellation
Following preprocessing, the mean time series was extracted from 375 predefined regions of interest (ROIs). To ensure whole-brain coverage, we extracted the following: (a) 333 cortical parcels (161 and 162 regions from the left and right hemispheres, respectively) using the Gordon atlas102; (b) 14 subcortical regions from the Harvard-Oxford subcortical atlas (bilateral thalamus, caudate, putamen, ventral striatum, globus pallidus, amygdala, and hippocampus; https://fsl.fmrib.ox.ac.uk/); (c) 28 cerebellar regions from the SUIT atlas103 for each participant in the study.
Neuroimaging Analysis
In order to analyse task evoked activity related to stimulus presentations, we first performed a principal component analysis (PCA)76 on the pre-processed BOLD time-series (per subject/session), to extract orthogonal low-dimensional time-series. The top 3 PCs explained ~30.6% of the variance. The time-series of these PCs was entered into a general linear model, in which we modelled the following 9 event-types across an entire session, centred around the perceptual switch point, which changed on a trial-by-trial basis: the first two images (modelled as a single regressor), the seven images surrounding each perceptual change (i.e., the switch trial and the three images surrounding the change point, modelled as seven separate regressors) and the last two images (modelled as a single regressor). Each of the event onset times was also convolved with a canonical hemodynamic response function. This left us with nine unique β values per principal component, which we could use to determine how each PC differentially engaged as a function of the task. To test the hypothesis that the rate of change of PC engagement peaked at the perceptual change point, we calculated the difference between the β value for each of the top 3 PCs for each of the 9 event-types, and then plotted the resultant series in order to identify whether a peak occurred at the perceptual switch point (i.e., the middle β value in the series). A block-resampling null (n = 5,000 permutations) was used as a permutation test (p < 0.05).
Spatial maps associated with the terms: “switching”, “effort”, “attention”, “perception” and “load” were downloaded from the neurosynth repository78 and mapped into our parcellation space by calculating the mean value within each independent parcel. These values were then correlated with the spatial loading of each of the top 3 PCs. A separate partial correlation analysis was conducted in which the same correlation was estimated after controlling for each of the other spatial maps.
Topological analyses
A hierarchical modularity approach was used to collapse the mean time-averaged correlation matrix across participants into a set of four spatially non-overlapping modules. Briefly, this involved running the Louvain modularity algorithm, which iteratively maximizes the modularity statistic, Q, for different community assignments until the maximum possible score of Q has been obtained.
The community assignment for each region was then estimated 500 times across a range of γ values (0.5–2.0, in steps of 0.1). In order to identify multi-level structure in our data, we repeated the modularity analysis for each of the modules identified in the first step104. Finally, a consensus partition was identified using a fine-tuning algorithm from the Brain Connectivity Toolbox (http://www.brain-connectivity-toolbox.net/). We subsequently used this final module assignment to estimate the cartographic profile of the each participant’s time-averaged adjacency matrix80. Specifically, we estimated Integration using the participation coefficient, which quantifies the extent to which a region connects across all modules (i.e., between-module strength105), and Segregation using the module-degree Z-score. These measures were entered into a joint-histogram (101×101 unique bins, equally-spaced between 0–1 [for Integration] and - 1 and 1 [for Segregation]). The value within each bin of this joint histogram was then correlated with the combined regression weights of PC2 and PC3 for each subject. A permutation test that scrambled the order of participants was used to assess statistical significance (p < 0.05).
Brain-state displacement and the energy landscape
To quantify the change in the evoked BOLD activity following each stimulus we calculated the main BOLD displacement (MBD). The MBD is a measure of the absolute evoked deviation in BOLD activity. The evoked activity is measured through a general linear model using a canonical hemodynamic response function convolved on a design matrix. We are interested in the probability, p(MBD, re), that we will observe a given displacement in BOLD at a given regressor re. The probability is calculated through the null model of the general lineal model (the probability that the observed evoked value of the corresponding region is different from 0). As described above we then calculated the energy for each displacement value as EMBD, re = In
Supplementary Files

Overall Analysis Flow.
Top Row (orange) - pupil diameter was collected in a cohort of 35 individuals while they performed the Ambiguous Figures task. We observed a large peak in pupil dilation at the perceptual change point, which led us to make the prediction that there should be an increase in inter-regional gain at the switch point. Middle Row (blue) - we trained a 100-node RNN to perform a similar classification task in the presence of shifting perceptual ambiguity, and then tested the network at different levels of gain (i.e., the slope of the tanh activation function). We observed early switches with heightened gain, as well as altered attractor dynamics that caused a flattening of the energy landscape characterising state switches. Bottom Row (green) - we tested the predictions of the RNN using BOLD data from 17 subjects performing the same task. After filtering the BOLD data through a principal component analysis (in which we retained the top 5 principal components; PC1–5), we observed an increase in the gradient of PC loading around the switch point using an FIR model, as well as a flattening of the energy landscape, thus confirming our original predictions.

Difference in mean firing rates between stimulus selective excitatory clusters.
To examine the effect of manipulating gain on the operation of the network (A) we averaged over the firing rate of the excitatory neurons in each stimulus selective cluster (rc1, rc2) and looked for the point at which rc2 > rc1 (and v.v.). In line with expectations the speeding and slowing effect of gain on network output time was straightforwardly reflected in the mean firing rates. B) Difference in mean firing rates for high (red), intermediate (green), and low (blue) gain, for gain manipulations targeting both excitatory and inhibitory neurons. Notice that the switch from rc1 > rc2 to rc2 > rc1 occurs sooner in time for high gain, and slower for low gain. C) Manipulating excitatory gain in isolation led to slower switches in mean firing rates for low gain but high gain did not speed switches. D) Manipulating inhibitory gain in isolation led to slower switches in mean firing rates for low gain and speeded switches under high gain.

Relationship between Attractor Depth and Energy Landscape.
We simulated a simple model

PC2 evoked displacement and surplice around the perceptual change.
A) Pearson’s correlation between each PCs and the evoked brain activity at the perceptual switch (β values), dashed line at PC2. B) Pearson’s correlation between the inverted brain maps using βPC (PC(i-i) × βPC(i-i)). Dashed line shows that the correlation gets to 94% using the first 3 PCs (Pearson’s r =0.94, p < 0.001). C) Mean absolute β loading (red) and group standard error (shaded red). D) Mean surprise calculated as -log(1-pvalues) in each regressors. Dotted black line define the perceptual switch point.
References
- 1.A tutorial on the free-energy framework for modelling perception and learningJ. Math. Psychol 76:198–211
- 2.Neural dynamics of visual ambiguity resolution by perceptual prioreLife 8:e41861
- 3.A theory of cortical responsesPhilos. Trans. R. Soc. B Biol. Sci 360:815–836
- 4.The Predictive MindOUP Oxford
- 5.Whatever next? Predictive brains, situated agents, and the future of cognitive scienceBehav. Brain Sci 36:181–204
- 6.Predictive coding explains binocular rivalry: An epistemological reviewCognition 108:687–701
- 7.Binocular RivalryMIT Press
- 8.Dynamics of perceptual bi-stability for stereoscopic slant rivalry and a comparison with grating, house-face, and Necker cube rivalryVision Res 45:29–40
- 9.Attention and Conscious Perception in the Hypothesis Testing BrainFront. Psychol 3
- 10.Free Energy, Precision and Learning: The Role of Cholinergic NeuromodulationJ. Neurosci 33:8227–8236
- 11.Computational models link cellular mechanisms of neuromodulation to large-scale neural dynamicsNat. Neurosci 24:765–776
- 12.The Anatomy of Inference: Generative Models and Brain StructureFront. Comput. Neurosci 12:90
- 13.Uncertainty, epistemics and active inferenceJ. R. Soc. Interface 14:20170376
- 14.Unified Neural Dynamics of Decisions and Actions in the Cerebral Cortex and Basal Gangliahttps://doi.org/10.1101/2020.10.22.350280
- 15.The physics of optimal decision making: A formal analysis of models of performance in two-alternative forced-choice tasksPsychol. Rev 113:700–765
- 16.Global gain modulation generates timedependent urgency during perceptual choice in humansNat. Commun 7:1–15
- 17.A cortical network that marks the moment when conscious representations are updatedNeuropsychologia 79:113–122
- 18.A predictive coding account of bistable perception - a model-based fMRI studyPLoS Comput. Biol 13
- 19.The Normalization Model of AttentionNeuron 61:168–185
- 20.Neural mechanisms of selective visual attentionAnnu. Rev. Neurosci 18:193–222
- 21.Can attention selectively bias bistable perception? Differences between binocular rivalry and ambiguous figuresJ. Vis 4:539–551
- 22.Understanding attentional modulation of binocular rivalry: A framework based on biased competitionFront. Hum. Neurosci 5:1–12
- 23.A Recurrent Network Mechanism of Time Integration in Perceptual DecisionsJ. Neurosci 26:1314–1328
- 24.Dimension Reduction and Dynamics of a Spiking Neural Network Model for Decision Making under NeuromodulationSIAM J. Appl. Dyn. Syst 10:148–188
- 25.Probabilistic Decision Making by Slow Reverberation in Cortical CircuitsNeuron 36:955–968
- 26.Resynthesizing behavior through phylogenetic refinementAtten. Percept. Psychophys 81:2265–2287https://doi.org/10.3758/s13414-019-01760-1
- 27.Perceptual rivalry across animal speciesJ. Comp. Neurol 528:3123–3133
- 28.Locus Coeruleus tracking of prediction errors optimises cognitive flexibility: An Active Inference modelPLOS Comput. Biol 15:e1006267
- 29.With an eye on uncertainty: Modelling pupillary responses to environmental volatilityPLOS Comput. Biol 15:e1007126
- 30.The locus coeruleus broadcasts prediction errors across the cortex to promote sensorimotor plasticityeLife 12
- 31.Pupil-Linked Arousal Determines Variability in Perceptual Decision MakingPLoS Comput. Biol 10
- 32.Functional Organization of the Sympathetic Pathways Controlling the Pupil: Light-Inhibited and Light-Stimulated PathwaysFront. Neurol 9
- 33.Modulators in concert for cognition : Modulator interactions in the prefrontal cortex:69–91
- 34.The locus coeruleus and noradrenergic modulation of cognitionNat. Rev. Neurosci 10:211–223
- 35.Monoaminergic Neuromodulation of Sensory ProcessingFront Neural Circuits. :1–17
- 36.Network reset: A simplified overarching theory of locus coeruleus noradrenaline functionTrends Neurosci 28:574–582
- 37.Pupil Size as a Window on Neural Substrates of CognitionTrends Cogn. Sci 24:466–480
- 38.Functional Neuroanatomy of the Noradrenergic Locus Coeruleus: Its Roles in the Regulation of Arousal and Autonomic Function Part II: Physiological and Pharmacological Manipulations and Pathological Alterations of Locus Coeruleus Activity in HumansCurr. Neuropharmacol 6:254–285
- 39.Coupling of pupil- and neuronal population dynamics reveals diverse influences of arousal on cortical processingeLife 11:e71890
- 40.Pupil-linked phasic arousal predicts a reduction of choice bias across species and decision domainseLife 9:e54014
- 41.Pupil dilation reflects perceptual selection and predicts subsequent stability in perceptual rivalryProc. Natl. Acad. Sci. U. S. A 105:1704–1709
- 42.Pupil fluctuations track rapid changes in adrenergic and cholinergic activity in cortexNat. Commun 7:1–7
- 43.Neuromodulatory Influences on Integration and Segregation in the BrainTrends Cogn. Sci 23:572–583
- 44.The ascending arousal system promotes optimal performance through meso-scale network integration in a visuospatial attentional taskNetw. Neurosci 5:890–910https://doi.org/10.1162/netn_a_00205
- 45.Decision-related pupil dilation reflects upcoming choice and individual biasProc. Natl. Acad. Sci 111:E618-E625
- 46.Dynamic modulation of decision biases by brainstem arousal systemseLife 6:e23232
- 47.The ascending arousal system shapes neural dynamics to mediate awareness of cognitive statesNat. Commun 12:1–9
- 48.Assessing perceptual change with an ambiguous figures task: Normative data for 40 standard picture setsBehav. Res. Methods 48:201–222
- 49.The modulation of neural gain facilitates a transition between functional segregation and integration in the braineLife 7:e31130
- 50.Reports A Network Model of Catecholamine Effects : Gain , Signal-to-Noise Ratio , and Behavior:892–895
- 51.Relationships between Pupil Diameter and Neuronal Activity in the Locus Coeruleus, Colliculi, and Cingulate CortexNeuron 89:221–234
- 52.Pupil dynamics during bistable motion perceptionJ. Vis 9:10–10
- 53.Pupil size tracks perceptual content and surpriseEur. J. Neurosci 41:1068–1078
- 54.Thalamocortical excitability modulation guides human perception under uncertaintyNat. Commun 12
- 55.The effects of neural gain on attention and learningNat. Neurosci 16:1146–1153
- 56.Pupil-linked arousal is driven by decision uncertainty and alters serial choice biasNat. Commun 8:14637
- 57.Pupil size tracks perceptual content and surpriseEur. J. Neurosci 41:1068–1078
- 58.Diffuse neural coupling mediates complex network dynamics through the formation of quasi-critical brain statesNat. Commun 11
- 59.The Dynamics of Functional Brain Networks: Integrated Network States during Cognitive Task PerformanceNeuron 92:544–554
- 60.Structural connections between the noradrenergic and cholinergic system shape the dynamics of functional brain networksNeuroImage 260:119455
- 61.An integrative theory of locus coeruleus-norepinephrine function: adaptive gain and optimal performanceAnnu. Rev. Neurosci 28:403–450
- 62.Training Excitatory-Inhibitory Recurrent Neural Networks for Cognitive Tasks: A Simple and Flexible FrameworkPLOS Comput. Biol 12:e1004792
- 63.Artificial Neural Networks for Neuroscientists: A PrimerNeuron 107:1048–1070
- 64.Training Excitatory-Inhibitory Recurrent Neural Networks for Cognitive Tasks: A Simple and Flexible FrameworkPLOS Comput. Biol 12:e1004792
- 65.The role of the locus coeruleus in shaping adaptive cortical melodiesTrends Cogn. Sci 26:527–538
- 66.Two views on the cognitive brainNat. Rev. Neurosci 22:359–371
- 67.Codimension-2 parameter space structure of continuous-time recurrent neural networksBiol. Cybern 116:501–515
- 68.Dynamical approaches to cognitive scienceTrends Cogn. Sci 4:91–99
- 69.Neural circuits as computational dynamical systemsCurr. Opin. Neurobiol 25:156–163
- 70.Opening the Black Box: Low-Dimensional Dynamics in High-Dimensional Recurrent Neural NetworksNeural Comput 25:626–649
- 71.NeuroImage Structural connections between the noradrenergic and cholinergic system shape the dynamics of functional brain networks
- 72.It’s about time: Linking dynamical systems with human neuroimaging to understand the brainNetw. Neurosci 6:960–979
- 73.A deep learning framework for neuroscienceNat. Neurosci 22:1761–1770
- 74.The neuroconnectionist research programmeNat. Rev. Neurosci 24:431–450https://doi.org/10.1038/s41583-023-00705-w
- 75.Computation Through Neural Population DynamicsAnnu. Rev. Neurosci 43:249–275
- 76.The low-dimensional neural architecture of cognitive complexity is related to activity in medial thalamic nucleiNeuron 104:849–855
- 77.A Note on the Use of Principal Components in RegressionAppl. Stat 31:300
- 78.Large-scale automated synthesis of human functional neuroimaging dataNat. Methods 8:665–670
- 79.The modulation of neural gain facilitates a transition between functional segregation and integration in the braineLife 7:e31130
- 80.The Dynamics of Functional Brain Networks: Integrated Network States during Cognitive Task PerformanceNeuron 92:544–554
- 81.Locus Coeruleus in time with the making of memoriesCurr. Opin. Neurobiol 35:87–94
- 82.Dynamic Lateralization of Pupil Dilation Evoked by Locus Coeruleus Activation Results from Article Dynamic Lateralization of Pupil Dilation Evoked by Locus Coeruleus Activation Results from Sympathetic , Not Parasympathetic , ContributionsCellReports 20:3099–3112
- 83.Decision making, the P3, and the locus coeruleus-norepinephrine systemPsychol. Bull 131:510–532
- 84.Noradrenergic ensemble-based modulation of cognition over multiple timescalesBrain Res 1709:50–66https://doi.org/10.1016/j.brainres.2018.12.031
- 85.Rapid Reconfiguration of the Functional Connectome after Chemogenetic Locus Coeruleus ActivationSSRN Electron. J https://doi.org/10.2139/ssrn.3334983
- 86.Mapping neurotransmitter systems to the structural and functional organization of the human neocortexNat. Neurosci 25:1569–1581
- 87.Mapping gene transcription and neurocognition across human neocortexNat. Hum. Behav 5:1240–1250
- 88.Phasic Activation of Dorsal Raphe Serotonergic Neurons Increases Pupil SizeCurr. Biol 31:192–197https://doi.org/10.1016/j.cub.2020.09.090
- 89.Pupil size signals mental effort deployed during multiple object tracking and predicts brain activity in the dorsal attention network and the locus coeruleusJ. Vis 14:1–20
- 90.Optogenetic silencing of locus coeruleus activity in mice impairs cognitive flexibility in an attentional set-shifting task:1–8
- 91.Pupil Fluctuations Track Fast Switching of Cortical States during Quiet WakefulnessNeuron 84:355–362
- 92.Pupil size tracks attentional performance in attention-deficit/hyperactivity disorderSci. Rep 7:1–9
- 93.Differential neurophysiological correlates of retrieval of consolidated and reconsolidated memories in humans: an ERP and pupillometry studyNeurobiol. Learn. Mem :107279https://doi.org/10.1016/j.nlm.2020.107279
- 94.A pupil size, eye-tracking and neuropsychological dataset from ADHD children during a cognitive taskSci. Data 6:25
- 95.Pytorch: An imperative style, high-performance deep learning libraryAdv. Neural Inf. Process. Syst 32
- 96.Backpropagation Through Time: What It Does and How to Do ItProc. IEEE 78:1550–1560
- 97.Adam: A method for stochastic optimization3rd Int. Conf. Learn. Represent. ICLR 2015 - Conf. Track Proc :1–15
- 98.Thermodynamics and signatures of criticality in a network of neuronsProc. Natl. Acad. Sci. U. S. A 112:11508–11513
- 99.Methods to detect, characterize, and remove motion artifact in resting state fMRINeuroImage 84:320–41
- 100.A component based noise correction method (CompCor) for BOLD and perfusion based fMRINeuroImage 37:90–101
- 101.Controllability of structural brain networksNat. Commun 6:1–10
- 102.Generation and Evaluation of a Cortical Area Parcellation from Resting-State CorrelationsCereb. Cortex 26:288–303
- 103.A probabilistic MR atlas of the human cerebellumNeuroImage 46:39–46
- 104.Modular and Hierarchically Modular Organization of Brain NetworksFront. Neurosci 4
- 105.Functional cartography of complex metabolic networksNature 433:895–900
Article and author information
Author information
Version history
- Preprint posted:
- Sent for peer review:
- Reviewed Preprint version 1:
- Reviewed Preprint version 2:
Copyright
© 2024, Wainstein et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
- views
- 354
- downloads
- 16
- citation
- 1
Views, downloads and citations are aggregated across all versions of this paper published by eLife.