Modeling the hallucinatory effects of classical psychedelics in terms of replay-dependent plasticity mechanisms

eLife Assessment

This paper provides a useful new theory of the hallucinatory effects of 5-HT2A psychedelics. The authors present convincing evidence that a computational model trained with the Wake-Sleep algorithm can reproduce some features of hallucinations by varying the strength of top-down connections in the model, though it is not clear that this model applies to 5-HT2A hallucinogens in particular. The work will be of interest to researchers studying hallucinations or offline activity and plasticity more broadly.

https://doi.org/10.7554/eLife.105968.3.sa0

Significance of the findings:

Useful: Findings that have focused importance and scope

Landmark
Fundamental
Important
Valuable
Useful

Strength of evidence:

Convincing: Appropriate and validated methodology in line with current state-of-the-art

Exceptional
Compelling
Convincing
Solid
Incomplete
Inadequate

During the peer-review process the editor and reviewers write an eLife Assessment that summarises the significance of the findings reported in the article (on a scale ranging from landmark to useful) and the strength of the evidence (on a scale ranging from exceptional to inadequate). Learn more about eLife Assessments

Abstract
Introduction
Results
Discussion
Methods
Appendix 1
Data availability
References
Article and author information
Metrics

Abstract

Classical psychedelics induce complex visual hallucinations in humans, generating percepts that are coherent at a low level, but which have surreal, dream-like qualities at a high level. While there are many hypotheses as to how classical psychedelics could induce these effects, there are no concrete mechanistic models that capture the variety of observed effects in humans, while remaining consistent with the known pharmacological effects of classical psychedelics on neural circuits. In this work, we propose the ‘oneirogen hypothesis,’ which posits that the perceptual effects of classical psychedelics are a result of their pharmacological actions inducing neural activity states that truly are more similar to dream-like states. We simulate classical psychedelics’ effects via manipulating neural network models trained on perceptual tasks with the Wake-Sleep algorithm. This established machine learning algorithm leverages two activity phases: a perceptual phase (wake) where sensory inputs are encoded, and a generative phase (dream) where the network internally generates activity consistent with stimulus-evoked responses. We simulate the action of psychedelics by partially shifting the model to the ‘Sleep’ state, which entails a greater influence of top-down connections, in line with the impact of psychedelics on apical dendrites. The effects resulting from this manipulation capture a number of experimentally observed phenomena, including the emergence of hallucinations, increases in stimulus-conditioned variability, and large increases in synaptic plasticity. We further provide a number of testable predictions which could be used to validate or invalidate our oneirogen hypothesis.

Introduction

Classical psychedelics—including psilocybin, mescaline, DMT, and LSD—are a family of hallucinogenic compounds with a common mechanism of action: they are agonists for the 5-HT2a serotonin receptor commonly expressed on the apical dendrites of cortical pyramidal neurons (Jakab and Goldman-Rakic, 1998) and on parvalbumin (PV) interneurons (de Almeida and Mengod, 2007). These drugs induce numerous effects in human subjects, including complex visual, auditory, and tactile hallucinations; intense spiritual experiences; long-lasting alterations in mood; changes in personality; and increases in synaptic plasticity (Preller and Vollenweider, 2018; Shao et al., 2021; Grieco et al., 2022). Recently, they have been explored clinically as potential treatments for depression and anxiety (Muttoni et al., 2019), as well as PTSD (Krediet et al., 2020).

The 5-HT2a receptor plays a critical role in psychedelic-induced hallucinations. Indeed, behavioral measures of hallucinatory drug effects are induced selectively by cellular membrane-permeable 5-HT2a agonists (Vargas et al., 2023), and perceptual effects of classical psychedelics are largely eliminated by blocking 5-HT2a receptors in the cortex (Kraehenmann et al., 2017; Vargas et al., 2023) (though 5-HT2a agonists with mixed receptor selectivity are in some cases characterized by primarily non-hallucinatory effects [Green et al., 2003; Marona-Lewicka et al., 2002]). However, very little is understood about why highly structured hallucinations and changes in synaptic plasticity emerge from activating cortical 5-HT2a receptors: to explain this, it is necessary to develop mechanistic theories that are capable of linking changes in neuron-level properties (receptor agonism) to changes in perception and behavior. Psychedelic drug users and therapists have long noted the ‘dream-like’ qualities of psychedelic drug hallucinations, which are realistic but untethered from the external world; this observation leads naturally to speculation that these drugs are ‘oneirogens,’ or dream-manifesting compounds (Carhart-Harris, 2007). However, beyond perceptual phenomenology (and some evidence pointing to the effects of psychedelics on sleep cycles [Thomas et al., 2022; Dudysová et al., 2020; Barbanoj et al., 2008]), we lack a mechanistic proposal that could explain the similarity between dreams and psychedelic drug experiences. Here, we articulate the ‘oneirogen hypothesis,’ which describes one such potential mechanistic explanation. We propose that classical psychedelics induce a dream-like state by shifting the balance between bottom-up pathways transmitting sensory information and top-down pathways ordinarily used to create replay sequences in the brain. Replay sequences have been shown to be important for learning during sleep (Girardeau et al., 2009; Deuker et al., 2013; de Lavilléon et al., 2015; Maingret et al., 2016; Fernández-Ruiz et al., 2019): we propose that mechanisms supporting replay-dependent learning during sleep are key to explaining the increases in plasticity caused by psychedelic drug administration. In total, our model of the functional effect of psychedelics on pyramidal neurons could provide an explanation for the perceptual psychedelic experience in terms of learning mechanisms for consolidation during sleep (Walker and Stickgold, 2004), and cortical ‘replay’ phenomena (Nádasdy et al., 1999; Lee and Wilson, 2002; Foster, 2017; Ji and Wilson, 2007; Euston et al., 2007; Peyrache et al., 2009; Kenet et al., 2003; Xu et al., 2012; Hoffman and McNaughton, 2002; Louie and Wilson, 2001; Andrillon et al., 2015).

To explore the oneirogen hypothesis concretely, we use the aptly named Wake-Sleep algorithm (Hinton et al., 1995), which has historically been used to train artificial neural networks (ANNs) that possess both a bottom-up ‘recognition’ pathway and a top-down ‘generative’ pathway to learn a representation of incoming sensory data. It enables unsupervised learning in ANNs by alternating between periods of ‘waking perception’ (wherein bottom-up recognition pathways drive activity) and ‘dreaming sequences’ (wherein top-down generative pathways drive activity). With these alternate periods of distinct activity, connectivity parameters in each pathway are adjusted to match the activity of the opposite pathway. This way, the top-down pathway learns to generate activity consistent with that induced by sensory inputs, and the bottom-up pathway learns better representations thanks to generated activity.

In this work, we show that within a neural network trained via Wake-Sleep, it is possible to model the action of classical psychedelics (i.e. 5-HT2a receptor agonism) by shifting the balance during the wake state from the bottom-up pathways to the top-down pathways, thereby making the ‘wake’ network states more ‘dream-like’. Specifically, we model the effects of classical psychedelics by manipulating the relative influence of top-down and bottom-up connections in neural networks trained with the Wake-Sleep algorithm on images. Doing so, we capture a number of effects observed in experiments on individuals under the influence of psychedelics, including: the emergence of closed-eye hallucinations, increases in stimulus-conditioned variability, and large increases in synaptic plasticity. This data suggests that the oneirogen hypothesis may indeed help to explain why 5-HT2a agonists have the functional effects that they do. We subsequently identify several testable predictions that could be used to further validate the oneirogen hypothesis.

Results

Mapping the Wake-Sleep algorithm onto cortical architecture

The Wake-Sleep algorithm allows ANNs to optimize a global, unsupervised objective function for sensory representation learning—the Evidence Lower Bound (ELBO)—through local synaptic modifications to a bottom-up recognition pathway and a top-down generative pathway. As a precursor to the variational autoencoder (Rezende et al., 2014; Kingma and Welling, 2013), the Wake-Sleep algorithm provides a mechanism for learning a probabilistic latent representation $r$ responding to incoming sensory stimuli $s$ , which obeys representational characteristics that are ideal for a neural system (e.g. sparsity and metabolic efficiency Simoncelli, 2003, compression and coding efficiency Simoncelli and Olshausen, 2001; Ballé et al., 2016, or disentanglement DiCarlo et al., 2012; Higgins et al., 2017). To do this, Wake-Sleep optimizes the ELBO through an approximation of the Expectation Maximization (EM) algorithm (Ikeda et al., 1998) to train the two pathways (Figure 1a). (For readers who are unfamiliar with the Wake-Sleep algorithm, a tutorial can be found here Kirby, 2006).

Figure 1

Download asset Open asset

Mapping the Wake-Sleep algorithm onto cortical architecture.

Left: Network architecture. We model early sensory processing in the cortex with a multilayer network, $r$ , receiving stimuli $s$ . Center: individual pyramidal neurons receive top-down inputs (red) at the apical dendritic compartment, and bottom-up inputs at the basal dendritic compartment (blue). 5-HT2a receptors are expressed on the apical dendritic shaft (red bar), and on parvalbumin (PV) interneurons (red triangle); both sites may play a role in gating basal input. Right: Over the course of Wake-Sleep training, basal inputs dominate activity during the Wake phase ( $α = 0$ ) and are used to train apical synapses, whereas apical inputs dominate activity during the Sleep phase ( $α = 1$ ) and are used to train basal synapses.

Notably, the Wake-Sleep algorithm requires two phases of activity (i.e. ‘Wake’ and ‘Sleep’), where the network phase is controlled by a global state variable $α \in [0, 1]$ that regulates the balance between the bottom-up and top-down pathways. In the Wake phase ( $α = 0$ ), the network processes real sensory stimuli drawn from the environment, and network activity is sampled based on the bottom-up inputs (corresponding to the approximate inference distribution). In the Sleep phase ( $α = 1$ ), the network internally samples neural activity from its generative model, which then produces generated activity in the stimulus layer $s$ . We use this structure of the Wake-Sleep algorithm as a concrete model to express the oneirogen hypothesis. Specifically, we use changes to the value of α as a means of modeling a 5-HT2a agonist-induced shift to a more dream-like state, as we detail below.

Within the Wake-Sleep algorithm, neurons alternate between ‘Wake’ and ‘Sleep’ modes, where activity during each mode is dominated by the bottom-up and top-down pathways, respectively. We can determine the neural activity for a given intermediate layer $l$ with the following equation:

r^{(l)} = f (h (r), μ (r), α) + f (σ_{b}, σ_{p}, α) η,

where $h (r)$ defines bottom-up input, $μ (r)$ defines top-down input, $f (h, μ, α)$ is any interpolation function such that $f (h, μ, 0) = h$ and $f (h, μ, 1) = μ$ , $σ_{b}$ and $σ_{p}$ define the bottom-up and top-down activity standard deviations, and $η \sim N (0, 1)$ adds random noise to the neural activity (see Methods for more detail). Here, for notational conciseness, we treat $r$ as a concatenated vector of all $r^{(l)}$ vectors from each layer. This equation means that α controls whether bottom-up inputs or top-down inputs control the dynamics of individual neural units.

Thus, as α moves from a value of 0 to a value of 1, the activity of the neurons shifts from being driven by the bottom-up recognition pathway to being driven by the top-down generative pathway. How could this occur in the brain? Realistically, each neuron in the cortex would have its own α variable defining the relative influence of top-down and bottom-up inputs on its spiking activity; here, for simplicity, we will assign the entire network a single α value reflecting the ‘mean’ relative top-down/bottom-up influence averaged across neurons, as determined by the network state (Wake, Sleep, dose-dependent psychedelic administration). In the cortex, excitatory pyramidal neurons receive inputs from distinct sources: inputs that are from ‘higher order’ cortical areas target the apical dendrites, whereas inputs that are from ‘lower order’ cortical or sensory subcortical areas target the basal dendrites (Larkum, 2013). Thus, we can capture the core idea behind the oneirogen hypothesis using the Wake-Sleep algorithm, by postulating that the bottom-up basal synapses are predominantly driving neural activity during the Wake phase (when α is low), while top-down apical synapses are predominantly driving neural activity during the Sleep phase (when α is high; Figure 1) Aru et al., 2020; this is in agreement with several recent theoretical studies that have proposed that apical dendrites could serve as a site for integrating top-down learning signals (Körding and König, 2001; Urbanczik and Senn, 2014; Guerguiev et al., 2017; Sacramento et al., 2018; Richards and Lillicrap, 2019; Payeur et al., 2021), particularly those which propose that the top-down signal corresponds to a predictive or generative model of neural activity (Bredenberg et al., 2021; George et al., 2024). This proposed change in α does indeed appear to occur during both slow-wave (SW) (Seibt et al., 2017; Miyamoto et al., 2016) and rapid eye movement (REM) (Li et al., 2017; Zhou et al., 2020; Aime et al., 2022) sleep, where apical dendritic inputs have been observed to exert increased influence on neural activity that is critical for plasticity induction and consolidation of learned behaviors; during REM sleep, this increased influence has been shown to be mediated by potentiation of basal dendrite-targeting PV inhibitory interneurons (Aime et al., 2022).

Next, we ask: can we model the effects of classical psychedelics in terms of changes in α? Notably, 5-HT2a receptors are expressed in the apical dendrites of pyramidal neurons (Jakab and Goldman-Rakic, 1998) and PV interneurons (de Almeida and Mengod, 2007) and have an excitatory effect that positively modulates glutamatergic transmission due to apical dendritic inputs (Aghajanian and Marek, 1997; Aghajanian and Marek, 1999); furthermore, classical psychedelic administration has been shown to have an inhibitory effect on glutamatergic transmission due to basal dendritic inputs (Arvanov et al., 1999). These data suggest that 5-HT2a agonists could have a push-pull effect on cortical pyramidal neurons, increasing the relative influence of apical dendrites and decreasing the relative influence of basal dendrites (Hidalgo Jiménez et al., 2025) in much the same way as has been observed during SW and REM sleep. Hence, we can model these effects by increasing the α value in a Wake-Sleep trained network, and then ask whether the networks exhibit other phenomena that match the known impact of classical psychedelics on neural activity. We note that with this mapping of the Wake-Sleep algorithm to models of basal and apical processing, synaptic modifications at both apical and basal synapses correspond to minimizing a local prediction error between top-down and bottom-up inputs (see Methods).

Modeling hallucinations

To see whether a transition from waking to a more dream-like state would induce hallucinatory effects in our model, we trained multilayer neural networks with branched dendritic arbors (see Methods) on the MNIST digits dataset (Deng, 2012) using the Wake-Sleep algorithm and subsequently simulated hallucinatory activity by varying α (see Methods; Equation 8). We could visualize the effects of our simulated psychedelic with snapshots of the stimulus layer $s$ at a fixed point in time for various values of α (Figure 2; see also Video 1 and Video 2). As α increased, we observed that network activity gradually deformed away from the ground-truth stimulus in a highly structured way, adding strokes to the original digit that were not originally present. At the highest values of α tested, we found that network states were wholly divorced from the ground-truth stimulus but retained many characteristics of the MNIST digits on which the network was trained (e.g. smooth strokes and the rough form of digits). These results emphasize that hallucinations induced by a shift to a more dream-like state in these models are heavily influenced by the training dataset, which for an animal would correspond to the statistics of the sensory environment in which it learns its sensory representation. To emphasize this point, we further trained our networks on the CIFAR10 natural images dataset (Krizhevsky and Hinton, 2009; Figure 2c), to provide an example of a more naturalistic training dataset. In this case, our model was not powerful enough to reproduce realistic natural images—instead, we found that our modeled hallucinatory activity corresponded to ‘ripple’ effects, which are similar to the ‘breathing’ and ‘rippling’ phenomena reported by psychedelic drug users at low doses (Preller and Vollenweider, 2018).

Figure 2 with 2 supplements see all

Download asset Open asset

Visualizing the effects of psychedelics in the model.

We model the effects of classical psychedelics by progressively increasing α from 0 to 1 in our model, where $α = 1$ is equivalent to the Sleep phase. We visualize the effects of psychedelics on the network representation by inspecting the stimulus layer $s$ . (a) Example stimulus-layer activity (rows) in response to an MNIST digit presentation as psychedelic dose increases (columns, left to right). (b) Same as (a) but for ‘eyes-closed’ conditions where an entirely black image is presented. (**c–d**) Same as (**a–b**), but for the CIFAR10 dataset.

Video 1

Download asset

posterframe for video — Visualizing the effects of psychedelics in the MNIST-trained model.

Video 2

Download asset

These simulations were produced with a complex, multicompartmental neuron model; however, we found similar results with two alternative network architectures, one with within-layer recurrence (Figure 2—figure supplement 1a) and one which used a simpler single compartment neuron model (Figure 2—figure supplement 1b). We found that our single compartment model produced qualitatively less realistic generated images than the multicompartment and recurrent models, justifying our use of the more complex models (Figure 2—figure supplement 2). To demonstrate the importance of a learned top-down pathway to produce complex, structured hallucinations in the earliest layers of our network, we generated model hallucinations from two control networks: an untrained model and a trained network where psychedelic activity was alternatively modeled by a simple increase in the variance of individual neurons (we will refer to this latter control as the noise-based hallucination protocol). We found that hallucinations under these control conditions resembled additive white noise, rather than structured digit-like shapes (Figure 2—figure supplement 1c–d).

Psychedelic drug users also report observing the emergence of hallucinations while their eyes are closed (Preller and Vollenweider, 2018). Interestingly, we found that our model recapitulated these phenomena: as α increased, networks trained on MNIST gradually began revealing increasingly complex and digit-like patterns (Figure 2b), whereas CIFAR10-trained networks again predominantly produced ‘ripple’ hallucinations (Figure 2d).

Effects of psychedelics on single neurons

Having recapitulated hallucinatory phenomena in stimulus space, we next explored how our proposed mechanism affected neural activity in our network model, in order to establish markers that could be used to experimentally validate or invalidate the oneirogen hypothesis. To start, we investigated the effects of learning and psychedelic drug administration on the activity of single neurons in the model. As noted previously, the learning algorithm used here trains synapses so that top-down inputs to apical dendritic compartments match bottom-up inputs to basal dendritic compartments. As a consequence, we observed that after training, inputs to apical and basal dendritic compartments were much more correlated on the same neuron than they were for random neurons (Figure 3a), which was not observed in untrained models (Figure 3—figure supplement 1a). This form of strongly correlated tuning has been observed in both cortex and the hippocampus (Beaulieu-Laroche et al., 2019; O’Hare et al., 2024).

Figure 3 with 2 supplements see all

Download asset Open asset

Effects of psychedelics on single model neurons.

(a) Correlations between the apical and basal dendritic compartments of either the same network neuron or between randomly selected neurons. (b) Total plasticity for apical (left) and basal (right) synapses as α increases in the model when plasticity is either gated or not gated by α. Error bars indicate +/-1 s.e.m. (c) Cosine similarity between plasticity induced under psychedelic conditions compared to baseline for apical (left) and basal (right) synapses.

There are many indicators that psychedelic drug administration in humans and animals can induce marked, long-lasting changes in behavior, as well as large increases in synaptic plasticity (Shao et al., 2021; Nardou et al., 2023; de la Fuente Revenga et al., 2021; Vargas et al., 2023; Grieco et al., 2022). In Wake-Sleep learning, apical synapses learn during the Wake phase, whereas basal synapses learn during the Sleep phase—thus, plasticity at apical synapses is gated by $(1 - α)$ , whereas plasticity at basal synapses is gated by α (see Methods). However, learning is still theoretically possible without this explicit gating, though it may be noisier and less efficient; furthermore, it is conceivable that classical psychedelics could increase the relative influence of apical inputs on the activity of a neuron without affecting this gating mechanism. As a consequence, we modeled the dose-dependent effects of psychedelics on plasticity both with and without gating (Figure 3b). Consistent with recent experimental results (Shao et al., 2021), for intermediate doses, we found large increases in plasticity at both apical and basal synapses under both conditions, where plasticity was measured as a mean change in normalized synaptic strength across weight parameters in our network (see Methods). In our model, we found that the total evoked plasticity peaked at roughly $α = 0.5$ ; we further found that if gating was affected by psychedelics, apical plasticity would eventually be quenched at very high drug doses. We also found that plasticity induced by psychedelic drug administration gradually became unaligned from the weight updates that would have occurred in the absence of the drug (Figure 3c), indicating that these results were not simply due to modulation of the effective learning rate of the underlying plasticity. Rather, as has been suggested by other theoretical studies (Juliani et al., 2024), plasticity in the model likely increased because aberrant hallucinatory activity pulled the learning mechanism out of a local optimum in which plasticity was minimal, producing much more plasticity across the network. Importantly, we observed these increases in plasticity in all network architectures and training datasets we explored, including for our noise-based hallucination protocol (Figure 3—figure supplement 2), demonstrating that changes in apical dendritic influence within a Wake-Sleep learning framework are sufficient, but not necessary to induce increases in synaptic plasticity: for trained networks, it would seem that even simple increases in neural variability can have similar effects.

Effects of psychedelics on neural variability

Having observed that increasing our modeled drug dosage caused heightened fluctuations and deviations from the ground-truth stimulus in the sensory layer of our network (Figure 2), we next investigated whether variability was affected at the level of individual neurons in higher layers of the model. Indeed, we found that for a fixed stimulus, neural variability increased markedly as the simulated psychedelic drug dose increased (Figure 4a). This result is consistent with the data supporting the Entropic Brain Theory (Carhart-Harris and Friston, 2019; Lebedev et al., 2016; Carhart-Harris et al., 2014; Siegel et al., 2024), in which neural activity in resting state fMRI recordings becomes increasingly ‘entropic’ (i.e. variable) under the influence of psychedelics; however, it is important to note that our noise-based hallucination protocol also produced these effects (Figure 4—figure supplement 1a). Though most experimental data supporting the Entropic Brain Theory is taken from recordings with relatively poor spatial resolution, averaging activity over large cortical areas, our model predicts that this increase in variability should be reflected at the level of individual neurons; this increase in variability after psychedelic administration has been recently observed in auditory cortical neurons for active mice (Horrocks et al., 2024), but whether this phenomenon is general across tasks and cortical areas remains to be seen. We further found that this increase in variability corresponded to a decrease in ability to identify the stimulus being presented to the network: we trained a classifier to identify which MNIST digit was presented to our networks on Wake neural activity (see Methods), and found that the accuracy of our classifier decreased (Figure 4b) while the output variability of the classifier increased (Figure 4c) in response to drug administration.

Figure 4 with 1 supplement see all

Download asset Open asset

Effects of psychedelics on neural variability.

(a) Stimulus-conditioned variability for neurons in the network as α increases, as compared to variability in neural activity across stimuli (rightmost bar). Error bars indicate +/-1 s.e.m. (b) Proportion correct for a classifier trained to detect the label of presented MNIST digits as α increases. (c) Variability in the logit outputs of the trained classifier as α increases.

Within our model, this increase in variability is quite sensible: in the ordinary Wake state, neural activity is constrained to correspond to the singular sensory stimulus being presented, whereas during Sleep states, neural activity is completely unconstrained by any particular sensory stimulus, reflecting instead the full distribution of possible sensory stimuli. As increasing α in our model interpolates between Wake and Sleep states, we can expect intermediate values of α to produce network states which are less constrained by the particular sensory stimulus being presented, reflected in increased neural variability.

Network-level effects of psychedelics

We next investigated the effects of psychedelics on network-level and inter-areal dynamics within our model. We first identified an important negative result: the pairwise correlation structure between neurons was largely preserved across psychedelic doses (Figure 5a–b), as was the effective dimensionality of population activity (Figure 5c). This was sensible, because a network that has been well-trained with the Wake-Sleep algorithm will have the same marginal distribution of network states in the Wake mode as in the Sleep mode—thus, pairwise correlations between neurons should also not differ (as measures of the second order moments of the marginal distribution). We found empirically that even for intermediate values of α in which activity is a mixture of Wake and Sleep modes, these correlations are largely unchanged; in contrast, we observed large changes in correlation structure for untrained networks and increases in effective dimensionality for both untrained networks and for our simple noise-based hallucination protocol, suggesting that these results are more specific to our trained models in which hallucinations are caused by an increase in apical dendritic influence (Figure 5—figure supplement 1a–b). Interestingly, these results are consistent with a recent study that has shown only minimal functional connectivity and effective dimensionality changes in task-engaged humans being presented with audiovisual stimuli under the influence of psilocybin (Siegel et al., 2024).

Figure 5 with 1 supplement see all

Download asset Open asset

Network-level effects of psychedelics.

(a) Pairwise correlation matrices computed for neurons in layer 2 across stimuli for $α = 0$ (left), $α = 0.5$ (center), and $α = 1.0$ (right). (b) Correlation similarity metric between the pairwise correlation matrices of the network in the absence of hallucination ( $α = 0$ ) as compared to hallucinating network states ( $α > 0$ ). (c) Proportion of explained variability as a function of principal component (PC) number for $α \in {0, 0.5, 1}$ . (d) Ratio of across-stimulus variance in individual stimulus layer neurons when the apical dendrites have been inactivated, versus baseline conditions across different α values. (e) Ratio of across-stimulus variance in individual neurons in the stimulus layer when neurons at the deepest network layer have been inactivated, versus baseline conditions across different α values. Error bars indicate +/-1 s.e.m.

However, though the pairwise correlations between single neurons are largely preserved, the causal influence between lower and higher layers of our model network changes considerably both during hallucination and Sleep modes. Because psychedelic drug administration increases the influence of apical dendritic inputs on neural activity in our model, we found that silencing apical dendritic activity reduced across-stimulus neural variability more as the psychedelic drug dose increases (Figure 5d). Furthermore, we found that as α increased, inactivating the deepest network layer induced a large reduction in variability in the stimulus layer relative to baseline (Figure 5e), revealing that within our model, increases in top-down influence are responsible for much of the observed stimulus-conditioned variability at larger drug doses. These inactivations had no impact on neural variability in our noise-based hallucination protocol, but were observed for all network architectures and datasets that we tested in which hallucinations were caused by an increase in apical dendritic influence (Figure 5—figure supplement 1), suggesting that these results are quite specific to our model. Furthermore, these inactivations have not yet been performed in animals and consequently constitute a critical testable prediction of our model.

Modeling hallucinations in large-scale pretrained networks

While our trained model is capable of capturing several effects of classical psychedelics, it also has a clear limitation: our top-down generative model does not have sufficient expressive power to induce complex hallucinations of naturalistic stimuli, producing instead ‘ripples,’ or ‘breathing’ effects that preserve lower-order statistical features of the input data (Figure 5b). While psychedelic drug users do report these phenomena, they also report observing much more complex hallucinations, including people, animals, and scenes (Shanon, 2002; Diaz, 2010).

Generative models trained through backpropagation have been much more successful in producing more complex generated sensory stimuli (Kingma and Welling, 2013; Rezende et al., 2014; Goodfellow et al., 2020), and furthermore, hierarchical variational autoencoder models have a nearly identical top-down/bottom-up model architecture as our Wake-Sleep-trained networks (Sønderby et al., 2016; Vahdat and Kautz, 2020). Therefore, to see whether our proposed mechanism would induce complex, structured hallucinations in more powerful models, we induced hallucinations in Very Deep Variational Autoencoder (VDVAE) models (Child, 2020) that were pretrained through backpropagation on a large natural images dataset, Tiny ImageNet (Wu et al., 2017), and a large corpus of human faces, FFHQ-256 (Karras et al., 2019). These models have a few key differences compared to our Wake-Sleep-trained models: (1) they are trained through backpropagation, which is well-known to be biologically implausible (Lillicrap et al., 2020); (2) they exploit parameter sharing across spatial positions in convolutional layers for increased data efficiency during training, at the expense of further biological realism (Pogodin et al., 2021); (3) the ‘Wake’ stage inference process of these models incorporates inputs from both bottom-up and top-down sources, which both improves performance (Sønderby et al., 2016) and is more biologically realistic (Csikor et al., 2022; Larkum, 2013); (4) the models are trained on more complex, higher-resolution datasets. Finally, to induce more ‘abstract’ hallucinations, we increased the α parameter in these models selectively for higher layers of the network, whereas for the Wake-Sleep-trained models, we increased α evenly across layers (see Methods). Combined, these differences make for an effective model of high-level hallucination effects, at the expense of some biological realism.

We found that hallucinations generated by these pretrained models were much richer and more complex: increasing α in the Tiny ImageNet VDVAE caused the emergence of textural patterns and geometric shapes, while the FFHQ-256 VDVAE caused increasingly bizarre changes in facial features (Figure 6). Both models were also capable of reproducing closed-eyes hallucinations (Figure 6—figure supplement 1), where the content of these hallucinations was shaped by their respective training datasets.

Figure 6 with 2 supplements see all

Download asset Open asset

Visualizing the effects of psychedelics in pretrained Very Deep Variational Autoencoder (VDVAE) models.

Decoded outputs of a pretrained VDVAE model trained on Tiny ImageNet (Top) and FFHQ-256 (Bottom) based on hallucinations generated in the top 35 layers of the model. Image samples vary along rows, and hallucination intensity, parameterized by α, increases along columns.

To investigate the nature of hallucinations generated by the Tiny ImageNet VDVAE, we examined the Laplacian pyramid of decoded hallucination images at varying α values (Figure 6—figure supplement 2a). Essentially, a Laplacian pyramid decomposes an image into levels of decreasing resolution features, with each level encoding the residual produced by downsampling to the next-lowest resolution (level 0 corresponds to the base 64×64 pixel image, while level 5 corresponds to a 4×4 reduced-resolution set of features). We found that low-level pyramid features varied considerably at low α levels, while high-level pyramid features did not begin to vary until higher α doses (Figure 6—figure supplement 2b). This suggests that hallucinations within our model obey a fine-to-coarse structure, where low-dose hallucinations are confined to high-frequency, spatially localized changes, and progressively increasing doses begin to cause variations in more global image features.

Lastly, we were able to replicate our previous network-level results on the Tiny ImageNet VDVAE. We found that increasing psychedelic dose α caused an increase in stimulus-conditioned variance within the model (Figure 6—figure supplement 2c), and that across-stimulus correlation structure between network units was largely preserved across doses (Figure 6—figure supplement 2d). Furthermore, we found that the ratio of before- and after-inactivation across-stimulus variance decreased as the psychedelic dose α increased (though somewhat paradoxically, inactivation caused an increase in variance for $α = 0$ , likely due to the influence of top-down inputs during inference for this model). Combined, these results show that key testable predictions from our Wake-Sleep-trained model are preserved in the VDVAE, while this latter model is capable of producing some of the more complex hallucinations characteristic of psychedelic experience.

Discussion

Experimental results captured by our model

In this study, we have examined a hypothetical mechanism explaining how the 5-HT2a receptor agonism of classical psychedelics could induce the highly structured hallucinations reported by people who have consumed these drugs. Specifically, we have explored the ‘oneirogen hypothesis,’ which postulates that 5-HT2a agonists have the effects that they do because they shift the neocortex to a more dream-like state, wherein activity is more strongly driven by top-down inputs to apical dendrites than normally occurs during waking. To provide a concrete model to explore the ‘oneirogen hypothesis,’ we used the classic Wake-Sleep algorithm, which learns by toggling between a Wake phase, where activity is driven by bottom-up sensory inputs, and a Sleep phase, where activity is driven by top-down generative signals. We modeled the ‘oneirogen hypothesis’ by simulating psychedelic administration as an increase in a neuronal state variable (α) that switches neural activity between these two phases, such that the simulated psychedelic caused the network to enter a state somewhere between the Wake and Sleep phases, making activity during the Wake phase less tied to actual sensory inputs by increasing the relative influence of the top-down, apical compartment in the models (depending on the ‘dosage’). This formulation is consistent with anatomical wiring data (Larkum, 2013), as well as several recent theoretical studies which propose a specialized learning role for top-down projections to the apical dendrites of pyramidal neurons (Körding and König, 2001; Urbanczik and Senn, 2014; Guerguiev et al., 2017; Sacramento et al., 2018; Richards and Lillicrap, 2019; Payeur et al., 2021). It is also consistent with the known cellular mechanism of action of classical psychedelics (Jakab and Goldman-Rakic, 1998; Aghajanian and Marek, 1999; Aghajanian and Marek, 1997; Kraehenmann et al., 2017) and experiments that demonstrate a reduced responsivity to bottom-up stimuli in cortex after psychedelic drug administration (Evarts et al., 1955; Azimi et al., 2020; Michaiel et al., 2019). Using this model, we were able to produce both stimulus-conditioned and ‘closed-eye’ hallucinations that are consistent with the low-level effects reported by psychedelic drug users (Preller and Vollenweider, 2018), and we were also able to recapitulate the large increases in plasticity observed at both apical and basal synapses at moderate psychedelic doses (Shao et al., 2021).

Our model uses a particular functional form of synaptic plasticity at both apical and basal synapses, reminiscent of the classical delta rule (Widrow and Lehr, 1990), which seeks to minimize a prediction error between inputs in apical and basal synapses. There are many theoretical models of learning that propose similar forms of plasticity (Urbanczik and Senn, 2014; Guerguiev et al., 2017; Bredenberg et al., 2021), so while this plasticity is a necessary prediction of our model, it is not sufficient to validate it. Experimentally, plasticity dynamics which could, theoretically, minimize such a prediction error have been observed in cortex (Sjöström and Häusser, 2006; Froemke et al., 2010); we found that plasticity rules of this kind induce strong correlations between inputs to the apical and basal dendritic compartments of pyramidal neurons, which has been observed in both the hippocampus and cortex (Beaulieu-Laroche et al., 2019; O’Hare et al., 2024). Psychedelic administration within our model induced large increases in plasticity, which has also been observed experimentally (Shao et al., 2021; Grieco et al., 2022). Within our model, this plasticity should not be interpreted as ‘learning,’ since it arises from aberrant network activity and does not necessarily produce behavioral or perceptual improvements; it is likely closer to ‘noise,’ that may still be useful for helping neural networks escape from local minima in the loss optimization landscape for synaptic weights, with possible implications for individuals suffering from post-traumatic stress disorder, early life trauma, or the negative effects of sensory deprivation. Further work will be required to analyze the relationship within our model between psychedelic dosage, usage frequency, and the long-term stability of learned representations in neural networks.

Interestingly, we also found that increasing the influence of apical dendrites in the model increased stimulus-conditioned variability in our individual neurons. In the cortex, this effect has recently been shown at the level of single auditory neurons (Horrocks et al., 2024); furthermore, there have been numerous studies reporting similar increases in asynchronous variability (Carhart-Harris et al., 2014) (or, analogously, sample entropy Lebedev et al., 2016) and Lempel-Ziv complexity (Mediano et al., 2024) in resting-state human brain recordings, previously modeled using Entropic Brain Theory. This theory proposes that many of the effects of classical psychedelics on perception and learning can be explained in terms of increases in variability induced by drug administration (e.g. the increase in variability could introduce novel patterns of thinking, or perturb learning to allow it to break out of ‘local minima’). Our results are broadly consistent with this perspective, to which we have added explanatory layers that are both normative and mechanistic (Bredenberg and Savin, 2024; Levenstein et al., 2023): namely, we speculate that this variability under ordinary conditions results from an ethologically important mechanism underlying generative replay for unsupervised learning during sleep or quiescence, and we propose that mechanistically this increase in variability is caused by the increased influence of top-down synapses that are not tied to incoming sensory stimuli. Alternatively, such entropy increases could be caused by increases in attention or self-reflective thought, as supported by recent studies showing that task engagement significantly attenuates psychedelic-induced entropy increases (Siegel et al., 2024); though our model does not include cognitive or attention components, such an interpretation is potentially consistent with and complementary to our framework.

Testable predictions

While our results are broadly consistent with existing experimental evidence, there are many unconfirmed aspects of our model which could be tested to validate or invalidate it (summarized in Table 1). As mentioned in the previous section, our model predicts that single neurons should increase variability in response to psychedelic drug administration in any cortical area affected by psychedelic drugs, an effect that has not yet been investigated systematically throughout cortex or across task conditions. Second, we propose that psychedelic drugs should not push network dynamics into wildly different operating regimes than normal wakefulness, beyond any differences observed between wakefulness and replay (dreams) during sleep. In particular, we found that our simulated psychedelic drug administration did not perturb pairwise correlations between neurons within local circuits when averaged across an ecologically representative set of stimuli.

Table 1

Summarizing testable predictions of the ‘oneirogen hypothesis’.

Models: OH - oneirogen hypothesis; EC - Ermentrout and Cowan, 1979; REBUS - Relaxed Beliefs Under Psychedelics (Carhart-Harris and Friston, 2019); DD - DeepDream (Suzuki et al., 2017). Key: ✓ - model is consistent with the prediction; ✗ - model is inconsistent with the prediction; n/a - model is neither inconsistent nor consistent with the prediction.

Testable predictions	OH	EC	REBUS	DD
1. Psychedelic administration increases stimulus-conditioned variability of neurons.	✓	✓	✓	✓
2. Psychedelic administration preserves pairwise across-stimulus correlations between neurons.	✓	✗	✗	✗
3. Silencing apical dendritic compartments decreases neural variability more after psychedelic administration.	✓	n/a	n/a	n/a
4. Silencing higher-order cortical areas affects lower-order cortical activity more after psychedelic administration.	✓	✗	✗	✓
5. Psychedelic drug effects are mediated by the same circuitry responsible for inducing generative replay dynamics in cortex.	✓	✗	✗	✗

Within our model, psychedelic drug administration is expected to increase the relative influence of top-down projections. This prediction appears to be supported by slice experiments (Aghajanian and Marek, 1999; Aghajanian and Marek, 1997; Arvanov et al., 1999), but to our knowledge, this change in functional connectivity has not yet been shown via in vivo manipulations. This could be explored experimentally in several ways: first, we have shown that apical dendrite-targeted silencing experiments can identify the amount of influence apical dendritic inputs exert on neuronal dynamics; second, we have shown that increases in top-down influence can in principle be identified with interareal silencing experiments. We caution that interpreting results in this second vein may be difficult, as establishing a clean distinction between a ‘higher order’ and ‘lower order’ cortical area may be much more difficult in a densely recurrent system, such as the brain, compared to our simplified and fully observable network model.

Interestingly, if psychedelic drugs are genuinely co-opting circuitry ordinarily reserved for generative replay during periods of offline quiescence or sleep, we would expect that the same changes in functional connectivity observed during psychedelic drug administration would also occur during periods of replay. Replay has been observed and dreams have been documented during both SW (Lee and Wilson, 2002; Ji and Wilson, 2007) and REM (Louie and Wilson, 2001; Andrillon et al., 2015) sleep, with REM dreams exhibiting greater degrees of bizarreness, possibly indicating a more ‘generative’ form of replay (Stickgold et al., 2001). During SW sleep, increased top-down influence has been observed from secondary motor cortex to primary somatosensory cortex (Miyamoto et al., 2016), and from hippocampus to prefrontal cortex (Ji and Wilson, 2007); however, it should be noted that increased hippocampal-to-prefrontal functional coupling was not observed after classical psychedelic administration (Domenico et al., 2021). During REM sleep, increased top-down influence (or apical dendritic influence) has been observed in prefrontal, visual (Zhou et al., 2020), and motor (Li et al., 2017) cortices, with some top-down inputs originating from higher-order thalamic nuclei (Aime et al., 2022; Whyte et al., 2024); similarly, multiple non-invasive imaging studies have observed increases in top-down functional coupling from higher-order thalamic nuclei after psychedelic administration (Gaddis et al., 2022; Delli Pizzi et al., 2023). Therefore, increases in top-down coupling appear broadly consistent between REM sleep and classical psychedelic administration, while psychedelic states appear inconsistent with the hippocampal-cortical coupling during SW sleep; this latter result could potentially be explained in terms of a recent complementary learning systems model (Singh et al., 2022), in which SW sleep is responsible for orchestrating hippocampus-cortex-coupled episodic replay while REM sleep is responsible for orchestrating hippocampus-cortex-decoupled generative replay, but more experiments and theoretical work will likely be necessary to fully characterize this additional complexity. Given these data, it seems as though REM sleep replay is a moderately stronger candidate for sharing a mechanism of action with classical psychedelics, though it remains possible that replay events during SW sleep occur via a similarly shared mechanism.

To summarize, though we have provided a candidate explanation for several of the hallucinatory effects of psychedelic drugs with a model that displays a strong correspondence with existing empirical evidence, our model rests on a number of testable assumptions. Our goal here has been to articulate these assumptions as clearly as possible, to facilitate experimental efforts to test them.

Comparisons to alternative models

Here, we review prominent existing hypotheses as to how psychedelic drugs could induce hallucinations in neural networks and compare to our model (summarized in Table 1). The first alternative proposed that incredibly complex, geometric patterns formed by DMT administration could be attributed to pattern-formation effects in visual cortex caused by a disruption of the balance between excitation and inhibition in locally coupled topographic recurrent neural networks (Ermentrout and Cowan, 1979; Bressloff et al., 2001). Our work differs from this approach in several respects. First, rather than disrupting E-I balance, we propose that psychedelics increase the relative influence of apical dendrites and top-down projections on the dynamics of neural activity. Second, though their model is able to generate geometric patterns, it is not able to generate patterns that are statistically related to the features of the sensory environment (e.g. MNIST digits). Lastly, for simplicity, we avoided, including topographic (or convolutional) recurrent connectivity in our model; however, it would be a very fruitful direction for future research to extend our work to generative modeling of temporal video sequences, as in Keller and Welling, 2023; Keller et al., 2023. With such a development, it is conceivable that our model could directly generalize these pattern formation-based approaches.

Perhaps more closely related to our model is the ‘relaxed beliefs under psychedelics’ (REBUS) model, which proposes to explain the effects of classical psychedelics in terms of predictive coding theory (Carhart-Harris and Friston, 2019). Similar to the Wake-Sleep algorithm, predictive coding theory (Rao and Ballard, 1999) models sensory representation learning with neural dynamics and local synaptic modifications that collectively optimize an ELBO objective function. However, at a mechanistic level, there are numerous differences, the most easily distinguishable feature being that the Wake-Sleep algorithm requires periods of offline ‘generative replay’ to train bottom-up synapses in its network, whereas predictive coding learning occurs concomitantly with stimulus presentation. Furthermore, the REBUS model of psychedelic effects is described at a computational level, in terms of a decrease in the ‘precision-weighting of top-down priors.’ While it is more difficult to map the REBUS model directly onto cortical microcircuitry, and the hallucinatory effects of such a model have, to our knowledge, not been directly analyzed, it has been shown that the proposed mechanism causes an increase in bottom-up information flow between cortical areas (Rajpal et al., 2022), in direct contrast to the effects that we have shown in our model (Figure 5c–d), there is some evidence supporting this idea (Alamia et al., 2020), but noninvasive imaging studies are inconsistent on this question, with many studies showing by contrast an increase in top-down functional connectivity caused by classical psychedelic administration (Gaddis et al., 2022; Delli Pizzi et al., 2023), and with invasive recordings showing a decrease in the influence of bottom-up inputs (Evarts et al., 1955; Azimi et al., 2020; Michaiel et al., 2019). Because interareal causal influence can be difficult to analyze statistically due to dense recurrent connectivity (i.e. correlation does not imply causation), we stress that it would be more effective to distinguish between the REBUS model and our ‘oneirogen hypothesis’ by performing direct interventions on inputs to the apical and basal dendritic compartments of pyramidal neurons in cortex, and by exploring whether psychedelic drugs affect the same circuitry that induces ‘generative replay’ during periods of sleep and quiescence. More consistent with our model, a recent non-mechanistic approach based on the DeepDream algorithm has been used to generate realistic hallucinations via increased influence from a top-down learning signal (Suzuki et al., 2017); however, this model proposes no relationship between psychedelics and replay during sleep.

Lastly, it should be noted that the Wake-Sleep algorithm and our choice of network architecture constitute one particular model within a family of related models, all of which satisfy our key criteria for a good model of the ‘oneirogen hypothesis,’ namely that (1) the model has well-defined top-down and bottom-up pathways, (2) it learns a generative model of incoming sensory inputs, and (3) it uses periods of offline replay for learning through local synaptic plasticity. For example, in the Supplemental Materials, we have replicated all of our essential results for two alternative network architectures, also learned via the Wake-Sleep algorithm: one model uses within-layer recurrence to improve generative performance, while the other model uses a simpler single compartment neuron model. Furthermore, the closely related Contrastive Divergence learning algorithm for Boltzmann Machines (Ackley et al., 1985) also involves alternations between Wake and generative Sleep phases, learns through local synaptic plasticity, and has been used to model hallucination disorders like Charles Bonnet Syndrome (Reichert et al., 2013), though Boltzmann machines are computationally more cumbersome to train and require more non-biological network features than the Wake-Sleep algorithm. We feel as though it is important to recognize that models that satisfy these three criteria are more similar than they are different, and that it may be quite difficult to experimentally distinguish between them.

Limitations

While our model is capable of capturing several effects of classical psychedelics, it also has several clear limitations. First, while we have been able to model complex hallucination phenomena with backpropagation-trained networks, hallucinations generated by Wake-Sleep-trained networks were generally simpler, likely because the Wake-Sleep algorithm is well-known to be a less effective representation learning and generative modeling algorithm than backpropagation (Kingma and Welling, 2013), despite its superior biological realism. This suggests that while it is quite possible for generative modeling approaches to produce complex hallucinations through non-biological means, algorithmic or architectural improvements may be necessary in order to make the performance of the more plausible Wake-Sleep algorithm closer to that achieved by state-of-the-art models.

Our model also oversimplifies several aspects of biology. In particular, we do not use neurons that respect Dale’s law (O’Donohue et al., 1985; Cornford et al., 2020), and the majority of our efforts to map the Wake-Sleep algorithm onto biology focus on excitatory pyramidal neurons. Furthermore, though we do observe that neural dynamics can tolerate a significant amount of top-down input before disrupting perception, experiments and theoretical studies have shown that inputs to apical dendrites of pyramidal neurons do play an important role in waking perception (Larkum, 2013; Whyte et al., 2024; Munn et al., 2023), and are not just learning signals. We focused on clear distinctions between basally-driven Wake modes and apically-driven Sleep modes during training for computational efficiency reasons, and also due to the fact that parameter sharing across inference and generative networks in the Wake-Sleep algorithm is theoretically under-explored (though it is supported in closely related predictive coding approaches Rao and Ballard, 1999 and Boltzmann machines Ackley et al., 1985). Future elaborations on our model could incorporate feedback control (Podlaski and Machens, 2020), attention (Lindsay, 2020), or multimodal sensory inputs (Islah et al., 2025) into top-down projections; such inputs could help explore how psychedelic hallucinations interact with attentional or feedback control systems in the brain and have been shown to interact constructively with top-down learning signals in prior models (Gilra and Gerstner, 2017; Meulemans et al., 2021; Roelfsema and van Ooyen, 2005). Our use of VDVAEs is a positive step in this direction, but ideally, such network architectures would be made compatible with the Wake-Sleep algorithm.

Lastly, our modeling focus has been exclusively on cortical plasticity and hallucination effects: it should be noted that our model has little bearing on other important features of the psychedelic experience of potential therapeutic relevance, because we have not included the effects of psychedelics on subcortical structures, including the serotonergic system (Carhart-Harris and Nutt, 2017), which plays an important role in regulating mood and may be where psychedelics exert some of their antidepressant effects. Many studies of the effects of psychedelics on fear extinction focus on the hippocampus or the amygdala (Bombardi and Di Giovanni, 2013; Jiang et al., 2009; Kelly et al., 2024; Tiwari et al., 2024). These areas receive extensive innervation directly from serotonergic synapses originating from the dorsal raphe nucleus, which have been shown to play an important role in emotional learning (Lesch and Waider, 2012); because classical psychedelics may play a more direct role in modulating this serotonergic innervation, it is possible that fear conditioning results (in addition to the anxiolytic effects of psychedelics) cannot be attributed to a shift in balance between apical and basal synapses induced by psychedelic administration.

Conclusions

Here, we have proposed a hypothesis for the mechanism of action of psychedelic drugs in terms of its excitatory effects on the apical dendrites of pyramidal neurons, which we propose pushes network dynamics into a state normally reserved for offline replay and learning; we have also proposed a number of testable predictions which could be used to validate or invalidate our hypothesis. If validated, our model would describe a mechanism by which psychedelic drug administration causes ordinary sensory perception to become literally more dream-like; it further suggests that the plasticity increases observed during both sleep and psychedelic experience could occur via a common mechanism dedicated to sensory representation learning in the brain. Beyond classical psychedelics, further studying the balance between apical and basal dendritic inputs to pyramidal neurons in connection to replay during sleep may be relevant for explaining the hallucinatory effects of other drugs (such as ketamine) or mental disorders like schizophrenia (Corlett et al., 2009).

Methods

Model architecture and training

To model the effects of psychedelics on neural network dynamics and plasticity, we first constructed a simple model of the early visual system by training neural networks on two different image datasets (MNIST Deng, 2012 and CIFAR10 Krizhevsky and Hinton, 2009). Networks were trained with the Wake-Sleep algorithm (Hinton et al., 1995), which requires, for each layer, two modes of stochastic network activity: a ‘generative mode,’ and an ‘inference mode.’ For the ‘inference’ mode, we must specify a probability distribution $b (r^{(l)} | r^{(l - 1)})$ , while for the ‘generative’ mode, we must specify a separate distribution $p (r^{(l)} | r^{(l + 1)})$ (As a notational convention, we will use letters when referring to mathematical objects from the generative, top-down distribution, and their vertical reflection when referring to the inference, bottom-up distribution (e.g. p and b)). Notice here that activity in ‘inference’ mode is conditioned on ‘bottom-up’ network states ( $r^{(l - 1)}$ ), while activity in generative mode is conditioned on ‘top-down’ network states ( $r^{(l + 1)}$ ) (Figure 1a).

The ‘inference mode’ specifies a probability distribution over neural activity, conditioned on the next-lower layer (where the lowest layer is the stimulus layer, i.e., $r^{(0)} = s$ )—mechanistically, it corresponds to activity generated by feedforward projections. To increase the expressive power of our neural units, we use multicompartmental neuron models similar to Poirazi et al., 2003 with $N_{d}$ dendritic compartments, whose voltages are summed nonlinearly to form the full input to the basal dendrites. For $l > 0$ , layer activity is sampled from the distribution $r^{(l)} \sim N (h (r^{(l - 1)}), σ_{b}^{2} I)$ , where for neuron $i$ in layer $l$ , $h_{i} (r^{(l - 1)})$ is given by:

h_{i} (r^{(l - 1)}) = ϕ (\sum_{n = 0}^{N_{d}} w_{i n}^{(l)} ϕ_{d} (W_{i n}^{(l)} r^{(l - 1)} + c_{i n}^{(l)}) + b_{i}^{(l)}),

where $W_{i n}^{(l)}$ is a $1 \times N^{(l - 1)}$ matrix of synaptic weights onto dendrite $n$ , $c_{i n}$ is the corresponding bias for the nth dendritic compartment, $w_{i n}^{(l)}$ is the strictly positive weight given to the nth dendritic branch (roughly corresponding to a conductance), and $b_{i}^{(l)}$ is the bias for the entire basal compartment. $ϕ_{d} (\cdot)$ and $ϕ (\cdot)$ are nonlinearities for the dendritic branches and the total basal compartment, respectively: both are the sequential composition of the $\tanh$ nonlinearity, followed by batch normalization (Ioffe, 2015). For the dendritic branch nonlinearities, we allow for learnable affine parameters (scale and bias), but for the entire basal dendritic compartment, we constrain activity to be zero-mean and unit variance across batches in order to prevent indeterminacy between apical and basal scale parameters. For the final inference layer $r^{(L)}$ , as in the variational autoencoder (Rezende et al., 2014), we parameterize both the mean and a diagonal covariance matrix of the inference distribution: $r^{(L)} \sim N (h (r^{(L - 1)}), d i a g (h_{2} (r^{(L - 1)})))$ , where $h_{2} (\cdot)$ is also a multicompartmental model, in this case replacing the final batch normalization with an exponential nonlinearity to ensure positivity.

The ‘generative’ mode specifies a probability distribution over neural activity, conditioned on the next-higher layer—it corresponds mechanistically to activity generated by feedback projections. The highest layer, $r^{(L)}$ is sampled from an $N^{(L)}$ -dimensional independent standard normal distribution, $r^{(L)} \sim N (0, I)$ , and all subsequent layers are sampled from the distribution $r^{(l)} \sim N (μ (r^{(l + 1)}), σ_{p}^{2} I)$ , where for the ith neuron, $μ_{i} (r^{(l + 1)})$ is given by:

μ_{i} (r^{(l + 1)}) = ϕ (\sum_{n = 0}^{N_{d}} m_{i n}^{(l)} ϕ_{d} (M_{i n}^{(l)} r^{(l + 1)} + d_{i n}^{(l)}) + a_{i}^{(l)}),

where $M_{i n}^{(l)}$ is a $1 \times N^{(l + 1)}$ matrix of synaptic weights onto apical dendritic branch $n$ , $d_{i n}^{(l)}$ is the corresponding bias for the nth dendritic compartment, $m_{i n}^{(l)}$ is the strictly positive weight given to the nth dendritic branch, and $a_{i}^{(l)}$ is the bias for the entire apical compartment. Again, $ϕ_{d} (\cdot)$ and $ϕ (\cdot)$ are nonlinearities, identical to the inference (basal) pathway.

While the neuron model used here is more complicated than is normally used for single-unit neuron models, functions of this kind could feasibly be implemented by nonlinear dendritic computations (Poirazi et al., 2003); we further found that using this nonlinearity qualitatively improved generative performance (Figure 2—figure supplement 2). Given these parameterized probability distributions, we then determined the neural activity for each layer $l$ according to Equation 1. Our network trained on MNIST was composed of three layers, with widths [32, 16, 6], listed in ascending order. A full list of network hyperparameters for both our MNIST and CIFAR10-trained networks can be found in the Supplemental Methods.

All synaptic weights and parameters in our networks were trained via the Wake-Sleep algorithm (Hinton et al., 1995), which is known to produce ‘local’ parameter updates for a wide range of neuron models (and rate or spike-based output distributions), though the specific functional form of the update may vary depending on the neuron model chosen (Bredenberg et al., 2024). These updates, for reasonable choices of neural network architecture, can be interpreted as predictions for how synaptic plasticity should look in the brain, if learning were really occurring via the Wake-Sleep algorithm or some approximation thereof.

Consider a generic inference (basal dendrite) parameter for neuron $i$ , $θ_{b} \in {w_{i n}^{(l)}, W_{i n}^{(l)}, b_{i}^{(l)}, c_{i n}^{(l)} : n = 0, . . ., N_{d}}$ . The Wake-Sleep algorithm gives the following update, for a single stimulus presentation:

Δ θ_{b} = (α) η \frac{(r_{i}^{(l)} - h_{i} (r^{(l - 1)}, θ_{b}))}{σ_{b}^{2}} \frac{\partial h_{i} (r^{(l - 1)}, θ_{b})}{\partial θ_{b}},

where η is a learning rate, and the gate α ensures that learning only occurs during sleep mode. Furthermore, for reasons of computational efficiency, we average weight updates across a batch of 512 stimulus presentations; similar results could in principle be obtained with purely online updates (Williams et al., 2023), but we opted to present stimuli in batches here in order to parallelize computations. $\frac{\partial h_{i} (r^{(l - 1)}, θ_{b})}{\partial θ_{b}}$ changes depending on the parameter θ, reflecting that particular parameter’s contribution to basal dendritic activity. For a dendritic branch weight $w_{i n}^{(l)}$ , we have:

\frac{\partial h_{i} (r^{(l - 1)}, w_{i n}^{(l)})}{\partial w_{i n}^{(l)}} = ϕ^{'} (v_{i}^{t o t a l}) ϕ_{d} (v_{i n}),

where $v_{i}^{t o t a l}$ is the total input to the basal dendritic compartment, and $v_{i n} = W_{i n}^{(l)} r^{(l - 1)} + c_{i n}^{(l)}$ is the total input to the nth dendritic branch. This update has the functional form of a classical ‘delta’ learning rule (Widrow and Lehr, 1990), where a compartmental prediction error between local dendritic activity and neuronal firing rate is multiplicatively combined with branch-specific input to provide changes in the conductance for the nth branch. Similarly, for the jth synapse on the nth dendritic branch, $W_{i n j}^{(l)}$ , we have:

\frac{\partial h_{i} (r^{(l - 1)}, W_{i n j}^{(l)})}{\partial W_{i n j}^{(l)}} = ϕ^{'} (v_{i}^{t o t a l}) w_{i n}^{(l)} ϕ^{'} (v_{i n}) r_{j}^{(l - 1)} .

Unlike for simple one-compartment neuron models, the computation of parameter updates for dendritic synapses $W_{i n j}^{(l)}$ requires weighting the ‘delta’ error by the conductance of the corresponding dendritic branch ( $w_{i n}$ ), which could be approximated by the passive diffusion of signaling molecules from the principal basal dendritic compartment back along dendritic branches to individual synapses.

For generative parameters ( $θ_{p} \in {m_{i n}^{(l)}, M_{i n}^{(l)}, a_{i}^{(l)}, d_{i n}^{(l)} : n = 0, . . ., N_{d}}$ ), we have a nearly identical update for a single stimulus presentation:

Δ θ_{p} = (1 - α) η \frac{(r_{i}^{(l)} - μ_{i} (r^{(l + 1)}, θ_{p}))}{σ_{p}^{2}} \frac{\partial μ_{i} (r^{(l + 1)}, θ_{p})}{\partial θ_{p}},

where now input in the apical dendritic compartment, $μ_{i} (r^{(l + 1)})$ , is being compared to the activity of the neuron as a whole to determine the magnitude and sign of plasticity. The $(1 - α)$ gate in this case ensures that plasticity only occurs during the Wake mode. We provide pseudocode (Supplementary file 4) for our Wake-Sleep implementation, as well as a full list of algorithm and optimizer hyperparameters (Supplementary files 1 and 2) in the Supplemental materials (Code for reproducing all results from Wake-Sleep-trained models this study is available here: https://github.com/colinbredenberg/oneirogen-hypothesis, copy archived at Bredenberg, 2024).

Modeling hallucinations

During training, neural network activity is either dominated entirely by bottom-up inputs (Wake, $α = 0$ ) or by top-down inputs (Sleep, $α = 1$ ). As a consequence, sampling neural activity is computationally low-cost and can be performed in a single time step. During Wake, one can take a sampled stimulus variable $s$ , determine the activity at layer 1, then 2, and so on until layer $L$ , while during Sleep, one can sample a latent network state in layer $L$ and traverse the layers in reverse order, down to the stimulus layer. However, this is not possible if $α \notin {0, 1}$ , because activity in each layer $l$ should depend simultaneously on layer $l + 1$ and layer $l - 1$ . For this reason, we chose to model hallucinatory neural activity dynamically, as follows:

r_{t}^{(l)} = (1 - \frac{1}{τ}) r_{t - 1}^{(l)} + \frac{1}{τ} f (h (r_{t - 1}), μ (r_{t - 1}), α) + \frac{f (σ_{b}, σ_{p}, α)}{\sqrt{τ}} η_{t - 1},

where τ is a time constant that determines how much of the previous network state is retained, and $η_{t - 1} \sim N (0, I)$ . Critically, if we take $τ = 1$ , these dynamics reduce to the sampling procedure used during training (Equation 1). A priori, the choice of interpolation function $f (a, b, α)$ is arbitrary. We selected the following function:

f (a, b, α) = κ \log [(1 - α) \exp \frac{a}{κ} + α \exp \frac{b}{κ}],

where $κ = 0.35$ is a free parameter. This function is equivalent to linear interpolation as $κ \to \infty$ , and is equivalent to the maximum function between arguments $a$ and $b$ as $κ \to 0$ if $α = 0.5$ . By selecting $κ = 0.35$ , we are biasing the system towards registering positive inputs from apical or basal sources (in the inclusive sense). We found that this produced ‘hallucinatory’ percepts in stimulus space that did not reduce the intensity of input stimuli as α increased; rather, inputs maintained their intensity, and hallucinations were added on top if they were of greater intensity than the ground-truth image. All simulations were run for 800 timesteps, with $τ = 0.1$ . As a control, we compared our results to network dynamics produced purely by increases in noise, without increases in apical dendritic influence (which we refer to as our noise-based hallucination protocol). For these control simulations, we produced network activity time series with the following equation:

r_{t}^{(l)} = (1 - \frac{1}{τ}) r_{t - 1}^{(l)} + \frac{1}{τ} h (r_{t - 1}) + \frac{σ_{b} + α}{\sqrt{τ}} η_{t - 1},

so that the standard deviation of the injected noise increased linearly with α.

Apical and basal alignment

To measure the alignment between inputs in the apical and basal dendritic compartments of our model neurons, we computed the ‘Wake’ neural responses to the full test dataset and measured the activity in both the basal and apical compartments of our neurons ( $h (r^{(l - 1)})$ and $μ (r^{(l + 1)})$ , respectively). We then calculated the correlation coefficient between apical and basal compartments for the same neuron, compared to the correlation between compartments for two randomly selected neurons.

Quantifying plasticity

To quantify the total amount of plasticity induced in our model system by the administration of psychedelic drugs, we measured the change in relative parameter strength (averaging across all synapses in the network and an ensemble of 512 test images). For each test image, we simulated network dynamics according to Equation 8. Subsequently, for each parameter θ, we calculated the net amount of plasticity induced by viewing all test images, $Δ θ$ . We subsequently reported the relative change:

Δ θ_{r e l} = \frac{| Δ θ |}{| θ | + ϵ},

under conditions in which α values gate plasticity (as in ordinary Wake-Sleep) and under conditions in which psychedelic drug administration does not also affect plasticity gating. Here, we took $ϵ = 10^{- 2}$ to avoid numerical instabilities.

Classifier training

As we trained our neural network using the Wake-Sleep algorithm, we simultaneously trained a separate classifier network based on Wake-phase neural activity in the second network layer on a cross-entropy loss, to identify the stimulus class of the input to the system. For our classifier, we used a multilayer perceptron neural network with a single 256-unit hidden layer and $\tanh (\cdot)$ nonlinearities.

We then quantified the accuracy of the classifier on the test set, based on neural activity drawn from the final time step $T$ of hallucination simulations with various values of α. We further measured the average variance of the 10-dimensional output logits of the neural network.

Quantifying correlation matrix similarity before and after psychedelics

To quantify how similar the pairwise correlations between neurons in our model networks were before and after the administration of psychedelics, we recorded hallucinatory network dynamics for an ensemble of 512 test images and measured pairwise correlations between neurons in the first network layer. To compare these matrices, we then report the correlation coefficient between the flattened $N \times N$ matrices. For this metric, a value of 1 indicates that the correlation matrices are perfectly aligned, while a value of –1 indicates that pairwise correlations are fully inverted.

Quantifying interareal causality through inactivations

To quantify changes in interareal functional connectivity induced by psychedelics, we performed two different types of inactivation. In the first, we inactivated the apical dendritic compartments of all neurons in the stimulus layer and measured how this inactivation affected across-stimulus variability of neurons relative to the fully active state. In the second method, we inactivated all neurons in the deepest layer and measured the same effect in across-stimulus variability in the stimulus layer. For both inactivation schemes, we report the mean and standard error of the variance ratio:

V R = \frac{{Var}_{i n a c t} (r^{(0)}) + ϵ_{v}}{Var (r^{(0)}) + ϵ_{v}},

where we added $ϵ_{v} = 10^{- 3}$ to the denominator to prevent numerical instability and to the numerator to ensure that the ratio evaluates to 1 if the two variances are equivalent.

Generating hallucinations in hierarchical variational autoencoders

To model more complex hallucination phenomena than could be observed in our simpler Wake-Sleep-trained networks, we used pretrained VDVAE Child, 2020 models trained on Tiny ImageNet Wu et al., 2017, a 64×64 pixel variant of ImageNet, and FFHQ-256 Karras et al., 2019, a dataset of 256×256 pixel human faces. VDVAE models are very similar to our Wake-Sleep-trained models: they are trained on the same unsupervised representation learning objective function (the ELBO), and every layer of the multilayer network models are parameterized by a bottom-up inference distribution $b$ and a top-down generative distribution $p$ . VDVAE models are top-down VAEs (Sønderby et al., 2016), which means that the inference distribution is conditioned on bottom-up stimuli and latent network activity at higher layers, i.e., the distribution is written $b (r^{(l)} | h (s, r^{(l + 1)}))$ , where $h (\cdot)$ is a parameterized neural network. By contrast, the generative distribution is conditioned only on top-down inputs and is written $p (r^{(l)} | μ (r^{(l + 1)}))$ , where $μ (r^{(l + 1)})$ is also a parameterized neural network.

For our Wake-Sleep-trained networks, we modeled hallucinations by simulating a stochastic time series at each layer (Equation 8), but for the VDVAE models, we found this to be computationally infeasible. Instead, we modeled hallucinations with a single bottom-up and top-down pass through the network, as follows:

r^{(l)} = (1 - α) r_{b}^{(l)} + (α) r_{p}^{(l)},

where $r_{b}^{l} \sim b (r^{(l)} | h (s, r^{(l + 1)}))$ is a sample from the inference distribution, and $r_{p}^{l} \sim p (r^{(l)} | μ (r^{(l + 1)}))$ is a sample from the generative distribution. This generation scheme is simpler and less computationally expensive than our previous method, while still producing purely Wake-stage sampling when $α = 0$ and Sleep-stage sampling when $α = 1$ ; intermediate values of α correspond to modeled hallucinatory network states (Code for reproducing results obtained with pretrained VDVAE models is available here: https://github.com/colinbredenberg/vdvae, copy archived at Bredenberg, 2025). Our Laplacian pyramid analysis of generated images was performed using the Pyrtools package (Simoncelli et al., 2025).

Ethics declarations

Psychedelic drug research has a long history fraught with many instances of unethical research practice (Strauss et al., 2022). Furthermore, psychedelic drug use itself has long been stigmatized and punished through legal measures (Bauml and Schaefer, 2016), often at the expense of indigenous peoples, who have long incorporated psychoactive substances into their cultural and spiritual practices (Samorini, 2019). In the interest of avoiding a repetition of past mistakes, we feel compelled to provide explicit guidance on how our work should be interpreted and used. To do so, we will take inspiration from two principal ethical frameworks: the Montreal Declaration on Responsible AI (Dilhac et al., 2018), and the EQUIP framework for equity-oriented healthcare (Browne et al., 2015; Rea and Wallace, 2021). We strongly encourage anyone considering extending our research or using our work in any form of clinical setting to ensure that subsequent research adheres to these frameworks.

Below, drawing from these ethical frameworks, we will provide a set of guidelines for how our work should be interpreted and used. Though these guidelines are by no means exhaustive, our hope is that adherence to them will help promote the potential positive outcomes of our work while limiting potential negative consequences.

Guidelines for the ethical use of this study:

Do:

Ensure that the elements of our hypothesis have been adequately tested, as outlined in our discussion, before using our framework in any form of clinical or therapeutic setting.
Use our ideas to inform further basic neuroscience research on perception, learning, sleep, and replay phenomena.
Explore our ideas as an opportunity to inform your own understanding of cognition, learning, and perception, with the understanding that these ideas have not yet been fully validated experimentally.
Feel free to ask us if you are worried that your proposed use of our work may have negative impacts.

Do not:

Report our results as scientific fact. We have outlined a hypothesis, which is designed to be tested by the experimental neuroscience community.
Cite or interpret our results without an adequate understanding of the evidence supporting the various claims made in this study. Feel free to ask us if you are worried that you may be misinterpreting our results.
Use our results to extract undue or inequitable profit. The ideas developed in this paper are the product of decades of research and public funding, built upon centuries of exploration of psychedelics. Any knowledge or value contained within this paper is the common heritage of all humanity, with particular recognition due to the indigenous and marginalized communities that have historically suffered and are currently suffering from oppressive government and industry policies.
Use our results for any application that could violate human rights or harm human beings in any way.

Appendix 1

Supplementary materials

Recurrent network model

To explore the extent to which our results hold for different neuron models, and to give our generative model more expressive power than the traditional Helmholtz machine (Dayan et al., 1995), we constructed a network model with a single timestep of within-layer recurrent denoising in each layer, which gives our model some similarities to denoising diffusion approaches (Issa and Toosi, 2024). For both our ‘inference’ mode and our ‘generative’ mode, we specify both a denoised network state ${\bar{r}}^{(l)}$ and a noise-corrupted network state $r^{(l)}$ for layer $l$ ; specifying a neural network model is then equivalent to specifying, for each layer, a joint probability distribution over denoised and noise-corrupted network states for both the inference and generative modes, i.e., for the ‘inference’ mode we must specify a probability distribution $b ({\bar{r}}^{(l)}, r^{(l)} | r^{(l - 1)})$ , while for the ‘generative’ mode we must specify a separate distribution $p ({\bar{r}}^{(l)}, r^{(l)} | {\bar{r}}^{(l + 1)})$ (As a notational convention, we will use letters when referring to mathematical objects from the generative, top-down distribution, and their vertical reflection when referring to the inference, bottom-up distribution (e.g. p and b)). Notice here that activity in ‘inference’ mode is conditioned on ‘bottom-up’ network states ( $r^{(l - 1)}$ ), while activity in generative mode is conditioned on ‘top-down’ network states ( ${\bar{r}}^{(l + 1)}$ ) (Figure 1a).

h (r^{(l - 1)}) = \tanh (W^{(l)} r^{(l - 1)} + b^{(l)}) .

Subsequently, we add additional noise to get a noise-corrupted network state $𝐫^{(l)} \sim 𝒩 ((1 - {\bar{σ}}_{b}^{2} 𝐈) {\bar{𝐫}}^{(l)}, {\bar{σ}}_{b}^{2})$ ; while noise corruption is a natural feature of network dynamics in the brain (Faisal et al., 2008), we include it here in our model because it has been shown that denoising is a critical aspect of many powerful generative modeling approaches (Vincent, 2011; Kadkhodaie and Simoncelli, 2021; Rombach et al., 2022), and we have likewise found that it improves the quality of generated images in our learned networks (Figure 2—figure supplement 2).

The ‘generative’ mode specifies a probability distribution over neural activity, conditioned on the next-higher layer—it corresponds mechanistically to activity generated by feedback projections. The highest layer, $r^{(L)}$ is sampled from an $N^{(L)}$ -dimensional independent standard normal distribution, $r^{(L)} \sim N (0, I)$ , and all subsequent layers are sampled from the distribution $r^{(l)} \sim N (μ ({\bar{r}}^{(l + 1)}), σ_{p}^{2} I)$ , where $μ ({\bar{r}}^{(l + 1)})$ is given by:

μ ({\bar{𝐫}}^{(l + 1)}) = \tanh (𝐌^{(l)} {\bar{𝐫}}^{(l + 1)} + a^{(l)}),

where $m^{(l)}$ is a $N^{l} \times N^{(l + 1)}$ weight matrix, and $a$ is a bias term. Subsequently, the network goes through a single timestep of recurrent denoising, so that ${\bar{r}}^{(l)} \sim N (\bar{μ} (r^{(l)}), {\bar{σ}}_{p}^{2} I)$ , where $\bar{μ} (r^{(l)})$ is given by:

\bar{μ} (r^{(l)}) = r^{(l)} + σ (C_{1} r^{(l)} + c_{1}) \tanh (C_{2} r^{(l)} + c_{2}),

where $σ (\cdot)$ is a sigmoid nonlinearity that acts as a gating function similar to those used in the LSTM (Hochreiter and Schmidhuber, 1997) and GRU (Chung et al., 2014), $C_{1}$ and $C_{2}$ are $N^{(l)} \times N^{(l)}$ recurrent weight matrices, and $c_{1}$ and $c_{2}$ are biases. While this is a more complicated nonlinearity than is normally used for single-unit neuron models, functions of this kind could feasibly be implemented by nonlinear dendritic computations (Poirazi et al., 2003); we further found that using this nonlinearity qualitatively improved generative performance. Given these parameterized probability distributions, we then determined the neural activity for each layer $l$ according to Equation 1. As with our multicompartmental neuron model, inference and generative parameters were updated according to Equations 4 and 7, respectively. Recurrent network hyperparameters are available in Supplementary file 3.

Simplified neuron model

As a control, we also tested our results using a simplified multilayer perceptron neuron model, which used neither batch normalization nor multiple dendritic branches. For the ‘inference’ mode within the simplified model, for $l > 0$ , layer activity is sampled from the distribution $r^{(l)} \sim N (h (r^{(l - 1)}), σ_{b}^{2} I)$ , where for neuron $i$ in layer $l$ , $h_{i} (r^{(l - 1)})$ is given by:

h_{i} (r^{(l - 1)}) = \tanh (W_{i :}^{(l)} r^{(l - 1)} + b_{i}^{(l)}),

where $W_{i :}^{(l)}$ is a $1 \times N^{(l - 1)}$ matrix of basal synaptic weights onto neuron $i$ , and $b_{i}$ is the corresponding bias.

The simplified ‘generative’ mode likewise replaces the branched neuron model used in the main text with a multilayer perceptron model. The highest layer, $r^{(L)}$ is sampled from an $N^{(L)}$ -dimensional independent standard normal distribution, $r^{(L)} \sim N (0, I)$ , and all subsequent layers are sampled from the distribution $r^{(l)} \sim N (μ (r^{(l + 1)}), σ_{p}^{2} I)$ , where for the ith neuron, $μ_{i} (r^{(l + 1)})$ is given by:

μ_{i} (r^{(l + 1)}) = \tanh (M_{i :}^{(l)} r^{(l + 1)} + a_{i}^{(l)}),

where $M_{i :}^{(l)}$ is a $1 \times N^{(l + 1)}$ matrix of apical synaptic weights onto neuron $i$ , and $a_{i}^{(l)}$ is the corresponding bias. As with the branched neuron model, inference and generative parameters were updated according to Equations 4 and 7, respectively. For optimization, we used the identical hyperparameters to the multicompartment neuron model (Supplementary file 1).

Data availability

Code for reproducing all results from Wake-Sleep-trained models in this study is available here: https://github.com/colinbredenberg/oneirogen-hypothesis, copy archived at Bredenberg, 2024. Code for reproducing results obtained with pretrained VDVAE models is available here: https://github.com/colinbredenberg/vdvae, copy archived at Bredenberg, 2025.

References

(1985) A learning algorithm for boltzmann machines
Cognitive Science 9:147–169.

https://doi.org/10.1016/S0364-0213(85)80012-4
- Google Scholar
1. Aghajanian GK
2. Marek GJ
(1997) Serotonin induces excitatory postsynaptic potentials in apical dendrites of neocortical pyramidal cells
Neuropharmacology 36:589–599.

https://doi.org/10.1016/s0028-3908(97)00051-8
- PubMed
- Google Scholar
1. Aghajanian GK
2. Marek GJ
(1999) Serotonin and hallucinogens
Neuropsychopharmacology 21:16S–23S.

https://doi.org/10.1016/S0893-133X(98)00135-3
- PubMed
- Google Scholar
1. Aime M
2. Calcini N
3. Borsa M
4. Campelo T
5. Rusterholz T
6. Sattin A
7. Fellin T
8. Adamantidis A
(2022) Paradoxical somatodendritic decoupling supports cortical plasticity during REM sleep
Science 376:724–730.

https://doi.org/10.1126/science.abk2734
- PubMed
- Google Scholar
(2020) DMT alters cortical travelling waves
eLife 9:e59784.

https://doi.org/10.7554/eLife.59784
- PubMed
- Google Scholar
1. Andrillon T
2. Nir Y
3. Cirelli C
4. Tononi G
5. Fried I
(2015) Single-neuron activity and eye movements during human REM sleep and awake vision
Nature Communications 6:7884.

https://doi.org/10.1038/ncomms8884
- PubMed
- Google Scholar
1. Aru J
2. Siclari F
3. Phillips WA
4. Storm JF
(2020) Apical drive—A cellular mechanism of dreaming?
Neuroscience & Biobehavioral Reviews 119:440–455.

https://doi.org/10.1016/j.neubiorev.2020.09.018
- PubMed
- Google Scholar
1. Arvanov VL
2. Liang X
3. Russo A
4. Wang RY
(1999) LSD and DOB: interaction with 5-HT2A receptors to inhibit NMDA receptor-mediated transmission in the rat prefrontal cortex
The European Journal of Neuroscience 11:3064–3072.

https://doi.org/10.1046/j.1460-9568.1999.00726.x
- PubMed
- Google Scholar
1. Azimi Z
2. Barzan R
3. Spoida K
4. Surdin T
5. Wollenweber P
6. Mark MD
7. Herlitze S
8. Jancke D
(2020) Separable gain control of ongoing and evoked activity in the visual cortex by serotonergic input
eLife 9:e53552.

https://doi.org/10.7554/eLife.53552
- PubMed
- Google Scholar
Preprint
(2016) End-to-end optimized image compression
arXiv.

https://arxiv.org/abs/1611.01704
- Google Scholar
1. Barbanoj MJ
2. Riba J
3. Clos S
4. Giménez S
5. Grasa E
6. Romero S
(2008) Daytime Ayahuasca administration modulates REM and slow-wave sleep in healthy volunteers
Psychopharmacology 196:315–326.

https://doi.org/10.1007/s00213-007-0963-0
- PubMed
- Google Scholar
Book
1. Bauml JA
2. Schaefer SB
(2016)
Peyote: History, Tradition, Politics, and Conservation

Bloomsbury Publishing.
- Google Scholar
(2019) Widespread and highly correlated somato-dendritic activity in cortical layer 5 neurons
Neuron 103:235–241.

https://doi.org/10.1016/j.neuron.2019.05.014
- PubMed
- Google Scholar
1. Bombardi C
2. Di Giovanni G
(2013) Functional anatomy of 5-HT2A receptors in the amygdala and hippocampal complex: relevance to memory functions
Experimental Brain Research 230:427–439.

https://doi.org/10.1007/s00221-013-3512-6
- PubMed
- Google Scholar
Conference
(2021)
Impression learning: Online representation learning with synaptic plasticity

Advances in Neural Information Processing Systems. pp. 11717–11729.
- Google Scholar
Software
1. Bredenberg C
(2024) Oneirogen-hypothesis, version swh:1:rev:40dbd6de2ca131ebe291b47fd7ff7ff786a38f34
Software Heritage.

https://archive.softwareheritage.org/swh:1:dir:b547c3ae54a24cf1ea0ce203758891e736b364da;origin=https://github.com/colinbredenberg/oneirogen-hypothesis;visit=swh:1:snp:e83aef202fdd2f78f8d359c297143dec2bfe4074;anchor=swh:1:rev:40dbd6de2ca131ebe291b47fd7ff7ff786a38f34
Conference
(2024) Formalizing locality for normative synaptic plasticity models
Advances in Neural Information Processing Systems.

https://doi.org/10.52202/075280-0247
- Google Scholar
1. Bredenberg C
2. Savin C
(2024) Desiderata for normative models of synaptic plasticity
Neural Computation 36:1245–1285.

https://doi.org/10.1162/neco_a_01671
- PubMed
- Google Scholar
Software
1. Bredenberg C
(2025) Vdvae, version swh:1:rev:919a2360c6df9cb429a13570a12deb5cdf647d9b
Software Heritage.

https://archive.softwareheritage.org/swh:1:dir:8d88f316dde49e573de164615b838f8d28811fcb;origin=https://github.com/colinbredenberg/vdvae;visit=swh:1:snp:b8f188b1e42580a3a699698e3d39adce97c41617;anchor=swh:1:rev:919a2360c6df9cb429a13570a12deb5cdf647d9b
(2001) Geometric visual hallucinations, Euclidean symmetry and the functional architecture of striate cortex
Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences 356:299–330.

https://doi.org/10.1098/rstb.2000.0769
- PubMed
- Google Scholar
(2015) EQUIP Healthcare: An overview of a multi-component intervention to enhance equity-oriented care in primary health care settings
International Journal for Equity in Health 14:152.

https://doi.org/10.1186/s12939-015-0271-y
- PubMed
- Google Scholar
1. Carhart-Harris R
(2007) Waves of the unconscious: The neurophysiology of Dreamlike phenomena and its implications for the psychodynamic model of the mind
Neuropsychoanalysis 9:183–211.

https://doi.org/10.1080/15294145.2007.10773557
- Google Scholar
(2014) The entropic brain: a theory of conscious states informed by neuroimaging research with psychedelic drugs
Frontiers in Human Neuroscience 8:20.

https://doi.org/10.3389/fnhum.2014.00020
- PubMed
- Google Scholar
1. Carhart-Harris RL
2. Nutt DJ
(2017) Serotonin and brain function: a tale of two receptors
Journal of Psychopharmacology 31:1091–1120.

https://doi.org/10.1177/0269881117725915
- PubMed
- Google Scholar
1. Carhart-Harris RL
2. Friston KJ
(2019) REBUS and the anarchic brain: toward a unified model of the brain action of psychedelics
Pharmacological Reviews 71:316–344.

https://doi.org/10.1124/pr.118.017160
- PubMed
- Google Scholar
Preprint
1. Child R
(2020) Very deep vaes generalize autoregressive models and can outperform them on images
arXiv.

https://arxiv.org/abs/2011.10650
- Google Scholar
Preprint
1. Chung J
2. Gulcehre C
3. Cho K
4. Bengio Y
(2014) Empirical evaluation of gated recurrent neural networks on sequence modeling
arXiv.

https://arxiv.org/abs/1412.3555
- Google Scholar
(2009) From drugs to deprivation: a Bayesian framework for understanding models of psychosis
Psychopharmacology 206:515–530.

https://doi.org/10.1007/s00213-009-1561-0
- PubMed
- Google Scholar
Preprint
(2020) Learning to live with dale’s principle: ANNs with separate excitatory and inhibitory units
bioRxiv.

https://doi.org/10.1101/2020.11.02.364968
- Google Scholar
Preprint
1. Csikor F
2. Meszéna B
3. Szabó B
4. Orbán G
(2022) Top-down inference in an early visual cortex inspired hierarchical variational autoencoder
arXiv.

https://arxiv.org/abs/2206.00436
- Google Scholar
1. Dayan P
2. Hinton GE
3. Neal RM
4. Zemel RS
(1995) The Helmholtz machine
Neural Computation 7:889–904.

https://doi.org/10.1162/neco.1995.7.5.889
- PubMed
- Google Scholar
1. de Almeida J
2. Mengod G
(2007) Quantitative analysis of glutamatergic and GABAergic neurons expressing 5-HT(2A) receptors in human and monkey prefrontal cortex
Journal of Neurochemistry 103:475–486.

https://doi.org/10.1111/j.1471-4159.2007.04768.x
- PubMed
- Google Scholar
(2021) Prolonged epigenomic and synaptic plasticity alterations following single exposure to a psychedelic in mice
Cell Reports 37:109836.

https://doi.org/10.1016/j.celrep.2021.109836
- PubMed
- Google Scholar
(2015) Explicit memory creation during sleep demonstrates a causal role of place cells in navigation
Nature Neuroscience 18:493–495.

https://doi.org/10.1038/nn.3970
- PubMed
- Google Scholar
(2023) LSD-induced changes in the functional connectivity of distinct thalamic nuclei
NeuroImage 283:120414.

https://doi.org/10.1016/j.neuroimage.2023.120414
- PubMed
- Google Scholar
1. Deng L
(2012) The MNIST database of handwritten digit images for machine learning research
IEEE Signal Processing Magazine 29:141–142.

https://doi.org/10.1109/MSP.2012.2211477
- Google Scholar
1. Deuker L
2. Olligs J
3. Fell J
4. Kranz TA
5. Mormann F
6. Montag C
7. Reuter M
8. Elger CE
9. Axmacher N
(2013) Memory consolidation by replay of stimulus-specific neural activity
The Journal of Neuroscience 33:19373–19383.

https://doi.org/10.1523/JNEUROSCI.0414-13.2013
- PubMed
- Google Scholar
1. Diaz JL
(2010) Sacred plants and visionary consciousness
Phenomenology and the Cognitive Sciences 9:159–170.

https://doi.org/10.1007/s11097-010-9157-z
- Google Scholar
(2012) How does the brain solve visual object recognition?
Neuron 73:415–434.

https://doi.org/10.1016/j.neuron.2012.01.010
- PubMed
- Google Scholar
Report
(2018)
Report of the Montréal Declaration for a responsible development of artificial intelligence

Montréal Declaration Activity Report.
- Google Scholar
1. Domenico C
2. Haggerty D
3. Mou X
4. Ji D
(2021) LSD degrades hippocampal spatial representations and suppresses hippocampal-visual cortical interactions
Cell Reports 36:109714.

https://doi.org/10.1016/j.celrep.2021.109714
- PubMed
- Google Scholar
(2020) The effects of daytime psilocybin administration on sleep: implications for antidepressant action
Frontiers in Pharmacology 11:602590.

https://doi.org/10.3389/fphar.2020.602590
- PubMed
- Google Scholar
1. Ermentrout GB
2. Cowan JD
(1979) A mathematical theory of visual hallucination patterns
Biological Cybernetics 34:137–150.

https://doi.org/10.1007/BF00336965
- PubMed
- Google Scholar
(2007) Fast-forward playback of recent memory sequences in prefrontal cortex during sleep
Science 318:1147–1150.

https://doi.org/10.1126/science.1148979
- PubMed
- Google Scholar
(1955) Some effects of lysergic acid diethylamide and bufotenine on electrical activity in the cat’s visual system
American Journal of Physiology-Legacy Content 182:594–598.

https://doi.org/10.1152/ajplegacy.1955.182.3.594
- PubMed
- Google Scholar
(2008) Noise in the nervous system
Nature Reviews. Neuroscience 9:292–303.

https://doi.org/10.1038/nrn2258
- PubMed
- Google Scholar
(2019) Long-duration hippocampal sharp wave ripples improve memory
Science 364:1082–1086.

https://doi.org/10.1126/science.aax0758
- PubMed
- Google Scholar
1. Foster DJ
(2017) Replay comes of age
Annual Review of Neuroscience 40:581–602.

https://doi.org/10.1146/annurev-neuro-072116-031538
- PubMed
- Google Scholar
1. Froemke RC
2. Letzkus JJ
3. Kampa BM
4. Hang GB
5. Stuart GJ
(2010) Dendritic synapse location and neocortical spike-timing-dependent plasticity
Frontiers in Synaptic Neuroscience 2:29.

https://doi.org/10.3389/fnsyn.2010.00029
- PubMed
- Google Scholar
(2022) Psilocybin induces spatially constrained alterations in thalamic functional organizaton and connectivity
NeuroImage 260:119434.

https://doi.org/10.1016/j.neuroimage.2022.119434
- PubMed
- Google Scholar
Preprint
1. George TM
2. Barry C
3. Stachenfeld K
4. Clopath C
5. Fukai T
(2024) A generative model of the hippocampal formation trained with theta driven local learning rules
bioRxiv.

https://doi.org/10.1101/2023.12.12.571268
- Google Scholar
1. Gilra A
2. Gerstner W
(2017) Predicting non-linear dynamics by stable local learning in a recurrent spiking neural network
eLife 6:e28295.

https://doi.org/10.7554/eLife.28295
- PubMed
- Google Scholar
(2009) Selective suppression of hippocampal ripples impairs spatial memory
Nature Neuroscience 12:1222–1223.

https://doi.org/10.1038/nn.2384
- PubMed
- Google Scholar
1. Goodfellow I
2. Pouget-Abadie J
3. Mirza M
4. Xu B
5. Warde-Farley D
6. Ozair S
7. Courville A
8. Bengio Y
(2020) Generative adversarial networks
Communications of the ACM 63:139–144.

https://doi.org/10.1145/3422622
- Google Scholar
(2003) The pharmacology and clinical pharmacology of 3,4-methylenedioxymethamphetamine (MDMA, “ecstasy”)
Pharmacological Reviews 55:463–508.

https://doi.org/10.1124/pr.55.3.3
- PubMed
- Google Scholar
1. Grieco SF
2. Castrén E
3. Knudsen GM
4. Kwan AC
5. Olson DE
6. Zuo Y
7. Holmes TC
8. Xu X
(2022) Psychedelics and neural plasticity: therapeutic implications
The Journal of Neuroscience 42:8439–8449.

https://doi.org/10.1523/JNEUROSCI.1121-22.2022
- PubMed
- Google Scholar
(2017) Towards deep learning with segregated dendrites
eLife 6:e22901.

https://doi.org/10.7554/eLife.22901
- PubMed
- Google Scholar
Preprint
(2025) Electrophysiological mechanisms of psychedelic drugs: a systematic review
bioRxiv.

https://doi.org/10.1101/2025.07.05.663289
- Google Scholar
Conference
1. Higgins I
2. Matthey L
3. Pal A
4. Burgess CP
5. Glorot X
6. Botvinick MM
7. Mohamed S
8. beta-vae LA
(2017)
Learning basic visual concepts with a constrained variational framework

ICLR.
- Google Scholar
1. Hinton GE
2. Dayan P
3. Frey BJ
4. Neal RM
(1995) The “wake-sleep” algorithm for unsupervised neural networks
Science 268:1158–1161.

https://doi.org/10.1126/science.7761831
- PubMed
- Google Scholar
1. Hochreiter S
2. Schmidhuber J
(1997) Long short-term memory
Neural Computation 9:1735–1780.

https://doi.org/10.1162/neco.1997.9.8.1735
- PubMed
- Google Scholar
1. Hoffman KL
2. McNaughton BL
(2002) Coordinated reactivation of distributed memory traces in primate neocortex
Science 297:2070–2073.

https://doi.org/10.1126/science.1073538
- PubMed
- Google Scholar
Preprint
(2024) The serotonergic psychedelic doi impairs deviance detection in the auditory cortex
bioRxiv.

https://doi.org/10.1101/2024.09.06.611733
- Google Scholar
Conference
1. Ikeda S
2. Si A
3. Nakahara H
(1998)
Convergence of the wake-sleep algorithm

Advances in Neural Information Processing Systems.
- Google Scholar
Preprint
1. Ioffe S
(2015) Batch normalization: accelerating deep network training by reducing internal covariate shift
arXiv.

https://arxiv.org/abs/1502.03167
- Google Scholar
1. Islah N
2. Etter G
3. Tugsbayar M
4. Gurbuz BT
5. Richards B
6. Muller EB
(2025) Learning to combine top-down context and feed-forward representations under ambiguity with apical and basal dendrites
Cerebral Cortex 35:bhaf134.

https://doi.org/10.1093/cercor/bhaf134
- Google Scholar
Conference
1. Issa E
2. Toosi T
(2024) Brain-like flexible visual inference by harnessing feedback feedforward alignment
Advances in Neural Information Processing Systems. pp. 56979–56997.

https://doi.org/10.52202/075280-2491
- Google Scholar
1. Jakab RL
2. Goldman-Rakic PS
(1998) 5-Hydroxytryptamine _2A serotonin receptors in the primate cerebral cortex: Possible site of action of hallucinogenic and antipsychotic drugs in pyramidal cell apical dendrites
PNAS 95:735–740.

https://doi.org/10.1073/pnas.95.2.735
- PubMed
- Google Scholar
1. Ji D
2. Wilson MA
(2007) Coordinated memory replay in the visual cortex and hippocampus during sleep
Nature Neuroscience 10:100–107.

https://doi.org/10.1038/nn1825
- PubMed
- Google Scholar
1. Jiang X
2. Xing G
3. Yang C
4. Verma A
5. Zhang L
6. Li H
(2009) Stress impairs 5-HT2A receptor-mediated serotonergic facilitation of GABA release in juvenile rat basolateral amygdala
Neuropsychopharmacology 34:410–423.

https://doi.org/10.1038/npp.2008.71
- PubMed
- Google Scholar
Preprint
1. Juliani A
2. Chelu V
3. Graesser L
4. Safron A
(2024) A dual-receptor model of serotonergic psychedelics: therapeutic insights from simulated cortical dynamics
bioRxiv.

https://doi.org/10.1101/2024.04.12.589282
- Google Scholar
Conference
1. Kadkhodaie Z
2. Simoncelli E
(2021)
Stochastic solutions for linear inverse problems using the prior implicit in a denoiser

Advances in Neural Information Processing Systems. pp. 13242–13254.
- Google Scholar
Conference
1. Karras T
2. Laine S
3. Aila T
(2019) A style-based generator architecture for generative adversarial networks
2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 4401–4410.

https://doi.org/10.1109/CVPR.2019.00453
- Google Scholar
Preprint
(2023) Traveling waves encode the recent past and enhance sequence learning
arXiv.

https://arxiv.org/abs/2309.08045
- Google Scholar
Conference
1. Keller TA
2. Welling M
(2023)
Neural wave machines: learning spatiotemporally structured representations with locally coupled oscillatory recurrent neural networks

International Conference on Machine Learning. pp. 16168–16189.
- Google Scholar
1. Kelly TJ
2. Bonniwell EM
3. Mu L
4. Liu X
5. Hu Y
6. Friedman V
7. Yu H
8. Su W
9. McCorvy JD
10. Liu Q-S
(2024) Psilocybin analog 4-OH-DiPT enhances fear extinction and GABAergic inhibition of principal neurons in the basolateral amygdala
Neuropsychopharmacology 49:854–863.

https://doi.org/10.1038/s41386-023-01744-8
- PubMed
- Google Scholar
(2003) Spontaneously emerging cortical representations of visual attributes
Nature 425:954–956.

https://doi.org/10.1038/nature02078
- PubMed
- Google Scholar
Preprint
1. Kingma DP
2. Welling M
(2013) Auto-encoding variational bayes
arXiv.

https://arxiv.org/abs/1312.6114
- Google Scholar
Book
1. Kirby KG
(2006)
A Tutorial on Helmholtz Machines

Department of Computer Science, Northern Kentucky University.
- Google Scholar
1. Körding KP
2. König P
(2001) Supervised and unsupervised learning with two sites of synaptic integration
Journal of Computational Neuroscience 11:207–215.

https://doi.org/10.1023/a:1013776130161
- PubMed
- Google Scholar
(2017) Dreamlike effects of LSD on waking imagery in humans depend on serotonin 2A receptor activation
Psychopharmacology 234:2031–2046.

https://doi.org/10.1007/s00213-017-4610-0
- PubMed
- Google Scholar
(2020) Reviewing the Potential of Psychedelics for the Treatment of PTSD
The International Journal of Neuropsychopharmacology 23:385–400.

https://doi.org/10.1093/ijnp/pyaa018
- PubMed
- Google Scholar
Website
1. Krizhevsky A
2. Hinton G
(2009) Learning multiple layers of features from tiny images
Accessed April 8, 2009.

http://www.cs.utoronto.ca/~kriz/learning-features-2009-TR.pdf
1. Larkum M
(2013) A cellular mechanism for cortical associations: an organizing principle for the cerebral cortex
Trends in Neurosciences 36:141–151.

https://doi.org/10.1016/j.tins.2012.11.006
- PubMed
- Google Scholar
(2016) LSD-induced entropic brain activity predicts subsequent personality change
Human Brain Mapping 37:3203–3213.

https://doi.org/10.1002/hbm.23234
- PubMed
- Google Scholar
1. Lee AK
2. Wilson MA
(2002) Memory of sequential experience in the hippocampus during slow wave sleep
Neuron 36:1183–1194.

https://doi.org/10.1016/s0896-6273(02)01096-6
- PubMed
- Google Scholar
1. Lesch KP
2. Waider J
(2012) Serotonin in the modulation of neural plasticity and networks: implications for neurodevelopmental disorders
Neuron 76:175–191.

https://doi.org/10.1016/j.neuron.2012.09.013
- PubMed
- Google Scholar
1. Levenstein D
2. Alvarez VA
3. Amarasingham A
4. Azab H
5. Chen ZS
6. Gerkin RC
7. Hasenstaub A
8. Iyer R
9. Jolivet RB
10. Marzen S
11. Monaco JD
12. Prinz AA
13. Quraishi S
14. Santamaria F
15. Shivkumar S
16. Singh MF
17. Traub R
18. Nadim F
19. Rotstein HG
20. Redish AD
(2023) On the role of theory and modeling in neuroscience
The Journal of Neuroscience 43:1074–1088.

https://doi.org/10.1523/JNEUROSCI.1179-22.2022
- PubMed
- Google Scholar
1. Li W
2. Ma L
3. Yang G
4. Gan WB
(2017) REM sleep selectively prunes and maintains new synapses in development and learning
Nature Neuroscience 20:427–437.

https://doi.org/10.1038/nn.4479
- PubMed
- Google Scholar
(2020) Backpropagation and the brain
Nature Reviews. Neuroscience 21:335–346.

https://doi.org/10.1038/s41583-020-0277-3
- PubMed
- Google Scholar
1. Lindsay GW
(2020) Attention in psychology, neuroscience, and machine learning
Frontiers in Computational Neuroscience 14:29.

https://doi.org/10.3389/fncom.2020.00029
- PubMed
- Google Scholar
1. Louie K
2. Wilson MA
(2001) Temporally structured replay of awake hippocampal ensemble activity during rapid eye movement sleep
Neuron 29:145–156.

https://doi.org/10.1016/s0896-6273(01)00186-6
- PubMed
- Google Scholar
(2016) Hippocampo-cortical coupling mediates memory consolidation during sleep
Nature Neuroscience 19:959–964.

https://doi.org/10.1038/nn.4304
- PubMed
- Google Scholar
(2002) Re-evaluation of lisuride pharmacology: 5-hydroxytryptamine1A receptor-mediated behavioral effects overlap its other properties in rats
Psychopharmacology 164:93–107.

https://doi.org/10.1007/s00213-002-1141-z
- PubMed
- Google Scholar
1. Mediano PAM
2. Rosas FE
3. Timmermann C
4. Roseman L
5. Nutt DJ
6. Feilding A
7. Kaelen M
8. Kringelbach ML
9. Barrett AB
10. Seth AK
11. Muthukumaraswamy S
12. Bor D
13. Carhart-Harris RL
(2024) Effects of external stimulation on psychedelic state neurodynamics
ACS Chemical Neuroscience 15:462–471.

https://doi.org/10.1021/acschemneuro.3c00289
- PubMed
- Google Scholar
Conference
(2021)
Credit assignment in neural networks through deep feedback control

Advances in Neural Information Processing Systems. pp. 4674–4687.
- Google Scholar
(2019) A hallucinogenic serotonin-2A receptor agonist reduces visual response gain and alters temporal dynamics in mouse V1
Cell Reports 26:3475–3483.

https://doi.org/10.1016/j.celrep.2019.02.104
- PubMed
- Google Scholar
1. Miyamoto D
2. Hirai D
3. Fung CCA
4. Inutsuka A
5. Odagawa M
6. Suzuki T
7. Boehringer R
8. Adaikkan C
9. Matsubara C
10. Matsuki N
11. Fukai T
12. McHugh TJ
13. Yamanaka A
14. Murayama M
(2016) Top-down cortical input during NREM sleep consolidates perceptual memory
Science 352:1315–1318.

https://doi.org/10.1126/science.aaf0902
- PubMed
- Google Scholar
1. Munn BR
2. Müller EJ
3. Medel V
4. Naismith SL
5. Lizier JT
6. Sanders RD
7. Shine JM
(2023) Neuronal connected burst cascades bridge macroscale adaptive signatures across arousal states
Nature Communications 14:6846.

https://doi.org/10.1038/s41467-023-42465-2
- PubMed
- Google Scholar
(2019) Classical psychedelics for the treatment of depression and anxiety: A systematic review
Journal of Affective Disorders 258:11–24.

https://doi.org/10.1016/j.jad.2019.07.076
- PubMed
- Google Scholar
(1999) Replay and time compression of recurring spike sequences in the hippocampus
The Journal of Neuroscience 19:9497–9507.

https://doi.org/10.1523/JNEUROSCI.19-21-09497.1999
- PubMed
- Google Scholar
1. Nardou R
2. Sawyer E
3. Song YJ
4. Wilkinson M
5. Padovan-Hernandez Y
6. de Deus JL
7. Wright N
8. Lama C
9. Faltin S
10. Goff LA
11. Stein-O’Brien GL
12. Dölen G
(2023) Psychedelics reopen the social reward learning critical period
Nature 618:790–798.

https://doi.org/10.1038/s41586-023-06204-3
- PubMed
- Google Scholar
(1985) On the 50th anniversary of Dale’s law: multiple neurotransmitter neurons
Trends in Pharmacological Sciences 6:305–308.

https://doi.org/10.1016/0165-6147(85)90141-5
- Google Scholar
Preprint
1. O’Hare JK
2. Wang J
3. Shala MD
4. Polleux F
5. Losonczy A
(2024) Distal tuft dendrites shape and maintain new place fields
bioRxiv.

https://doi.org/10.1101/2024.02.26.582144
- Google Scholar
1. Payeur A
2. Guerguiev J
3. Zenke F
4. Richards BA
5. Naud R
(2021) Burst-dependent synaptic plasticity can coordinate learning in hierarchical circuits
Nature Neuroscience 24:1010–1019.

https://doi.org/10.1038/s41593-021-00857-x
- PubMed
- Google Scholar
(2009) Replay of rule-learning related neural patterns in the prefrontal cortex during sleep
Nature Neuroscience 12:919–926.

https://doi.org/10.1038/nn.2337
- PubMed
- Google Scholar
Conference
1. Podlaski B
2. Machens CK
(2020)
Biological credit assignment through dynamic inversion of feedforward networks

Advances in Neural Information Processing Systems. pp. 10065–10076.
- Google Scholar
Conference
(2021)
Towards biologically plausible convolutional networks

Advances in Neural Information Processing Systems. pp. 13924–13936.
- Google Scholar
(2003) Pyramidal neuron as two-layer neural network
Neuron 37:989–999.

https://doi.org/10.1016/s0896-6273(03)00149-1
- PubMed
- Google Scholar
Conference
1. Preller KH
2. Vollenweider FX
(2018) Phenomenology, structure, and dynamic of psychedelic states
Behavioral Neurobiology of Psychedelic Drugs. pp. 221–256.

https://doi.org/10.1007/7854_2016_459
- Google Scholar
(2022) Psychedelics and schizophrenia: distinct alterations to bayesian inference
NeuroImage 263:119624.

https://doi.org/10.1016/j.neuroimage.2022.119624
- PubMed
- Google Scholar
1. Rao RP
2. Ballard DH
(1999) Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects
Nature Neuroscience 2:79–87.

https://doi.org/10.1038/4580
- PubMed
- Google Scholar
1. Rea K
2. Wallace B
(2021) Enhancing equity-oriented care in psychedelic medicine: Utilizing the EQUIP framework
The International Journal on Drug Policy 98:103429.

https://doi.org/10.1016/j.drugpo.2021.103429
- PubMed
- Google Scholar
(2013) Charles Bonnet syndrome: evidence for a generative model in the cortex?
PLOS Computational Biology 9:e1003134.

https://doi.org/10.1371/journal.pcbi.1003134
- PubMed
- Google Scholar
Conference
(2014)
Stochastic backpropagation and approximate inference in deep generative models

International conference on machine learning PMLR. pp. 1278–1286.
- Google Scholar
1. Richards BA
2. Lillicrap TP
(2019) Dendritic solutions to the credit assignment problem
Current Opinion in Neurobiology 54:28–36.

https://doi.org/10.1016/j.conb.2018.08.003
- PubMed
- Google Scholar
1. Roelfsema PR
2. van Ooyen A
(2005) Attention-gated reinforcement learning of internal representations for classification
Neural Computation 17:2176–2214.

https://doi.org/10.1162/0899766054615699
- PubMed
- Google Scholar
Conference
1. Rombach R
2. Blattmann A
3. Lorenz D
4. Esser P
5. Ommer B
(2022) High-Resolution Image Synthesis with Latent Diffusion Models
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 10684–10695.

https://doi.org/10.1109/CVPR52688.2022.01042
- Google Scholar
Conference
(2018)
Dendritic cortical microcircuits approximate the backpropagation algorithm

Advances in Neural Information Processing Systems.
- Google Scholar
1. Samorini G
(2019) The oldest archeological data evidencing the relationship of Homo sapiens with psychoactive plants: A worldwide overview
Journal of Psychedelic Studies 3:63–80.

https://doi.org/10.1556/2054.2019.008
- Google Scholar
(2017) Cortical dendritic activity correlates with spindle-rich oscillations during sleep in rodents
Nature Communications 8:684.

https://doi.org/10.1038/s41467-017-00735-w
- PubMed
- Google Scholar
1. Shanon B
(2002)
Ayahuasca visualizations a structural typology

Journal of Consciousness Studies 9:3–30.
- Google Scholar
1. Shao LX
2. Liao C
3. Gregg I
4. Davoudian PA
5. Savalia NK
6. Delagarza K
7. Kwan AC
(2021) Psilocybin induces rapid and persistent growth of dendritic spines in frontal cortex in vivo
Neuron 109:2535–2544.

https://doi.org/10.1016/j.neuron.2021.06.008
- PubMed
- Google Scholar
1. Siegel JS
2. Subramanian S
3. Perry D
4. Kay BP
5. Gordon EM
6. Laumann TO
7. Reneau TR
8. Metcalf NV
9. Chacko RV
10. Gratton C
11. Horan C
12. Krimmel SR
13. Shimony JS
14. Schweiger JA
15. Wong DF
16. Bender DA
17. Scheidter KM
18. Whiting FI
19. Padawer-Curry JA
20. Shinohara RT
21. Chen Y
22. Moser J
23. Yacoub E
24. Nelson SM
25. Vizioli L
26. Fair DA
27. Lenze EJ
28. Carhart-Harris R
29. Raison CL
30. Raichle ME
31. Snyder AZ
32. Nicol GE
33. Dosenbach NUF
(2024) Psilocybin desynchronizes the human brain
Nature 632:131–138.

https://doi.org/10.1038/s41586-024-07624-5
- PubMed
- Google Scholar
1. Simoncelli EP
2. Olshausen BA
(2001) Natural image statistics and neural representation
Annual Review of Neuroscience 24:1193–1216.

https://doi.org/10.1146/annurev.neuro.24.1.1193
- PubMed
- Google Scholar
1. Simoncelli EP
(2003) Vision and the statistics of the visual environment
Current Opinion in Neurobiology 13:144–149.

https://doi.org/10.1016/S0959-4388(03)00047-3
- PubMed
- Google Scholar
Software
1. Simoncelli E
2. Young R
3. Broderick W
4. Fiquet P
5. Wang Z
6. Kadkhodaie Z
7. Parthasarathy N
8. Ward B
(2025) Pyrtools: tools for multi-scale image processing
Zenodo.

https://doi.org/10.5281/zenodo.15127019
(2022) A model of autonomous interactions between hippocampus and neocortex driving sleep-dependent memory consolidation
PNAS 119:e2123432119.

https://doi.org/10.1073/pnas.2123432119
- PubMed
- Google Scholar
1. Sjöström PJ
2. Häusser M
(2006) A cooperative switch determines the sign of synaptic plasticity in distal dendrites of neocortical pyramidal neurons
Neuron 51:227–238.

https://doi.org/10.1016/j.neuron.2006.06.017
- PubMed
- Google Scholar
Conference
(2016)
Ladder variational autoencoders

Advances in neural information processing systems.
- Google Scholar
1. Stickgold R
2. Hobson JA
3. Fosse R
4. Fosse M
(2001) Sleep, learning, and dreams: off-line memory reprocessing
Science 294:1052–1057.

https://doi.org/10.1126/science.1063530
- PubMed
- Google Scholar
(2022) Research abuses against people of colour and other vulnerable groups in early psychedelic research
Journal of Medical Ethics 48:728.

https://doi.org/10.1136/medethics-2021-107262
- PubMed
- Google Scholar
(2017) A deep-dream virtual reality platform for studying altered perceptual phenomenology
Scientific Reports 7:15982.

https://doi.org/10.1038/s41598-017-16316-2
- PubMed
- Google Scholar
(2022) Psilocin acutely alters sleep-wake architecture and cortical brain activity in laboratory mice
Translational Psychiatry 12:77.

https://doi.org/10.1038/s41398-022-01846-9
- Google Scholar
1. Tiwari P
2. Davoudian PA
3. Kapri D
4. Vuruputuri RM
5. Karaba LA
6. Sharma M
7. Zanni G
8. Balakrishnan A
9. Chaudhari PR
10. Pradhan A
11. Suryavanshi S
12. Bath KG
13. Ansorge MS
14. Fernandez-Ruiz A
15. Kwan AC
16. Vaidya VA
(2024) Ventral hippocampal parvalbumin interneurons gate the acute anxiolytic action of the serotonergic psychedelic DOI
Neuron 112:3697–3714.

https://doi.org/10.1016/j.neuron.2024.08.016
- PubMed
- Google Scholar
1. Urbanczik R
2. Senn W
(2014) Learning by the dendritic prediction of somatic spiking
Neuron 81:521–528.

https://doi.org/10.1016/j.neuron.2013.11.030
- PubMed
- Google Scholar
Conference
1. Vahdat A
2. Kautz J
(2020)
NVAE: A deep hierarchical variational autoencoder

Advances in Neural Information Processing Systems. pp. 19667–19679.
- Google Scholar
1. Vargas MV
2. Dunlap LE
3. Dong C
4. Carter SJ
5. Tombari RJ
6. Jami SA
7. Cameron LP
8. Patel SD
9. Hennessey JJ
10. Saeger HN
11. McCorvy JD
12. Gray JA
13. Tian L
14. Olson DE
(2023) Psychedelics promote neuroplasticity through the activation of intracellular 5-HT2A receptors
Science 379:700–706.

https://doi.org/10.1126/science.adf0435
- PubMed
- Google Scholar
1. Vincent P
(2011) A connection between score matching and denoising autoencoders
Neural Computation 23:1661–1674.

https://doi.org/10.1162/NECO_a_00142
- PubMed
- Google Scholar
1. Walker MP
2. Stickgold R
(2004) Sleep-dependent learning and memory consolidation
Neuron 44:121–133.

https://doi.org/10.1016/j.neuron.2004.08.031
- PubMed
- Google Scholar
(2024) Thalamic contributions to the state and contents of consciousness
Neuron 112:1611–1625.

https://doi.org/10.1016/j.neuron.2024.04.019
- PubMed
- Google Scholar
1. Widrow B
2. Lehr MA
(1990) 30 years of adaptive neural networks: perceptron, Madaline, and backpropagation
Proceedings of the IEEE 78:1415–1442.

https://doi.org/10.1109/5.58323
- Google Scholar
Conference
(2023)
Flexible phase dynamics for bio-plausible contrastive learning

International Conference on Machine Learning. pp. 37042–37065.
- Google Scholar
Report
1. Wu J
2. Zhang Q
3. Xu G
(2017)
Tiny imagenet challenge

Technical Report.
- Google Scholar
1. Xu S
2. Jiang W
3. Poo M
4. Dan Y
(2012) Activity recall in a visual cortical ensemble
Nature Neuroscience 15:449–455.

https://doi.org/10.1038/nn.3036
- PubMed
- Google Scholar
1. Zhou Y
2. Lai CSW
3. Bai Y
4. Li W
5. Zhao R
6. Yang G
7. Frank MG
8. Gan WB
(2020) REM sleep promotes experience-dependent dendritic spine elimination in the mouse cortex
Nature Communications 11:4819.

https://doi.org/10.1038/s41467-020-18592-5
- PubMed
- Google Scholar

Article and author information

Author details

Colin Bredenberg
1. Mila - Quebec AI Institute, Montreal, Canada
2. University of Montreal, Montreal, Canada
Contribution
Conceptualization, Software, Formal analysis, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing

For correspondence
colin.bredenberg@mila.quebec

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-9749-9228
Fabrice Normandin

Mila - Quebec AI Institute, Montreal, Canada

Contribution
Software, Validation, Writing – review and editing

Competing interests
No competing interests declared
Blake Richards
1. Mila - Quebec AI Institute, Montreal, Canada
2. McGill University, Montreal, Canada
Contribution
Conceptualization, Supervision, Funding acquisition, Writing – review and editing

Contributed equally with
Guillaume Lajoie

Competing interests
is employed by Google Paradigms of Intelligence

"This ORCID iD identifies the author of this article:" 0000-0001-9662-2151
Guillaume Lajoie
1. Mila - Quebec AI Institute, Montreal, Canada
2. University of Montreal, Montreal, Canada
Contribution
Conceptualization, Supervision, Funding acquisition, Writing – review and editing

Contributed equally with
Blake Richards

Competing interests
is a visiting researcher at Google Paradigms of Intelligence

"This ORCID iD identifies the author of this article:" 0000-0003-2730-7291

Funding

Natural Sciences and Engineering Research Council of Canada (RGPIN-2018-04821)

Guillaume Lajoie

CIFAR AI Chair Program

Blake Richards
Guillaume Lajoie

Canada Research Chair in Neural Computations and Interfacing

Guillaume Lajoie

Natural Sciences and Engineering Research Council of Canada (RGPIN-2020-05105)

Blake Richards

Natural Sciences and Engineering Research Council of Canada (RGPAS-2020-00031)

Blake Richards

Arthur B. McDonald Fellowship (566355-2022)

Blake Richards

CIFAR Learning in Machine and Brains Fellowship

Blake Richards

FRQNT Strategic Clusters Program

Colin Bredenberg

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We would like to thank members of both GL and BR’s labs, as well as James M Shine, Brandon Munn, Christopher Whyte, Veronica Chelu, Jiameng Wu, Matthew Larkum, Santiago Jaramillo, Michael Wehr, Neil Savalia, Alexandra Klein, Sarah Cook, Conor Lane, Anousheh Bakhti-Suroosh, Runchong Wang, Michael Okun, and Jordan O’Byrne for insightful discussions and feedback. This work was supported by: [GL] NSERC Discovery Grant (RGPIN-2018-04821), Canada CIFAR AI Chair Program, Canada Research Chair in Neural Computations and Interfacing (CIHR, tier 2). [BR] NSERC (Discovery Grant:RGPIN-2020-05105; Discovery Accelerator Supplement: RGPAS-2020-00031; Arthur B McDonald Fellowship: 566355-2022) and CIFAR (Canada AI Chair; Learning in Machine and Brains Fellowship). [CB] is supported in part by the FRQNT Strategic Clusters Program (Centre UNIQUE - Quebec Neuro-AI Research Center). The authors acknowledge the material support of NVIDIA in the form of computational resources.

Version history

Preprint posted: January 13, 2025
Sent for peer review: February 3, 2025
Reviewed Preprint version 1: June 6, 2025
Reviewed Preprint version 2: January 13, 2026
Version of Record published: April 21, 2026

Cite all versions

You can cite all versions using the DOI https://doi.org/10.7554/eLife.105968. This DOI represents all versions, and will always resolve to the latest one.

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.