Learning cortical representations through perturbed and adversarial dreaming

  1. Nicolas Deperrois  Is a corresponding author
  2. Mihai A Petrovici
  3. Walter Senn
  4. Jakob Jordan
  1. Department of Physiology, University of Bern, Switzerland
  2. Kirchhoff-Institute for Physics, Heidelberg University, Germany
15 figures, 1 table and 1 additional file

Figures

Cortical representation learning through perturbed and adversarial dreaming (PAD).

(a) During wakefulness (Wake), cortical feedforward pathways learn to recognize that low-level activity is externally driven and feedback pathways learn to reconstruct it from high-level neuronal representations. These high-level representations are stored in the hippocampus. (b) During non-rapid eye movement sleep (NREM), feedforward pathways learn to reconstruct high-level activity patterns replayed from the hippocampus affected by low-level perturbations, referred to as perturbed dreaming. (c) During rapid eye movement sleep (REM), feedforward and feedback pathways operate in an adversarial fashion, referred to as adversarial dreaming. Feedback pathways generate virtual low-level activity from combinations of multiple hippocampal memories and spontaneous cortical activity. While feedforward pathways learn to recognize low-level activity patterns as internally generated, feedback pathways learn to fool feedforward pathways.

Different objectives during wakefulness, non-rapid eye movement (NREM), and rapid eye movement (REM) sleep govern the organization of feedforward and feedback pathways in perturbed and adversarial dreaming (PAD).

The variable x corresponds to 32 × 32 image, z is a 256-dimensional vector representing the latent layer (higher sensory cortex). Encoder (E, green) and generator (G, blue) networks project bottom-up and top-down signals between lower and higher sensory areas. An oblique arrow () indicates that learning occurs in a given pathway. (a) During Wake, low-level activities x are reconstructed. At the same time, E learns to classify low-level activity as external (red target ‘external!’) with its output discriminator d. The obtained latent representations z are stored in the hippocampus. (b) During NREM, the activity z stored during wakefulness is replayed from the hippocampal memory and regenerates visual input from the previous day perturbed by occlusions, modeled by squares of various sizes applied along the generated low-level activity with a certain probability (see Materials and methods). In this phase, E adapts to reproduce the replayed latent activity. (c) During REM, convex combinations of multiple random hippocampal memories (z and zold) and spontaneous cortical activity (ϵ), here with specific prefactors, generate a virtual activity in lower areas. While the encoder learns to classify this activity as internal (red target ‘internal!’), the generator adversarially learns to generate visual inputs that would be classified as external. The red minus on G indicates the inverted plasticity implementing this adversarial training.

Both non-rapid eye movement (NREM) and rapid eye movement (REM) dreams become more realistic over the course of learning.

(a) Examples of sensory inputs observed during wakefulness. Their corresponding latent representations are stored in the hippocampus. (b, c) Single episodic memories (latent representations of stimuli) during NREM from the previous day and combinations of episodic memories from the two previous days during REM are recalled from hippocampus and generate early sensory activity via feedback pathways. This activity is shown for early (epoch 1) and late (epoch 50) training stages of the model. (d) Discrepancy between externally driven and internally generated early sensory activity as measured by the Fréchet inception distance (FID) (Heusel et al., 2018) during NREM and REM for networks trained on CIFAR-10 (top) and SVHN (bottom). Lower distance reflects higher similarity between sensory-evoked and generated activity. Error bars indicate ±1 SEM over four different initial conditions.

Adversarial dreaming during rapid eye movement (REM) improves the linear separability of the latent representation.

(a) A linear classifier is trained on the latent representations z inferred from an external input x to predict its associated label (here, the category ‘car’). (b) Training phases and pathological conditions: full model (perturbed and adversarial dreaming [PAD], black), no REM phase (pink) and PAD with a REM phase using a single episodic memory only (‘w/o memory mix’, purple). (c, d) Classification accuracy obtained on test datasets (c: CIFAR-10; d: SVHN) after training the linear classifier to convergence on the latent space z for each epoch of the E-G-network learning. Full model (PAD): black line; without REM: pink line; with REM, but without memory mix: purple line. Solid lines represent mean, and shaded areas indicate ±1 SEM over four different initial conditions.

Perturbed dreaming during non-rapid eye movement (NREM) improves robustness of latent representations.

(a) A trained linear classifier (Figure 4) infers class labels from latent representations. The classifier was trained on latent representations of original images, but evaluated on representations of images with varying levels of occlusion. (b) Training phases and pathological conditions: full model (perturbed and adversarial dreaming [PAD], black), without NREM phase (w/o NREM, orange). (c, d) Classification accuracy obtained on the test dataset (c: CIFAR-10; d: SVHN) after 50 epochs for different levels of occlusion (0% to 100%). Full model (PAD): black line; w/o NREM: orange line. SEM over four different initial conditions overlap with data points. Note that due to an unbalanced distribution of samples the highest performance of a naive classifier is 18.9% for the SVHN dataset.

Effects of non-rapid eye movement (NREM) and rapid eye movement (REM) sleep on latent representations.

(a) Inputs x are mapped to their corresponding latent representations z via the encoder E. Principal component analysis (PCA; Jolliffe and Cadima, 2016) is performed on the latent space to visualize its structure (b–d). Clustering distances (e, f) are computed directly on latent features z. (b–d) PCA visualization of latent representations projected on the first two principal components. Full circles represent clean images, open circles represent images with 30% occlusion. Each color represents an object category from the SVHN dataset (purple: ‘0’; cyan: ‘1’; yellow: ‘2’; red: ‘3’). (e) Ratio between average intra-class and average inter-class distances in latent space for randomly initialized networks (no training, gray), full model (black), model trained without REM sleep (w/o REM, pink), and model trained without NREM sleep (w/o NREM, orange) for unoccluded inputs. (f) Ratio between average clean-occluded (30% occlusion) and average inter-class distances in latent space for the full model (black), w/o REM (pink), and w/o NREM (orange). Error bars represent SEM over four different initial conditions.

Model features and physiological counterparts during Wake, non-rapid eye movement (NREM), and rapid eye movement (REM) phases.

ACh: acetylcholine; NA: noradrenaline. ‘Sign switch’ indicates that identical local errors lead to opposing weight changes between Wake and REM sleep.

Convolutional neural network (CNN) architecture of encoder/discriminator and generator used in perturbed and adversarial dreaming (PAD).
Varying size and intensity of occlusions on example images from CIFAR-10.

Image occlusions vary along two parameters: occlusion intensity, defined by the probability to apply a gray square at a given position, and square size (s).

Appendix 1—figure 1
Training losses for the full and pathological models with the CIFAR-10 dataset.

Evolution of training losses used to optimize E and G networks (see Materials and methods) over training epochs for the full and pathological models.

Appendix 1—figure 2
Training losses for the full and pathological models with the SVHN dataset.
Appendix 1—figure 3
Linear classification performance for the full model and all pathological conditions.

For details, see Figure 4.

Appendix 1—figure 4
Linear classification performance for different mixing strategies during rapid eye movement (REM).

Linear separability of latent representations with training epochs for perturbed and adversarial dreaming (PAD) trained with different REM phases: one driven by a convex combination of mixed memories and noise (black), one by pure noise (green), and one by mixed memories only (red). For details, see Figure 4.

Appendix 1—figure 5
Linear classification performance for different order of sleep phases.

Linear separability of latent representations with training epochs for perturbed and adversarial dreaming (PAD) trained when non-rapid eye movement (NREM) precedes rapid eye movement (REM) phase (Wake–NREM–REM, black) or when REM precedes NREM (Wake–REM–NREM, brown).

Appendix 1—figure 6
Importance of replaying single hippocampal memories during non-rapid eye movement (NREM).

Linear separability of latent representations at the end of learning with occlusion intensity for a model trained with all phases.

Tables

Appendix 1—table 1
Final classification performance for the full model and all pathological conditions for unoccluded images.

Mean and standard error of the mean (SEM) over four different initial condition of linear separability of latent representations at the end of training (epoch 50) for perturbed and adversarial dreaming (PAD) and its pathological variants.

DatasetPADW/o memory mixW/o REMW/o NREMWake only
CIFAR-1058.25 ± 0.7053.87 ± 0.8546.00 ± 0.4358.00 ± 0.3442.25 ± 0.54
SVHN78.92 ± 0.4060.87 ± 5.0742.30 ± 1.5173.25 ± 0.2241.93 ± 0.65
  1. REM: rapid eye movement; NREM: non-rapid eye movement.

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Nicolas Deperrois
  2. Mihai A Petrovici
  3. Walter Senn
  4. Jakob Jordan
(2022)
Learning cortical representations through perturbed and adversarial dreaming
eLife 11:e76384.
https://doi.org/10.7554/eLife.76384