Peer review process
Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, public reviews, and a provisional response from the authors.
Read more about eLife’s peer review process.Editors
- Reviewing EditorSrdjan OstojicÉcole Normale Supérieure - PSL, Paris, France
- Senior EditorMichael FrankBrown University, Providence, United States of America
Reviewer #1 (Public Review):
Schmid et al. investigate the question of how sensory learning in animals and artificial networks is driven both by passive exposure to the environment (unsupervised) and from reinforcing feedback (supervised) and how these two systems interact. They first demonstrate in mice that passive exposure to the same auditory stimuli used in a discrimination task modifies learning and performance in the task. Based on this data, they then tested how the interaction of supervised and unsupervised learning in an artificial network could account for the behavioural results.
Strengths :
The clear behavioural impact of passive exposure to sounds on accelerating learning is a major strength of the paper. Moreover, the observation that passive exposure had a positive impact on learning whether it was prior to the task or interleaved with learning sessions provides interesting constraints for modelling the interaction between supervised and unsupervised learning. A practical fallout for labs performing long training procedures is that the periods of active learning that require water-restriction could be reduced by using passive sessions. This could increase both experimental efficiency and animal well-being.
The modelling section clearly exhibits the differences between models and the step-by-step presentation building to the final model provides the reader with a lot of intuition about how supervised and unsupervised learning interact. In particular, the authors highlight situations in which the task-relevant discrimination does not align with the directions of highest variance, thus reinforcing the relevance of their conclusions for the complex structure of sensory stimuli. A great strength of these models is that they generate clear predictions about how neural activity should evolve during the different training regimes that would be exciting to test.
Weaknesses :
The experimental design presented cannot clearly show that the effect of passive exposure was due to the specific exposure to task-relevant stimuli since there is no control group exposed to irrelevant stimuli. Studies have shown that exposure to a richer sensory environment, even in the adult, swiftly (ie within days) enhances responses even in the adult and even when the stimuli are different from those present in task1-3. Since the authors conclude that their network models "build latent representations of features that are determined by statistical properties of the input distribution, as long as those features aid the decoding of task-relevant variables" (line 339, my emphasis). This conclusion, and therefore the link of behaviour to the models, is weakened by the lack of direct testing of the need for task-relevant stimuli to be presented.
The conclusion that "passive exposure influences responses to sounds not used during training" (line 147) does not seem fully supported by the authors' analysis. The authors show that there is an increase in accuracy for intermediate sweep speeds despite the fact that this is the first time the animals encounter them in the active session. However, it seems impossible to exclude that this effect is not simply due to the increased accuracy of the extreme sounds that the animals had been trained on. For example, simply prolonging learning in stage 3 is likely to increase accuracy across sounds at stage 4, passive sessions may be mimicking this effect. Moreover, the authors point out that there is no effect on the slope of the psychometric curve. Such a sharpening would be predicted if the passive presentations were indeed enhancing intermediate sound representations, making them more precise and more discriminable.
In the modelling section, the authors adjusted the hyper-parameters to maximize the difference between pure active and passive/active learning. This makes a comparison of learning rates between models somewhat confusing, raising the question of whether the differences highlight an interaction between the two types of learning or simply parameter choice. For example:
- Figure 5: although in model 3 passive listening enhances learning relative to the pure active condition, learning is overall much slower in the active condition compared to model 2. This raises the question of whether the addition of unsupervised rules makes the models more apt at exploiting passive exposure but at the cost of efficient active learning.
- Figure 6 & 7: model 5 only differs from model 4 by the addition of supervised learning at layer 1 and the use of what should be a harder task (stimuli spread over the first PCs) however model 5 clearly has much better performance for the P: A condition which is surprising given that the unsupervised and supervised learning periods are clearly separated.
1. Mandairon, N., Stack, C. & Linster, C. Olfactory enrichment improves the recognition of individual components in mixtures. Physiol. Behav. 89, 379-384 (2006).
2. Alwis, D. S. & Rajan, R. Environmental enrichment and the sensory brain: The role of enrichment in remediating brain injury. Front. Syst. Neurosci. 8, 1-20 (2014).
3. Polley, D. B., Kvašňák, E. & Frostig, R. D. Naturalistic experience transforms sensory maps in the adult cortex of caged animals. Nature 429, 67-71 (2004).
Reviewer #2 (Public Review):
Schmid et al present a lovely study looking at the effect of passive auditory exposure on learning a categorization task.
The authors utilize a two-alternative choice task where mice have to discriminate between upward and downward-moving frequency sweeps. Once mice learn to discriminate easy stimuli, the task is made psychometric and additional intermediate stimuli are introduced (as is standard in the literature). The authors introduce an additional two groups of animals, one that was passively exposed to the task stimuli before any behavioral shaping, and one that had passive exposure interleaved with learning. The major behavioral finding is that passive exposure to sounds improves learning speed. The authors show this in a number of ways through linear fits to the learning curves. Additionally, by breaking down performance based on the "extreme" vs "psychometric" stimuli, the authors show that passive exposure can influence responses to sounds that were not present during the initial training period. One limitation here is that the presented analysis is somewhat simplistic, does not include any detailed psychometric analysis (bias, lapse rates etc), and primarily focuses on learning speed. Ultimately though, the behavioral results are interesting and seem supported by the data.
To investigate the neural mechanisms that may underlie their behavioral findings, the authors turn to a family of artificial neural network models and evaluate the consequences of different learning algorithms and schedules, network architectures, and stimulus distributions, on the learning outcomes. The authors work through five different architectures that fail to recapitulate the primary behavior findings before settling on a final model, utilizing a combination of supervised and unsupervised learning, that was capable of reproducing the key aspects of the experiments. Ultimately, the behavioral results presented are consistent with network models that build latent representations of task-relevant features that are determined by statistical properties of the input distribution.
Reviewer #3 (Public Review):
Summary of Author's Results/Intended Achievements
The authors were trying to ascertain the underlying learning mechanisms and network structure that could explain their primary experimental finding: passive exposure to a stimulus (independent of when the exposure occurs) can lead to improvements in active (supervised) learning. They modeled their task with 5 progressively more complex shallow neural networks classifying vectors drawn from multi-variate Gaussian distributions.
Account of Major Strengths:
Overall, the experimental findings were interesting, albeit not necessarily novel. The modelling was also appropriate, with a solid attempt at matching the experimental condition to simplified network models.
Account of Major Weaknesses:
I would say there are two major weaknesses of this work. The first is that even Model 5 differs from their data. For example, the A+P (passive interleaved condition) learning curve in Figure 7 seems to be non-monotonic, and has some sort of complex eigenvalue in its decay to the steady state performance as trials increase. This wasn't present in their experimental data (Figure 2D), and implies a subtle but important difference. There also appear to be differences in how quickly the initial learning (during early trials) occurs for the A+P and A:P conditions. While both A+P and A:P conditions learn faster than A only in M5, A+P and A:P seem to learn in different ways, which isn't supported in their data. The second major weakness is that the authors also don't generate any predictions with M5. Can they test this model of learning somehow in follow-up behavioural experiments in mice?
Discussion of Likely Impact:
Without follow-up experiments to test their mechanism of why passive exposure helps in a schedule-independent way, the impact of this paper will be limited.
Additional Context:
I believe the authors need to place this work in the context of a large amount of existing literature on passive (unsupervised) and active (supervised) learning interactions. This field is broad both experimentally and computationally. For example, there is an entire sub-field of machine learning, called semi-supervised learning that is not mentioned at all in this work.