Automated annotation of birdsong with a neural network that segments spectrograms
Abstract
Songbirds provide a powerful model system for studying sensory-motor learning. However, many analyses of birdsong require time-consuming, manual annotation of its elements, called syllables. Automated methods for annotation have been proposed, but these methods assume that audio can be cleanly segmented into syllables, or they require carefully tuning multiple statistical models. Here we present TweetyNet: a single neural network model that learns how to segment spectrograms of birdsong into annotated syllables. We show that TweetyNet mitigates limitations of methods that rely on segmented audio. We also show that TweetyNet performs well across multiple individuals from two species of songbirds, Bengalese finches and canaries. Lastly, we demonstrate that using TweetyNet we can accurately annotate very large datasets containing multiple days of song, and that these predicted annotations replicate key findings from behavioral studies. In addition, we provide open-source software to assist other researchers, and a large dataset of annotated canary song that can serve as a benchmark. We conclude that TweetyNet makes it possible to address a wide range of new questions about birdsong.
Data availability
Datasets of annotated Bengalese finch song are available at:https://figshare.com/articles/Bengalese_Finch_song_repository/4805749https://figshare.com/articles/BirdsongRecognition/3470165Datasets of annotated canary song are available at:https://doi.org/10.5061/dryad.xgxd254f4Model checkpoints, logs, and source data files are available at:http://dx.doi.org/10.5061/dryad.gtht76hk4Source data files for figure are in the repository associated with the paper:https://github.com/yardencsGitHub/tweetynet(version 0.4.3, 10.5281/zenodo.3978389).
-
Song recordings and annotation files of 3 canaries used to evaluate training of TweetyNet models for birdsong segmentation and annotationDryad Digital Repository, doi:10.5061/dryad.xgxd254f4.
-
Model checkpoints, logs, and source data filesDryad Digital Repository, doi:10.5061/dryad.gtht76hk4.
-
Bengalese Finch song repository.Figshare, https://doi.org/10.6084/m9.figshare.4805749.v6.
-
BirdsongRecognition.Figshare, https://doi.org/10.6084/m9.figshare.3470165.v1.
Article and author information
Author details
Funding
National Institute of Neurological Disorders and Stroke (R01NS104925)
- Alexa Sanchioni
- Emily K Mallaber
- Viktoriya Skidanova
- Timothy J Gardner
National Institute of Neurological Disorders and Stroke (R24NS098536)
- Alexa Sanchioni
- Emily K Mallaber
- Viktoriya Skidanova
- Timothy J Gardner
National Institute of Neurological Disorders and Stroke (R01NS118424)
- Timothy J Gardner
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Ethics
Animal experimentation: All procedures were approved by the Institutional Animal Care and Use Committees of Boston University (protocol numbers 14-028 and 14-029). Song data were collected from adult male canaries (n = 5). Canaries were individually housed for the entire duration of the experiment and kept on a light-dark cycle matching the daylight cycle in Boston (42.3601 N). The birds were not used in any other experiments.
Copyright
© 2022, Cohen et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 4,585
- views
-
- 508
- downloads
-
- 46
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Neuroscience
Neural activity in auditory cortex tracks the amplitude-onset envelope of continuous speech, but recent work counterintuitively suggests that neural tracking increases when speech is masked by background noise, despite reduced speech intelligibility. Noise-related amplification could indicate that stochastic resonance – the response facilitation through noise – supports neural speech tracking, but a comprehensive account is lacking. In five human electroencephalography experiments, the current study demonstrates a generalized enhancement of neural speech tracking due to minimal background noise. Results show that (1) neural speech tracking is enhanced for speech masked by background noise at very high signal-to-noise ratios (~30 dB SNR) where speech is highly intelligible; (2) this enhancement is independent of attention; (3) it generalizes across different stationary background maskers, but is strongest for 12-talker babble; and (4) it is present for headphone and free-field listening, suggesting that the neural-tracking enhancement generalizes to real-life listening. The work paints a clear picture that minimal background noise enhances the neural representation of the speech onset-envelope, suggesting that stochastic resonance contributes to neural speech tracking. The work further highlights non-linearities of neural tracking induced by background noise that make its use as a biological marker for speech processing challenging.
-
- Neuroscience
The neuropeptides Substance P and CGRPα have long been thought important for pain sensation. Both peptides and their receptors are expressed at high levels in pain-responsive neurons from the periphery to the brain making them attractive therapeutic targets. However, drugs targeting these pathways individually did not relieve pain in clinical trials. Since Substance P and CGRPα are extensively co-expressed, we hypothesized that their simultaneous inhibition would be required for effective analgesia. We therefore generated Tac1 and Calca double knockout (DKO) mice and assessed their behavior using a wide range of pain-relevant assays. As expected, Substance P and CGRPα peptides were undetectable throughout the nervous system of DKO mice. To our surprise, these animals displayed largely intact responses to mechanical, thermal, chemical, and visceral pain stimuli, as well as itch. Moreover, chronic inflammatory pain and neurogenic inflammation were unaffected by loss of the two peptides. Finally, neuropathic pain evoked by nerve injury or chemotherapy treatment was also preserved in peptide-deficient mice. Thus, our results demonstrate that even in combination, Substance P and CGRPα are not required for the transmission of acute and chronic pain.