Vocalization categorization behavior explained by a feature-based auditory categorization model

Abstract

Vocal animals produce multiple categories of calls with high between- and within-subject variability, over which listeners must generalize to accomplish call categorization. The behavioral strategies and neural mechanisms that support this ability to generalize are largely unexplored. We previously proposed a theoretical model that accomplished call categorization by detecting features of intermediate complexity that best contrasted each call category from all other categories. We further demonstrated that some neural responses in the primary auditory cortex were consistent with such a model. Here, we asked whether a feature-based model could predict call categorization behavior. We trained both the model and guinea pigs on call categorization tasks using natural calls. We then tested categorization by the model and guinea pigs using temporally and spectrally altered calls. Both the model and guinea pigs were surprisingly resilient to temporal manipulations, but sensitive to moderate frequency shifts. Critically, the model predicted about 50% of the variance in guinea pig behavior. By adopting different model training strategies and examining features that contributed to solving specific tasks, we could gain insight into possible strategies used by animals to categorize calls. Our results validate a model that uses the detection of intermediate-complexity contrastive features to accomplish call categorization.

Data availability

All data generated or analyzed during this study are included in the manuscript and supporting file; Source Data files have been provided for Figures 3 - 12.

Article and author information

Author details

  1. Manaswini Kar

    Center for Neuroscience, University of Pittsburgh, Pittsburgh, United States
    Competing interests
    The authors declare that no competing interests exist.
  2. Marianny Pernia

    Department of Neurobiology, University of Pittsburgh, Pittsburgh, United States
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-9889-3577
  3. Kayla Williams

    Department of Neurobiology, University of Pittsburgh, Pittsburgh, United States
    Competing interests
    The authors declare that no competing interests exist.
  4. Satyabrata Parida

    Department of Neurobiology, University of Pittsburgh, Pittsburgh, United States
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-2896-2522
  5. Nathan Alan Schneider

    Center for Neuroscience, University of Pittsburgh, Pittsburgh, United States
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-9145-5427
  6. Madelyn McAndrew

    Center for the Neural Basis of Cognition, University of Pittsburgh, Pittsburgh, United States
    Competing interests
    The authors declare that no competing interests exist.
  7. Isha Kumbam

    Department of Neurobiology, University of Pittsburgh, Pittsburgh, United States
    Competing interests
    The authors declare that no competing interests exist.
  8. Srivatsun Sadagopan

    Center for Neuroscience, University of Pittsburgh, Pittsburgh, United States
    For correspondence
    vatsun@pitt.edu
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-1116-8728

Funding

National Institutes of Health (R01DC017141)

  • Srivatsun Sadagopan

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Ethics

Animal experimentation: All experimental procedures conformed to the NIH Guide for the Care and Use of Laboratory Animals and were approved by the institutional animal care and use committee of the University of Pittsburgh (protocol number 21069431).

Copyright

© 2022, Kar et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 768
    views
  • 151
    downloads
  • 7
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Manaswini Kar
  2. Marianny Pernia
  3. Kayla Williams
  4. Satyabrata Parida
  5. Nathan Alan Schneider
  6. Madelyn McAndrew
  7. Isha Kumbam
  8. Srivatsun Sadagopan
(2022)
Vocalization categorization behavior explained by a feature-based auditory categorization model
eLife 11:e78278.
https://doi.org/10.7554/eLife.78278

Share this article

https://doi.org/10.7554/eLife.78278

Further reading

    1. Neuroscience
    Mohsen Alavash
    Insight

    Combining electrophysiological, anatomical and functional brain maps reveals networks of beta neural activity that align with dopamine uptake.

    1. Neuroscience
    Nicolas Langer, Maurice Weber ... Ce Zhang
    Tools and Resources

    Memory deficits are a hallmark of many different neurological and psychiatric conditions. The Rey–Osterrieth complex figure (ROCF) is the state-of-the-art assessment tool for neuropsychologists across the globe to assess the degree of non-verbal visual memory deterioration. To obtain a score, a trained clinician inspects a patient’s ROCF drawing and quantifies deviations from the original figure. This manual procedure is time-consuming, slow and scores vary depending on the clinician’s experience, motivation, and tiredness. Here, we leverage novel deep learning architectures to automatize the rating of memory deficits. For this, we collected more than 20k hand-drawn ROCF drawings from patients with various neurological and psychiatric disorders as well as healthy participants. Unbiased ground truth ROCF scores were obtained from crowdsourced human intelligence. This dataset was used to train and evaluate a multihead convolutional neural network. The model performs highly unbiased as it yielded predictions very close to the ground truth and the error was similarly distributed around zero. The neural network outperforms both online raters and clinicians. The scoring system can reliably identify and accurately score individual figure elements in previously unseen ROCF drawings, which facilitates explainability of the AI-scoring system. To ensure generalizability and clinical utility, the model performance was successfully replicated in a large independent prospective validation study that was pre-registered prior to data collection. Our AI-powered scoring system provides healthcare institutions worldwide with a digital tool to assess objectively, reliably, and time-efficiently the performance in the ROCF test from hand-drawn images.