Parallel processing in speech perception with local and global representations of linguistic context
Abstract
Speech processing is highly incremental. It is widely accepted that human listeners continuously use the linguistic context to anticipate upcoming concepts, words, and phonemes. However, previous evidence supports two seemingly contradictory models of how a predictive context is integrated with the bottom-up sensory input: Classic psycholinguistic paradigms suggest a two-stage process, in which acoustic input initially leads to local, context-independent representations, which are then quickly integrated with contextual constraints. This contrasts with the view that the brain constructs a single coherent, unified interpretation of the input, which fully integrates available information across representational hierarchies, and thus uses contextual constraints to modulate even the earliest sensory representations. To distinguish these hypotheses, we tested magnetoencephalography responses to continuous narrative speech for signatures of local and unified predictive models. Results provide evidence that listeners employ both types of models in parallel. Two local context models uniquely predict some part of early neural responses, one based on sublexical phoneme sequences, and one based on the phonemes in the current word alone; at the same time, even early responses to phonemes also reflect a unified model that incorporates sentence level constraints to predict upcoming phonemes. Neural source localization places the anatomical origins of the different predictive models in non-identical parts of the superior temporal lobes bilaterally, with the right hemisphere showing a relative preference for more local models. These results suggest that speech processing recruits both local and unified predictive models in parallel, reconciling previous disparate findings. Parallel models might make the perceptual system more robust, facilitate processing of unexpected inputs, and serve a function in language acquisition.
Data availability
The raw data and predictors used in this study are available for download from Dryad at https://doi.org/10.5061/dryad.nvx0k6dv0
-
Data from: Parallel processing in speech perception with local and global representations of linguistic contextDryad Digital Repository, doi:10.5061/dryad.nvx0k6dv0.
Article and author information
Author details
Funding
University of Maryland (BBI Seed Grant)
- Jonathan Z Simon
- Ellen Lau
National Science Foundation (BCS-1749407)
- Ellen Lau
National Institutes of Health (R01DC014085)
- Jonathan Z Simon
National Science Foundation (SMA-1734892)
- Jonathan Z Simon
Office of Naval Research (MURI Award N00014-18-1-2670)
- Philip Resnik
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Reviewing Editor
- Virginie van Wassenhove, CEA, DRF/I2BM, NeuroSpin; INSERM, U992, Cognitive Neuroimaging Unit, France
Ethics
Human subjects: The study was approved by the IRB of the University of Maryland under the protocol titled "MEG Studies of Speech and Language Processing" (reference # 01153), on August 22, 2018 and September 9, 2019 (approval duration: 1 year). All participants provided written informed consent prior to the start of the experiment.
Version history
- Preprint posted: July 3, 2021 (view preprint)
- Received: July 8, 2021
- Accepted: January 16, 2022
- Accepted Manuscript published: January 21, 2022 (version 1)
- Version of Record published: February 10, 2022 (version 2)
- Version of Record updated: April 5, 2022 (version 3)
Copyright
© 2022, Brodbeck et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 3,213
- views
-
- 568
- downloads
-
- 48
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Neuroscience
Resting-state brain networks (RSNs) have been widely applied in health and disease, but the interpretation of RSNs in terms of the underlying neural activity is unclear. To address this fundamental question, we conducted simultaneous recordings of whole-brain resting-state functional magnetic resonance imaging (rsfMRI) and electrophysiology signals in two separate brain regions of rats. Our data reveal that for both recording sites, spatial maps derived from band-specific local field potential (LFP) power can account for up to 90% of the spatial variability in RSNs derived from rsfMRI signals. Surprisingly, the time series of LFP band power can only explain to a maximum of 35% of the temporal variance of the local rsfMRI time course from the same site. In addition, regressing out time series of LFP power from rsfMRI signals has minimal impact on the spatial patterns of rsfMRI-based RSNs. This disparity in the spatial and temporal relationships between resting-state electrophysiology and rsfMRI signals suggests that electrophysiological activity alone does not fully explain the effects observed in the rsfMRI signal, implying the existence of an rsfMRI component contributed by ‘electrophysiology-invisible’ signals. These findings offer a novel perspective on our understanding of RSN interpretation.
-
- Neuroscience
Deep neural networks have made tremendous gains in emulating human-like intelligence, and have been used increasingly as ways of understanding how the brain may solve the complex computational problems on which this relies. However, these still fall short of, and therefore fail to provide insight into how the brain supports strong forms of generalization of which humans are capable. One such case is out-of-distribution (OOD) generalization – successful performance on test examples that lie outside the distribution of the training set. Here, we identify properties of processing in the brain that may contribute to this ability. We describe a two-part algorithm that draws on specific features of neural computation to achieve OOD generalization, and provide a proof of concept by evaluating performance on two challenging cognitive tasks. First we draw on the fact that the mammalian brain represents metric spaces using grid cell code (e.g., in the entorhinal cortex): abstract representations of relational structure, organized in recurring motifs that cover the representational space. Second, we propose an attentional mechanism that operates over the grid cell code using determinantal point process (DPP), that we call DPP attention (DPP-A) – a transformation that ensures maximum sparseness in the coverage of that space. We show that a loss function that combines standard task-optimized error with DPP-A can exploit the recurring motifs in the grid cell code, and can be integrated with common architectures to achieve strong OOD generalization performance on analogy and arithmetic tasks. This provides both an interpretation of how the grid cell code in the mammalian brain may contribute to generalization performance, and at the same time a potential means for improving such capabilities in artificial neural networks.