Cortical activity during naturalistic music listening reflects short-range predictions based on long-term experience

  1. Pius Kern
  2. Micha Heilbron
  3. Floris P de Lange
  4. Eelke Spaak  Is a corresponding author
  1. Radboud University Nijmegen, Donders Institute for Brain, Cognition and Behaviour, Netherlands
17 figures, 1 table and 1 additional file

Figures

Overview of the research paradigm.

Listeners undergoing EEG (data from Di Liberto et al., 2020) or MEG measurement (novel data acquired for the current study) were presented with naturalistic music synthesized from MIDI files. To model melodic expectations, we calculated note-level surprise and uncertainty estimates via three computational models reflecting different internal models of expectations. We estimated the regression evoked response or temporal response function (TRF) for different features using time-resolved linear regression on the M|EEG data, while controlling for low-level acoustic factors.

Model performance on the musical stimuli used in the MEG study.

(A) Comparison of music model performance in predicting upcoming note pitch, as composition-level accuracy (left; higher is better), median surprise across notes (middle; lower is better), and median uncertainty across notes (right). Context length for each model is the best performing one across the range shown in (B). Vertical bars: single compositions, circle: median, thick line: quartiles, thin line: quartiles ±1.5 × interquartile range. (B) Accuracy of note pitch predictions (median across 19 compositions) as a function of context length and model class (same color code as (A)). Dots represent maximum for each model class. (C) Correlations between the surprise estimates from the best models. (For similar results for the musical stimuli used in the EEG study, see Appendix 1—figure 2).

Model performance on MEG data from 35 listeners.

(A) Cross-validated r for the Onset only model (top left). Difference in cross-validated r between the Baseline model including acoustic regressors and the Onset model (bottom left). Difference in cross-validated r between models including surprise estimates from different model classes (color-coded) and the Baseline model (right). Vertical bars: participants; box plot as in Figure 2. (B) Comparison between the best surprise models from each model class as a function of context length. Lines: mean across participants, shaded area: 95% CI. (C) Predictive performance of the Music Transformer (MT) on the MEG data (left y-axis, dark, mean across participants) and the music data from the MEG study (right y-axis, light, median across compositions).

Model performance on EEG data from 20 listeners.

All panels as in Figure 3, but applied to the EEG data and its musical stimuli.

Temporal response functions (TRFs, left column) and spatial topographies at four time periods (right column) for the best model on the MEG data.

(A): Note onset regressor. (B): Note repetition regressor. (C): Surprise regressor from the Music Transformer with a context length of eight notes. TRF plots: Grey horizontal bars: time points at which at least one channel in the ROI was significant. Lines: mean across participants and channels. Shaded area: 95% CI across participants.

All panels as in Figure 5, but applied to the EEG data and its musical stimuli.
Source-level results for the MEG TRF data.

Volumetric density of estimated dipole locations across participants in the time window of interest identified in Figure 5 (180–240ms), projected on the average Montreal Neurological Institute (MNI) template brain. MNI coordinates are given for the density maxima with anatomical labels from the Automated Anatomical Labeling atlas.

Results for melodic uncertainty.

(A) Relationship between and distribution of surprise and uncertainty estimates from the Music Transformer (context length of eight notes). (B) Cross-validated predictive performance for the Baseline +surprise model (top), and for models with added uncertainty regressor (middle) and the interaction between surprise and uncertainty (SxU, bottom). Adding uncertainty and/or the interaction between surprise and uncertainty (SxU) did not improve but worsen the predictive performance on the MEG data.

Training (A) and fine-tuning (B) of the Music Transformer on the Maestro corpus and MCCC, respectively.

Cross-entropy loss (average surprise across all notes) on the test (dark) and training (light) data as a function of training epoch.

Appendix 1—figure 1
Comparison of the pitch (left) and pitch interval distributions (right) for the music data from the MEG study (top), EEG study (middle), and MCCC corpus (bottom).
Appendix 1—figure 2
Model performance on the musical stimuli used in the EEG study.

(A) Comparison of music model performance in predicting upcoming note pitch, as composition-level accuracy (left; higher is better), median surprise across notes (middle; lower is better), and median uncertainty across notes (right). Context length for each model is the best performing one across the range shown in (B). Vertical bars: single compositions, circle: median, thick line: quartiles, thin line: quartiles ±1.5 × interquartile range. (B) Accuracy of note pitch predictions (median across 10 compositions) as a function of context length and model class (same color code as (A)). Dots represent maximum for each model class. (C) Correlations between the surprise estimates from the best models.

Appendix 1—figure 3
Comparison of the MEG TRFs and spatial topographies for the surprise estimates from the best models of each model class.
Appendix 1—figure 4
Comparison of the EEG TRFs and spatial topographies for the surprise estimates from the best models of each model class.
Appendix 1—figure 5
Comparison of the predictive performance on the MEG data using ridge-regularized regression, with the optimal cost hyperparameter alpha estimated using nested cross-validation.

Results are shown for the best-performing model (MT, context length of 8 notes). Each line represents one participant. Lower panel: raw predictive performance (r). Upper panel: predictive performance expressed as percentage of a participant’s maximum.

Appendix 1—figure 6
Comparison of the predictive performance on the EEG data using ridge-regularized regression, with the optimal cost hyperparameter alpha estimated using nested cross-validation.

Results are shown for the best-performing model (MT, context length of 7 notes). Each line represents one participant. Lower panel: raw predictive performance (r). Upper panel: predictive performance expressed as percentage of a participant’s maximum.

Author response image 1
Author response image 2

Tables

Appendix 1—table 1
Overview of the musical stimuli presented in the MEG (top) and EEG study (bottom).
MusicMEG
ComposerCompositionYearKeyTime signatureTempo (bpm)Duration (sec)NotesSound
Benjamin BrittenMetamorphoses Op. 49, II. Phaeton1951C maj4/411095384Oboe
Benjamin BrittenMetamorphoses Op. 49, III. Niobe1951Db maj4/460101171Oboe
Benjamin BrittenMetamorphoses Op. 49, IV. Bacchus1951F maj4/4100114448Oboe
César FranckViolin Sonata IV. Allegretto poco mosso1886A maj4/4150175458Flute
Carl Philipp Emanuel BachSonata for Solo Flute, Wq.132/H.564 III.1763A min3/8982751358Flute
Ernesto KöhlerFlute Exercises Op. 33 a, V. Allegretto1880G maj4/4124140443Flute
Ernesto KöhlerFlute Exercises Op. 33b, VI. Presto1880D min6/8176134664Piano
Georg Friedrich HändelFlute Sonata Op. 1 No. 5, HWV 363b, IV. Bourrée1711G maj4/413284244Oboe
Georg Friedrich HändelFlute Sonata Op. 1 No. 3, HWV 379, IV. Allegro1711E min3/896143736Piano
Joseph HaydnLittle Serenade1785F maj3/49281160Oboe
Johann Sebastian BachFlute Partita BWV 1013, II. Courante1723A min3/464176669Flute
Johann Sebastian BachFlute Partita BWV 1013, IV. Bourrée angloise1723A min2/462138412Oboe
Johann Sebastian BachViolin Concerto BWV 1042, I. Allegro1718E maj2/2100122698Piano
Johann Sebastian BachViolin Concerto BWV 1042, III. Allegro Assai1718E maj3/89280413Piano
Ludwig van BeethovenSonatina (Anh. 5 No. 1)1807G maj4/4128210624Flute
Muzio ClementiSonatina Op. 36 No. 5, III. Rondo1797G maj2/4112187915Piano
Modest MussorgskyPictures at an Exhibition - Promenade1874Bb maj5/480106179Oboe
Pyotr Ilyich TchaikovskyThe Nutcracker Suite - Russian Dance Trepak1892G maj2/412078396Piano
Wolfgang Amadeus MozartThe Magic Flute K620, Papageno’s Aria1791F maj2/472150452Flute
25899824
MusicEEG
ComposerCompositionYearKeyTime signatureTempo (bpm)Duration (sec)NotesSound
Johann Sebastian BachFlute Partita BWV 1013, I. Allemande1723A min4/41001581022Piano
Johann Sebastian BachFlute Partita BWV 1013, II. Corrente1723A min3/4100154891Piano
Johann Sebastian BachFlute Partita BWV 1013, III. Sarabande1723A min3/470120301Piano
Johann Sebastian BachFlute Partita BWV 1013, IV. Bourree1723A min2/480135529Piano
Johann Sebastian BachViolin Partita BWV 1004, I. Allemande1723D min4/447165540Piano
Johann Sebastian BachViolin Sonata BWV 1001, IV. Presto1720G min3/81251991604Piano
Johann Sebastian BachViolin Partita BWV 1002, I. Allemande1720Bb min4/450173620Piano
Johann Sebastian BachViolin Partita BWV 1004, IV. Gigue1723D min12/8_1201821352Piano
Johann Sebastian BachViolin Partita BWV 1006, II. Loure1720E maj6/480134338Piano
Johann Sebastian BachViolin Partita BWV 1006, III. Gavotte1720E maj4/4140178642Piano
15987839

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Pius Kern
  2. Micha Heilbron
  3. Floris P de Lange
  4. Eelke Spaak
(2022)
Cortical activity during naturalistic music listening reflects short-range predictions based on long-term experience
eLife 11:e80935.
https://doi.org/10.7554/eLife.80935