Experimental design.

(A) Participants were presented with polyphonic music made of two monophonic streams (PolyOrig condition), with control stimuli where the average pitch heights for the two streams were inverted (PolyInv), and with the corresponding monophonic streams in isolation (Monophonic). All stimuli were synthesised from MIDI scores using high quality virtual piano sounds. (B) Each experimental session was organised into two blocks: a listening block, where EEG signals were recorded from participants during uninstructed music listening, followed by a music identification block. The music pieces were selected randomly from any condition, an a given piece was presented once, only in one of the three conditions, without repetitions. In the melody identification block, participants heard short polyphonic snippet from the same pieces, and were asked to sing back the melody and to indicate if that was the high or low pitch stream on a five-point scale. This led to a behavioural measurement of attention bias. (C) Attention bias was also measured from the neural signal. Attention decoding models were built on the monophonic condition, by fitting backward temporal response functions (TRF) on each participant to reconstruct the sound envelope from the EEG signal. TRF models were then applied to the polyphonic conditions to decode the attended melody, resulting in a neural measurement of attention bias.

Behavioural and neural indices of attention bias during uninstructed listening of polyphonic music.

(A) Illustration of the hypothesised attention bias scenarios for polyphonic music listening. A high-pitch bias was expected for PolyOrig by design. Dominant high-voice superiority effect or motif attractiveness were expected to lead to a strong attention bias toward the high (Hp0) and low (Hp2) pitch streams respectively, while a substantial reduction in attention bias would reflect a comparable contribution of the two factors (Hp1). (B) Behavioural result. The behavioural attention bias metric (mean ± SEM; ***p<0.001) was derived from the subjective reporting in the melody identification block. Subjective ratings indicated what stream was perceived as the main melody, from value -2 (low pitch) to value 2 (high pitch). (C) EEG decoding analysis. (Left) Envelope reconstruction correlations for the decoding analysis are reported (mean ± SEM; *p<0.05, ***p<0.001) for individual streams (high and low pitch) and conditions (PolyOrig and PolyInv). Colours refer to the motif (red: main melody; blue: support stream). Note that colours are inverted in the two conditions, reflecting the pitch inversion. (Right) Envelope reconstruction correlations for individual participants. (D) Neural attention bias index, which was obtained by subtracting the high and low pitch reconstruction correlations in (C) within each condition (Δenvelope reconstruction; mean ± SEM; **p<0.01). (E) Forward TRF model weights at channel Cz, providing insights into the temporal dynamics of the neural response to the two streams. Lines and shaded areas indicate the mean and SEM over participants respectively. Thick black lines indicate time points with statistically significantly difference across conditions.

Low-frequency cortical encoding of melodic expectation during polyphonic music listening.

(A) Schematics of the analysis method. Canonical Correlation Analysis (CCA) was run to study the stimulus-EEG relationship. Match-vs-mismatch classification scores were derived to quantify the strength of the neural encoding of a given stimulus feature-set. (B) The analysis was run separately for acoustic-only features (A) and acoustic+melodic expectations features (AM) in each of the three experimental conditions. The distributions in the figure indicate repetitions of the match-vs-mismatch procedure. (C) The gain in match-vs-mismatch classification after including the melodic expectation features, ΔClassification, is compared across conditions and models, informing on whether a monophonic or a polyphonic account of melodic expectations best fit the neural data. In the box-plot, the bottom and top edges mark the 25th and 75th percentiles respectively, while the mid-line indicates the median value (***p<0.001).

Composers and titles of musical pieces used as stimuli.