Figures and data

Human incremental structural interpretations derived from continuation pre-tests.
(A) An example set of target sentences differing only in the transitivity of Verb1, HiTrans: high transitivity, LoTrans: low transitivity. Det: determiner, SN: subject noun, V1: Verb1, PP1-PP3: prepositional phrase, MV: main verb, END: the last word in the sentence. (B) Probability of a direct object (left) and a prepositional phrase (right) in the continuations after Verb1. (C) Probability of a main verb in the continuations after Verb1, which indicates an Active interpretation. (D) Correlations between multifaceted lexical constraints and probabilistic interpretations in the two pre-tests. (Spearman rank correlation, black dots indicate significance determined by 10,000 permutations, PFDR < 0.05 corrected).

Incremental interpretation of sentential structure by BERT.
(A) Context-free dependency parse trees of two plausible structural interpretations. Left: Passive interpretation where V1 is the head of a reduced relative clause. Right: Active interpretation where V1 is the main verb. (B) Incremental input to BERT, with the lightness of dots encoding different positions in the target sentences. Det: determiner, SN: subject noun, V1: Verb1, PP1-PP3: prepositional phrase, MV: main verb, END: the last word in the sentence. (C) Incremental interpretations of the dependency between SN and V1 in the model space consisting of the parse depth of Det, SN and V1. Upper: Each colored circle represents the parse depth vector up to V1 derived at a certain position in the sentence [with the same color scheme as in (A)]. The hollow triangle and circle represent the context-free dependency parse vectors for Passive and Active interpretations in (B). Lower: incremental interpretations of the two types of target sentences represented by the trajectories of median parse depth. (D) Distance from Passive and Active landmarks in the model space as the sentence unfolds [between each colored circle and the two landmarks in the upper panel of (C)] (two-tailed two-sample t-test, *: P < 0.05, **: P < 0.001, error bars represent SEM).

Correlation between incremental BERT structural measures and explanatory variables.
BERT structural measures include (A, B) BERT interpretative mismatch represented by each sentence’s distance from the two landmarks in model space (Fig. 1C); (C, D) Dynamic updates of BERT interpretative mismatch represented by each sentence’s movement to the two landmarks; (E, F) Overall structural representations captured by the first two principal components (i.e., PC1 and PC2) of BERT parse depth vectors; (G, H) BERT Verb1 (V1) parse depth and its dynamic updates. Explanatory variables include lexical constraints derived from massive corpora and the main verb probability derived from human continuation pre-tests (Spearman correlation, permutation test, PFDR < 0.05, multiple comparisons corrected for all BERT layers, results shown here are based on layer 14, see Figs. S3-S5 for the results of all layers); PP1-PP3: prepositional phrase, MV: main verb, END: the last word in the sentence.

Neural dynamics underpinning the emerging structure and interpretation of an unfolding sentence.
(A-C) ssRSA results of BERT parse depth vector up to Verb1 (V1), the preposition (PP1) and the main verb (MV) in epochs separately time-locked to their onsets. (D-F) ssRSA results of the mismatch for the preferred structural interpretation (the specific BERT layer from which BERT structural measures were derived was denoted in parentheses). From top to bottom in each panel: vertex t-mass (each vertex’s summed t-value during its significant period); heatmap of time-series of ROI peak t-value (the highest t-value in an ROI at each time-point) with a green bar indicating effect onset and ROI t-mass (each ROI’s summed mean t-value during its significant period); cluster t-mass time-series (summed t-value of all the significant vertices of a cluster at each time-point). [cluster-based permutation test, vertex-wise P < 0.01, cluster-wise P < 0.05 in (A-E); marginally significance in (F) with cluster-wise P = 0.06]. Solid vertical lines indicate the timings of onset, average uniqueness point (UP), and average offset of the word time-locked in the epoch with grey shades indicating the range of one SD. LH/RH: left/right hemisphere. See Table S2 for full anatomical labels. See Fig. S8 for the significant results of other BERT layers in the MV epoch.

Neural dynamics updating the incremental structural interpretation.
(A) ssRSA results of BERT Verb1 (V1) parse depth change at the main verb (MV) relative to the parse depth V1 when it is first encountered. (B) ssRSA results of the updated BERT V1 parse depth when the input sentence reaches MV. (C) Spatiotemporal overlap between the effects in (A) and (B). (cluster-based permutation test, vertex-wise P < 0.01, cluster-wise P < 0.05).

Neural dynamics of multifaceted probabilistic constraints underpinning incremental structural interpretations.
(A, B) ssRSA results of SN agenthood and SN patienthood (i.e., plausibility of SN being the agent or the patient of V1) in PP1 and MV epochs separately. (C) ssRSA results of non-directional index (i.e., interpretative coherence between SN and V1 regardless of the structure preferred) in MV epoch. (D) ssRSA results of Passive index (i.e., interpretative coherence for the Passive interpretation) in MV epoch. (E) Influence of the Passive interpretative coherence on the emerging sentential structure in MV epoch revealed by the Granger causal analysis (GCA) based on the non-negative matrix factorization (NMF) components of whole-brain ssRSA results (see Fig. S9 for more details) [(A-D) cluster-based permutation test, vertex-wise P < 0.01, cluster-wise P < 0.05; (E) permutation test PFDR < 0.05].