A. Experimental procedure and multivariate statistical analyses. The learning part was sandwiched by a silent resting state (RS) and a random stream (RND) with even transition probabilities between syllables. This design accounted for the potential effect of time during the experiment and changes in vigilance state on neural entrainment measures. The learning segment consisted of a long structured (STR) stream where syllables were organized into four three-syllable words presented in random order with no repetition. Following this, six test-blocks were presented, each comprising 8 triplets from the words and part-words conditions with 2-second silences interleaved between items. To sustain learning, 30-second short STR streams were interspersed between test blocks. A 4.5s fade-in/out at the borders of each stream was included to minimize any perceptual anchor effect. The full procedure lasted ∼17 minutes. Arrows’ width schematically represents the transition probability (TP) magnitude. B. Pipeline for longitudinal partial least square correlation (PLS-C) analysis. Details are provided in the Methods section. ERP: Event-Related Potential.

Neural entrainment to syllable rate (4 Hz): Main effect across all participants (Top row); Group differences (bottom row).

A. Design salience (left) and brain salience (right topography) derived from the significant latent component (LC) for neural entrainment to the syllable rate (4 Hz) using the targeted frequency (4 Hz) versus adjacent frequencies as a contrast. Significance (i.e., salience) was established through bootstrapping. Bars represent the mean of 500 random salience samples with replacement bootstrapping, and error bars indicate the 95% confidence interval. Yellow shading highlights variables that significantly contribute to the LC, defined by a Bootstrap Ratio (BSR; mean of bootstrapping divided by standard deviation) > 2.3. The topography of the BSR values shows electrodes significantly contributing to the LC (indicated by black dots, BSR > 2.3). B. Individual raw Phase Locking Values (PLVs) extracted from the salient electrodes identified by the LC (black dots in the topography on 2A) are displayed. A fitted curve is included for visualization purposes only, produced by a mixed-effects model with a 95% confidence interval. This curve is intended solely to aid visualization, as the statistical relationships between EEG and behavioral variables are determined by the PLS-C analysis. C. PLS-C analysis of the differences between HL and LL groups is presented following the same format as in A. D. Individual raw PLVs extracted from the salient electrodes (black dots in subFigure 2C) are shown for visualization purposes only. At every age, a verbal developmental quotient (DQ) of 100 is expected in the general population.

Neural entrainment to word rate (1.3Hz): Main effect across all participants

(Top row); Group differences (bottom row). A. Design salience (left) and brain salience (right) derived (through bootstrapping) from the significant latent component (LC) for neural entrainment to word rate. B. Individual raw PLVs extracted from the salient electrodes given by the LC (black dots on subfigure A) with a fitted curve, for visualization purposes only. C. PLS-c applied on PLV at 1.3Hz with adjacent frequencies subtracted, using group as contrast, and verbal outcome (verbal developmental quotient [DQ] collected at 18-21 months) added as a design variable. D. Raw individual data extracted from the salient electrodes of the LC (black dots on subFigure 2C) with a linear regression curve fitted for illustration purpose only.

Early evoked response potential (ERP) to part-words compared to words across all participants.

A. Design and brain saliences derived from the significant latent component. Brain topographies of bootstrap ratios (BSR) are displayed at 250ms intervals. Black dots indicate BSR > 2.3. B. Participants’ brain scores for part-word and word conditions, as a function of age. Brain scores are participants’ raw voltage data projected onto electrode saliencies. Brain scores illustrate how individual EEG data fit the saliences derived from the latent component. Linear fitting is used for illustrative purposes only. C. Voltage grand averaged (left) and differential (right) responses to part-word and word conditions at each age bin.

Late evoked response potential (ERP) to part-words compared to words across all participants.

A. Design and brain saliences derived from the significant latent component. Brain topographies of bootstrap ratios (BSR) are displayed at 250ms intervals. Black dots indicate BSR > 2.3. B. Participants’ brain scores for part-word and word conditions, as a function of age. For details, see Figure 4. C. Voltage averaged (left) and differential (right) responses to part-word and word conditions at each age bin.

Late evoked response potential (ERP) to novelty in infants at high and low likelihood for autism.

A. Design and brain saliences derived from the significant latent component. Brain topographies of bootstrap ratios (BSR) are displayed at 250ms intervals. Black dots indicate BSR > 2.3. B. Participants’ brain scores for part-word and word conditions, as a function of age. For details, see Figure 4. C. Voltage differential responses to part-word and word conditions at each age bin and within each group. HL: high likelihood for autism; LL: low likelihood for autism.

Sample characteristics

. Statistical comparison between LL and HL samples. For categorical variable, chi square (χ2) was applied. For continuous variables, we used two-tailed independent T tests. P values <.05 are highlighted in bold.

Neural entrainment data quality.

LL mean trajectory is in red and HL in blue, with their respective 95% confidence intervals. Included epochs in absolute number.

ERP data quality.

LL mean trajectory is in red and HL in blue, with their respective 95% confidence intervals.

Neural entrainment at syllable rate (4Hz) in STR and RND separately, PLS-c.

Yellow shading on left panels and black dots on middle panels indicate BSR > 2.3. BSR >2.3 is considered significant.

Word entrainment within HL participants.

Supplementary analysis comparing 1.33Hz entrainment with neighboring frequencies within the HL population using PLS-c. Analysis confirmed presence of a significant LC (p=.002, r=.50 and 38.2% explained covariance) with following BSRs: contrast: 12.9*; mean age:-3.2*; contrast*mean age: - 5.3*; delta age: 5.9*; contrast*delta age: 6.7*; age2: 3.6*; contrast*age2: 1.9.

Neural entrainment time course over the experimental session considering all participants.

The plain squares under the plots correspond to the time samples with PLVs significantly greater than 0 (p<.05).

Group differences in the time course of syllable neural entrainment (4Hz).

Gray shading on right panel indicates BSR < 2.3; BSR >2.3 is considered significant. Vertical dashed lines indicate the transitions between random and structured streams.

ERP topographies across age bins and participants Figures S8 and S9 further divide ERP topographies by group, showing the response in both LL and HL at each age bin and for each condition.

ERP topographies in each group at 3mo and 6-9mo.

ERP topographies in each group at 12-15mo and 18-21mo

Figures S10 and S11 display the PLS-C conducted within LL and HL groups separately, for the late time-window [1500-3000ms].

PLS-C conducted within LL for late response to novelty.

PLS-C conducted within HL for late response to novelty.