Summary of experiments

(A) Experiment 1: Letter familiarity task. Participants were shown upright and inverted versions of each letter and had to identify which one they had seen more often. (B) Experiment 2: Letter recognition task. Participants were shown upright letters and had to name the letter. (C) Experiment 3: Letter search task. Participants saw two boxes containing four letters each and had to identify the box containing the one odd or different item. Letters were either upright or inverted in interleaved trials. (D) Experiment 4: RAN task. Participants were shown a card printed with a grid of digits and were asked to rapidly read them aloud from left to right as quickly as possible.

Letter shape familiarity and letter name knowledge in early readers

(A) Accuracy on the familiarity task (Experiment 1) plotted against the accuracy on the letter recognition task across 208 children who participated in both tasks. Each dot represents one child. Participants are shown as divided into four distinct groups based on familiarity and recognition accuracy for subsequent analyses in panels D-G: high/low familiarity (cyan/blue) × high/low recognition (orange/red). The overall correlation across all participants is depicted at the bottom left with asterisks representing statistical significance (**** is p < .00005). (B) Same as (A) but using familiarity accuracy calculated only on the letters that were not recognized by each child. Recognition accuracy is calculated as before. Note that this means that familiarity accuracy is calculated over many more letters for children with low compared to high recognition accuracy. Nonetheless it can be seen that children show high levels of letter shape familiarity even on letters that they did not recognize at all. This challenges the assumption that children become familiar with letter shape only when they undergo formal reading instruction. (C) Same as (A) but using familiarity accuracy calculated only on recognized letters, shown for the sake of completeness. Note that familiarity accuracy is now calculated across many more letters for children with high compared to low recognition accuracy. It can be seen that children showed high levels of letter familiarity on letters that they did recognize, which is not surprising. (D) Average search time for upright and inverted letter searches for participants in the high familiarity, low recognition group (n = 39). Error bars represent the standard error of the mean for the average search time across participants. Asterisks above the bars represent statistical significance, calculated using a sign-rank test on average search times for upright and inverted searches (** is p < .005). (E) Same as (D) but for the high familiarity, high recognition group (n = 121, *** is p < .0005). (F) Same as (D) but for the low familiarity, low recognition group (n = 28). (G) Same as (D) but for the low familiarity, high recognition group (n = 20).

Factors that determine upright letter search advantage

(A) Correlation between the upright letter search advantage and each factor across children. Error bars represent standard deviation estimated using a bootstrap analysis: participants were selected randomly with replacement 1,000 times and the correlation was calculated each time, and the error bar is taken as the standard deviation across these bootstrapped correlations. Statistically significant correlations are indicated using green bars with asterisks, and others using blue bars. Asterisks represent statistical significance (* is p < .05; ** is p < .005; *** is p < .0005, etc). (B) Partial correlation between the upright search advantage and each factor across children. All other conventions are as before.

Factors that determine RAN score & correlations between all factors

(A) Correlation between RAN score and each factor across children. Error bars represent standard deviation estimated using a bootstrap analysis: participants were selected randomly with replacement 1,000 times and the correlation was calculated each time, and the error bar is taken as the standard deviation across these bootstrapped correlations. Statistically significant correlations are indicated using green bars with asterisks, and others using blue bars. Asterisks represent statistical significance (* is p < 0.05; ** is p < 0.005; *** is p < 0.0005, etc). (B) Partial correlation between the upright search advantage and each factor across children. All other conventions are as before (C) Colormap of pairwise correlations between measures across all tasks. The color in each box represents the correlation coefficient between the corresponding factors. Asterisks inside the box represent statistical significance (* is p < 0.05, ** is p < 0.005, etc).