Figures and data

Visualisation of hypotheses.
We expected main effects on reading time of (a) cognitive load and (b) surprisal, as well as (c) an interaction of surprisal and cognitive load. Additionally, (d) we explored how these effects are modulated by age.

Experimental design and quantification of predictability as word surprisal using a large language model (GPT-2).
(a) Participants were asked to perform a self-paced reading task (Reading Only) which was complemented in some blocks by a secondary n-back task on the font colour of the words (Reading + 1-back, and Reading + 2-back). The order of the blocks was pseudo-randomized, with Reading Only always being the first condition to be presented, followed by the two dual-task conditions, and another main block for each of the three conditions. Both dual-task paradigms (Reading + 1-back and Reading + 2-back) were first introduced in short single-task training sets. (b) We generated one surprisal score for each word in the reading material by using context chunks of 2 words as prompts for next-word-predictions in GPT-2. The resulting probability for the actual next word in the text (here: “mail”, marked in teal) was then transformed into a surprisal score, which reflected how predictable the respective word was given the context. Additionally, based on the distribution of probabilities for all possible continuations, we computed an entropy score, which reflects the uncertainty in predicting the next word. Please note that the example sentence used here has been translated to English for better comprehensibility, while the original text materials were in German.

Estimated marginal effects of predictors age, cognitive load and surprisal on task performance and reading time.
Main effects of cognitive load and age on accuracy in the comprehension question task (panel a) and on n-back task performance (d-primes; panel b). Please note that we do not show d-primes for the Reading Only task as there was no n-back task in this condition. Reading time increased with increasing age and word surprisal (panel c, left: results from linear mixed model, LMM, right: results from generalised additive model, GAM – for an explanation see section Modelling Potential Non-Linear Distributions). In panel d, we show the 2-way interaction of cognitive load and surprisal (left) and cognitive load and age (middle). In both cases, effects were strongest in the Reading Only condition (see barplot insets). Additionally, we show how age modulates the effect of surprisal on reading time (panel c, right). For raw and predicted individual trajectories, please see Fig. S2 and S3 in the Supplementary Material. Estimated marginal effects were adjusted for “Reading Only” as the reference level.

Main results for model for reading time (N = 175).

Results of the simple slopes analysis and exemplary marginal effects plots for three different ages.
In the Johnson-Neyman plot [46] on the left side of panel (a), we show the effect of surprisal on reading time across the whole age range separated by cognitive load condition: Reading Only (top; blue), 1-back Dual Task (middle; yellow) and 2-back Dual Task (bottom; red). The stronger the surprisal effect for a certain age, the higher the value on the y-axis. Grey areas indicate age ranges for which we did not find an effect of surprisal on reading time in the respective condition, whereas blue areas indicate a significant surprisal effect (see inset on the right for a visualisation of a non-significant effect in a younger participant and a significant effect in an older participant). In panel (b), we show the predicted surprisal effect in each cognitive load for an average young (average age - 1SD), middle-aged (average age) and older participant (average age + 1SD). The bar plots illustrate the predicted effects of surprisal on reading time across the three cognitive load conditions for those three average participants.

Results of the internal online replication in comparison with the results of the online sample of the original study.
Shown are estimates ± CI for all main effects of surprisal, age, and cognitive load as well as their 2-way and 3-way interactions. RO: Reading only. For full results please see Table S3. For a comparison of age distributions in the original online and lab sample and the online replication sample, please see Fig. S1. Please note that effects are grouped by their magnitude.

Cumulative Effect of Surprisal on Reading Time.
To illustrate the cumulative effect of surprisal on reading time over the course of a text, we predicted reading times for an average younger (27 years, M - 1SD) and average older (63 years, M + 1SD) participant in the easy Reading Only condition (blue) and the most challenging condition 2-back (Dual Task; red) and computed the cumulative sum for a short example sentence. Panel a illustrates how reading time gradually increases in total over the course of the sentence, with all predictors being held constant at their average, except for the predictors age, cognitive load and word length. In panel b, we again show cumulative reading times, this time isolating the effect of surprisal. Please note that surprisal values are zero for the first two words, as our GPT-2 model estimates surprisal based on the two preceding words, which are unavailable at the beginning of the sentence. The example sentence used in both panels is the German translation of the opening line of Anna Karenina, “Happy families are all alike, every unhappy family is unhappy in its own way” [48].