1 Introduction

Ageing is a major risk factor for a wide range of major pathologies of the brain [López-Otín et al., 2013] including neurodegenerative disorders [Planche et al., 2022] and stroke [Ellekjær et al., 1997, Feigin et al., 2014]. Changes in human brain electrophysiology are observable through changes in the power spectrum of electromagnetic fields measured by EEG and MEG. These approaches offer a direct perspective on temporal synchronization of neuronal activity [Babiloni et al., 2020]. This synchronization depends on the integrity of network connections and is known to degrade throughout the ageing process [Maestú and Fernández, 2020]. Such effects have been observed in the neuronal power spectrum of both MEG [Brady et al., 2020, Gómez et al., 2013, Hoshi and Shigihara, 2020, Rempe et al., 2023, Stier et al., 2023, Osipova et al., 2005, Hughes et al., 2019] and EEG recordings [Klimesch, 1999, Park et al., 2024, Turner et al., 2023, Zibrandtsen and Kjaer, 2021]. Non-invasive measures of brain electrophysiology are a highly promising approach to contribute toward the global aim of allowing more people to realise their potential across the full lifespan [World Health Organization, 2015].

Electrophysiological signatures of ageing across studies in the literature are diverse, or even conflicting. It can be challenging to reconcile differences across differing analysis choices, frequency bands, and brain regions [Donoghue et al., 2021]. For example, increasing age has a longstanding association with a reduction in posterior alpha band (7-13Hz) power [Klimesch, 1999]. Yet, the broader literature reveals that this effect varies in magnitude, and may even reverse direction, across brain regions, frequency bands, and analysis pipelines [Gómez et al., 2013, Hoshi and Shigihara, 2020, Lodder and van Putten, 2011, Park et al., 2024, Pathak et al., 2022, Quinn et al., 2024, Rempe et al., 2023, Scally et al., 2018, Zibrandtsen and Kjaer, 2021]. Credible scientific claims must be supported by evidence of replicability in new datasets and robustness to reasonable variations in analysis [Botvinik-Nezer and Wager, 2023, Nosek and Errington, 2020, Nosek et al., 2022], but our ability to integrate results across studies is hindered by the large variety of analytic choices used in the literature. This makes it difficult to synthesize findings from multiple datasets, either as a manual comparison or a formal meta-analysis.

Neuroscience researchers working on MRI facilitate comparability and meta-analyses by computing and distributing whole-volume statistical maps along with results of individual studies [Gorgolewski et al., 2015]. A similar approach is possible for spectral analysis of MEG/EEG, e.g. computing and sharing full frequency and whole head estimates of critical effects without binning results into frequency bands or regions of interest. Features such as canonical frequency bands may be a useful guide for interpretation of results, but they can oversimplify the profile of the underlying spectrum and add a source of variability between studies that can prevent aggregation of results. Reporting of whole-head and full-frequency spectra of effect estimates would make it straightforward to aggregate across studies and eventually enable identification of sub-threshold effects that may be missed in single analyses but consistent across studies.

Effect sizes are another essential tool for comparing and aggregating results across studies. They quantify the magnitude, or practical significance, of an estimated effect without reducing it to a hypothesis test with a binary outcome that might be dependent on practical factors such as sample size [Smith and Nichols, 2018]. Critically, effect sizes allow efficient use of data by supporting planning for future data samples through power analyses. Statistically underpowered studies are more likely to lead to incorrect conclusions and inaccurate estimates of effect sizes when all other aspects of experimental design and analysis have been conducted perfectly [Button et al., 2013, Cremers et al., 2017]. This critical for neuroimaging analyses where different spatial, temporal or spectral components of the same signal feature can have different effect sizes [Baker et al., 2021]. Many guidelines on good practice in electrophysiology research encourage reporting effect sizes [Gross et al., 2013, Hillebrand et al., 2018, Baker et al., 2021] though this remains relatively uncommon in electrophysiology.

Here, we use a GLM-Spectrum [Quinn et al., 2024] approach to quantify and visualise age effects in the CamCAN dataset [Shafto et al., 2020]. The method uses multiple regression to model the relationships between multiple explanatory variables and the oscillatory power at each spatial location and frequency bin. The approach is expanded with between-subject effect sizes estimation in addition to replication on multiple datasets. An additional exploration examines how the age effect is changed when including other covariates in the model. The results are discussed in the context of the wider literature on age effects and lay the groundwork for rigorous evaluation and benchmarking of potential biomarkers for brain health in ageing.

2 Results

We first quantify the change in the MEG frequency spectrum across the adult lifespan using the CamCAN dataset in section 2.1. Section 2.2 explores the effect sizes of these effects and provides broad estimates of future sample sizes for planned replications of our results. We next demonstrate the reproducibility of these results across several MEG-UK datasets in section 2.3 and their robustness across analyses that control for a range of potential confounding variables in sections 2.4 and 2.5. Finally, we quantify the impact of including grey matter volume as a covariate on the effect size of the age effect in section 2.6.

2.1 A whole head and full-frequency depiction of age-related change to the neuronal power spectrum

MEG data from CamCAN [Shafto et al., 2020] for 557 participants during eyes-closed resting state were analysed. After data pre-processing and cleaning, a two-level GLM-Spectrum [Quinn et al., 2024] approach was used to characterise the average power and the linear slope effect of age across the spectrum for each sensor. Figure 1A uses the estimated GLM-Spectrum coefficients to show the model prediction of how the spectrum changes across age. This shows a characteristic pattern that combines a variety of previously reported results into a single analysis. Importantly, these effects are continuous across the spectrum and would be challenging to unpick without the full-spectrum approach used here.

Effect of age on the relative magnitude spectrum across space and frequency.

A) Model-predicted spectra (averaged across all sensors) for 4 equally spaced ages across the participant age range with inset histogram of participant ages within CamCAN. B) Spectrum of t-values quantifying the age effect across space and frequency. Non-parametric permutations with maximum statistics to control for multiple comparisons across sensors and frequency bins. While this permutation testing was not cluster-based, contiguous clusters of significant sensors were computed post-hoc for visualisation The largest 6 spatially and spectrally contiguous areas of statistically significant effects are highlighted in frequency by black bands at the top of the spectrum and highlighted in space by sensors marked with a white circle in the adjoining topography. C) A 2D frequency-by-space map of all statistically significant effects. Sensor-Frequency combinations that do not reach statistical significance have a faded colour scale. Blue regions indicate decreasing spectral power with age and red regions indicate increases.

The t-statistic spectrum (Figure 1B) is the test statistic for the linear effect of age at each point in frequency for each sensor. A positive t-value indicates that power in that frequency bin at that sensor position increases with age, and vice versa for negative t-values. The statistical significance of the t-statistic spectrum of the age-effect (i.e., rejecting the hypothesis that there is no age effect) was computed using non-parametric permutations with maximum statistics to control for multiple comparisons across sensors and frequency bins. While this permutation testing was not cluster-based, contiguous clusters of significant sensors were computed post-hoc for visualisation. Figure 1B shows the t-value spectrum of the age with the largest post-hoc clusters overlaid and Figure 1C shows the effect organised into a 2D image across space (anterior to posterior) and frequency as a statistical parametric map [Litvak et al., 2011]. The 2D image representation simplifies visualisation as it does not rely on colour-coding of sensor location. Critically, both the t-statistic spectrum and 2D image are simple visualisations of the age effect that allow statistically significant and sub-threshold effects to be shared and reused in the literature.

The maximum-statistic permutations testing results identified a broad set of effects, including six contiguous regions within the age-effect spectrum (Figure 1B). Firstly, older adults have lower spectral magnitude than young adults at low frequencies (below ¡7 Hz). This is a spatially broad effect that peaks in central/frontal regions. This analysis does not separate different effects in low frequency bands such as delta and theta. Rather, they appear as a contiguous effect that spreads across the whole 1-7Hz range. Older adults have relatively high magnitude around 7-8.5 Hz, and younger adults have higher magnitude around 9.5-12.5 Hz. Both effects peak in posterior/occipital regions, with the negative age effect at higher frequencies spreading more widely across the sensor array. These effects are visible as distinct peaks with opposite direction (Figure 1B) in the GLM Spectrum of the age effect. Referring to the model predicted spectra (Figure 1A) suggests that this may reflect a slowing of the alpha peak frequency. Older adults have higher spectral magnitude between 12.5-26 Hz in central/motor regions corresponding to sensori-motor band activity. Finally, two effects are identified with centre frequencies above 30 Hz. Older adults have higher relative magnitude in posterior central sensors at frequencies between 32.5 and 50 Hz and have lower relative magnitude in fronto-central regions in the gamma region between 51.5 and 80 Hz (frequencies above 100Hz were not included in the model). This overall pattern of findings is consistent in a equivalent source space analysis using LCMV beamforming and parcellation 9.

This simple mass-univariate analytic approach allows a broad range of effects to be investigated within a single spectrum. Critically, this representation contains derived effects that are not explicitly modelled by the GLM. A strong reduction in alpha peak frequency is qualitatively visible in the model projected power spectrum (Figure 1A). Whilst no single parameter directly fits this feature in the GLM-Spectrum, this non-linear pattern is captured by the profile of linear parameters across channels and frequencies. The decrease in alpha peak frequency can be recovered by extrema detection carried out on the model projected spectra (Figure 2B) and shows a prominent decrease in alpha power and alpha frequency across the course of ageing. Critically the GLM-Spectrum representation retains non-linear changes in these features across age (Figure 2B & C).

Model projected age effect retains non-linear properties of decrease in alpha peak frequency and magnitude.

A) Model predicted spectra (averaged across all sensors) for the oldest and youngest participants in CamCAN with the alpha peak for intermediate ages overlaid in the thick line. B) The parameter estimates of age that combine to describe the peak frequency shift seen in A). C) Scatter plot of individual alpha peak frequencies against age. A linear regression line fitted directly to the individual alpha peaks is shown in red and the GLM-Spectrum derived alpha peaks from the model projected spectra is shown in blue. D) As C for individual alpha peak magnitude.

2.2 Effect size of age is variable across space and frequency, with consequences for sample size planning

Neuroimaging analyses, such as the results in Figure 1, typically focus on testing hypotheses about the presence of effects in brain imaging data, not estimating their magnitude. These hypothesis tests are sample-size dependent, so that practically insignificant effects may become significant in sufficiently large datasets. Effect sizes estimate the strength of a phenomenon or relationship, independently of sample size. They better reflect an effect’s importance in a clinical or behavioural context [Reddan et al., 2017] and can be used as a basis for power analysis (i.e. sample size planning) to inform future research. In this analysis, the effect size of the age regressor within the GLM-Spectrum was computed using Cohen’s F 2 [Cohen, 1988, Selya et al., 2012] for each sensor and frequency bin in the analysis.

Cohen’s F 2 corresponds to the proportion of variance in the data that is uniquely explained by an explanatory variable (i.e., regressor) of interest within the context of a multiple regression model.

The spectrum of effect sizes for the age regressor (Figure 3A) indicates the strength or practical impact of ageing on the spectrum. The effect size spectrum contains local peaks that correspond to the effects highlighted in section 2.1. The largest effects with local peaks at 5, 8, 10.5, 15Hz, 36, and 60 Hz have close spatial and spectral correspondence to extrema in the t-spectrum in Figure 1B. Low frequency, upper alpha and beta range results show effect sizes peaking between F 2 = 0.2 to F 2 = 0.3, indicating that ageing had a relatively large practical impact on the spectrum in these bands. In contrast, though they achieved statistical significance in the hypothesis tests (Figure 1), lower alpha and the gamma results had effect sizes around F 2 ¡= 0.075, indicating a relatively small practical impact of ageing on the spectrum. These effect sizes can be used to construct sample size estimates to inform future study planning. Power contours [Baker et al., 2021] across different effect sizes and sample sizes can provide a visual guide to the number of participants required to achieve a given statistical power in a future replication of the present analyses (Figure 3B).

Cohen’s F 2 effect size for age.

A) Cohen’s F 2 effect size for age across space and frequency. Each line is a sensor, with a colour matched to the sensors in the topography shown in the inset. The six topographies show spatial distributions across the sensor array at the peak frequency of the six significant effects identified in Figure 1B. B) Contours showing the relationship between effect size and sample size for five different experimental power levels. As sample sizes get larger, there is sufficient power to reliably detect smaller effect sizes. C) Effect size spectra with bootstrapped 95% confidence intervals at the peak sensor for the four largest effects identified in Figure 1B.

Effect sizes are estimates can vary in their precision depending on the dataset in question. To indicate this variability, bootstrapped confidence intervals were computed for each sensor and frequency pair. The computed confidence limits of the effect size estimates varies substantially across sensors and frequencies (Figure 3C shows the effect sizes with confidence intervals at the peak sensor for the first four significant effects).

We use the 95% confidence intervals around the effect sizes to make recommendations for future sample sizes for future samples that plan to replicate these results. The observed power calculations should not impact the interpretation of the present results, and should only be used as a general guideline for planning future studies. Table 1 gives a full summary of the peak statistics and future sample size range for the six age effects identified in Figure 1B. These results have implications for future sample planning for resting state electrophysiology studies of ageing. Whilst the results with relatively large effect sizes would have well powered replications with sample sizes of around 50-60 participants, the smaller results would require several hundreds of participants to have the same probability of detecting the effect if it is true (Table 1). This indicates that study samples should be planned with the smallest effect of interest in mind and that ageing effects in different frequency bands may not all be well powered within the same sample.

Summary of significant spatio-spectral regions containing an age effect in CamCAN.

The estimated sample sizes are the range of sample sizes needed to have an 80% chance of detecting the effect in a future sample, assuming that both the effects here are true and that the effect sizes are well estimated. Sample size ranges are computed from bootstrapped 95% confidence intervals for effect size estimate. The lower bound of the 95% confidence intervals can be taken as a smallest effect size of interest. The sample size forcasts are not informative about the present results and serve only as a guide for future study planning.

2.3 The spectral profile of age effects on brain electrophysiology is replicable across datasets

To explore the replicability of the ageing effect reported from the CamCAN dataset, the analysis was repeated on eyes-closed resting state recordings from three smaller datasets from the MEG-UK database (https://meguk.ac.uk/database/). Details on the similarities and differences between the datasets are summarised in Table 2. Replicability was quantified by the correlation of the whole spectral profile of the age effect between datasets.

Summary of significant spatio-spectral regions containing an age effect in CamCAN.

The estimated sample sizes are the range of sample sizes needed to have an 80% chance of detecting the effect in a future sample, assuming that both the effects here are true and that the effect sizes are well estimated. Sample size ranges are computed from bootstrapped 95% confidence intervals for effect size estimate. The sample size forcasts are not informative about the present results and serve only as a guide for future study planning.

Overall, the spectral profile of the ageing effect is highly replicable across MEG datasets acquired on different systems and across different facilities. We identified high correlations between the spectral profiles of the age parameter estimates, inferential statistics, and effect sizes in each of the four datasets (Figure 4). In particular, the spectral profiles for the parameter estimates and t-statistics were strikingly similar across datasets (Pearson’s R between 0.67 and 0.97 for parameter estimates and between 0.76 and 0.92 for t-statistics), suggesting a high level of replicability. The spectral profile of effect sizes was also positively correlated across datasets but to a lesser extent (Pearson’s R between 0.53 and 0.84). Maximum and minimum t-statistics for CamCAN are substantially larger in magnitude than in the smaller MEG-UK datasets. This is an expected consequence of the larger sample size of CamCAN on null-hypothesis test statistics such as t-values.

Replicability of the parameter estimates, t-statistics, and effect sizes of ageing on the neuronal power spectrum.

Ai) Spectrum of parameter estimates across frequency (averaged over all sensors) for each of the four datasets. Aii) Correlation matrix indicating similarity of the sensor-averaged frequency profile of the age effect between datasets. Bi & Bii) As A for null hypothesis test statistics. Bi & Bii) As A for effect sizes.

The topography of age-related change is qualitatively reproduced across all datasets (Figure 5A) with some variability in the extent and spread of effects in specific datasets. Whilst the lower frequency effects are highly consistent, there is more variability in the higher frequency effects (30 Hz), which can change direction across participants (Figure 5B, C D). The smaller data samples have more variable estimates for Cohen’s F 2 when compared to CamCAN. Whilst the interpretation of effect sizes doesn’t rely on reaching a predetermined significance threshold, they do depend on having enough data to be considered reliable. This is reflected in the greater variability in effect sizes seen in the smaller data samples. The overall profiles of Cohen’s F 2 are similar across datasets, yet the peak effect size in the low-frequency and beta ranges can still vary by a factor of two or three between data samples. This may arise from relatively poor estimates of the population level variability from smaller data samples.

The spatial and spectral profile of the effect of age on resting-state brain electrophysiology is replicable across datasets.

A) GLM-Spectrum parameter estimates quantifying the age effect across space and frequency for the four datasets in the replicability analysis with distribution of participant ages shown in the inset figure. The four datasets share the key features highlighted in Figure 1. B) As A) for t-statistics testing the hypothesis that the parameter estimate of the age effect is different to zero. C) As B) but visualised as a 2d image in which each row indicates a single sensor with their y-axis position sorted by spatial location on the anterior-posterior axis. D) as A) for Cohen’s F 2 effect sizes for the age regressor in the GLM-Spectrum.

2.4 The robustness of the age effect to head position correction

The results so far have looked at age effects on the relative spectral magnitude of each frequency compared to the sum across frequencies. Whilst the absolute spectral magnitude is simpler to interpret in terms of underlying neuronal activity it can be confounded by the position of the participant relative to the MEG sensors (some sensors will be closer to neuronal source than others), and by inter-individual variability in this position (Gross et al., 2013). It is common to apply a correction or normalisation to sensor-space analyses to control for these effects, though this is an impactful decision which can change the profile of effects seen in the results.

The age effect on the absolute spectrum (Figure 6) presents a different pattern of results compared to the relative spectrum (Figure 1 & 3). Briefly, the beta effect remains preserved or even enhanced compared to the relative magnitude effect, whilst the alpha effects are attenuated, and the low-frequency effect is decreased or removed altogether (Figure 6A, B). There is a reduced spatial specificity in the age effects of absolute power. The largest post-hoc cluster of contiguous statistically significance for absolute power covers a broad range of frontal and temporal sensors and stretches the whole 1-95-Hz range of the spectrum (Figure 6C) whilst the next four largest clusters have relatively homogenous spatial patterns (Figure 6C). Spatial maps are positively correlated for all pairs of frequencies (Figure 6B), indicating that absolute scaling from head position might be making a large contribution to the overall effect.

The age effect computed on the absolute magnitude of the power spectrum.

A) Model-predicted spectra (averaged across all sensors) for 4 equally spaced ages across the participant age range with inset histogram of participant ages within CamCAN. B) The effect size for the age regressor computed on the absolute magnitude of the power spectrum. C top Spectrum of t-values quantifying the age effect across space and frequency. Non-parametric permutations with maximum statistics to control for multiple comparisons across sensors and frequency bins. While this permutation testing was not cluster-based, contiguous clusters of significant sensors were computed post-hoc for visualisation The largest 6 spatially and spectrally contiguous areas of statistically significant effects are highlighted in frequency by black bands at the top of the spectrum and highlighted in space by sensors marked with a white circle in the adjoining topography. C bottom A 2D frequency by space map of all statistically significant effects. Sensor-Frequency combinations that do not reach statistical significance have a faded colour scale. Blue regions indicate decreasing spectral power with age and red regions indicate increases.

A related decision involves rescaling data recordings to correct for differences in head position within (movement compensation) and between (transformation to reference position) data recordings. This is commonly applied using Signal Space Separation (SSS) with the maxfilter software [Taulu and Kajola, 2005, Taulu and Simola, 2006]. We additionally compared the age effect across absolute and relative spectral magnitude with three variants of SSS head-position correction (None, movement compensation, and movement compensation plus transformation to reference). The spectral profile of the absolute effect (without relative scaling) is relatively consistent across the three variants in SSS head position correction with the application of between recording transformation to reference head position creating the only noticeable difference (Figure Supplemental 14A, C & E). In contrast, the relative spectrum has the familiar spectral and spatial profile from Figure 1 and is unaffected by SSS head position correction (Figure Supplemental 14B, D & F). There is substantially greater spectral specificity in spatial profile of the age effect in relative power, indicated by greater variability in spatial correlations across frequency (Figure Supplemental 14G & H).

2.5 Between-participant covariates mediate the age effect differently across the spectrum

Next, we explored whether the age effect is robust when controlling for covariates other than age, i.e. putatively confounding explanatory variables. These alternative covariates may be correlated with age and may impact the estimated relationship between ageing and the observed neuronal power spectrum in complex ways. We selected a non-exhaustive range of alternative covariates split into five categories: demographic (sex), cardiac (heart rate, systolic and diastolic blood pressure), MEG data acquisition (head position in dewar), brain anatomy (brain volume, global grey matter volume, global white matter volume, hippocampal volume), and physiological (head radius, height, weight) factors. We do not distinguish between covariates representing neuronally relevant factors and those representing sources of non-neuronal variance.

The GLM-Spectrum was used to quantify the effect of each alternative covariate on the spectrum by fitting a separate GLM for each of the alternative covariates that only contained a regressor for the alternative covariate, alongside an intercept term. Brain volume, global grey matter volume (GGMV), systolic blood pressure, and sex had the strongest association with the observed MEG power spectrum, though the effect size varies strongly across frequency and space (Figure 5A & Supplemental Material).

Next, we compare the age effect in two models: the first includes an age regressor plus intercept, and the second model includes an age regressor, a single alternative covariate regressor, and intercept term. The difference in Cohen’s F 2 effect size for age estimated in a simple linear regression model and the partial effect of age from a multiple regression model quantifies how robust the age effect is to the inclusion of the covariate. Including GGMV and systolic blood pressure strongly reduces the estimated age effect sizes, whilst inclusion of brain volume, white matter volume, or hippocampal volume leads to either increases and decreases in the observed age effect depending on the specific sensor and frequency bin (Figure 5B & supplemental material).

Covariates with low correlation to age generally did not modify the estimated age effect (e.g. head position, heart rate, and weight). In contrast, including covariates with high correlation can lead to starkly different influences in different brain regions and frequency ranges. For example, including GGMV GG or systolic blood pressure strongly reduced the estimated effect size of age across the brain. In contrast, inclusion of white matter volume or brain volume in the model equally increased and decreased the estimated age effect size, depending on the position and frequency being observed. Systolic blood pressure, white matter volume, and brain volume have comparable correlations with age, but the impact of that correlation on the estimates of the age effect is substantially different.

2.6 Isolating the component of the age effect on brain electrophysiology that is distinct from global reduction in grey matter volume

Global grey matter volume (GGMV) had the strongest mediating effect on the impact of ageing on the MEG spectrum from all the alternative covariates (Figure 7). Whilst there is overlap between age and GGMV in their effect on the MEG spectrum, each has a unique contribution that is distinct from the other. Modelling GGMV alongside age reduced the association between age and neuronal power spectra heterogeneously across different frequency bands and did not reduce the age effect to zero (Figure 8A and B).

Covariate effect sizes and their impact on the ageing effect.

A) Box and whisker plot with paired kernel density plot for the distributions of Cohen’s F 2 effect size for each alternative covariate, when the alternative covariate is included as the only regressor in a GLM, along with an intercept term. The Cohen’s F 2 distributions are collected over every single sensor-frequency pair from the sensor-space dataset (i.e. over 102 sensors and 189 frequency bins). Full GLM spectrum visualisations of the results are included in the appendix. For comparison, the same is shown for the age using a GLM that contains regressors for age and an intercept. B) Change in the Cohen’s F 2 effect size of age between 1) a GLM that includes an age regressor plus intercept, and 2) a model that includes an age regressor, a single alternative covariate regressor and intercept. This is shown for each alternative covariate in turn. Age is excluded from this panel. C) Pearson’s correlation coefficient quantifying the univariate linear relationship between age and each covariate. Age and sex are excluded from this panel.

The age effect is reduced heterogeneously across space and frequency by including GGMV in the model.

A) The Cohen’s F2 spectrum for age using a GLM that contains regressors for age and an intercept. Replicated from Figure 3. B)as A) when a Grey Matter Volume regressor is added to the GLM. Accounting for grey matter volume broadly reduces the effect size of age across the spectra, but some moderate effects remain. C) Spectrum of t-values quantifying the age effect across space and frequency replicated from Figure 1. Non-parametric permutations with maximum statistics to control for multiple comparisons across sensors and frequency bins. While this permutation testing was not cluster-based, contiguous clusters of significant sensors were computed post-hoc for visualisation The largest 6 spatially and spectrally contiguous areas of statistically significant effects are highlighted in frequency by black bands at the top of the spectrum and highlighted in space by sensors marked with a white circle in the adjoining topography D) As C) for the age effect in a model that includes a global grey matter volume covariate. Most effects are reduced in size with smaller contiguous statistically significant regions surviving the post-hoc clustering procedure.

Effect of age on the relative magnitude spectrum across space and frequency in LCMV source reconstructed and parcellated data.

A) Model-predicted spectra (averaged across all sensors) for 4 equally spaced ages across the participant age range. B) Spectrum of t-values quantifying the age effect across space and frequency. Source topographies are shown in the frequency bands with significant effects at sensorspace. The permutation statistics are not repeated at source space. C) A 2D frequency-by-space map of all statistically significant effects. Sensor-Frequency combinations that do not reach statistical significance have a faded colour scale. Blue regions indicate decreasing spectral power with age and red regions indicate increases.

Whilst most of the age effect-size spectrum was reduced by inclusion of GGMV, the extent of this reduction varied clearly across the six spatio-spectral regions that showed significant simple linear age effects 2. Five of the six regions showed a reduction in peak effect size (between −15% and −83%), with a single effect showing a small increase (Low gamma range +15%). The regions with the largest initial age effect sizes (before GGMV inclusion) showed the greatest reductions when controlling for GGMV. Despite these reductions, the remaining effect sizes in these regions were still larger than the original effect sizes observed in the regions with initially smaller effects.

These results have a substantial practical impact on study planning. A sample that appears to be well powered for estimating an effect when considering the effect in isolation may in fact be underpowered when covariates and confounds are accounted for in the regression model. Using the present results from CamCAN, a study looking to replicate our findings of about the component of the age effect that is linearly distinct from GGMV will require a larger sample than one replicating the age effects alone 2. Most critically, two (low alpha and high gamma) of the six identified age effects drop below the significance threshold when accounting for GGMV 8. This suggests that a large part of the electrophysiological effects in these bands cannot be linearly separated from global structural change with ageing.

3 Discussion

A whole-head and full-frequency profile of the ageing effect is characterised and shown to be replicable across datasets. Six distinct ageing effects are identified, and the combination of linear parameters across the spectrum can represent more complex features such as alpha peak frequency. The corresponding effect sizes demonstrate that age impacts the spectrum heterogeneously and that the feature of interest must be considered when sample size planning for future experiments. This effect is robust to head position correction, but the spatial profile changes substantially between relative and absolute spectral power. Finally, the age effect is differentially modified by different factors, such as grey matter volume, brain volume, head size, and physiological factors. Individual variability grey matter volume mediates the association between age and neuronal power spectra, leading to a reduction in the estimated effect size of age across the whole spectrum. Critically, the impact of including a grey matter volume covariate differentially impacts different brain regions and frequency bands, indicating that some regions have a strong and unique signature of ageing that reflects functional/synaptic changes over and above grey matter atrophy.

Overall, by analysing four independent datasets including 600 participants, we demonstrated that there is a replicable and robust effect of ageing on MEG power spectrum across the whole head and a range of frequencies. These effects are well well-powered at moderate data samples, replicable across different datasets and scanner types, and are reasonably robust to modelling additional anatomical, physiological, or acquisition factors as covariates.

3.1 Full frequency spectra of age effects synthesise diverse results across the literature

The profile of results across the spectrum of age effects replicates and simplifies a range of results across previous literature. This is done without pre-specification of frequency bands or sensors of interest and in a manner that is conducive to future study planning and meta-analyses. Here, we briefly review the core age effects from this analysis in relation to a representative sample of published results on the effects of healthy ageing on the resting-state EEG and MEG spectrum.

Low-frequency power decreases with age

This result is broadly consistent with literature demonstrating that ‘theta’ power decreases with healthy ageing [Beese et al., 2017, Cummins and Finnigan, 2007, Rempe et al., 2023, Vlahou et al., 2014] and may be associated with decreases in cognitive performance [Finnigan and Robertson, 2011]. One critical difference with this literature is that the full-frequency age profile identifies a single ‘low frequency’ age-effect between 1-7 Hz and does not distinguish does not distinguish between the canonical delta (2-4 Hz) and theta (4-7 Hz) oscillations. A further difference is that we only see the low-frequency decrease in power in the relative spectrum (Figure 1). Computing the age effect from the absolute spectrum resulted in no changes across age at low frequencies (Figure 6).

A combination of age-related related decreases in lower frequencies with increases in higher frequencies (notably the beta range) contributes to a flattening of the aperiodic component of the power spectrum [Aggarwal and Ray, 2023, Cesnaite et al., 2023, Dave et al., 2018, Voytek et al., 2015]. This flattening is proposed to reflect an age-related increase in neural noise affecting age-related working memory performance [Voytek et al., 2015]. Future research accounting for flattening of the 1/f-type component in the full frequency spectrum approach can better separate whether the low frequency changes relate to oscillations or the aperiodic component of the spectrum.

Two dissociable and opposite effects in the alpha range

Papers that report results from a single alpha frequency band during rest include reports of a variety of contrasting age effects, including positive correlations with age [Rempe et al., 2023], negative correlations [Stier et al., 2023], or both positive and negative effects separated by space [Babiloni et al., 2006, Hoshi and Shigihara, 2020, Pathak et al., 2022]. Investigating age effects as a complete spectrum shows that two contrasting age effects coexist in close proximity in space and frequency: an increase with a small effect size in central occipital sensors around 7-8.5 Hz and a decrease with a large effect size age across a broad set of occipital, temporal and frontal sensors between 9.5-12.5 Hz. Different data samples and different data analysis choices, particularly sensor choice or anatomical localisation, or might emphasise one effect or the other in each analysis.

The two effects identified in the present analysis can combine to represent a decrease in power and frequency of a single alpha peak (Figure 2). This change in alpha peak frequency is highly replicable [Cesnaite et al., 2023, Dustman et al., 1993, Sahoo et al., 2020, Scally et al., 2018] and is a highly predictive spectral marker of ageing [Stier et al., 2024]. Decreases in alpha peak frequency have been linked to a decline in cognitive performance in healthy ageing [Cesnaite et al., 2023, Finley et al., 2024] and MCI [Garcés et al., 2013, López-Sanz et al., 2016, Puttaert et al., 2021].

Increase in beta power with age

Several papers report an increase a low beta range power with age [Gómez et al., 2013, Heinrichs-Graham and Wilson, 2016, Heinrichs-Graham et al., 2018, Hübner et al., 2018, Koyama et al., 1997, Rempe et al., 2023, Stier et al., 2023, Veldhuizen et al., 1993, Xifra-Porxas et al., 2019]. These spectrum changes may be associated with age-related changes in underlying bursting dynamics [Brady et al., 2020, Power et al., 2023].

In addition to changing in the resting state, ageing is associated with increases in movement-related beta desynchronisation [Bardouille and Bailey, 2019, Heinrichs-Graham and Wilson, 2016, Heinrichs-Graham et al., 2018, Xifra-Porxas et al., 2019, Rossiter et al., 2014]. Changes in post-movement beta rebound have been reported as well [Bardouille and Bailey, 2019, Xifra-Porxas et al., 2019]. The increased resting-state beta activity may require increased downregulation in order for older adults to perform movements [Heinrichs-Graham and Wilson, 2016].

Novel gamma band changes with age

There are relatively few publications exploring changes in effects of ageing on resting state gamma band activity. Two EEG studies report increases in gamma power in older adults compared to younger adults [Aoki et al., 2022, Jabès et al., 2021]. The frequency band of these effects is consistent with our finding though the EEG results indicate a frontoparietal and temporal shift whereas our gamma increase occurs in occipal sensors. There is larger literature on task related changes in gamma with age which indicates that gamma responses to visual gratings decrease in power and frequency with age [Murty et al., 2020, Kumar and Ray, 2023].

Mediation of age effects by grey matter volume

There are substantial reductions in grey matter volume and cortical thickness across the adult lifespan [Frangou et al., 2022]. We show that association between age and the resting spectrum is mediated by reductions in grey matter volume but that this effect is heterogeneous across space and frequency. Our results show that accounting for GGMV reduces the effect sizes of the age effects below 30Hz by 57% to 83% (Figure 8; Table 2). This reduction replicates results showing that structural brain changes are associated with widespread changes in spectral power [Stier et al., 2023]. Our results are limited by the use of global grey matter volume as a blunt measure of grey matter change, grey matter decline in ageing occurs heterogeneously across the brain [Frangou et al., 2022] and oscillatory changes are likely linked to this pattern [Mahjoory et al., 2020].

3.2 Observed power analysis for future study planning

The observed power and sample size calculations in this paper should not be used to help interpret the present findings, or as a basis to speculate about what sample size would be required for a non-significant result to pass the significance threshold. Rather they are intended solely to guide decisions about the sample sizes of future studies in an accessible manner and to highlight the potential for this approach in analyses of neuronal power spectra.

We concur with Lenth [2001] who highlight that “Sample size planning is often important, and is nearly always difficult”. Most recommendations about sample size planning indicate that a priori or theoretically informed effect sizes should be used as observed effect sizes are influence by sampling error [Hoenig and Heisey, 2001]. Single estimates of observed effect sizes can be extremely noisy and will not necessarily reflect the ‘true’ underlying population effect. Moreover, using the single observed estimate simply recapitulates the information already present in the p-value of the test [Hoenig and Heisey, 2001].

As a result, we emphasise the bootstrapped 95% confidence intervals of the effect sizes to give a sense of the uncertainty around our estimates. The upper and lower bound of these confidence intervals are used in the observed power analysis as an example to guide future study planning. Until such a time when the field reaches consensus about what the smallest effect size of interest would be for electrophysiological power spectra, we propose that researchers can use the lower bound of the 95% confidence interval as the smallest effect size of interest for a replication. Working with multiple covariates and a requirement to correct for multiple comparisons both decrease the statistical power of a neuroimaging dataset [Cremers et al., 2017] so it is critical to be transparent about which effects are observable in our data. Confidence interval based sample projections from observed effect sizes must be handled with care but are a useful tool in a field where sample size plans are frequently based on resource constraints or heuristics. Projections can be supported with further sensitivity power analyses that explore a realistic range of expected effect sizes given the context of the planned project. Many excellent guides have been written to inform principled sample size planning in behavioural [Lakens, 2022] and neuroimaging contexts [Mumford, 2012].

3.3 Limitations

Whilst we have explored the impact of several analysis decisions on the profile of the ageing effect, we have not completed an exhaustive search the processing options available to electrophysiology researchers. We have demonstrated that the age effect is robust to SSS based head position correction and to a broad range of covariates but modified by choice of relative or absolute power and inclusion of grey matter volume as a covariate. A broad range of analysis options are still to be explored, notably including the impact of different source reconstruction methods beyond the single example presented here. This is not an exhaustive exploration of the garden of forking paths [Gelman and Loken] and future analysis can broaden this scope through approaches such as multiverse analysis methods [Clayson et al., 2021, Dafflon et al., 2022].

We have compiled results from a broad range of whole-head and full-frequency models to provide a comprehensive overview of the linear age effect. However, due to the complexity of the data, further analysis is still needed to explore various options. We have only explored the linear effects of age in these analyses, whilst there is evidence for quadratic age effects in power spectral features [Brady et al., 2020, Rempe et al., 2023]. Our current analysis pipeline may not adequately capture variability in individual data recordings by neglecting to incorporate first-level variance components into the group model. A mixed modelling approach would provide a more comprehensive and robust framework for analysing these data. Finally, we present results based on spectral magnitude rather than power or log-power as a compromise between suitability for Gaussian linear modelling and intuitive presentation of findings [Quinn et al., 2024].

3.4 Building towards robust and reproducible effects of age in brain electrophysiology

We propose that computing and sharing unthresholded whole-head and full-frequency statistical spectra of key effects can facilitate comparability between studies. In addition, and the spectra of ageing effect sizes can support formal meta-analyses and planning of future sample sizes. We propose that this simple approach can provide a robust and replicable foundation stone on which clinical applications and more advanced analysis methodologies can be built and validated.

Similar approaches have been successful in facilitating the synthesis and crosstalk of results across a diverse literature in MRI [Gorgolewski et al., 2015]. To date, meta meta-analyses of spectral effects of ageing have been limited to features within specified frequency bands, such as alpha band peak frequency [Freschl et al., 2022] or power [Lejko et al., 2020]. These analyses are currently limited by variable and often incomplete reporting of results in addition to cumbersome decisions about how to combine results across heterogeneous channel sets, spatial regions, and frequency bands. Best practice reporting guidelines [Gross et al., 2013] and formal reporting standards, standards such as COBIDAS-MEEG [Pernet et al., 2018] improve transparency, but there is still a need to share outputs in a form that can readily support synthesis across studies and establish consensus about core effects

Conclusion

Overall, we have demonstrated that ageing effects on MEG power spectra are robust and and replicable across multiple datasets. Different brain regions and frequency bands respond differently to ageing, with some having a strong unique signature of ageing that is dissociable from global anatomical changes. This is a necessary step towards adoption of electrophysiological measures in a healthcare setting.

4 Methods

4.1 Datasets

Four open-access cross-sectional datasets with participants covering a broad age range were analysed as part of this manuscript.

CamCAN

Data used in the preparation of this work were obtained from the CamCAN repository (available at http://www.mrc-cbu.cam.ac.uk/datasets/camcan/), [Taylor et al., 2017, Cam-CAN et al., 2014]. MEG-UK Oxford, Cambridge & Nottingham Eyes closed resting-state MEG recordings were acquired in four MEG laboratories in the United Kingdom as part of the wider MEG-UK consortium. The Oxford and Cambridge data was acquired on a 306 channel VectorView system whilst the Cardiff and Nottingham data was acquired on a CTF-275 channel system.

Eyes open and eyes closed resting-state MEG recordings were acquired in four MEG laboratories in the United Kingdom as part of the wider MEG-UK consortium. The Oxford and Cambridge data was acquired on a 306 channel VectorView system whilst the Cardiff and Nottingham data was acquired on a CTF-275 channel system.

4.2 MEG Data Preprocessing

All MEG data pre-processing was carried out using MNE-Python [Gramfort, 2013] and OSL-ephys [Quinn et al., 2022, van Es et al., 2025] using the OSL batch pre-processing tools. The data from each recording was first converted from its raw format into MNE-Python Raw data objects. The first 35 seconds of continuous data were cropped out, and the remaining data were bandpass filtered between 0.25 and 150 Hz using an order-5 Butterworth filter.

50-Hz line noise was suppressed using the ZapLine method [de Cheveigné, 2020] as implemented in the MEEGKit toolbox. Two passes of ZapLine were applied to account for minor peaks around the 50 Hz artefact. One pass was centred on 49 Hz and the second on 50 Hz. Bad segments were identified by segmenting data into 2-second chunks and using the generalised-extreme studentized deviate [Rosner, 1983] algorithm to identify outlier (bad) samples with high variance across channels. Bad segment detection was applied to both the raw time series and the differential of the time series and separately to the Magnetometers and Gradiometers. An average of 2.82% of data samples were marked as ‘bad’ (standard deviation: 2.54%, min: 0.0%, max: 15.52%) equivalent to an average of 15.04 seconds per recording. Bad channels were then identified using the G-ESD routine to identify outliers in the distribution of variance per channel over time. The data were then resampled to 250 Hz to reduce space on-disk and ease subsequent computations.

Independent Component Analysis (ICA) denoising was carried out using a 64-component FastICA decomposition [Hyvarinen, 1999] on the MEG channels. This decomposition explained an average of 99% of variance in the sensor data across datasets. Artefactual components relating to eye movements or the heart rate were automatically identified by correlation with the simultaneous EOG and ECG channels. Between 0 and 3 EOG components were rejected in each dataset, with an average of 0.99 (standard deviation: 0.79) across all datasets. Between 0 and 5 ECG components were rejected in each dataset, with an average of 2.25 (standard deviation: 0.84) across all datasets. The continuous sensor data were then reconstructed without the influence of the components labelled as artefacts.

To retain consistent dimensionality across the group, any bad channels were interpolated using a spherical spline interpolation [Perrin et al., 1989] as implemented in MNE-Python. The head position of the participant in device coordinates was extracted from each dataset.

This pipeline was repeated for both the SSS processed data with and without movement compensation and head position standardization.

The same pipeline was applied to the VectorView data from the Oxford and Cambridge sites of the MEGUK dataset. A very similar pipeline was applied to the CTF data from the

Nottingham site of dataset MEGUK dataset, except for omitting the SSS processing in favour of 3rd order gradiometry and in selection of planar gradiometer channels.

4.3 MRI Data Processing

All MR data were processed using the FMRIB Software Library [Woolrich et al., 2009]. T1-weighted MR scans from 804 individuals were analysed. Images were reoriented to MNI space, cropped, and bias-field corrected. FMRIB’s Linear Registration Tool, FLIRT [Jenkinson and Smith, 2001, Jenkinson et al., 2002] was used to register to standard space before brain extraction was performed using BET [Smith, 2002]. Tissue-type segmentation (grey matter, white matter and CSF) was conducted using FMRIB’s Automated Segmentation Tool, FAST [Zhang et al., 2001]. Subcortical structure volumes (hippocampus, amygdala, etc) were derived using FMRIB’s Integrated Registration and Segmentation Tool [Patenaude et al., 2011].

Total head volume was computed by extracting a volumetric skull mask using the FSL BETSurf tool and filling in the gap left by brain extraction. The total number of voxels within the head and brain was computed with fslmaths. The overall brain volume was normalized by the total head volume.

The voxel count for each tissue-type and subcortical structure was extracted and normalised using the individual’s total brain volume (computed by FAST) to compute a percentage. The final MRI metrics were: Total head volume, Brain volume (as a proportion of head volume), Global Grey Matter Volume (grey matter volume as a proportion of total brain volume), and Hippocampus volume (summed across left and right hemispheres as a proportion of total brain volume).

4.4 Source Reconstruction & Parcellation

First, a structural MRI (sMRI) was used to coregister the MEG data. This involved extracting surfaces for the other skin, outer skull and inner skull from the sMRI using FSL, then aligning the Polhemus headshape points to the outer skin surface, as well as aligning the sMRI and Polhemus fiducials. Note, 31 out of 643 subjects from CamCAN were excluded at this stage due to missing sMRIs.

Following the coregistration, a single-shell forward model was computed. The preprocessed sensor-level data were source reconstructed using a Linearly Constrained Minimum Variance (LCMV) beamformer [Hillebrand et al., 2005, Van Veen and Buckley, 1988]. This involved projecting the sensor-level data onto an 8-mm dipole grid inside the inner skull. The forward model and sensor-level data covariance matrix were used to compute the LCMV beamformer. The covariance matrix was estimated using the entire recording for a subject and regularised to 60 components using principal component analysis (PCA) rank reduction.

Using the dipole time courses from source reconstruction, parcel time courses were calculated using an anatomical parcellation with 52 regions of interest (binary) [Kohl et al., 2023]. The first principal component across dipoles assigned to a parcel was used to generate the parcel time courses. Following this, the parcel time courses were orthogonalised using a symmetric multivariate correction for spatial leakage [Colclough et al., 2015]. Finally, a stochastic search algorithm was used to align the sign of each parcel time course across subjects by matching to a (median) template subject.

4.5 GLM-Spectrum - First level

A GLM-Spectrum [Quinn et al., 2024] was used to provide a statistical estimate of the MEG spectrum in the presence of covariates and confounds. A Short-Time Fourier Transform (STFT) was computed from the preprocessed sensor time series from each dataset using a 2-second segment length, a 1-second overlap between segments, and a Hanning taper. The 2-second segment length at the sample rate of 250 Hz gives a resolution of around 2 frequency bins per unit Hertz in the resulting spectrum. The short-time magnitude spectrum is computed from the complex-valued STFT and the frequency bins ranging between 0.1 and 100 Hz taken forward as the dependent variable in the first-level GLM-Spectrum for that dataset.

The GLM design matrix for each dataset is specified with five regressors. A single constant regressor models the intercept of the data, whilst two zero-mean parametric regressors model the effect of changing heart rate and a linear trend in time throughout the recording. The fourth and fifth regressors were non-zero mean parametric regressors modelling the effect of variance in the V-EOG channel and bad segments. These final regressors acted as confound variables that removed variance that can be linearly explained by the artefact sources from the mean term.

A set of simple contrasts isolated the main effect of each of the five regressors. The first-level GLM-Spectrum model parameters were estimated using the Moore-Penrose Pseudo-Inverse [Penrose, 1956] to compute beta-spectra. Finally, the beta-spectra were weighted by the contrasts to compute cope-spectra which were carried forward to the group level analyses.

4.6 GLM-Spectrum - Group level effect of age

A group-level design matrix was constructed with two regressors, one constant intercept term, and one parametric regressor containing z-standardized participant age. The input data were the first-level average magnitude spectrum from each participant. The model coefficients were computed using the Moore Penrose Pseudo-Inverse before contrasts and t-statistics were computed. Non-parametric permutation statistics were used to compute the statistical significance by row-shuffling the age regressor. Multiple comparisons were controlled by taking a maximum-statistic across all sensors and frequency bins. Contiguous clusters of significant effects were identified to simplify the final visualization of the results. Cohen’s F 2 was computed for the Age regressor for all sensors and frequency bins.

4.7 GLM-Spectrum - Group level effect between subject covariates

Eleven potential between-subject covariates were collated for group analysis and are summarized in Table 2.

Single Regressor Models

To explore the individual effect of each covariate, a separate two-regressor model containing a constant term and a covariate was fitted for each covariate in turn. Non-parametric permutation statistics were used to estimate statistical significance for each of the covariates in these simple models. The effect size of each covariate in the simple models were computed using Cohen’s F 2 statistics.

The age model was computed twice, once using the full first-level GLM-Spectra and once without the additional covariates (Linear trend, EOG, ECG, and bad segments) in the first level.

Age plus covariate models

the effect of each covariate on the overall effect of age was explored by fitting a second set of models that contain a constant term, the age covariate, and each individual other covariate in turn. Statistical significances and effect sizes were computed for the age effect in each individual model. The change in effect size of the age covariate between the single regressor age model and the age-plus-covariate models was computed for each covariate.

4.8 Effect Size and Power Analyses

Effect sizes are computed using Cohen’s F 2 metric, which compares the R2 of two models to establish the contribution of a particular regressor to the variance explained by the model [Cohen, 1988, Selya et al., 2012]. Cohen’s F 2 can be interpreted as the proportion of residual variance explained by the addition of the new regressor.

The effect size is defined as:

In which B refers to the regressor of interest and A refers to the set of all other regressors in the model. The numerator is then the proportion of additional variance explained by addition of the regressor of interest B. This is normalised by the denominator, which quantifies the total residual variance left unexplained by the full model [Selya et al., 2012]. A full spectrum of effect sizes can be computed from the GLM-Spectrum model [Quinn et al., 2024].

Bootstrapped confidence intervals for effect size estimates were computed by building a distribution of effect sizes using 1000 iterations of resampling with replacement. 95% confidence intervals were computed as percentiles across the bootstrapped distribution of effect size estimates. Confidence intervals for power and sample size calculations are computed using the bootstrapped Cohen’s F 2 metrics.

Power analysis for regressors in a general linear model was computed using in-house Python code validated against the pwr R package [Champely, 2020, Cohen, 1988]. Five variables were defined, of which any four can be used to compute the remaining fifth variable. For a GLM, these variables are the number of observations in the data, the number of regressors in the model, the estimated effect size, the alpha level (type 1 error), and the power (1 - type 2 error). We computed the future sample size estimates in Tables 1 and 2 using this approach. The number of observations in the dataset were computed from the number of regressors in the model (2 or 3), the Cohen’s F2 estimate, an alpha level of 0.05, and a power of 80%.

5 Supplementary Material

5.1 Source space age effect

The sensorspace effect shown in Figure 1 is replicated in a source projected analysis using an LCMV beamformer with voxel data organised into parcels are orthogonalised. The average profile of the age effect in source space is highly consistent with the sensorspace result.

5.2 Details of covariate analyses

The GLM-Spectrum effects of the investigated covariate effects are summarised in Figure 7 and shown here in full detail. Figures 10, 11 and 12 shows the results for models fitted with an intercept term and the z-transformed covariate. The full GLM-Spectrum is shown for the parameter estimates (Figure 10), t-statistics (Figure 11) and Cohen’s F 2 effect sizes (Figure 12). The results from Figure 12 are summarised in Figure 7A in the main text.

GLM spectrum of parameter estimates for all covariates.

GLM spectrum of t-statistics for all covariates.

GLM spectrum of effect sizes for all covariates.

Figure 13 shows how the Cohen’s F 2 estimates for age change between two models. The first model contains an intercept and the z-transformed ages as regressors and the second model contains the z-trasnsformed values of an additional covariate. Cohen’s F 2 for the age regressor is computed from both models and Figure 13 shows the difference. Age itself is excluded from this analysis. This result is summarised in Figure 7B in the main text.

GLM spectrum of change in estimates age effect size when including covariate in the model.

Age itself is excluded from this figure.

5.3 Robustness to head position correction methods

The change in the GLM spectrum with respect to head position correction is discussed in section 2.4. The results in Figure 14 show the full GLM-Spectra for the age effect computed for each combination of two sensor normalisation conditions - no normalisation (absolute magnitude) and z-transformed time series (relative magnitude) - and three SSS based head position correction conditions - no correction, within recording head position correction (−movecomp) and within and between recording head position correction (−movecomp and −trans). Results from the analysis of the relative magnitude with both within and between recording head position correction are shown in Figure 1 of the main text. Results from the analysis of the absolute magnitude with both within and between recording head position correction are shown in Figure 5 of the main text The SSS head position correction methods have little effect on the estimated group age effect whereas the sensor normalisation change the overall pattern of results as discussed in section 2.4.

GLM-Spectrum results showing the effect of age estimated for two different sensor normalisations and three different SSS head position correction types.

A) The age effect for the absolute magnitude with SSS applied without head position correction. B) The age effect for the relative magnitude with SSS applied without head position correction. C) The age effect for the absolute magnitude with SSS applied with head position correction applied within each dataset (−movecomp in maxfilter software). D) The age effect for the relative magnitude with SSS applied with head position correction applied within each dataset (− movecomp in maxfilter software). E) The age effect for the absolute magnitude with SSS applied with head position correction applied within each dataset (−movecomp in maxfilter software) and head position alignment to a reference datafile (−trans in maxfilter software). F) The age effect for the relative magnitude with SSS applied with head position correction applied within each dataset (−movecomp in maxfilter software) and head position alignment to a reference datafile (− trans in maxfilter software). G) The spectral generalisation of the topography of the absolute magnitude age effect. H) The spectral generalisation of the topography of the relative magnitude age effect.

The impact of the sensor normalisation is a change in the spectral specificity of spatial patterns in the age effect. The largest component of the absolute magnitude age effect is a consistent spatial pattern covering the whole frequency range, leading to high positive correlations in the spatial maps of the age effect across all frequencies (Figure 14G). In contrast, spatial topography of the relative magnitude age effect varies strongly across frequency (Figure 14H)

Data & Code Availability

Raw sensor-level MEG recordings are from the CamCAN Data Repository: https://opendata.mrc-cbu.cam.ac.uk/projects/camcan/ (Eyes Closed Resting State data) and the MEG-UK Database https://meguk.ac.uk/database/ (Eyes closed resting data from Oxford, Cambridge & Nottingham). Preprocessed and analysed data to reproduce group level analyses are archived on the Open Science Framework : https://osf.io/yspdg/. Code to run all processing, data analysis and visualisations are available on github : https://github.com/OHBA-analysis/Quinn2025-RobustReplicableAge.

The analyses in this paper were carried out in Python 3.11 with core dependencies as numpy [Harris et al., 2020], scipy [Virtanen et al., 2020] and Matplotlib [Hunter, 2007]. MNE python [Gramfort, 2013] was used for EEG/MEG data processing with OSL batch processing tools [Quinn et al., 2022, van Es et al., 2025]. The spectrum analyses further depend on the Spectrum Analysis in Linear Systems toolbox [Quinn and Hymers, 2020] and glmtools (https://pypi.org/project/glmtools/).

Additional information

Funding

This research was supported by the National Institute for Health Research (NIHR) Oxford Health Biomedical Research Centre. The Wellcome Centre for Integrative Neuroimaging is supported by core funding from the Wellcome Trust (203139/Z/16/Z). CG is supported by the Wellcome Trust (215573/Z/19/Z). OK is supported by the Marie Sklodowska-Curie Innovative Training Network “European School of Network Neuroscience (euSNN)” (860563). MWJvE is supported by the Wellcome Trust (106183/Z/14/Z, 215573/Z/19/Z), the New Therapeutics in Alzheimer’s Diseases (NTAD) supported by the MRC and the Dementia Platform UK (RG94383/RG89702). ACN is supported by the Wellcome Trust (104571/Z/14/Z) and James S. McDonnell Foundation (220020448). MWW is supported by the Wellcome Trust (106183/Z/14/Z, 215573/Z/19/Z), the New Therapeutics in Alzheimer’s Diseases (NTAD) study supported by UK MRC, the Dementia Platform UK (RG94383/RG89702) and the NIHR Oxford Health Biomedical Research Centre (NIHR203316). The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care.

For the purpose of open access, the author has applied a CC-BY public copyright licence to any Author Accepted Manuscript version arising from this submission.

The computations described in this paper were performed using the University of Birmingham’s BlueBEAR HPC service, which provides a High Performance Computing service to the University’s research community. See http://www.birmingham.ac.uk/bear for more details.

Author Contributions

A.J.Q.: Conceptualisation, Methodology, Software, Formal analysis, Data Curation, Project administration, Writing—Original Draft, Writing—Review & Editing, and Visualisation. C.G.: Conceptualisation, Methodology, Software, Formal analysis, Data Curation, and Writing—Review & Editing. J.P: Conceptualisation, Formal analysis and Writing—Review & Editing OK: Conceptualisation, Writing—Original Draft & Writing—Review & Editing M.v.E.: Conceptualisation, Writing—Review & Editing. A.C.N.: Conceptualisation, Writing—Review & Editing, Supervision. M.W.W.: Conceptualisation, Methodology, Writing—Original Draft, Writing—Review & Editing, and Supervision.

Funding

Wellcome Trust

https://doi.org/10.35802/215573

Wellcome Trust

https://doi.org/10.35802/106183

Wellcome Trust

https://doi.org/10.35802/104571

Wellcome Trust

https://doi.org/10.35802/203139

James S. McDonnell Foundation

https://doi.org/10.37717/220020448

Marie Curie Innovative Training Network (860563)

Medical Research Council (RG94383/RG89702)

National Institute for Health Research (NIHR203316)