Genetic loci and metabolic states associated with murine epigenetic aging

  1. Khyobeni Mozhui  Is a corresponding author
  2. Ake T Lu
  3. Caesar Z Li
  4. Amin Haghani
  5. Jose Vladimir Sandoval-Sierra
  6. Yibo Wu
  7. Robert W Williams
  8. Steve Horvath
  1. Department of Preventive Medicine, University of Tennessee Health Science Center, College of Medicine, United States
  2. Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, College of Medicine, United States
  3. Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, United States
  4. Department of Biostatistics, Fielding School of Public Health, University of California Los Angeles, United States
  5. YCI Laboratory for Next-Generation Proteomics, RIKEN Center for Integrative Medical Sciences, Japan
  6. University of Geneva, Switzerland


Changes in DNA methylation (DNAm) are linked to aging. Here, we profile highly conserved CpGs in 339 predominantly female mice belonging to the BXD family for which we have deep longevity and genomic data. We use a ‘pan-mammalian’ microarray that provides a common platform for assaying the methylome across mammalian clades. We computed epigenetic clocks and tested associations with DNAm entropy, diet, weight, metabolic traits, and genetic variation. We describe the multifactorial variance of methylation at these CpGs and show that high-fat diet augments the age-related changes. Entropy increases with age. The progression to disorder, particularly at CpGs that gain methylation over time, was predictive of genotype-dependent life expectancy. The longer-lived BXD strains had comparatively lower entropy at a given age. We identified two genetic loci that modulate epigenetic age acceleration (EAA): one on chromosome (Chr) 11 that encompasses the Erbb2/Her2 oncogenic region, and the other on Chr19 that contains a cytochrome P450 cluster. Both loci harbor genes associated with EAA in humans, including STXBP4, NKX2-3, and CUTC. Transcriptome and proteome analyses revealed correlations with oxidation-reduction, metabolic, and immune response pathways. Our results highlight concordant loci for EAA in humans and mice, and demonstrate a tight coupling between the metabolic state and epigenetic aging.

Editor's evaluation

This article used three newly generated epigenetic predictors to test how they differ between genetically diverse mice from the BXD family (by looking at metabolic traits and lifespan). The authors subsequently identified several quantitative trait loci for the different predictors, using linkage analysis, and performed transcriptome and proteome analyses of liver and adipose tissue. The described results provide some important new insights on the underlying biology of epigenetic mouse aging and may be used to inform future studies in other model organisms and humans focused on studying the relationship between epigenetic aging and metabolism.


Epigenetic clocks are widely used molecular biomarkers of aging (Horvath and Raj, 2018). These DNA methylation (DNAm) age predictors are based on the methylation levels of select CpGs that are distributed across the genome. Each CpG that is used in a clock model is assigned a specific weight, typically derived from supervised training algorithms (Bell et al., 2019; Thompson et al., 2018; Porter et al., 2021), and collectively, the methylation status across this ensemble of ‘clock CpGs’ is used to estimate the epigenetic age (DNAmAge). This estimate tracks closely, but not perfectly, with an individual’s chronological age. How much the DNAmAge deviates from the known chronological age can be a measure of the rate of biological aging. Denoted as ‘epigenetic age acceleration’ (EAA), a more accelerated clock (positive EAA) suggests an older biological age, and a decelerated clock (negative EAA) suggests a younger biological age. The extent of age acceleration has been associated with variation in health, fitness, exposure to stressors, body mass index (BMI), and even life expectancy (Zannas et al., 2015; Marioni et al., 2015; Dugué et al., 2018; Lu et al., 2019; Ryan et al., 2020).

DNAm clocks were initially reported for humans (Hannum et al., 2013; Horvath, 2013; Bocklandt et al., 2011). Since then, many different models of human DNAmAge have been developed, and this rapid expansion was made possible by reliable DNAm microarrays that provide a fixed CpG content – starting with the Illumina Infinium 27K to the current 850K EPIC array (Horvath, 2013; Zhang et al., 2019; Liu et al., 2020; Lee et al., 2020). These clock variants differ in the subset of CpGs that go into the age estimation model. Some clock models are specific to cells or tissues, others are multitissue. Some clocks perform better at predicting chronological age, others better capture biological aging and predict health and life expectancy (Lu et al., 2019; Horvath et al., 2018; Levine et al., 2018; Shireby, 2020). The performance of these clocks depends heavily on the training models, and the size and tissue types of the training set (Zhang et al., 2019).

The DNAm age biomarker has also been extended to model organisms, and this has opened up the possibility of directly testing the effects of different interventions such as caloric restriction, rapamycin, and genetic manipulation (Thompson et al., 2018; Petkovich, 2017; Stubbs et al., 2017; Wang, 2017; Levine et al., 2020; Han, 2018). However, one point to note is that model organisms have not benefited from a microarray platform comparable to that of the human methylation Infinium arrays. Most rodent studies have used enrichment-based DNAm sequencing, and this limits the transferability and reproducibility of clocks between datasets since the same CpGs are not always covered (Wang, 2017). Moreover, these studies are usually performed in a single inbred strain (for mouse, the canonical C57BL/6), or at most, a few genetic backgrounds, and this makes it impossible to carry out genetic mapping studies that can complement the human genome-wide association studies (GWAS) of epigenetic aging (Gibson, 2019; Lin, 2021; Lu et al., 2018; McCartney et al., 2021; Kuo et al., 2021).

A new microarray was recently developed to profile CpGs that have high conservation in mammals. This pan-mammalian DNAm array (HorvathMammalMethylChip40) surveys over 37K CpGs and provides a unifying platform to study epigenetic aging in mammals (Arneson et al., 2022). This array has been used to build multitissue universal clocks and lifespan predictors that are applicable to a variety of mammalian species (Lu et al., 2021; Li et al., 2021). Here, we use this array to examine the dynamism and variability of the conserved CpGs in a genetically diverse cohort of mice belonging to the BXD family (Williams et al., 2022; Roy et al., 2021).

The BXDs are one of the preeminent murine genetic reference panels used as the experimental paradigm of precision medicine (Ashbrook et al., 2021). They are a large family of recombinant inbred (RI) strains made by crossing the C57BL/6J (B6) and DBA/2J (D2) parental strains. The family has been expanded to 150 fully sequenced progeny strains (Ashbrook et al., 2021; Peirce et al., 2004). The individual members of the BXD family (e.g., BXD1, BXD27, BXD102) represent a replicable isogenic cohort. The family segregates for a high level of genetic variation, and likewise, family members have high variation in their metabolic profiles, responses to diet, aging rates, and life expectancies (Williams et al., 2022; Roy et al., 2021; Ashbrook et al., 2021; de Haan and Williams, 2005; Hsu et al., 2003; Lang et al., 2010). The availability of deep genome sequence data, and unrivaled multi-omic and phenomic data make the BXDs a powerful tool with which to evaluate the causal linkage between genome, epigenome, and aging rates.

In our previous work, we used an enrichment-based sequencing to assay the methylome in a modest number of BXD mice and reported rapid age-dependent methylation changes in mice on high-fat diet (HFD) and mice with higher body weight (Sandoval-Sierra et al., 2020). In this study, we start by testing the performance of new pan-tissue and liver-specific epigenetic mouse clocks and evaluate how these relate to metabolic states, genotype-dependent life expectancy, and methylome entropy. We also apply a multifactor analysis of site-specific CpG methylation to understand the interrelations among four key variables – chronological age, diet, weight, and lifespan – and the liver methylome. We perform quantitative trait locus (QTL) mapping, along with multi-omic gene expression analyses, and identify upstream gene loci that modulate the DNAm clocks.

Our results are consistent with a faster clock for cases on HFD and with higher body weight. This may be partly because exposure to HFD augmented the age-dependent gains in methylation at specific CpGs. We also observed that BXD genotypes with longer life expectancies tend to have lower methylation at CpGs that undergo age-dependent methylation gains, and the entropy computed from this set of CpGs has a significant inverse correlation with strain lifespan. QTL mapping uncovered loci on chromosomes (Chrs) 11 and 19 that are associated with EAA. A strong candidate gene in the chromosome (Chr) 11 interval (referred to as Eaa11) is Stxbp4, a gene that has been consistently linked to EAA by human GWAS (Gibson, 2019; Lu et al., 2018; McCartney et al., 2021). The Chr19 QTL (Eaa19) also harbors strong contenders including Cyp26a1, Myof, Cutc, and Nkx2–3, and the conserved genes in humans have been associated with longevity and EAA (McCartney et al., 2021; Benton et al., 2017; Yashin et al., 2018). We performed gene expression analyses using transcriptomic and proteomic data to clarify the molecular pathways linked to epigenetic aging, and this highlighted metabolic networks, and also apolipoproteins (including APOE) as strong expression correlates.


Description of samples

Liver DNAm data was from 321 females and 18 males belonging to 45 members of the BXD family, including both parental strains and F1 hybrids. Age ranged from 5.6 to 33.4 months. Mice were all weaned onto a normal chow (control diet [CD]), and a balanced subset of cases were then randomly assigned to HFD (see Roy et al., 2021 for details). Tissues were collected at approximately 6 months’ intervals (see Williams et al., 2022). Individual-level data are given in Supplementary file 1.

DNAm clocks, entropy, and chronological age prediction

We built three different clocks, and each was developed as a pair depending on whether the training set used all tissues (pan-tissue) or a specific tissue (in this case, liver). These are (1) a general DNAm clock (referred to simply as DNAmAge): clock trained without pre-selecting for any specific CpGs; (2) developmental clock (dev.DNAmAge): built from CpGs that change during development (defined as the period from prenatal to 1.6 months); and (3) interventional clock (int.DNAmAge): built from CpGs that change in response to aging-related interventions (caloric restriction and growth hormone receptor knockout). The clocks we report here were trained in a larger mouse dataset that excluded the BXDs and are therefore unbiased to the characteristics of the BXD family (Lu et al., 2021; Li et al., 2021; Haghani et al., 2022). The specific clock CpGs and coefficients for DNAmAge computation are given in Supplementary file 2. All clocks performed well in age estimation and had an average r of 0.89 with chronological age. However, the interventional clocks had higher deviation from chronological age and higher median predictive error (Table 1, Figure 1a). The age-adjusted EAA derived from these clocks showed wide individual variation (Figure 1b).

Table 1
Chronological age prediction and correlation with methylome-wide entropy.
Clock typeDNAmAge nameTissuer with age (n = 339)*Age prediction median errorr with entropy(n = 339)*,
Standard clocksDNAmAgePan0.890.120.43
Developmental clocksdev.DNAmAgePan0.870.140.39
Interventional clocksint.DNAmAgePan0.850.170.29
  1. *


  2. Methylome-wide entropy calculated from ~28K CpGs.

Figure 1 with 2 supplements see all
Correlates and modifiers of epigenetic clocks and methylome-wide entropy.

(a) Correlation between chronological age and predicted age (shown for the liver intervention clock or int.DNAmAge). Black circles are control diet (CD, n = 210); red crosses are high-fat diet mice (HFD, n = 128). (b) Violin plots of age-adjusted epigenetic age acceleration (EAA) (‘int,’ interventional; ‘dev,’ developmental). (c) Shannon entropy, calculated from the full set of high-quality CpGs, increases with age. (d) Methylome entropy has a direct correlation with EAA (shown for the liver int.EAA). (e) For 48 mice, initial body weight (BW0) was measured 1 or 3 days after introduction to HFD, and these showed significant correlation with EAA. (f) Weight was first measured at mean age of 4.5 ± 2.7 months (BW0), and then at 6.3 ± 2.8 months (BW1). Weight gains during this interval (deltaBW = BW1 – BW0) are a direct correlate of EAA. (g) For BXD genotypes with males and females samples, males have higher age acceleration. Bars represent mean ± standard error; 40 females (26 CD, 14 HFD) and 18 males (10 CD, 8 HFD). (h) Relative effects of different predictor variables on EAA shown as logworth scores (-log10p). The dashed lines correspond to p=0.01. Positive logworth values indicate positive regression estimates (for diet, positive means higher in HFD compared to CD). BWF, final weight; Chol, serum total cholesterol; Gluc, fasted glucose levels. (i) The residual plot displays association between methylome-wide entropy and the pan-tissue int.EAA after adjustment for diet, age, weight, glucose, cholesterol, and batch.

We next estimated the methylome-wide entropy as a measure of randomness and information loss. This was computed from 27,966 probes that provide high-quality data and have been validated to perform well in mice (Arneson et al., 2022). Consistent with previous reports (Hannum et al., 2013; Sziráki et al., 2018), this property increased with chronological age, and age accounted for about 6% (in CD) to 28% (in HFD) of the variance in entropy (Figure 1c). As direct correlates of chronological age, all the DNAmAge estimates also had significant positive correlations with entropy (Table 1). We hypothesized that higher entropy levels will be associated with higher EAA, and based on this bivariate comparison, most of the EAA showed a significant positive correlation with entropy (Figure 1d; Supplementary file 3).

How the epigenetic readouts relate to diet, sex, and metabolic traits


HFD was associated with higher EAA for four of the clocks (Table 2). For instance, the liver-specific interventional clock diverged between the diets (Figure 1a), and CD mice had an average of –0.04 years of age deceleration, and HFD mice had an average of +0.11 years of age acceleration (Table 2). The two clocks that were not affected by diet were the liver general and developmental clocks. Methylome-wide entropy was not different between the diets.

Table 2
Association with diet and weight, and heritability of the epigenetic readouts.
TypeEAADietMean (SD)Diet (p)rBW0*pBW0rBWF*pBWFh2Strain r
Mouse clocksEAA, panCD–0.05 ± 0.21<0.00010.190.0060.29<0.00010.490.54
HFD0.07 ±<0.00010.50
EAA, liverCD0 ± 0.17ns0.09ns0.200.0030.400.73
HFD0.03 ±<0.00010.52
dev.EAA, panCD–0.04 ± 0.230.0040.09ns0.220.0010.530.76
HFD0.03 ±<0.00010.61
dev.EAA, liverCD0 ± 0.2ns0.190.0020.29<0.00010.460.78
HFD0 ±<0.00010.60
int.EAA, panCD–0.05 ± 0.250.00030.03ns0.210.0020.270.66
HFD0.06 ± 0.330.220.010.46<0.00010.45
int.EAA, liverCD–0.04 ± 0.22<0.00010.05ns0.180.010.590.80
HFD0.11 ±<0.00010.54
Entropy-CD2.67 ± 0.02ns0.09ns0.05ns0.310.24(ns)
HFD2.67 ±
  1. CD, control diet; HFD, high-fat diet; EAA, epigenetic age acceleration; int, interventional; dev, developmental.

  2. *

    BW0 is body weight at about 4.5 months of age (n = 339; 210 CD and 129 HFD); BWF is final weight at tissue collection (1 HFD case missing data; n = 338; 210 CD and 128 HFD).

  3. Pearson correlation between strain means for n = 29 BXD genotypes kept on CD and HFD.

Body weight

Body weight was first measured when mice were at an average age of 4.5 ± 2.7 months. We refer to this initial weight as baseline body weight (BW0). For mice on HFD, this was usually before introduction to the diet, except for 48 cases that were first weighed 1 or 3 days after HFD (Supplementary file 1). In the CD group, only the EAA from the pan-tissue standard and liver developmental clocks showed modest but significant positive correlations with BW0 (Table 2). In the HFD group, the positive correlation with BW0 was more robust and consistent across all the clocks, and this may have been due to the inclusion of the 48 cases that had been on HFD for 1 or 3 days. Taking only these 48 cases, we found that higher weight even after 1 day of HFD had an age-accelerating effect on all the clocks, and this was particularly strong for the interventional clocks (r = 0.45, p=0.001 for the pan-tissue int.EAA; r = 0.58, p<0.0001 for the liver int.EAA) (Figure 1e). Second weight was measured 7.4 ± 5.2 weeks after BW0 (mean age 6.3 ± 2.8 months). We refer to this as BW1 and estimated the weight change as deltaBW = BW1 – BW0. DeltaBW was a positive correlate of EAA in both diets, albeit more pronounced in the HFD group (Figure 1f). The final body weight (BWF) was measured at the time of tissue harvest, and EAA from all the mouse clocks were significant correlates of BWF in both diets (Table 2). In contrast, entropy did not show an association with either BW0 or BWF. We do note that when stratified by diet the entropy level had a slight positive correlation with BW1 in the HFD group (r = 0.23, p=0.008), but not in the CD group (Supplementary file 3).


Four BXD genotypes (B6D2F1, D2B6F1, BXD102, B6) had cases from both males and females. We used these to test for sex effect. All the clocks showed significant age acceleration in male mice, and this was particularly strong for both dev.EAA and pan-tissue int.EAA (Figure 1g, Supplementary file 3). The effect of sex was independent of the higher BWF of males, and the higher age acceleration in males was detected after adjusting for BWF (Supplementary file 4a). There was no significant difference in entropy between the sexes.

Metabolic measures

278 cases with DNAm data also had fasted serum glucose and total cholesterol (Williams et al., 2022; Roy et al., 2021), and we examined whether these metabolic traits were associated with either the EAA measures or methylome entropy. Since these are highly dependent on diet, weight, and age, we applied a multivariable model to jointly examine how the different metabolic variables (cholesterol and glucose, as well as diet and weight) and entropy relate to EAA after adjusting for age. To test the robustness of associations, we also include the methylation array batch as another covariate (Supplementary file 1 has batch information; Supplementary file 5 has the full statistics). Figure 1h shows the relative strengths and directions of associations between these variables and the EAA traits. Except for the pan-tissue interventional clock, entropy had a strong positive correlation with EAA. For example, a plot of residuals between entropy and the liver int.EAA indicates that after adjusting for all the other covariates, the methylome-wide entropy explains 17% of the variance in int.EAA (Figure 1i). Since diet strongly influences BWF, the inclusion of BWF in the regression diminished the effect of diet. For the two clocks that were not influenced by diet (the liver EAA and liver dev.EAA), adjusting for the effect of BWF resulted in an inverse association with diet (i.e., the residual EAA values after accounting for BWF were slightly lower in the HFD group). Fasted glucose did not have a significant effect on EAA. Cholesterol had a positive association with the interventional clocks but the effects were modest (residual R2 = 0.02 and p=0.02 for cholesterol and the pan-tissue int.EAA).

We also performed a similar analysis with BW0 instead of BWF (Figure 1—figure supplement 1), and here, HFD remained as an accelerator of the pan-tissue EAA and liver int.EAA. Cholesterol also became a significant positive correlate of EAA for the interventional clocks. This would suggest that the effect of diet on EAA is mostly mediated by its impact on physiological and metabolic traits, and BWF becomes a prominent predictor of EAA.

To summarize, our results indicate that the degree of disorder in the methylome increases with age and may partly contribute to the epigenetic clocks as higher entropy is associated with higher EAA. The EAA traits were also associated with biological variables (i.e., body weight, diet, and sex). Of these, BWF was the strongest modulator of EAA.

How the epigenetic readouts relate to strain longevity

We next obtained longevity data from a parallel cohort of female BXD mice that were allowed to age on CD or HFD (Roy et al., 2021). Since the strain lifespan was determined from female BXDs, we restricted this to only the female cases. For strains with natural death data from n ≥ 5, we computed the minimum (minLS), 25th quartile (25Q-LS), mean, median lifespan, 75th quartile (75Q-LS), and maximum lifespan (maxLS) (Supplementary file 1). Specifically, we postulated an accelerated clock for strains with shorter lifespan. Overall, the EAA measures showed the expected inverse correlation trend with the lifespan statistics (Supplementary file 4b). However, these correlations were weak. The correlations were significant only for the pan-tissue general clock (Figure 1—figure supplement 2a) and the liver intervention clock, with explained variance in lifespan of only ~3% (Figure 1—figure supplement 2b and c). When separated by diet, these correlations became weaker, indicating that while we see the expected inverse relationship, the EAA is only weakly predictive of strain longevity. Entropy estimated from the full set of CpGs was unrelated to strain longevity.

Multifactor variance of the conserved CpGs

Both entropy and clock readouts capture the overall variation across multiple CpGs, and to gain insights into the underlying variance patterns, we performed a multivariable epigenome-wide association study (EWAS). For this, we applied a site-by-site regression on the 27,966 validated CpGs (Arneson et al., 2022) and tested for association with age, BWF, diet, and genotype-dependent strain median lifespan (full set of probes, annotations, and EWAS results in Supplementary file 6).

Age was clearly the most influential variable, and this is apparent from the volcano plots (Figure 2a–d). We used a cutoff of Bonferroni p≤0.05 to define differentially methylated CpGs (DMCs), and 6553 CpGs were associated with age (referred to as age-DMCs), 733 with weight (weight-DMCs), 321 with diet (diet-DMCs), and 236 with genotype-dependent lifespan (LS-DMCs). We note extensive overlap among the lists of DMCs that shows that variation at these CpGs are multifactorial in nature (Figure 1e). Majority of the age-DMCs (77%) gained methylation (or age-gain), and consistent with previous observations, age-gain CpGs tended to be in regions with low methylation, whereas age-DMCs that declined in methylation (age-loss) were in regions with high methylation (Figure 2f; Sandoval-Sierra et al., 2020; Sziráki et al., 2018; Slieker et al., 2016). By overlaying the volcano plots with the age-gain and age-loss information, we see distinct patterns in how these age-DMCs vary with weight (Figure 2b), diet (Figure 2c), and genotype lifespan (Figure 2d). While the majority of CpGs, including several age-loss CpGs, had negative regression estimates for weight (i.e., decrease in DNAm with unit increase in weight), HFD was associated with higher methylation levels (positive regression estimates) including at several age-DMCs (Figure 2c). This pattern of inverse correlation with weight but heightened methylation due to HFD is illustrated by the CpG in the 3′UTR of Mettl23 (cg10587537) (Figure 2g). Taking the 6553 age-DMCs, a comparison of the regression estimates for age (i.e., the change in methylation per day of aging) versus diet (difference in HFD relative to CD) shows that the age-gains were augmented in methylation by HFD (Figure 2h), and again, this is illustrated by the CpG in the Mettl23 3′UTR (Figure 2i). For the LS-DMCs, sites that had negative regression estimates for lifespan (i.e., lower DNAm per day increase in strain median longevity) had higher proportion of age-gain CpGs (Figure 2d). A comparison between the regression estimates for age versus the regression estimates for lifespan shows that CpGs that gain methylation with age tended to have lower methylation in strains with longer lifespan (Figure 2j).

Figure 2 with 2 supplements see all
Multivariable analysis of site-specific methylation.

(a) Volcano plot comparing regression estimates (change in methylation beta-value per day of age) versus the statistical significance for age effect. Dashed line denotes the Bonferroni p=0.05 for ~28K tests. Similar volcano plots for predictor variables: (b) final body weight (regression estimates are change per gram of weight), (c) diet (change in high-fat diet [HFD] compared to control diet [CD]), and (d) strain median lifespan (per day increase in median longevity). CpGs that were significantly associated with age are denoted by colored markers (red circles: age-gain; yellow triangles: age-loss). (e) Overlap among the lists of differentially methylated CpGs. (f) Each dot represents the mean methylation beta-values for the 5030 age-gain and 1523 age-loss CpGs. (g) Correlation between body weight and methylation beta-values for the CpG (cg10587537) located in the 3′UTR of Mettl23. Mice on HFD have higher methylation than mice on CD, but the inverse correlation with weight is consistent for both groups (r = –0.45, p<0.0001 for CD; r = –0.15, p=0.08 for HFD). (h) Contour density plot for the 6553 CpGs that are significantly associated with age (age-DMCs). This relates the pattern of change with age (x-axis) with change on HFD (y-axis). CpGs that gain methylation with age are also increased in methylation by HFD. (i) Correlation between age and methylation at the Mettl23 3′UTR CpG (r = 0.35 for CD; r = 0.46 for HFD). (j) For the 6553 age-DMCs, the contour density plot relates the pattern of change with age (x-axis) vs. change with median longevity (y-axis). CpGs that gain methylation with age have lower methylation with higher lifespan.

As in Sziráki et al., 2018, we divided the CpGs by age effect: age-gain, age-loss, and those that do not change strongly with age (age-ns; i.e., the remaining 21,413 CpGs that were not classified as age-DMCs). For these conserved CpGs, both sets of age-DMCs had significant increases in entropy with age regardless of diet (Figure 3a and b), and even the age-ns showed a modest entropy gain with age in the HFD group (Figure 3c). The reason for this increase in disorder becomes evident when we compare the density plots using the full set of CpGs for one of the younger mice (UT319; 0.56 years old) and one of the older mice (UT573; 2.3 years) (Figure 3d). Concordant with previous reports (Sziráki et al., 2018; Kerepesi et al., 2022), the older sample showed a subtle flattening of the bimodal peaks towards a slightly more hemi-methylated state. The entropy of the age-gain CpGs was modestly but significantly higher in the HFD group (p=0.001; Figure 3e). Entropy of the age-loss and age-ns CpGs was not different between the diets. Body weight, on the other hand, was associated specifically with the entropy score of the age-loss CpGs, and both higher BW0 (Figure 3f) and BWF predicted higher entropy for age-loss CpGs.

Entropy at age-associated CpGs.

Entropy values were calculated for the 5020 age-gain and 1523 age-loss CpGs separately. For both control diet (CD) and high-fat diet (HFD), there is a significant increase in entropy with age at the (a) age-gain and (b) age-loss CpGs. (c) The HFD mice also showed a slight increase in entropy at CpGs that were not strongly associated with age (age-ns). (d) The methylome-wide distribution of beta-values in a young adult mouse (0.6 years old; black dashed line) and an older mouse (2.3 years old; red line); both CD mice. The young mouse has higher peaks at the hypomethylated (closer to 0.1) and hypermethylated (around 0.9) beta-values compared to the older mouse. (e) The HFD group has higher entropy at the age-gain CpGs compared to the CD group. (f) Entropy at age-loss CpGs is higher with higher baseline weight (BW0). (g) Relative effects of predictor variables on entropy shown as logworth scores (-log10p). The dashed lines correspond to p=0.01. Positive values indicate positive regression estimates (for diet, positive value means higher in HFD). BWF, final weight; Chol, serum total cholesterol; Gluc, fasted glucose levels; LS, strain median lifespan. (h) The residual plot (adjusted for age, diet, BWF, glucose, cholesterol, and batch) shows the inverse association between entropy at age-gain sites and lifespan. Similar residual plots show the association between (i) BWF and age-loss entropy, and (j) between fasted serum glucose and age-gain entropy.

We applied a multivariable regression to compare the relative effects of age, diet, BWF, glucose, cholesterol, and strain median lifespan (Figure 3g; full statistics in Supplementary file 7). Entropy of age-gain CpGs was increased by HFD but was not associated with BWF. Strain median lifespan showed a significant inverse correlation with the entropy of age-gain CpGs with an explained variance of 6% (Figure 3h). Entropy of the age-loss CpGs had a significant positive correlation with BWF (Figure 3i), but was not associated with diet, and also had a modestly significant inverse correlation with median lifespan. Cholesterol was unrelated to the entropy values. Glucose, on the other hand, showed a significant inverse association with entropy of both age-gain (Figure 3j) and age-loss CpGs, and this suggests slightly lower entropy with higher fasted glucose.

Taken together, our results show that the conserved CpGs are influenced by multiple predictors. HFD augmented the age-dependent changes with a prominent effect on age-gain CpGs. Body weight showed a strong association with the age-loss CpGs. Additionally, strains with longer life expectancy tended to have lower methylation levels at age-gain CpGs with an overall lower entropy state at these CpGs that suggests a more ‘youthful’ methylome for longer lived genotypes.

Functional and genomic context of DMCs

To uncover the potential biological pathways represented by the DMCs, we performed genomic regions enrichment analyses for the CpGs (McLean et al., 2010). The age-gain CpGs were highly enriched in transcription factors, regulators of development and growth, menarche and menstrual phases, energy metabolism, and transcription factor networks such as HNF1 and HNF3B pathways (Supplementary file 8). The age-loss CpGs had somewhat modest enrichment and represented cell adhesion and cytoskeletal processes, endothelial cell proliferation, and p38 signaling. The BW-DMCs were enriched in actin and protein metabolism, and WNT, and platelet-derived growth factor (PDGF) and ErbB signaling. Similarly, the diet-DMCs were highly enriched in PDGF, epidermal growth factor (EGFR), and ErbB signaling, as well as the mTOR signaling pathway, and regulation of energy homeostasis (Supplementary file 8). Seeming to converge on common pathways, the LS-CpGs that were negatively correlated with lifespan had modest enrichment in cell signaling pathways such as EGFR, PDGF, and ErbB signaling. The LS-CpGs with positive correlation with lifespan were highly enriched in lipid metabolic genes and also included pathways related to chromosome maintenance and telomere expansion (Supplementary file 8).

We next examined the genomic annotations and chromatin states of the DMCs. Consistent with previous reports (Sandoval-Sierra et al., 2020; Sziráki et al., 2018), age-gain CpGs were enriched in promoter and 5′UTR CpGs, but depleted in 3′UTR, exon, and intergenic CpGs (Figure 2—figure supplement 1a, Supplementary file 9). Diet- and weight-DMCs were depleted in promoter regions and enriched in exons and 3′UTR, and along with the age-loss CpGs, enriched in introns. For chromatin states, we annotated the CpG regions using the 15-state chromatin data for neonatal (P0) mouse liver (Ernst and Kellis, 2012; Gorkin et al., 2020). Also included were regions labeled as no reproducible state (NRS); that is, regions that were not replicated (Supplementary file 6 has annotations for each site; Gorkin et al., 2020). Compared to the array content as background, the age-gain CpGs were selectively enrich in polycomb-associated heterochromatin (Hc-P) and bivalent promoters (Pr-Bi), chromatin states that were highly depleted among the other DMCs (Figure 2—figure supplement 1b). In contrast, strong and permissive transcription sites (Tr-S and Tr-P, respectively) were depleted among the age-gain CpGs and enriched among the BW- and diet-DMCs. Age-loss CpGs were enriched in Tr-P and Tr-I (transcription initiation). Distal enhancers (strong distal or En-Sd, and poised distal or En-Pd) were also highly enriched among the BW- and diet-DMCs, and also showed some enrichment among the age-DMCs.

For an overview of the general methylation and variance patterns by chromatin annotations, we used the full set of 27,966 CpGs and computed the average methylation beta-values, and average regression coefficients (i.e., change in beta-value per unit change in the respective predictor variable, or contrast between diets). As expected, promoter CpGs and Hc-P were sites with the lowest methylation. Hc-H, Tr-S, and Tr-P had higher methylation, and many of the enhancer sites were in the hemi-methylated zone. For age effect, mean regression estimates had a significant inverse linear fit with mean methylation (r = –0.63, p=0.009; Figure 2—figure supplement 1c), and this is consistent with the greater age-loss at hypermethylated CpGs and greater age-gains at hypomethylated CpGs (Figure 2f). The effects of diet and weight were not linearly related to the mean methylation of the chromatin states. Instead, both showed a U-shaped fit with a significant negative quadratic effect for diet (R2 = 0.69, p=0.0005, quadratic estimate = –0.05; Figure 2—figure supplement 1d) and a positive quadratic effect for weight (R2 = 0.50, p=0.01, quadratic estimate = 0.001; Figure 2—figure supplement 1e). Methylation variation as a function of strain longevity did not relate to mean methylation with either a linear or polynomial fit, and indicates that variance due to background genotype is less dependent on the chromatin and mean methylation status. While this is a very low-resolution and broad view of methylation levels and methylation variation, the observations show that while aging results in erosion of the hypo- and hypermethylated peaks, diet and body weight appear to have generally stronger associations with hemi-methylated sites.

Genetic analysis of epigenetic age acceleration

The EAA traits had moderate heritability at an averaged h2 of 0.50 (Table 2; Ashbrook et al., 2021). Another way to gauge the level of genetic correlation is to compare between members of strains maintained on different diets. All the EAA traits shared high strain-level correlations between diets, indicating an effect of background genotype that is robust to dietary differences (Table 2). The methylome-wide entropy had a heritability of ~0.30 and had no strain-level correlation between diets.

To uncover genetic loci, we applied QTL mapping using mixed linear modeling that corrects for the BXD kinship structure (Zhou and Stephens, 2014). First, we performed the QTL mapping for each EAA trait with adjustment for diet and body weight. EAA from the two interventional clocks had the strongest QTLs (Supplementary file 10). The pan-tissue int.EAA had a significant QTL on Chr11 (90–99 Mb) with the highest linkage at ~93 Mb (p=3.5E-06; equivalent to a LOD score of 4.7) (Figure 4a). Taking a genotype marker at the peak interval (BXD variant ID DA0014408.4 at Chr11, 92.750 Mb) (Ashbrook et al., 2021), we segregated the BXDs homozygous for either the D2 (DD) or the B6 (BB) alleles. Strains with DD genotype at this locus had significantly higher int.EAA (Figure 4a, inset). The liver int.EAA had the highest QTL on Chr19 (35–45 Mb) with the most significant linkage at markers between 38 and 42 Mb (p=9E-07; LOD score of 5.2) (Figure 4b). We selected a marker at the peak interval (rs48062674 at Chr19, 38.650 Mb), and the BB genotype had significantly higher liver int.EAA compared to DD (Figure 4b, inset).

Figure 4 with 1 supplement see all
Quantitative trait locus (QTL) maps for the DNAm readouts.

The Manhattan plots represent the location of genotyped markers (x-axis) and linkage –log10p (y-axis). (a) The peak QTL for age acceleration from the pan-tissue interventional clock (int.EAA) is on chromosome (Chr) 11 at ~93 Mb. The inset shows the mean (± standard error) trait values for BXDs homozygous for the C57BL/6J allele (BB; gray) versus BXDs homozygous for the DBA/2J allele (DD; black) on control diet (CD) and high-fat diet (HFD). (b) The liver-specific int.EAA has a peak QTL on Chr19 (~38 Mb). Trait means by genotype at this locus are shown in inset; BB has higher age acceleration. (c) Linkage statistics are weaker for the methylome-wide entropy. However, there is a nominally significant linkage on the Chr19 locus, but the peak markers are at ~47.5 Mb. Here, the BB genotype has higher entropy.

We performed a similar QTL mapping for methylome-wide entropy with adjustment for major covariates (diet, chronological age, and body weight). There were no genome-wide significant QTLs. A region on Chr19 that overlapped the liver int.EAA showed a modest peak (Figure 4c, Supplementary file 10). However, the peak markers for entropy were located slightly distal to the peak EAA QTL (~47.5 Mb at rs30567369, minimum p=0.0005). At this locus, the BB genotype had higher average entropy.

To identify regulatory loci that are consistent across the different EAA measures, we applied a multitrait analysis and derived the linkage meta-p-value using a p-value combination for the six EAA traits (Peirce et al., 2007). The peaks on Chrs 11 and 19 attained the highest consensus p-values (Figure 4—figure supplement 1a). There was another potential consensus peak at combined –log10 p>6 on Chr3 (~54 Mb). We focus on the Chrs 11 and 19 QTLs and refer to these as EAA QTL on Chr 11 (Eaa11), and EAA QTL on Chr 19 (Eaa19). Eaa11 extends from 90 to 99 Mb. For Eaa19, we delineated a broader interval from 35 to 48 Mb that also encompasses the peak markers for entropy.

We performed marker-specific linkage analyses for each of the clocks using a regression model that adjusted for diet. With the exception of the liver int.EAA, all the EAA traits had nominal to highly significant associations with the representative Eaa11 marker (DA0014408.4), and the DD genotype had higher age acceleration (Table 3). Mean plots by genotype and diet show that this effect was primarily in the CD mice (Figure 4—figure supplement 1b). The effect of this locus appeared to be higher for the pan-tissue clocks compared to the corresponding liver-specific clocks. For proximal Eaa19, the representative marker (rs48062674) was associated with all the EAA traits and the BB mice had higher age acceleration in both diets (Figure 4—figure supplement 1c). We also tested if these peak markers were associated with the recorded lifespan phenotype, and we found no significant association with the observed lifespan of the BXDs.

Table 3
Marker-specific linkage analyses for epigenetic age acceleration and body weight trajectory.
Linear regression*
PredictorOutcomeEstimateStandard errort ratiop
Eaa11DA0014408.4[DD]Chr11, 92.750 Mb(133 BB cases, and 173 DD cases)EAA, pan0.0960.0234.1843.8E-05
EAA, liver0.0670.0173.8800.0001
dev.EAA, pan0.0770.0253.0410.003
dev.EAA, liver0.0370.0201.8780.06
int.EAA, pan0.1530.0295.2782.5E-07
int.EAA, liver–0.0330.025–1.2840.20
Eaa19rs48062674[DD]Chr19, 38.650 Mb(238 BB cases, and 67 DD cases)EAA, pan–0.0830.028–2.9540.003
EAA, liver–0.1370.020–6.9722.0E-11
dev.EAA, pan–0.2060.029–7.2184.3E-12
dev.EAA, liver–0.1240.023–5.4619.9E-08
int.EAA, pan–0.1430.035–4.0287.1E-05
int.EAA, liver–0.2500.027–9.2384.6E-18
Mixed model for longitudinal change in body weight
PredictorOutcomeEstimateStandard errort ratiop
Eaa11DA0014408.4[DD]Number of observations = 6885; Number of individuals = 2112Body weight0.6190.3451.7940.07
Eaa19rs48062674[DD]Number of observations = 6132; Number of individuals = 1852Body weight–1.8470.374–4.9457.6E-07
  1. int, interventional; dev, developmental; EAA, epigenetic age acceleration.

  2. *

    Regression model: lm(EAA ~ genotype + diet).

  3. lmer(weight ~age + diet + genotype + (1|mouseID)).

Association of EAA QTLs with body weight trajectory

Since gain in body weight with age was an accelerator of the clocks, we examined whether the selected markers in Eaa11 and Eaa19 are also related to body weight change. We retrieved longitudinal weight data from a larger cohort of the aging BXD mice that were weighed at regular intervals. After excluding heterozygotes, we tested the effect of genotype. Concordant with the higher EAA for the DD genotype at Eaa11 in the CD group, the DD genotype in the CD group also had slightly higher mean weight in older adulthood (12 and 18 months; Figure 5a). However, this marker had no significant association with body weight when tested using a mixed-effects model (p=0.07; Table 3). In Eaa19, it was the BB genotype that consistently exhibited an accelerated clock in both diets, and also higher entropy, and the BB genotype had higher average body weight by 6 months of age (Figure 5b), and this locus had a significant influence on the body weight trajectory (p=7.6E-07; Table 3).

Body weight trajectory by diet and genotype.

Body weight was measured at regular age intervals (x-axis) from (a) 2112 BXD mice that were homozygous at the Eaa11 marker (DA0014408.4; 842 BB, 1279 DD) and (b) 1852 BXD mice that were homozygous at the proximal Eaa19 marker (rs48062674; 1252 BB, 600 DD). Mice were maintained on either control diet (CD) or high-fat diet (HFD). The graphs show the segregation of body weight over time by diet and genotype. Mean ± standard error; heterozygotes were excluded.

Candidate genes for epigenetic age acceleration

There are several positional candidate genes in Eaa11 and Eaa19. To narrow the list, we applied two selection criteria: genes that (1) contain missense and/or stop variants, and/or (2) contain non-coding variants and regulated by cis-acting expression QTLs (eQTL). For the eQTL analysis, we utilized an existing liver transcriptome data from the same aging cohort (Williams et al., 2022). We identified 24 positional candidates in Eaa11 that includes Stxbp4, Erbb2 (Her-2 oncogenic gene), and Grb7 (growth factor receptor binding) (Supplementary file 11, Figure 4a). Eaa19 has 81 such candidates that includes a cluster of cytochrome P450 genes, and Chuk (inhibitor of NF-kB) in the proximal region, and Pcgf6 (epigenetic regulator) and Elovl3 (lipid metabolic gene) in the distal region (Supplementary file 11, Figure 4b and c).

For further prioritization, we converted the mouse QTL regions to the corresponding syntenic regions in the human genome and retrieved GWAS annotations for these intervals (Buniello et al., 2019). We specifically searched for the traits: epigenetic aging, longevity, age of menarche/menopause/puberty, Alzheimer’s disease, and age-related cognitive decline and dementia. This highlighted five genes in Eaa11 and three genes in Eaa19 (Supplementary file 4c). We also identified a GWAS that found associations between variants near Myof-Cyp26a1 and human longevity (Yashin et al., 2018), and a meta-GWAS that found gene-level associations between Nkx2–3 and Cutc, and epigenetic aging (Supplementary file 4c; McCartney et al., 2021).

Gene expression correlates of EAA

A subset of the BXD cases had liver RNA-seq data (94 CD and 59 HFD) (Williams et al., 2022). Using this set, we performed transcriptome-wide correlation analysis for the general pan-tissue EAA and the more specific liver int.EAA. To gain insights into biological pathways, we selected the top 2000 transcriptome correlates for functional enrichment analysis (Supplementary file 12a). The common themes for both clocks were that (1) there were far fewer negative correlates (223 out of 2000 for pan-tissue EAA, and 337 out of 2000 transcripts for liver int.EAA) than positive correlates, and (2) the negative correlates were highly enriched (Bonferroni corrected p<0.05) in oxidation-reduction and mitochondrial genes (Supplementary file 12b and c). The pan-tissue general clock was also highly enriched in pathways related to steroid metabolism, epoxygenase p450 pathway, and xenobiotics, which are pathways that are particularly relevant to liver. The p450 genes included candidates that are in Eaa19 (e.g., Cyp2c29, Cyp2c37). The positive correlates were enriched in a variety of gene functions, including mitosis for both clocks, and immune and inflammatory response for the general pan-tissue clock (functions that are not specific to liver). 563 transcripts (315 unique genes) were correlated with both pan-tissue EAA and liver int.EAA. Based on hierarchical clustering (HC) of these common mRNA correlates of EAA, the transcripts could be clustered into three groups (Figure 6a; heatmap in Figure 6—figure supplement 1a). While none of these were significantly enriched in any particular Gene Ontology (GO), cluster 3 included several oxidation-reduction genes, including the Eaa11 candidate, Cyp2c29, and cluster 2 included several cell cycle genes (Figure 6a). To verify that these transcriptomic associations are robust to the effect of diet, we repeated the correlation and enrichment analysis in the CD group only for the pan-tissue general clock (n = 94). Again, taking the top 2000 correlates (|r| ≤ 0.22; p≤0.03), we found the same enrichment profiles for the positive correlates (immune, cell cycle) and the negative correlates (oxidation-reduction and mitochondrial).

Figure 6 with 1 supplement see all
Gene expression correlates of epigenetic age acceleration (EAA).

(a) mRNAs that are correlated with the acceleration of both pan-tissue general clock (pan EAA) and liver interventional clock (liver int.EAA) were grouped based on unsupervised hierarchical clustering (HC). Few representative genes and gene ontologies are highlighted. For liver proteome, the level of APOE is the strongest correlate for both (b) liver int.EAA and (c) pan-tissue EAA. (d) For liver proteins that are correlated with both pan-tissue EAA and liver int.EAA, HC grouped the proteins into clusters enriched in oxidation-reduction and lipid metabolism, and a cluster enriched in glycogen metabolism. In adipose tissue, the expression level of the APOE protein is higher with higher age acceleration for both (e) liver int.EAA and (f) pan-tissue EAA.

Liver proteome was also available for 164 of the BXDs, and 53 also had adipose proteome. The liver proteome data quantifies over 32,000 protein variants from 3940 unique genes and has been reported in Williams et al., 2022. Similar to the transcriptome-wide analysis, we extracted the top 2000 protein correlates of EAA (Supplementary file 12d) and performed functional enrichment analysis (Supplementary file 12b and c). For both liver int.EAA and pan-tissue EAA, the top liver protein correlate was APOE, and higher expression of APOE was associated with higher age acceleration (Figure 6b and c). Similar to the transcriptome, the negative correlates of EAA were highly enriched in oxidation-reduction (several cytochrome proteins), steroid metabolism, and epoxygenase 450 pathway. The positive correlates were also highly enriched in oxidation-reduction (several hydroxy-delta-5 steroid dehydrogenases proteins), lipid and carbohydrate metabolism, as well as phospholipid efflux (particularly enriched for the liver int.EAA). There was a high degree of overlap at the proteomic level for the two clocks, and 1241 protein variants (332 unique genes) were correlated with both pan-tissue EAA and liver int.EAA (Supplementary file 12d). For these common protein correlates, the HC divided the proteins into clusters that represented metabolic pathways mainly related to steroid metabolism, and also glycolysis and gluconeogenesis (Figure 6d; heatmap in Figure 6—figure supplement 1b).

Finally, we used the adipose proteome data for a proteome-wide correlational analysis for the pan-tissue EAA and liver int.EAA. We took only the top 1000 correlates (due to the small sample size), and a functional enrichment analysis showed consistent enrichment in metabolic pathways related to fatty acids and carbohydrates, and cell proliferation genes for the pan-tissue EAA (Supplementary file 12b and c). For the adipose proteome, the cytochrome p450 genes were no longer enriched. However, the overall functional profile highlighted metabolic pathways as important gene expression correlates of EAA. Furthermore, for both the liver and adipose proteomes, APOE levels were highly correlated with EAA that indicates a higher level of this apolipoprotein in both tissues is associated with higher age acceleration (Figure 6e and f).


Here, we have tested the performance of DNAm clocks derived from highly conserved CpGs and described the dynamism and variability of site-specific methylation. While age is a major source of variance, we detected joint modulation by diet, body weight, and genotype-by-diet life expectancy. HFD had an age-accelerating effect on the clocks, and this is concordant with our previous report where we found more rapid age-dependent changes in methylation (Sandoval-Sierra et al., 2020). This also concurs with studies in humans that have found that obesity accelerates epigenetic aging (Horvath et al., 2014; Nevalainen et al., 2017). However, when BWF was included in the regression term, the effect of diet became inconsistent. This suggests that the effect of diet on EAA is mediated by the changes in weight and metabolic traits. Body weight, in particular, had a strong age-accelerating effect. The effect of weight manifests early on, and even in the CD group, higher weight gains at younger age (between 4 and 6 months) correlated with higher EAA later in life.

We tested different mouse DNAm clocks, and the main difference between these clocks was the subsets of CpGs that were used for training. It is well-known that DNAm clocks have high level of degeneracy (Thompson et al., 2018; Liu et al., 2020). In other words, highly accurate predictors of chronological age can be built from entirely different sets of CpGs and different weight coefficients. This is likely because a large proportion of CpGs undergo some degree of change with age, and combinatorial information from any subset of such CpGs is informative of age. For instance, even at a very stringent cutoff of Bonferroni 0.05 that treated the 27,966 CpGs as ‘independent,’ we still detected 6553 CpGs as age-DMC, which is close to a quarter of the CpGs we tested. Clocks built from pre-selected CpGs that are at conserved sequences are known to be sensitive to the effects of pro-longevity interventions such as caloric restriction and growth hormone receptor deletion (Thompson et al., 2018; Wang and Lemos, 2019). And while all these DNAm clocks achieve reasonably high prediction of chronological age, the age divergence derived from these different clocks (EAA) can capture slightly different facets of biological aging, and the better a clock is at predicting chronological age, the lower its association with mortality risk (Zhang et al., 2019; Liu et al., 2020). In this study, we find that the interventional clocks deviated most from chronological age, and this is expected as these were built from a much smaller set of CpGs (see Materials and methods). The interventional clocks were also associated with BWF and cholesterol, but had weaker associations with BW0. The liver int.EAA had the highest positive correlation with methylome-wide entropy and was the clock that had the strongest inverse correlation with strain longevity. In contrast, the developmental clocks, which were based on CpGs that change early in life, showed a stronger association with BW0. The contrast between the interventional and developmental clocks suggests that while one is more modifiable, the other is more informative of baseline characteristics that influence aging later in life. The pan-tissue general clock, which was not constrained to any preselected set of CpGs or tissue, also performed well in capturing biological aging and was accelerated by both BW0 and BWF, diet, higher entropy, and had a modest but significant inverse correlation with strain lifespan.

Entropy, a measure of noise and information loss, increases as a function of time and age (Hannum et al., 2013; Xie et al., 2011; Jenkinson et al., 2017; Hayflick, 2007). In the context of the methylome, the shift to higher entropy represents a tendency for the highly organized hypo- and hypermethylated landscape to erode towards a more hemi-methylated state (Hannum et al., 2013; Sziráki et al., 2018; Kerepesi et al., 2022). This increase in disorder, particularly across CpGs that are highly conserved, could have important functional consequences. The entropy of age-gain CpGs predicted strain lifespan and was increased by HFD. Overall, we find that mice belonging to longer-lived BXD strains had a more ‘youthful’ methylome with lower entropy at the age-gain CpGs. The entropy of age-loss CpGs, on the other hand, was related to the body weight of mice, and both higher BW0 and BWF were associated with higher entropy. This leads us to suggest that the rate of noise accumulation, an aspect of epigenomic aging, can vary between individuals, and the resilience or susceptibility to this shift towards higher noise is modulated by diet as well as genetic factors.

Somewhat surprising was the inverse correlation between the entropy of age-DMCs and fasted glucose. This lower entropy of age-gain CpGs with higher glucose is counter to the general tendency for strains with shorter lifespan to have higher glucose (Roy et al., 2021). In biological systems, entropy is kept at bay by the uptake of chemical energy and investment in maintenance and repair (Hayflick, 2007), and we can only speculate that at least in mice, the higher amount of glucose after overnight fast is associated with a more ordered methylome. The centrality of bioenergetics for biological systems may explain why we detect this coupling between the DNAm readouts (i.e., the clocks and entropy), and indices of metabolism including weight, diet, levels of macronutrients, and even expression of metabolic genes. As cogently highlighted by Donohoe and Bultman, 2012, many metabolites (e.g., SAM, NAD+, ATP) are essential cofactors for enzymes that shape the epigenome, and these could serve as nutrient sensors and mechanistic intermediaries that regulate how the epigenome is organized in response to metabolic conditions. Close interactions between macro- and micronutrients, and DNAm is a conserved process and plays a critical role in defining both physiology and body morphology (Kucharski et al., 2008; Dolinoy et al., 2007). Overall, our results suggest that a higher metabolic state is associated with higher entropy and EAA, and potentially, lower lifespan.

For the BXDs, life expectancy is highly dependent on the background genotype, and mean lifespan varies from under 16 months for strains such as BXD8 and BXD13, to over 28 months in strains such as BXD91 and BXD175 (Roy et al., 2021; de Haan and Williams, 2005; Lang et al., 2010). The EAA showed the expected inverse correlation with lifespan, but the effect was modest and only significant for the pan-tissue EAA and the liver int.EAA. The correlation between lifespan and entropy of age-gain CpGs was slightly stronger. We must point out that the analysis between the epigenetic readouts and lifespan was an indirect comparison. Unlike the comparison with body weight and metabolic traits, which were traits measured from the same individual, the lifespan data are strain characteristics computed from a parallel cohort of mice that were allowed to survive till natural mortality, and this may partly explain the weaker associations with EAA. Nonetheless, our observations indicate that genotypes with higher life expectancy have generally lower entropy, and lower methylation levels at the age-gain CpGs, and these properties of the methylome are likely to be partly under genetic modulation.

Our goal was to take these different clocks and identify regulatory loci that were the most stable and robust to the slight algorithmic differences in building the clocks. A notable candidate in Eaa11 is Syntaxin binding protein 4 (Stxbp4, aka, Synip), located at 90.5 Mb. Stxbp4 is a high-priority candidate due to the concordant evidence from human genetic studies. The conserved gene in humans is a replicated GWAS hit for the intrinsic rate of epigenetic aging (Gibson, 2019; Lu et al., 2018; McCartney et al., 2021). In the BXDs, Stxbp4 contains several noncoding variants, and a missense mutation (rs3668623), and the expression of Stxbp4 in liver is modulated by a cis-eQTL. Stxbp4 plays a key role in insulin signaling (Holman, 1999) and has oncogenic activity and implicated in different cancers (Michailidou et al., 2017; Rokudai et al., 2018). Furthermore, GWAS have also associated STXBP4 with age of menarche (Kichaev et al., 2019; Perry et al., 2014). Eaa11 corresponds to the 17q12-21 region in humans, and the location of additional oncogenic genes, for example, ERBB2/HER2, GRB7, and BRCA1 (Albertsen et al., 1994). The mouse Brca1 gene is a little distal to the peak QTL region and is not considered a candidate here, although it does segregate for two missense variants in the BXDs. Erbb2 and Grb7 are in the QTL region, and Erbb2 contains a missense variant (rs29390172), and Grb7 is modulated by a cis-eQTL. Nr1d1 is another candidate in Eaa11, and the co-activation of Erbb1, Grb7, and Nr1d1 has been linked to breast and other cancers (Kauraniemi and Kallioniemi, 2006; Tanaka et al., 1997).

Eaa19 was consistently associated with EAA from all the clocks we evaluated, and also with body weight gains, irrespective of diet. DNAm entropy may also have a weak association with markers at this interval. The EAA traits have peak markers in the proximal part of Eaa19 (around the cytochrome cluster), and the methylome-wide entropy had a weak peak that was in the distal portion (over candidates like Elovl3, Pcgf3). Two candidates in Eaa19 have been implicated in epigenetic aging in humans based on gene-level meta-GWAS: NK homeobox 3 (Nkx2-3, a developmental gene) and CutC copper transporter (Cutc) (McCartney et al., 2021). Eaa19 is also the location of the Cyp26a1-Myof genes, and the human syntenic region is associated with longevity, metabolic traits, and lipid profiles (Yashin et al., 2018; de Vries et al., 2019; Richardson et al., 2020). Another noteworthy candidate in Eaa19 is Chuk, a regulator of mTORC2, that has been associated with age at menopause (Kichaev et al., 2019; Xu et al., 2013). Eaa19 presents a complex and intriguing QTL related to the DNAm readouts that may also influence body weight gains over the course of life. Both Eaa19 and Eaa11 exemplify the major challenge that follows when a genetic mapping approach leads to gene- and variant-dense regions (Gallagher and Chen-Plotkin, 2018; Lappalainen, 2015). Both loci have several biologically relevant genes, and identifying the causal gene (or genes) will require a more fine-scaled functional genomic dissection.

The gene expression analyses highlighted metabolic pathways. At the mRNA level, the negative correlates of EAA were highly enriched in metabolic genes related to oxidation-reduction and steroid metabolism, while the positive correlates were enriched in pathways related to mitosis, and immune response for the pan-tissue general EAA. This convergence of metabolic, immune, and cell division genes is very consistent with previous reports (Liu et al., 2020; Kuo et al., 2021; Slieker et al., 2016). Here, we should note that depending on the tissue(s) in which the clocks are trains, and the tissue from which the DNAmAge is estimated, the EAA derivative may put an emphasis on biological pathways or genes that are most relevant to that tissue. For instance, clocks optimized for neural tissue are more closely related to neurodegeneration and neuropathologies (Shireby, 2020; Grodstein et al., 2021). With the liver clocks, expression correlates highlighted aspects of metabolism that are relevant to liver function (e.g., the cytochrome p450 epoxygenase genes), and this is detected both at the transcriptomic and proteomic levels. For the adipose tissue proteome, the cytochrome genes become less prominent, but the enriched pathways still remained consistent (i.e., oxidation-reduction, lipid and carbohydrate metabolism, and cell proliferation for the positive correlates of the pan-tissue EAA). At the proteome level, we also find several phospholipid efflux genes (APOC1, APOA2, APOC3, APOA1, APOA4, APOE) that are positive correlates of EAA. For both the liver and adipose proteomes, APOE stands out as the top protein correlate of EAA. A recent human study has also identified the APOE locus as the strongest GWAS hit for two measures of biological age acceleration (the phenoAge and the bioAge) (Kuo et al., 2021). While more specific to liver, the cytochrome P450 genes presents as both positional candidates and expression correlates of EAA. These genes have high expression in liver and have major downstream impact on metabolism (Olona et al., 2018; Schuck et al., 2014; Wang et al., 2020). One caveat is that these CYP genes are part of a gene cluster in Eaa19 that includes transcripts with cis-eQTLs (e.g., Cyp2c66, Cyp2c39, Cyp2c68) and the tight clustering of the genes, and proximity of trait QTL and eQTLs may result in co-expression due to linkage disequilibrium (Mozhui et al., 2008). Nonetheless, the cytochrome genes in Eaa19 are strong candidate modulators of EAA derived from liver tissue that calls for further investigation.

Aside from Eaa11 and Eaa19, another locus with evidence for consensus QTL was detected on Chr3. We do not delve much into this here, but the Chr3 interval is near genes associated with human epigenetic aging (Ift80, Trim59, Kpna4) (Gibson, 2019; McCartney et al., 2021). However, this QTL is dispersed across a large interval, and the peak markers do not exactly overlap these human EAA GWAS hits. While we have focused on Eaa11 and Eaa19, the Chr3 locus presents a potentially important region for EAA.

In summary, we have identified two main QTLs – Eaa11 and Eaa19 – that contribute to variation in EAA. Eaa11 contains several genes with oncogenic properties (e.g., Stxbp4, Erbb2), while Eaa19 contains a dense cluster of metabolic genes (e.g., Elovl3, Chuk, the cytochrome genes). We demonstrate that metabolic profile and body weight are closely related to epigenetic aging and methylome entropy. The convergence of evidence from genetic and gene expression analyses indicates that genes involved in metabolism and energy balance contribute to the age-dependent restructuring of the methylome, which in turn forms the basis of the epigenetic clocks.

Materials and methods

Biospecimen collection and processing

Request a detailed protocol

Samples for this study were selected from a larger colony of BXD mice that were housed in a specific pathogen-free facility at the University of Tennessee Health Science Center (UTHSC). All animal procedures were in accordance with a protocol approved by the Institutional Animal Care and Use Committee (IACUC) at the UTHSC. Detailed description of housing conditions and diet can be found in Williams et al., 2022; Roy et al., 2021. Mice were given ad libitum access to water and either standard laboratory chow (Harlan Teklad; 2018, 18.6% protein, 6.2% fat, 75.2% carbohydrates) or high-fat chow (Harlan Teklad 06414; 18.4% protein, 60.3% fat, 21.3% carbohydrate). Animals were first weighed within the first few days of assignment to either diets, and this was mostly but not always prior to introduction to HFD. Following this, animals were weighed periodically and a final time (BWF) when animals were humanely euthanized (anesthetized with avertin at 0.02 ml/g of weight, followed by perfusion with phosphate-buffered saline) at specific ages for tissue collection. This work utilizes the biobanked liver specimens that were pulverized and stored at –80°C, and overlaps samples described in Williams et al., 2022. DNA was extracted using the DNeasy Blood & Tissue Kit from QIAGEN. Nucleic acid purity was inspected with a NanoDrop spectrophotometer and quantified using a Qubit fluorometer dsDNA BR Assay.

Methylation array and quality checks

Request a detailed protocol

DNA samples from ~350 BXD mice were profiled on the Illumina HorvathHumanMethylChip40 array. Samples were in 96-well plate format (Supplementary file 1), and the plates were randomized for major covariates such as age and diet. Details of this array are described in Arneson et al., 2022; Horvath and Haghani, 2021. The array contains probes that target ~36K CpGs that are highly conserved in mammals. Over 33K probes map to homologous regions in the mouse genome. For downstream statistical tests, we further filtered the probes and used only 27,966 probes that have been validated for the mouse genome using calibration data generated from synthetic mouse DNA (Arneson et al., 2022). Data was normalized using the SeSame method (Zhou et al., 2018). Unsupervised HC was performed to identify outliers and failed arrays, and those were excluded. We also performed strain verification as an additional quality check. While the majority of the probes were free of DNA sequence variants, we found 45 probes that overlapped variants in the BXD family. We leveraged these as proxies for genotypes and performed a principal component analysis (PCA). The top genotype principal components (genoPC1 and genoPC2; Supplementary file 1) segregated the samples by strain identity, and samples that did not cluster close to the reported strains were removed. After excluding outliers, failed arrays, and samples that failed strain verification, the final liver DNAm data consisted of 339 samples. The beta-values for these ~28K probes in the 339 samples show the expected bimodal distribution (Figure 2—figure supplement 2a), but for these highly conserved CpGs, we note a much higher representation of hypermethylated CpGs instead of the slightly hypomethylated state of the methylome when a wider spectrum of CpGs is assayed (Sziráki et al., 2018).

BXD-unbiased mouse clock estimation

Request a detailed protocol

Three different mouse clocks are reported here, and all three are based on penalized regression modeling using glmnet (Friedman et al., 2010). Training was done in a larger mouse dataset that excluded the BXDs (Lu et al., 2021; Li et al., 2021; Haghani et al., 2022). The clocks are therefore unbiased to the characteristics of the BXDs. For pan-tissue clocks, all mouse samples were used for training. For the liver-specific clocks, the training was limited to data from liver samples.

The general DNAmAge clock did not preselect for any CpGs, and the full set of CpGs that map to Mus musculus was used. First, a log-linear transformation was applied to the chronological age using the function:

f(Age)= {Age1.2+0.06+log(1.2+0.06)1.21.2+0.06,Age>1.2log(Age+0.06), Age1.2

This is similar to the age transformation described in the original Horvath pan-tissue human clock, but with offset at 0.06 and adult mouse age at 1.2 (Horvath, 2013). Following this transformation, an elastic net regression was implemented to regress the transformed chronological age on the CpG beta-values in the training data. The alpha was set at 0.5, and the optimal lamda parameter was determined by 10-fold cross-validation (function cv.glmnet). This selected subsets of clock CpGs and coefficients. DNAmAge was then calculated as:

DNAmAge= f1(b0+b1CpG1+b2CpG2++biCpGib0+b1+b2++bi)

where b0 is the intercept, b1 to bi are the coefficients, CpG1 to CpGi denote the beta-values for the respective clock CpGs, and f–1() denotes the inverse function of f().

A similar method was used to build the developmental and interventional clocks, but for these, the CpGs were pre-selected. For the liver-specific developmental clock, CpGs that change during mouse development were selected in liver samples based on Pearson correlation with age in mice that were <1.6 months old. The top 1000 negative and top 1000 positive correlates were then classified as ‘developmental CpGs’, and the training was done using only this subset of CpGs. For the pan-tissue dev.DNAmAge, the top 1000 positive and top 1000 negative developmental CpGs were based on a multitissue EWAS, also using Pearson correlation with age for mice <1.6 months old, and these are CpGs that are strongly correlated with age during the mouse developmental period when all available tissues are considered.

Training for the interventional clock started with 537 CpGs that relate to gold-standard anti-aging interventions (caloric restriction, growth hormone receptor knockout) (Haghani et al., 2022; Coschigano et al., 2003). These ‘interventional CpGs’ were identified from an independent mouse liver caloric restriction (n = 95) and one growth hormone receptor knockout (n = 71) data that were not included in the clock estimation (Haghani et al., 2022). Top CpGs associated with these interventions were identified, and the 537 CpGs are the sites that are consistently associated with these anti-aging interventions. Of the 537, 121 CpGs increased in methylation and 416 decreased in methylation with application of the pro-longevity interventions. Given the small number of CpGs that went into training for the int.DNAmAge, we expected this clock to be less correlated with chronological age, and possibly more responsive to variables such as diet.

Entropy calculation

Request a detailed protocol

Methylome-wide entropy was calculated from the 27,966 probes. The beta-values were discretized into 20 bins, and the Shannon entropy for each sample was estimated using the R package, ‘entropy’ (v1.2.1) with method = ‘ML’: maximum likelihood (Hausser and Strimmer, 2009). The optimal number of bins was determined using the Freedman–Diaconis rule (breaks = ‘FD’ for the hist() function in R). We also estimated the methylome-wide entropy after discretizing into 100 and 2000 bins (values provided in Supplementary file 1), and the results we report are consistent and robust to the number of bins. For the age-gain, age-loss, and age-ns CpGs, entropy for each set was estimated, also following discretization into 20 bins.


Statistical analyses were done using R or the JMP Pro software (version 15). Associations between the epigenetic predictors and continuous variables (body weight, strain lifespan, fasted serum glucose, and total cholesterol) were based on Pearson correlations, and t-test was used to evaluate the effect of categorical predictors (sex, diet). Multivariable regression models were also used to control for covariates (R regression equations provided with relevant tables and supplementary files). All these traits are directly accessible from GeneNetwork 2 (GN2; more information on how to retrieve these data from GN2 is provided in Supplementary file 13; Williams, 2022; Mulligan et al., 2017). Longevity data was obtained from a parallel cohort of BXD mice housed in the same UTHSC colony, and members of this ‘longevity cohort’ were allowed to age until natural death (more detail on the longevity cohort can be found in Roy et al., 2021). Males were excluded and strain-by-diet lifespan summary statistics were derived. Only strain-by-diet groups with five or more observations for lifespan were included in the correlational analyses with the epigenetic predictors.

Multivariable EWAS

Request a detailed protocol

Site-by-site differential methylation analysis (EWAS) was performed on the 27,966 CpGs using a multivariable regression model. As such genome-wide explorations are vulnerable to unmeasured confounders, we included the top PC derived from a PCA of the 27,966 probes (McClay et al., 2014). The top 10 PCs cumulatively accounted for ~62% of the variance (Figure 2—figure supplement 2b). A plot of PC1 (19% of variance) and PC2 (14% of variance) showed that PC1 captured some noise due to batch (Figure 2—figure supplement 2b). The remaining top PCs (PC2 onwards) were strongly associated with biological variables, particular age, and also weight and diet (top 10 PCs provided in Supplementary file 1). For this reason, we included PC1 as a correction factor in the EWAS. The regression model we used was lm(CpGi ~ age + median lifespan + diet + BWF + PC1), where CpGi is the ith CpG from 1 to 27,966. As lifespan was from female mice, this EWAS excluded the few male samples.

CpG annotation and enrichment

Request a detailed protocol

Functional annotation and enrichment analyses for the DMCs were done using the genomic region enrichment R package, rGREAT (version 3.0.0; McLean et al., 2010) with the array content (i.e., the 27,966 CpGs) as background. Enrichment p-values are based on hypergeometric tests, and categories with Benjamini–Hochberg adjusted p-values≤0.05 are reported. Annotations were for the GRCm38/mm10 reference genome.

For chromatin state annotation, we used bedtools to annotate the 27,966 CpGs coordinates using chromatin annotation.bed files for neonatal (P0) mouse liver tissue created by Gorkin et al., 2020; Quinlan and Hall, 2010. This provides the 15-state model using ChromHMM (Ernst and Kellis, 2012), and we downloaded the file for the ‘replicated set’ (here, the regions annotated as NRS are sites that did not produce replicable signal). Enrichment and depletion analyses for genomic annotations and chromatin annotations were based on the hypergeometric test (phyper R function). The R codes are provided with the results data (Supplementary file 9).

Genetic analyses

Request a detailed protocol

Heritability within diet was estimated as the fraction of variability that was explained by background genotype (Ashbrook et al., 2021; Ashbrook et al., 2018; Belknap, 1998). For this, we applied an ANOVA: aov(EAA ~ strain), and heritability was computed as h2 = SSqstrain/(SSqstrain + SSqresidual), where SSqstrain is the strain sum of squares, and SSqresidual is the residual sum of squares.

All QTL mapping was done on the GN2 platform (trait accession IDs provided in Supplementary file 13; Williams, 2022). On the GN2 home page, the present set of BXD mice belongs to the Group: BXD NIA Longevity Study, and GN2 provides a direct interface to the genotype data. All QTL mapping was done for genotypes with minor allele frequency ≥ 0.05 using the genome-wide efficient mixed model association (GEMMA) algorithm (Zhou and Stephens, 2014), which corrects for the BXD kinship matrix. For the EAA traits, diet, weight at 6 months, and final weight were fitted as cofactor. Chronological age had no correlation with EAA, and this was not included as a cofactor (including age does not change the results). Genome-wide linkage statistics were downloaded for the full set of markers that were available from GN2 (7320 markers in Supplementary file 10). For the combined p-values, QTL mapping was done separately using GEMMA for each EAA traits, then the Fisher’s p-value combination was applied to get the meta-p-value (Peirce et al., 2007). We used this method to simply highlight loci that had consistent linkage across the different EAA measures. QTL mapping for methylome-wide entropy was done using GEMMA with adjustment for chronological age, diet, weight at 6 months, and final weight.

For marker-specific linkage, we selected SNPs located at the peak QTL regions (DA0014408, rs48062674) and grouped the BXDs by their genotypes (F1 hybrids and other heterozygotes were excluded for this), and marker-specific linkage was tested using ANOVA and linear regression (R regression equation given in Table 3). rs48062674 is a reference variant that is already catalogued in dbSNP (Sherry et al., 2001) and is used as a marker in the QTL mapping. DA0014408.4 is an updated variant at a recombinant region in the Chr11 interval and within the peak QTL interval (Ashbrook et al., 2021). Genotypes at these markers for individual BXD samples are given in Supplementary file 1.

To test the effect of genotype on body weight change, body weight data measured at approximately 4 (baseline), 6, 12, 18, and 24 months were downloaded from GN2 (Supplementary file 13). Detailed description of these weight data is given in Roy et al., 2021. We then applied a mixed-effects regression model using the lme4 R package (Bates et al., 2021): lmer(weight ~ age + diet + genotype + (1|ID)), where ID is the identifier for individual mouse.

Bioinformatic tools for candidate genes selection

Request a detailed protocol

Sequence variation between B6 and D2 in the QTL intervals (Chr11:90–99 Mb and Chr19:35–48 Mb) was retrieved from the Wellcome Sanger Institute Mouse Genomes Project database (release 1505 for GRCm38/mm10) (Wellcome Sanger Institute Mouse Genome Project, 2022; Keane et al., 2011; Yalcin et al., 2011). Positional candidates were required to contain at least one coding variant (missense and/or nonsense variants) or have noncoding variants with evidence of cis-regulation in liver tissue of the BXDs. cis-eQTLs for the candidate genes were obtained from the liver RNA-seq data described in Williams et al., 2022. An interface to search and analyze this transcriptome data is available from GN2 and is catalogued under Group: BXD NIA Longevity Study; Type: Liver mRNA; and Dataset: UTHSC BXD Liver RNA-seq (Oct 19) TMP Log2.

For human GWAS annotations, we navigated to the corresponding syntenic regions on the human genome by using the coordinate conversion tool in the UCSC Genome Browser. The Chr11 90–95 Mb interval on the mouse reference genome (GRCm38/mm10) corresponds to human Chr17:50.14–55.75 Mb (GRCh38/hg38) (40.7% of bases; 100% span). The Chr11 95–99 Mb interval in the mouse corresponds to human Chr17:47.49–50.14 Mb (29.3% of bases, 57.9% span) and Chr17:38.19–40.39 Mb (20.7% of bases, 44.1% span). Likewise, for the Chr19 QTL, the mm10 35–40 Mb corresponds to hg38 Chr10:89.80–95.06 Mb (32.2% of bases, 89.2% span), 40–45 Mb corresponds to hg38 Chr10:95.23–100.98 Mb (46.6% of bases, 95.6% span), and 45–48 Mb corresponds to hg38 Chr10:100.98–104.41 Mb (46.5% of bases, 100% span). We then downloaded the GWAS data for these regions from the NHGRI-EBI GWAS catalog (Buniello et al., 2019) and retained the GWAS hits that were related to aging.

Transcriptome and proteome analyses

Request a detailed protocol

The liver RNA-seq data mentioned above was also used for the transcriptome-wide correlational analysis for EAA in the 153 cases that had both DNAm and RNA-seq data. We considered the top 2000 highest mRNA correlates (|r| = 0.24, p=0.003 for the pan-tissue EAA; |r| = 0.3, p=0.0002 for the liver int.EAA), and the list of transcripts was collapsed to a nonredundant list of gene symbols, which was uploaded to the DAVID Bioinformatics Database (version 2021 update) for GO enrichment analysis (Huang et al., 2009a; Huang et al., 2009b). Proteome correlational analysis was carried out using the data: Group: BXD NIA Longevity Study; Type: Liver Proteome; and Dataset: EPFL/ETHZ BXD Liver Proteome CD-HFD (Nov19). Detailed description of this data is given in Williams et al., 2022. 164 BXD cases had both DNAm and liver proteomics, and similar to the RNA-seq, we selected the top 2000 correlates (|r| = 0.24, p=0.002 for both the pan-tissue EAA and liver int.EAA) for enrichment analysis.

59 of the BXD cases also have proteome data from adipose tissue (Group: BXD NIA Longevity Study; Type: Adipose Proteome; and Dataset: Riken-Wu BXD Liver Proteome CD-HFD (Sep20)). While small in sample number, we used this data to test whether we could recapitulate the same functional enrichment profiles in a different tissue. Details of sample preparation and processing steps for the adipose proteome are provided in the dataset’s ‘Info’ page on GN2. In brief, protein was extracted from the adipose samples by first lysis in a buffer with protease inhibitor, followed by homogenization with a glass dounce and sonication. The protein fraction was isolated from the homogenate by centrifugation and processed for assay on a liquid chromatography tandem mass spectrometry (LC-M/MS) using a modified Phase Transfer Surfactant Method as described in Masuda et al., 2008 and Mostafa et al., 2020. Samples were measured using a Q Exactive Plus Orbitrap LC-MS/MS System (Thermo Fisher). For each sample, 600 ng was injected and the samples were measured with data-independent acquisition (DIA). A portion of the peptides from the samples was pooled and fractionated using a Pierce High pH Reversed-Phase (HPRP) Peptide Fractionation Kit (Thermo Fisher Scientific) to generate a spectral library. For the HPRP fractions, 450 ng was injected and the samples were measured with data-dependent acquisition (DDA). For protein identification, the raw measurement files were searched against a mouse database using the (uniprot-reviewed_Mus_musculus_10090_.fasta) with Proteome Discoverer v2.4 software (Thermo Fisher Scientific). Filtered output was used to generate a sample-specific spectral library using the Spectronaut software (Biognosys, Switzerland). Raw files from DIA measurements were used for quantitative data extraction with the generated spectral library, as previously described (Mostafa et al., 2020). The false discovery rate was estimated with the mProphet approach and set to 0.01 at both peptide precursor level and protein level (Reiter et al., 2011; Rosenberger et al., 2017). Due to the small sample size, for this dataset, we considered the top 1000 protein correlates of EAA (|r| = 0.25, p=0.06 for the pan-tissue EAA; |r| = 0.31, p=0.02 for the liver int.EAA).

Data availability

Request a detailed protocol

The normalized microarray data and raw files are available from the NCBI Gene Expression Omnibus (accession ID GSE199979). The HorvathMammalMethylChip40 array manifest files and genome annotations of CpGs can be found on GitHub at (Horvath and Haghani, 2021). Individual-level BXD data, including the processed microarray data, are available on (Williams, 2022) in FAIR+-compliant format; data identifiers and way to retrieve data are described in Supplementary file 13.

Data availability

The normalized microarray data and raw files are available from the NCBI Gene Expression Omnibus (accession ID GSE199979). The HorvathMammalMethylChip40 array manifest files and genome annotations of CpGs can be found on Github at (; copy archived at swh:1:rev:355098ec161ae668546a1a343e755c40a475b941). Individual level BXD data, including the processed microarray data are available on on FAIR+ compliant format; data identifiers, and way to retrieve data are described in Supplementary file 13.

The following data sets were generated
    1. Longevityteam
    (2021) BDL_10071
    ID 10071. Genetics and epigenetics of aging and longevity in BXD mice.
    1. Longevityteam
    (2021) BDL_10072
    ID 10072. Genetics and epigenetics of aging and longevity in BXD mice.
    1. Longevityteam
    (2021) BDL_10073
    ID 10073. Genetics and epigenetics of aging and longevity in BXD mice.
    1. Longevityteam
    (2021) BDL_10074
    ID 10074. Genetics and epigenetics of aging and longevity in BXD mice.
    1. Longevityteam
    (2021) BDL_10075
    ID 10075. Genetics and epigenetics of aging and longevity in BXD mice.
    1. Longevityteam
    (2021) BDL_10076
    ID 10076. Genetics and epigenetics of aging and longevity in BXD mice.
    1. Longevityteam
    (2022) BDL_10093
    ID 10093. Genetics and epigenetics of aging and longevity in BXD mice.
The following previously published data sets were used
    1. Longevityteam
    (2021) BDL_10066
    ID 10066. Genetics and epigenetics of aging and longevity in BXD mice.


    1. de Vries PS
    2. Brown MR
    3. Bentley AR
    4. Sung YJ
    5. Winkler TW
    6. Ntalla I
    7. Schwander K
    8. Kraja AT
    9. Guo X
    10. Franceschini N
    11. Cheng CY
    12. Sim X
    13. Vojinovic D
    14. Huffman JE
    15. Musani SK
    16. Li C
    17. Feitosa MF
    18. Richard MA
    19. Noordam R
    20. Aschard H
    21. Bartz TM
    22. Bielak LF
    23. Deng X
    24. Dorajoo R
    25. Lohman KK
    26. Manning AK
    27. Rankinen T
    28. Smith AV
    29. Tajuddin SM
    30. Evangelou E
    31. Graff M
    32. Alver M
    33. Boissel M
    34. Chai JF
    35. Chen X
    36. Divers J
    37. Gandin I
    38. Gao C
    39. Goel A
    40. Hagemeijer Y
    41. Harris SE
    42. Hartwig FP
    43. He M
    44. Horimoto A
    45. Hsu FC
    46. Jackson AU
    47. Kasturiratne A
    48. Komulainen P
    49. Kühnel B
    50. Laguzzi F
    51. Lee JH
    52. Luan J
    53. Lyytikäinen LP
    54. Matoba N
    55. Nolte IM
    56. Pietzner M
    57. Riaz M
    58. Said MA
    59. Scott RA
    60. Sofer T
    61. Stančáková A
    62. Takeuchi F
    63. Tayo BO
    64. van der Most PJ
    65. Varga TV
    66. Wang Y
    67. Ware EB
    68. Wen W
    69. Yanek LR
    70. Zhang W
    71. Zhao JH
    72. Afaq S
    73. Amin N
    74. Amini M
    75. Arking DE
    76. Aung T
    77. Ballantyne C
    78. Boerwinkle E
    79. Broeckel U
    80. Campbell A
    81. Canouil M
    82. Charumathi S
    83. Chen YDI
    84. Connell JM
    85. de Faire U
    86. de Las Fuentes L
    87. de Mutsert R
    88. de Silva HJ
    89. Ding J
    90. Dominiczak AF
    91. Duan Q
    92. Eaton CB
    93. Eppinga RN
    94. Faul JD
    95. Fisher V
    96. Forrester T
    97. Franco OH
    98. Friedlander Y
    99. Ghanbari M
    100. Giulianini F
    101. Grabe HJ
    102. Grove ML
    103. Gu CC
    104. Harris TB
    105. Heikkinen S
    106. Heng CK
    107. Hirata M
    108. Hixson JE
    109. Howard BV
    110. Ikram MA
    111. Jacobs DR
    112. Johnson C
    113. Jonas JB
    114. Kammerer CM
    115. Katsuya T
    116. Khor CC
    117. Kilpeläinen TO
    118. Koh WP
    119. Koistinen HA
    120. Kolcic I
    121. Kooperberg C
    122. Krieger JE
    123. Kritchevsky SB
    124. Kubo M
    125. Kuusisto J
    126. Lakka TA
    127. Langefeld CD
    128. Langenberg C
    129. Launer LJ
    130. Lehne B
    131. Lemaitre RN
    132. Li Y
    133. Liang J
    134. Liu J
    135. Liu K
    136. Loh M
    137. Louie T
    138. Mägi R
    139. Manichaikul AW
    140. McKenzie CA
    141. Meitinger T
    142. Metspalu A
    143. Milaneschi Y
    144. Milani L
    145. Mohlke KL
    146. Mosley TH
    147. Mukamal KJ
    148. Nalls MA
    149. Nauck M
    150. Nelson CP
    151. Sotoodehnia N
    152. O’Connell JR
    153. Palmer ND
    154. Pazoki R
    155. Pedersen NL
    156. Peters A
    157. Peyser PA
    158. Polasek O
    159. Poulter N
    160. Raffel LJ
    161. Raitakari OT
    162. Reiner AP
    163. Rice TK
    164. Rich SS
    165. Robino A
    166. Robinson JG
    167. Rose LM
    168. Rudan I
    169. Schmidt CO
    170. Schreiner PJ
    171. Scott WR
    172. Sever P
    173. Shi Y
    174. Sidney S
    175. Sims M
    176. Smith BH
    177. Smith JA
    178. Snieder H
    179. Starr JM
    180. Strauch K
    181. Tan N
    182. Taylor KD
    183. Teo YY
    184. Tham YC
    185. Uitterlinden AG
    186. van Heemst D
    187. Vuckovic D
    188. Waldenberger M
    189. Wang L
    190. Wang Y
    191. Wang Z
    192. Wei WB
    193. Williams C
    194. Wilson G
    195. Wojczynski MK
    196. Yao J
    197. Yu B
    198. Yu C
    199. Yuan JM
    200. Zhao W
    201. Zonderman AB
    202. Becker DM
    203. Boehnke M
    204. Bowden DW
    205. Chambers JC
    206. Deary IJ
    207. Esko T
    208. Farrall M
    209. Franks PW
    210. Freedman BI
    211. Froguel P
    212. Gasparini P
    213. Gieger C
    214. Horta BL
    215. Kamatani Y
    216. Kato N
    217. Kooner JS
    218. Laakso M
    219. Leander K
    220. Lehtimäki T
    221. Magnusson PKE
    222. Penninx B
    223. Pereira AC
    224. Rauramaa R
    225. Samani NJ
    226. Scott J
    227. Shu XO
    228. van der Harst P
    229. Wagenknecht LE
    230. Wang YX
    231. Wareham NJ
    232. Watkins H
    233. Weir DR
    234. Wickremasinghe AR
    235. Zheng W
    236. Elliott P
    237. North KE
    238. Bouchard C
    239. Evans MK
    240. Gudnason V
    241. Liu CT
    242. Liu Y
    243. Psaty BM
    244. Ridker PM
    245. van Dam RM
    246. Kardia SLR
    247. Zhu X
    248. Rotimi CN
    249. Mook-Kanamori DO
    250. Fornage M
    251. Kelly TN
    252. Fox ER
    253. Hayward C
    254. van Duijn CM
    255. Tai ES
    256. Wong TY
    257. Liu J
    258. Rotter JI
    259. Gauderman WJ
    260. Province MA
    261. Munroe PB
    262. Rice K
    263. Chasman DI
    264. Cupples LA
    265. Rao DC
    266. Morrison AC
    267. InterAct Consortium
    268. Lifelines Cohort, Groningen, The Netherlands Lifelines Cohort Study
    (2019) Multiancestry Genome-Wide Association Study of Lipid Levels Incorporating Gene-Alcohol Interactions
    American Journal of Epidemiology 188:1033–1054.
    1. Hausser J
    2. Strimmer K
    Entropy Inference and the James-Stein Estimator, with Application to Nonlinear Gene Association Networks
    Journal of Machine Learning Research: JMLR 10:1469–1484.
  1. Preprint
    1. Li CZ
    2. Haghani A
    3. Robeck TR
    4. Villar D
    5. Lu AT
    6. Zhang J
    7. Faulkes CG
    8. Vu H
    9. Ablaeva J
    10. Adams DM
    11. Ardehali R
    12. Arneson A
    13. Baker CS
    14. Belov K
    15. Blumstein DT
    16. Bors EK
    17. Breeze CE
    18. Brooke RT
    19. Brown JL
    20. Caulton A
    21. Cavin JM
    22. Chatzistamou I
    23. Chen H
    24. Chiavellini P
    25. Choi OW
    26. Clarke S
    27. DeYoung J
    28. Emmons CK
    29. Emmrich S
    30. Fei Z
    31. Ferguson SH
    32. Finno CJ
    33. Flower JE
    34. Gaillard JM
    35. Garde E
    36. Gladyshev VN
    37. Goya RG
    38. Hanson MB
    39. Haulena M
    40. Herrick K
    41. Hogan AN
    42. Hogg CJ
    43. Hore TA
    44. Jasinska AJ
    45. Jones G
    46. Jourdain E
    47. Kashpur O
    48. Katcher H
    49. Katsumata E
    50. Kaza V
    51. Kiaris H
    52. Kobor MS
    53. Kordowitzki P
    54. Koski WR
    55. Larison B
    56. Lee SG
    57. Lehmann M
    58. Lemaitre JF
    59. Levine AJ
    60. Li C
    61. Li X
    62. Lin DTS
    63. Lindemann DM
    64. Macoretta N
    65. Maddox D
    66. Matkin CO
    67. Mattison JA
    68. Mergl J
    69. Meudt JJ
    70. Montano GA
    71. Mozhui K
    72. Naderi A
    73. Nagy M
    74. Narayan P
    75. Nathanielsz PW
    76. Nguyen NB
    77. Niehrs C
    78. Odom DT
    79. Ophir AG
    80. Ostrander EA
    81. O’Tierney Ginn P
    82. Parsons KM
    83. Paul KC
    84. Pellegrini M
    85. Pinho GM
    86. Plassais J
    87. Prado NA
    88. Rey B
    89. Ritz BR
    90. Robbins J
    91. Rodriguez M
    92. Russell J
    93. Rydkina E
    94. Sailer LL
    95. Salmon AB
    96. Sanghavi A
    97. Schachtschneider KM
    98. Schmitt D
    99. Schomacher L
    100. Schook LB
    101. Sears KE
    102. Seifert AW
    103. Shindyapina AV
    104. Singh K
    105. Sinha I
    106. Snell RG
    107. Soltanmohammadi E
    108. Spangler ML
    109. Spriggs M
    110. Steinman KJ
    111. Sugrue VJ
    112. Szladovits B
    113. Takasugi M
    114. Teeling EC
    115. Van Bonn B
    116. Vernes SC
    117. Vinters HV
    118. Wallingford MC
    119. Wang N
    120. Wilkinson GS
    121. Williams RW
    122. Yang XW
    123. Young BG
    124. Zhang B
    125. Zhang Z
    126. Zhao P
    127. Zhao Y
    128. Zhou W
    129. Zoller JA
    130. Ernst J
    131. Seluanov A
    132. Raj K
    133. Gorbunova V
    134. Horvath S
    (2021) Epigenetic Predictors of Maximum Lifespan and Other Life History Traits in Mammals
  2. Preprint
    1. Lu AT
    2. Fei Z
    3. Haghani A
    4. Robeck TR
    5. Zoller JA
    6. Li CZ
    7. Zhang J
    8. Ablaeva J
    9. Adams DM
    10. Almunia J
    11. Ardehali R
    12. Arneson A
    13. Baker CS
    14. Belov K
    15. Black P
    16. Blumstein DT
    17. Bors EK
    18. Breeze CE
    19. Brooke RT
    20. Brown JL
    21. Caulton A
    22. Cavin JM
    23. Chatzistamou I
    24. Chen H
    25. Chiavellini P
    26. Choi OW
    27. Clarke S
    28. DeYoung J
    29. Dold C
    30. Emmons CK
    31. Emmrich S
    32. Faulkes CG
    33. Ferguson SH
    34. Finno CJ
    35. Gaillard JM
    36. Garde E
    37. Gladyshev VN
    38. Gorbunova V
    39. Goya RG
    40. Grant MJ
    41. Hales EN
    42. Hanson MB
    43. Haulena M
    44. Hogan AN
    45. Hogg CJ
    46. Hore TA
    47. Jasinska AJ
    48. Jones G
    49. Jourdain E
    50. Kashpur O
    51. Katcher H
    52. Katsumata E
    53. Kaza V
    54. Kiaris H
    55. Kobor MS
    56. Kordowitzki P
    57. Koski WR
    58. Larison B
    59. Lee SG
    60. Lee YC
    61. Lehmann M
    62. Lemaitre JF
    63. Levine AJ
    64. Li C
    65. Li X
    66. Lin DT
    67. Macoretta N
    68. Maddox D
    69. Matkin CO
    70. Mattison JA
    71. Mergl J
    72. Meudt JJ
    73. Mozhui K
    74. Naderi A
    75. Nagy M
    76. Narayan P
    77. Nathanielsz PW
    78. Nguyen NB
    79. Niehrs C
    80. Ophir AG
    81. Ostrander EA
    82. O’Tierney Ginn P
    83. Parsons KM
    84. Paul KC
    85. Pellegrini M
    86. Pinho GM
    87. Plassais J
    88. Prado NA
    89. Rey B
    90. Ritz BR
    91. Robbins J
    92. Rodriguez M
    93. Russell J
    94. Rydkina E
    95. Sailer LL
    96. Salmon AB
    97. Sanghavi A
    98. Schachtschneider KM
    99. Schmitt D
    100. Schmitt T
    101. Schomacher L
    102. Schook LB
    103. Sears KE
    104. Seluanov A
    105. Shanmuganayagam D
    106. Shindyapina A
    107. Singh K
    108. Sinha I
    109. Snell RG
    110. Soltanmaohammadi E
    111. Spangler ML
    112. Staggs L
    113. Steinman KJ
    114. Sugrue VJ
    115. Szladovits B
    116. Takasugi M
    117. Teeling EC
    118. Thompson MJ
    119. Van Bonn B
    120. Vernes SC
    121. Villar D
    122. Vinters HV
    123. Wallingford MC
    124. Wang N
    125. Wayne RK
    126. Wilkinson GS
    127. Williams CK
    128. Williams RW
    129. Yang XW
    130. Young BG
    131. Zhang B
    132. Zhang Z
    133. Zhao P
    134. Zhao Y
    135. Zimmermann J
    136. Zhou W
    137. Ernst J
    138. Raj K
    139. Horvath S
    (2021) Universal DNA Methylation Age across Mammalian Tissues
    1. Michailidou K
    2. Lindström S
    3. Dennis J
    4. Beesley J
    5. Hui S
    6. Kar S
    7. Lemaçon A
    8. Soucy P
    9. Glubb D
    10. Rostamianfar A
    11. Bolla MK
    12. Wang Q
    13. Tyrer J
    14. Dicks E
    15. Lee A
    16. Wang Z
    17. Allen J
    18. Keeman R
    19. Eilber U
    20. French JD
    21. Qing Chen X
    22. Fachal L
    23. McCue K
    24. McCart Reed AE
    25. Ghoussaini M
    26. Carroll JS
    27. Jiang X
    28. Finucane H
    29. Adams M
    30. Adank MA
    31. Ahsan H
    32. Aittomäki K
    33. Anton-Culver H
    34. Antonenkova NN
    35. Arndt V
    36. Aronson KJ
    37. Arun B
    38. Auer PL
    39. Bacot F
    40. Barrdahl M
    41. Baynes C
    42. Beckmann MW
    43. Behrens S
    44. Benitez J
    45. Bermisheva M
    46. Bernstein L
    47. Blomqvist C
    48. Bogdanova NV
    49. Bojesen SE
    50. Bonanni B
    51. Børresen-Dale A-L
    52. Brand JS
    53. Brauch H
    54. Brennan P
    55. Brenner H
    56. Brinton L
    57. Broberg P
    58. Brock IW
    59. Broeks A
    60. Brooks-Wilson A
    61. Brucker SY
    62. Brüning T
    63. Burwinkel B
    64. Butterbach K
    65. Cai Q
    66. Cai H
    67. Caldés T
    68. Canzian F
    69. Carracedo A
    70. Carter BD
    71. Castelao JE
    72. Chan TL
    73. David Cheng T-Y
    74. Seng Chia K
    75. Choi J-Y
    76. Christiansen H
    77. Clarke CL
    78. NBCS Collaborators
    79. Collée M
    80. Conroy DM
    81. Cordina-Duverger E
    82. Cornelissen S
    83. Cox DG
    84. Cox A
    85. Cross SS
    86. Cunningham JM
    87. Czene K
    88. Daly MB
    89. Devilee P
    90. Doheny KF
    91. Dörk T
    92. Dos-Santos-Silva I
    93. Dumont M
    94. Durcan L
    95. Dwek M
    96. Eccles DM
    97. Ekici AB
    98. Eliassen AH
    99. Ellberg C
    100. Elvira M
    101. Engel C
    102. Eriksson M
    103. Fasching PA
    104. Figueroa J
    105. Flesch-Janys D
    106. Fletcher O
    107. Flyger H
    108. Fritschi L
    109. Gaborieau V
    110. Gabrielson M
    111. Gago-Dominguez M
    112. Gao Y-T
    113. Gapstur SM
    114. García-Sáenz JA
    115. Gaudet MM
    116. Georgoulias V
    117. Giles GG
    118. Glendon G
    119. Goldberg MS
    120. Goldgar DE
    121. González-Neira A
    122. Grenaker Alnæs GI
    123. Grip M
    124. Gronwald J
    125. Grundy A
    126. Guénel P
    127. Haeberle L
    128. Hahnen E
    129. Haiman CA
    130. Håkansson N
    131. Hamann U
    132. Hamel N
    133. Hankinson S
    134. Harrington P
    135. Hart SN
    136. Hartikainen JM
    137. Hartman M
    138. Hein A
    139. Heyworth J
    140. Hicks B
    141. Hillemanns P
    142. Ho DN
    143. Hollestelle A
    144. Hooning MJ
    145. Hoover RN
    146. Hopper JL
    147. Hou M-F
    148. Hsiung C-N
    149. Huang G
    150. Humphreys K
    151. Ishiguro J
    152. Ito H
    153. Iwasaki M
    154. Iwata H
    155. Jakubowska A
    156. Janni W
    157. John EM
    158. Johnson N
    159. Jones K
    160. Jones M
    161. Jukkola-Vuorinen A
    162. Kaaks R
    163. Kabisch M
    164. Kaczmarek K
    165. Kang D
    166. Kasuga Y
    167. Kerin MJ
    168. Khan S
    169. Khusnutdinova E
    170. Kiiski JI
    171. Kim S-W
    172. Knight JA
    173. Kosma V-M
    174. Kristensen VN
    175. Krüger U
    176. Kwong A
    177. Lambrechts D
    178. Le Marchand L
    179. Lee E
    180. Lee MH
    181. Lee JW
    182. Neng Lee C
    183. Lejbkowicz F
    184. Li J
    185. Lilyquist J
    186. Lindblom A
    187. Lissowska J
    188. Lo W-Y
    189. Loibl S
    190. Long J
    191. Lophatananon A
    192. Lubinski J
    193. Luccarini C
    194. Lux MP
    195. Ma ESK
    196. MacInnis RJ
    197. Maishman T
    198. Makalic E
    199. Malone KE
    200. Kostovska IM
    201. Mannermaa A
    202. Manoukian S
    203. Manson JE
    204. Margolin S
    205. Mariapun S
    206. Martinez ME
    207. Matsuo K
    208. Mavroudis D
    209. McKay J
    210. McLean C
    211. Meijers-Heijboer H
    212. Meindl A
    213. Menéndez P
    214. Menon U
    215. Meyer J
    216. Miao H
    217. Miller N
    218. Taib NAM
    219. Muir K
    220. Mulligan AM
    221. Mulot C
    222. Neuhausen SL
    223. Nevanlinna H
    224. Neven P
    225. Nielsen SF
    226. Noh D-Y
    227. Nordestgaard BG
    228. Norman A
    229. Olopade OI
    230. Olson JE
    231. Olsson H
    232. Olswold C
    233. Orr N
    234. Pankratz VS
    235. Park SK
    236. Park-Simon T-W
    237. Lloyd R
    238. Perez JIA
    239. Peterlongo P
    240. Peto J
    241. Phillips K-A
    242. Pinchev M
    243. Plaseska-Karanfilska D
    244. Prentice R
    245. Presneau N
    246. Prokofyeva D
    247. Pugh E
    248. Pylkäs K
    249. Rack B
    250. Radice P
    251. Rahman N
    252. Rennert G
    253. Rennert HS
    254. Rhenius V
    255. Romero A
    256. Romm J
    257. Ruddy KJ
    258. Rüdiger T
    259. Rudolph A
    260. Ruebner M
    261. Rutgers EJT
    262. Saloustros E
    263. Sandler DP
    264. Sangrajrang S
    265. Sawyer EJ
    266. Schmidt DF
    267. Schmutzler RK
    268. Schneeweiss A
    269. Schoemaker MJ
    270. Schumacher F
    271. Schürmann P
    272. Scott RJ
    273. Scott C
    274. Seal S
    275. Seynaeve C
    276. Shah M
    277. Sharma P
    278. Shen C-Y
    279. Sheng G
    280. Sherman ME
    281. Shrubsole MJ
    282. Shu X-O
    283. Smeets A
    284. Sohn C
    285. Southey MC
    286. Spinelli JJ
    287. Stegmaier C
    288. Stewart-Brown S
    289. Stone J
    290. Stram DO
    291. Surowy H
    292. Swerdlow A
    293. Tamimi R
    294. Taylor JA
    295. Tengström M
    296. Teo SH
    297. Beth Terry M
    298. Tessier DC
    299. Thanasitthichai S
    300. Thöne K
    301. Tollenaar RAEM
    302. Tomlinson I
    303. Tong L
    304. Torres D
    305. Truong T
    306. Tseng C-C
    307. Tsugane S
    308. Ulmer H-U
    309. Ursin G
    310. Untch M
    311. Vachon C
    312. van Asperen CJ
    313. Van Den Berg D
    314. van den Ouweland AMW
    315. van der Kolk L
    316. van der Luijt RB
    317. Vincent D
    318. Vollenweider J
    319. Waisfisz Q
    320. Wang-Gohrke S
    321. Weinberg CR
    322. Wendt C
    323. Whittemore AS
    324. Wildiers H
    325. Willett W
    326. Winqvist R
    327. Wolk A
    328. Wu AH
    329. Xia L
    330. Yamaji T
    331. Yang XR
    332. Har Yip C
    333. Yoo K-Y
    334. Yu J-C
    335. Zheng W
    336. Zheng Y
    337. Zhu B
    338. Ziogas A
    339. Ziv E
    340. ABCTB Investigators
    341. ConFab/AOCS Investigators
    342. Lakhani SR
    343. Antoniou AC
    344. Droit A
    345. Andrulis IL
    346. Amos CI
    347. Couch FJ
    348. Pharoah PDP
    349. Chang-Claude J
    350. Hall P
    351. Hunter DJ
    352. Milne RL
    353. García-Closas M
    354. Schmidt MK
    355. Chanock SJ
    356. Dunning AM
    357. Edwards SL
    358. Bader GD
    359. Chenevix-Trench G
    360. Simard J
    361. Kraft P
    362. Easton DF
    (2017) Association analysis identifies 65 new breast cancer risk loci
    Nature 551:92–94.
    1. Perry JR
    2. Day F
    3. Elks CE
    4. Sulem P
    5. Thompson DJ
    6. Ferreira T
    7. He C
    8. Chasman DI
    9. Esko T
    10. Thorleifsson G
    11. Albrecht E
    12. Ang WQ
    13. Corre T
    14. Cousminer DL
    15. Feenstra B
    16. Franceschini N
    17. Ganna A
    18. Johnson AD
    19. Kjellqvist S
    20. Lunetta KL
    21. McMahon G
    22. Nolte IM
    23. Paternoster L
    24. Porcu E
    25. Smith AV
    26. Stolk L
    27. Teumer A
    28. Tšernikova N
    29. Tikkanen E
    30. Ulivi S
    31. Wagner EK
    32. Amin N
    33. Bierut LJ
    34. Byrne EM
    35. Hottenga JJ
    36. Koller DL
    37. Mangino M
    38. Pers TH
    39. Yerges-Armstrong LM
    40. Zhao JH
    41. Andrulis IL
    42. Anton-Culver H
    43. Atsma F
    44. Bandinelli S
    45. Beckmann MW
    46. Benitez J
    47. Blomqvist C
    48. Bojesen SE
    49. Bolla MK
    50. Bonanni B
    51. Brauch H
    52. Brenner H
    53. Buring JE
    54. Chang-Claude J
    55. Chanock S
    56. Chen J
    57. Chenevix-Trench G
    58. Collée JM
    59. Couch FJ
    60. Couper D
    61. Coveillo AD
    62. Cox A
    63. Czene K
    64. D’adamo AP
    65. Smith GD
    66. De Vivo I
    67. Demerath EW
    68. Dennis J
    69. Devilee P
    70. Dieffenbach AK
    71. Dunning AM
    72. Eiriksdottir G
    73. Eriksson JG
    74. Fasching PA
    75. Ferrucci L
    76. Flesch-Janys D
    77. Flyger H
    78. Foroud T
    79. Franke L
    80. Garcia ME
    81. García-Closas M
    82. Geller F
    83. de Geus EE
    84. Giles GG
    85. Gudbjartsson DF
    86. Gudnason V
    87. Guénel P
    88. Guo S
    89. Hall P
    90. Hamann U
    91. Haring R
    92. Hartman CA
    93. Heath AC
    94. Hofman A
    95. Hooning MJ
    96. Hopper JL
    97. Hu FB
    98. Hunter DJ
    99. Karasik D
    100. Kiel DP
    101. Knight JA
    102. Kosma VM
    103. Kutalik Z
    104. Lai S
    105. Lambrechts D
    106. Lindblom A
    107. Mägi R
    108. Magnusson PK
    109. Mannermaa A
    110. Martin NG
    111. Masson G
    112. McArdle PF
    113. McArdle WL
    114. Melbye M
    115. Michailidou K
    116. Mihailov E
    117. Milani L
    118. Milne RL
    119. Nevanlinna H
    120. Neven P
    121. Nohr EA
    122. Oldehinkel AJ
    123. Oostra BA
    124. Palotie A
    125. Peacock M
    126. Pedersen NL
    127. Peterlongo P
    128. Peto J
    129. Pharoah PD
    130. Postma DS
    131. Pouta A
    132. Pylkäs K
    133. Radice P
    134. Ring S
    135. Rivadeneira F
    136. Robino A
    137. Rose LM
    138. Rudolph A
    139. Salomaa V
    140. Sanna S
    141. Schlessinger D
    142. Schmidt MK
    143. Southey MC
    144. Sovio U
    145. Stampfer MJ
    146. Stöckl D
    147. Storniolo AM
    148. Timpson NJ
    149. Tyrer J
    150. Visser JA
    151. Vollenweider P
    152. Völzke H
    153. Waeber G
    154. Waldenberger M
    155. Wallaschofski H
    156. Wang Q
    157. Willemsen G
    158. Winqvist R
    159. Wolffenbuttel BH
    160. Wright MJ
    161. Boomsma DI
    162. Econs MJ
    163. Khaw KT
    164. Loos RJ
    165. McCarthy MI
    166. Montgomery GW
    167. Rice JP
    168. Streeten EA
    169. Thorsteinsdottir U
    170. van Duijn CM
    171. Alizadeh BZ
    172. Bergmann S
    173. Boerwinkle E
    174. Boyd HA
    175. Crisponi L
    176. Gasparini P
    177. Gieger C
    178. Harris TB
    179. Ingelsson E
    180. Järvelin MR
    181. Kraft P
    182. Lawlor D
    183. Metspalu A
    184. Pennell CE
    185. Ridker PM
    186. Snieder H
    187. Sørensen TI
    188. Spector TD
    189. Strachan DP
    190. Uitterlinden AG
    191. Wareham NJ
    192. Widen E
    193. Zygmunt M
    194. Murray A
    195. Easton DF
    196. Stefansson K
    197. Murabito JM
    198. Ong KK
    199. Australian Ovarian Cancer Study
    200. GENICA Network
    201. kConFab
    202. LifeLines Cohort Study
    203. InterAct Consortium
    (2014) Parent-of-origin-specific allelic associations among 106 genomic loci for age at menarche
    Nature 514:92–97.
    1. Tanaka S
    2. Mori M
    3. Akiyoshi T
    4. Tanaka Y
    5. Mafune K
    6. Wands JR
    7. Sugimachi K
    Coexpression of Grb7 with epidermal growth factor receptor or Her2/erbB2 in human advanced esophageal carcinoma
    Cancer Research 57:28–31.
  3. Software
    1. Williams RW
    (2022) GeneNetwork 2

Article and author information

Author details

  1. Khyobeni Mozhui

    1. Department of Preventive Medicine, University of Tennessee Health Science Center, College of Medicine, Memphis, United States
    2. Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, College of Medicine, Memphis, United States
    Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Visualization, Writing - original draft, Writing – review and editing
    For correspondence
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-6623-4112
  2. Ake T Lu

    Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, United States
    Formal analysis, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
  3. Caesar Z Li

    Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, United States
    Formal analysis, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
  4. Amin Haghani

    Department of Biostatistics, Fielding School of Public Health, University of California Los Angeles, Los Angeles, United States
    Formal analysis, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-6052-8793
  5. Jose Vladimir Sandoval-Sierra

    Department of Preventive Medicine, University of Tennessee Health Science Center, College of Medicine, Memphis, United States
    Writing – review and editing, laboratory work, contributed to data generation
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-2007-6582
  6. Yibo Wu

    1. YCI Laboratory for Next-Generation Proteomics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
    2. University of Geneva, Geneva, Switzerland
    Methodology, Resources, Writing – review and editing
    Competing interests
    No competing interests declared
  7. Robert W Williams

    Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, College of Medicine, Memphis, United States
    Conceptualization, Data curation, Resources, Writing – review and editing
    Competing interests
    No competing interests declared
  8. Steve Horvath

    1. Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, United States
    2. Department of Biostatistics, Fielding School of Public Health, University of California Los Angeles, Los Angeles, United States
    Conceptualization, Data curation, Formal analysis, Methodology, Writing – review and editing
    Competing interests
    is a founder of the non-profit Epigenetic Clock Development Foundation, which plans to license several of his patents from his employer, University of California Regents. The Regents of the University of California filed a patent application (publication number WO2020150705) related to the HorvathMammalMethylChip40 and clock computation for which he is named an inventor


National Institute on Aging (R21AG055841)

  • Khyobeni Mozhui

National Institute on Aging (R01AG043930)

  • Robert W Williams

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.


We thank the entire UTHSC BXD Aging Colony team, particularly Dr. Suheeta Roy, Casey J Chapman, Melinda S McCarty, Jesse Ingles, and everyone else who contributed to the tissue harvest. We thank Dr. Lu Lu, who leads the main BXD Colony effort. We thank Dr. Evan G Williams for making the gene expression data readily available, and Dr. David Ashbrook for providing the BXD genotypes. We thank the GeneNetwork team, especially Zach Sloan and Arthur Centeno, who have been extremely prompt and effective at assisting with the GeneNetwork interface, and Dr. Pjotr Prins for his role in implementing GEMMA. We thank Dr. Garrett Jenkinson for the invaluable guidance he provided for entropy estimation. This study was funded by the NIH NIA grants R21AG055841 and R01AG043930.


All animal procedures were in accordance to protocol approved by the Institutional Animal Care and Use Committee (IACUC) at the University of Tennessee Health Science Center. Protocol numbers 12-148.0 (2012-2015), 15-124.0 (2015-2018), and 18-094.0 (2018-present).

Version history

  1. Preprint posted: June 24, 2021 (view preprint)
  2. Received: November 3, 2021
  3. Accepted: April 1, 2022
  4. Accepted Manuscript published: April 7, 2022 (version 1)
  5. Version of Record published: April 28, 2022 (version 2)


© 2022, Mozhui et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.


  • 1,950
  • 280
  • 29

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Khyobeni Mozhui
  2. Ake T Lu
  3. Caesar Z Li
  4. Amin Haghani
  5. Jose Vladimir Sandoval-Sierra
  6. Yibo Wu
  7. Robert W Williams
  8. Steve Horvath
Genetic loci and metabolic states associated with murine epigenetic aging
eLife 11:e75244.

Share this article

Further reading

    1. Chromosomes and Gene Expression
    Rupam Choudhury, Anuroop Venkateswaran Venkatasubramani ... Axel Imhof
    Research Article

    Eukaryotic chromatin is organized into functional domains, that are characterized by distinct proteomic compositions and specific nuclear positions. In contrast to cellular organelles surrounded by lipid membranes, the composition of distinct chromatin domains is rather ill described and highly dynamic. To gain molecular insight into these domains and explore their composition, we developed an antibody-based proximity-biotinylation method targeting the RNA and proteins constituents. The method that we termed Antibody-Mediated-Proximity-Labelling-coupled to Mass Spectrometry (AMPL-MS) does not require the expression of fusion proteins and therefore constitutes a versatile and very sensitive method to characterize the composition of chromatin domains based on specific signature proteins or histone modifications. To demonstrate the utility of our approach we used AMPL-MS to characterize the molecular features of the chromocenter as well as the chromosome territory containing the hyperactive X-chromosome in Drosophila. This analysis identified a number of known RNA binding proteins in proximity of the hyperactive X and the centromere, supporting the accuracy of our method. In addition, it enabled us to characterize the role of RNA in the formation of these nuclear bodies. Furthermore, our method identified a new set of RNA molecules associated with the Drosophila centromere. Characterization of these novel molecules suggested the formation of R-loops in centromeres, which we validated using a novel probe for R-loops in Drosophila. Taken together, AMPL-MS improves the selectivity and specificity of proximity ligation allowing for novel discoveries of weak protein-RNA interactions in biologically diverse domains.

    1. Cancer Biology
    2. Chromosomes and Gene Expression
    Gregory Caleb Howard, Jing Wang ... William P Tansey
    Research Article

    The chromatin-associated protein WD Repeat Domain 5 (WDR5) is a promising target for cancer drug discovery, with most efforts blocking an arginine-binding cavity on the protein called the ‘WIN’ site that tethers WDR5 to chromatin. WIN site inhibitors (WINi) are active against multiple cancer cell types in vitro, the most notable of which are those derived from MLL-rearranged (MLLr) leukemias. Peptidomimetic WINi were originally proposed to inhibit MLLr cells via dysregulation of genes connected to hematopoietic stem cell expansion. Our discovery and interrogation of small-molecule WINi, however, revealed that they act in MLLr cell lines to suppress ribosome protein gene (RPG) transcription, induce nucleolar stress, and activate p53. Because there is no precedent for an anticancer strategy that specifically targets RPG expression, we took an integrated multi-omics approach to further interrogate the mechanism of action of WINi in human MLLr cancer cells. We show that WINi induce depletion of the stock of ribosomes, accompanied by a broad yet modest translational choke and changes in alternative mRNA splicing that inactivate the p53 antagonist MDM4. We also show that WINi are synergistic with agents including venetoclax and BET-bromodomain inhibitors. Together, these studies reinforce the concept that WINi are a novel type of ribosome-directed anticancer therapy and provide a resource to support their clinical implementation in MLLr leukemias and other malignancies.