A mathematical model that predicts human biological age from physiological traits identifies environmental and genetic factors that influence aging

Sergiy Libert; Alex Chekholko; Cynthia Kenyon

doi:10.7554/eLife.92092.2

eLife Assessment

This important study developed a mathematical model to predict biological age by leveraging physiological traits across multiple organ systems. The results presented are convincing, utilizing comprehensive data-driven approaches. However, additional external validation could further strengthen its generalizability. The model provides a way to identify environmental and genetic factors impacting aging and lifespan, revealing new factors potentially affecting aging. It also shows promise for evaluating therapeutics aimed at prolonging a healthy lifespan.

https://doi.org/10.7554/eLife.92092.2.sa2

Significance of findings

important: Findings that have theoretical or practical implications beyond a single subfield

landmark
fundamental
important
valuable
useful

Strength of evidence

convincing: Appropriate and validated methodology in line with current state-of-the-art

exceptional
compelling
convincing
solid
incomplete
inadequate

During the peer-review process the editor and reviewers write an eLife assessment that summarises the significance of the findings reported in the article (on a scale ranging from landmark to useful) and the strength of the evidence (on a scale ranging from exceptional to inadequate). Learn more about eLife assessments

Abstract

Why people age at different rates is a fundamental, unsolved problem in biology. We created a model that predicts an individual’s age from physiological traits that change with age in the large UK Biobank dataset, such as blood pressure, lung function, strength and stimulus- reaction time. The model best predicted a person’s age when it heavily-weighted traits that together query multiple organ systems, arguing that most or all physiological systems (lung, heart, brain, etc.) contribute to the global phenotype of chronological age. Differences between calculated “biological” age and chronological age (ΔAge) appear to reflect an individual’s relative youthfulness, as people predicted to be young for their age had a lower subsequent mortality rate and a higher parental age at death, even though no mortality data were used to calculate ΔAge. Remarkably, the effect of each year of physiological ΔAge on Gompertz mortality risk was equivalent to that of one chronological year. A Genome-Wide Association Study (GWAS) of ΔAge, and analysis of environmental factors associated with ΔAge identified known as well as new factors that may influence human aging, including genes involved in synapse biology and a tendency to play computer games. We identify a small number of readily measured physiological traits that together assess a person’s biological age and may be used clinically to evaluate therapeutics designed to slow aging and extend healthy life.

Introduction

The process of aging is universally similar yet deeply unique to each person. By observing a person for a moment, one can deduce their age with high accuracy, even though no two people age the same way. Some individuals might lose hair with age or develop chronic diseases, whereas others might not. Investigating both the universal aspects of aging as well as the basis of individual differences, and developing means of measuring physiological age and health, will provide opportunities to improve human lives.

The rate of aging; that is, the rate at which organisms lose physiological fitness and accumulate morbidity, has both genetic and environmental determinants. Humans age more slowly than, for example, dogs, so genes play a key role, but environmental factors like smoking and exercise influence aging as well. In this study, we have used publicly available data of human health parameters to systematically identify genetic and environmental variables that influence human aging.

To generate an inclusive, wholistic model of human aging, we queried a large, well- annotated human database (the United Kingdom BioBank or UKBB) comprising over 3000 phenotypes that together span the functions of multiple organs and physiological systems. The UKBB’s medical, environmental, and genetic data on ∼500,000 British volunteers is a unique resource to investigate the biology of aging. While participants of UKBB are not a random cross- section of society (Abdellaoui et al., 2019; Haworth et al., 2019), this rich database nonetheless likely provides generalizable insights into human aging and disease (Hanlon Id et al., 2022).

A number of published studies describe and employ methods to identify genes that might influence human aging. The majority of those studies (Graham Ruby et al., 2018; Pilling et al., 2017; Wright et al., 2019) focus on lifespan (Joshi et al., 2017; Timmers et al., 2022, 2019); for example, age at death or parents’ age at death, or analyze cohorts of people with exceptional lifespan (Bae et al., 2022; Shen et al., 2020); or the presence or absence of one or few age- associated diseases (Timmers et al., 2022, 2020; Zenin et al., 2019). Additionally, researchers have used molecular traits, such as blood proteins (Coenen et al., 2023), or blood DNA methylation patterns to build and analyze biological age prediction algorithms (clocks) to identify genes that influence aspects of human aging (Gibson et al., 2019; Lu et al., 2018; McCartney et al., 2021). Biological age clocks derived from one or few physiological measures have also been constructed, such as a biological clock built using 3D facial scans (Xia et al., 2020). Likewise, a biological clock built using the gut microbiome (Wilmanski et al., 2021) was used to identify individuals who might be aging slower or faster than average and suggest drugs that might influence gut health. Recently, aging of separate organs has been investigated and linked to age-associated diseases and mortality (Tian et al., 2023), and biological age has been estimated using AI methods (Qiu et al., 2022).

In our approach, we set out to measure human aging directly and wholistically, making sure that all systems relevant to health are represented. To do so, we sampled and analyzed traits reporting on multiple organ systems and physiological domains. We quantified the markers of aging that reflect overall physiological health, such as strength, stimulus-reaction time and blood pressure. This multi-systemic approach does not rely on a presence or absence of recognized diseases or a small number of binary events, such as death or stroke, and therefore reflects human aging more directly. Likewise, instead of concentrating on diseases, we aimed to evaluate a multitude of physiological parameters that change in “healthy” people, to allow us to identify factors missed by previous studies.

We developed a series of mathematical models that consider 121 age-related traits and predict a biological age for each individual. We show that the model that best predicts age incorporates data reflecting the activity of most if not all the organs and physiological systems. By comparing predicted biological age to actual age, we identified individuals who may be aging slower or faster than average. Using this model, we identified new environmental factors and genetic loci that may influence biological age. By building models lacking clusters of phenotypically correlated (typically organ-specific) traits, we further categorized these genetic loci and environmental factors as those likely to influence aging globally vs those that likely impact a single organ system. Likewise, by analyzing a smaller, healthier sub-cohort of UKBB participants, we identified factors likely to influence apparent age by conferring an age-related disease. Notably, our findings highlighted neural function as an important determinant of overall biological age. Finally, after analyzing the performance of different physiological clocks, we identified twelve key physiological traits that together could measure biological age in longitudinal clinical trials for interventions that increase human healthspan.

Results

Physiological traits that change with age

To identify age-dependent traits, we conducted linear regression analysis on every UKBB parameter relative to the age of the participants (see Supplementary Table 1a, b) and recorded the list of those with a non-zero slope and adjusted statistical significance better than 10^-3 (see Supplementary Table 2a, b). Examples include systolic blood pressure (shown in Fig. 1a), which increases with age, and hand-grip strength (shown in Fig. 1b), which decreases with age.

a) Systolic blood pressure (UKBB field ID# 4080) and b) hand-grip strength (UKBB field ID# 47) of a random set of 10,000 female UKBB participants is plotted against their age; c) Number of age-sensitive phenotypes plotted against the declining number of people in whom these phenotypes were measured; d, e) sex hormone binding globulin concentrations (UKBB field# 30830) of a random set of 10,000 males (d) and females (e); f) Average number of lifetime sexual partners is plotted against the age of UKBB participants (UKBB field ID# 2149). Grey area denotes 99% confidence interval. Color of dots on the plot represents relative density of dots in the area.

Most large human databases and datasets, including UKBB, have certain limitations, such as incomplete or missing data points. Therefore, before proceeding to modelling aging, we needed to address the following three issues:

Certain phenotypes, such as MRI brain scans, were only available for a subset of UKBB participants (in this case <50,000). Therefore, we could not use MRI data to estimate the age of the remaining participants. Thus, the inclusion of such incomplete phenotypes in the UKBB database required an optimization strategy. The objective was to identify individuals who appeared young for their age, and the more individuals in the study, the greater the likelihood of discovering them. Likewise, including more diverse phenotypes improves the robustness and global assessment of overall aging. However, as we increased the number of age-dependent phenotypes, the number of individuals evaluated decreased. From the curve’s shape (Fig. 1c), we estimated an optimal inclusion threshold to be ∼120 +/- 15 phenotypes.
Significant phenotypic differences exist between the sexes. For example, the parameters “age at which first facial hair appeared”, “age at menopause” and “degree of pattern balding” are gender specific. Additionally, shared phenotypes may have different dynamics in males versus females. For example, increasing plasma concentration of sex-hormone binding globulin (SHBG) is one of the best predictors of age in males (Fig. 1d); however, in females, its plasma concentration stays nearly constant or even tends to decrease (Fig. 1e). Thus, we analyzed male and female aging separately.
The dataset we used is largely cross-sectional, meaning that each data point represents a different person at a different age. Consequently, phenotypes that are used to predict age could be indicative of cultural and societal changes over time, rather than biological changes associated with aging. For instance, a good predictor of age (with a p-value < 10^-52) is the lifetime number of sexual partners (Fig. 1f). While sexual activity and fertility have been linked to human aging and longevity (Min et al., 2012), the correlation here is most likely driven by evolving social norms in Britain. Other examples of such traits include "how many siblings do you have" or "how long have you lived in your current house." Moreover, some biological measurements were derived using age as a parameter. For instance, BMR (Basal Metabolic Rate) is an outstanding age predictor (p-value < 10^-255). However, BMR was not measured directly; instead, it was computed using a formula that incorporates height, weight, gender, and age itself. Therefore, we examined each age-dependent parameter independently, aiming to satisfy three broad criteria: a) the trait should not reflect societal norms and structures; b) the trait should not be a function of elapsed time (e.g., how long have you been drinking green tea?); and c) the trait’s value should not depend on a person’s actual age. We endeavored to use purely biological and physiological parameters. Although it is possible that the selected phenotypes were still influenced to some degree by the birth cohort, these considerations should have reduced this effect. The complete list of age-related traits we selected, along with the reasons behind our choices, can be found in supplementary tables 2a and 2b.

Age-dependent physiological traits fall into clusters

The phenotypes we selected for our age-prediction model were often correlated to one another; for example, left-hand and right-hand grip strength. To assess the degree and pattern of correlations among the age-dependent traits (see Supp. Table S1), we first normalized each phenotype by its mean and standard deviation. For phenotypes represented as multiple-choice questions (e.g., do you take naps - often, sometimes, rarely, never?), we encoded each answer option as a binary vector (one or zero), and these vectors were also normalized. Correlations were computed for each pair of phenotypes and visualized as dendrograms (fig. 2a, b). As expected, highly correlated phenotypes grouped together, such as “BMI”- “Weight”- “Waist Circumference” or “Cholesterol”- “LDL”. Surprisingly, this analysis uncovered strong correlations that were not obvious, such as “I drive faster than the speed limit most of the time (id# 1100)” with “I like my drinks very hot (id# 1518)” (fig. 2a, b; marked with yellow shadows). Notably, most of the clusters appeared to be enriched for phenotypes associated with a specific organ or physiological system. For example, the cluster that contains “Creatinine”, “Urea”, “Cystatin-C” and “Phosphate” likely reflects kidney function; whereas the cluster that contains “Systolic blood pressure” and “Diastolic blood pressure” likely reflects cardiovascular function (fig. 2a, b). That said, upon close examination, it is not intuitively obvious why some physiological traits do or do not cluster with one another. Thus, this dendrogram might be a valuable data source for future hypothesis generation and exploration.

Age-dependent phenotypic clustering. Dendrogram plots of age-dependent female (a) and male (b) phenotypes selected for age prediction. Numbers in the name of “rays” represent UKBB ID numbers for multichoice questions (see supplementary table 1 or the UKBB website), followed by the answer. Major clusters were colored and subjectively assigned a name that reflects a possible biological function of the cluster; The number of principle components included in the PLS (Projection to Latent Structures) model to predict age vs root mean square error of the predictions is plotted for females (c) and (d) males; e) the top phenotypes with the highest weights in the age-predicting PLS model are listed for (e) females and (f) males. Phenotypes shaded in green are shared between sexes, red are specific to females, and blue are specific to males. All phenotypes were used for both sexes, and this shading reflects only the position in list of top-13 traits; g) list of phenotypes used to predict age of females and (h) males projected on 2D space using correlation as the distance measure. The degree of correlation is also depicted by grey lines, the darker the shade, the stronger the correlation. Note that the distortion in positioning is an inevitable consequence of projecting high-dimensional data into 2D space. As before, groups of related phenotypes were subjectively assigned a name that likely depicts their physiology, and phenotypes with the highest weight in the PLS model were depicted by red dots.

A mathematical model to predict age

To develop a model that predicts age, we experimented with several algorithms, including simple linear regression, Gradient Boosting Machine (GBM) and Partial Least Squares regression (PLS). Different approaches have different advantages and limitations; however, we decided to choose one approach, and not develop and analyze several independent models in parallel in order to not artificially inflate the False Discovery Rate (FDR). We ultimately selected PLS regression because it enabled us to determine the number and composition of components required to predict age optimally from the data, which provides additional insights into the biology of human aging. But before making this selection, we compared the performance of the three approaches. The outcomes of PLS and linear regression were almost identical (R-squared between ΔAge values derived by these two methods was 0.99, meaning that if one model were to predict an individual was 62 years old, the other model would have the same prediction).

This similarity is likely due to the small number of predictors (121 phenotypes) and comparatively large number of participants (over 400,000). The correlation between GBM model outcomes and PLS (and linear regression) was slightly smaller (R-squared = 0.87). The reason for the lower correlation is likely the need for imputation in PLS and linear regression models. The GBM model tolerates missing data, whereas linear regression and PLS methods require imputation or removal of individuals with too many datapoints missing, an approach we describe in more detail below.

PLS modeling is not tolerant of missing values, and in the UKBB dataset we used, over 60,000 participants (∼15%) lacked at least one phenotypic measurement. To prevent excessive imputation, we excluded any individual missing more than 15 datapoints from the study, thereby decreasing number of selected female participants from 222,111 to 215,949 (∼2.7% loss), and males from 188,609 to 183,715 (∼2.6% loss). We imputed and scaled the values of the remaining participants with missing data (Methods).

Next, we determined how many PLS components (each derived from UKBB phenotypes) were required to predict chronological age. To do so, we constructed a series of age-prediction models using an increasing number of these components. The first model was built using only component #1, the second using components #1 and #2, and so on. At each step, we calculated the root-mean-square error of the age prediction and determined its decline using the R function "selectNcomp" (see Fig. 2c, d). Our analysis revealed that only 11 independent components were required to describe female aging dynamics, and 9 independent components were required for males. Including additional components did not further improve the model performance. Therefore, we used the R function "plsR" with 9 and 11 components for males and females, respectively, along with the Cross-Validation function (CV) to prevent overfitting when building models to predict age using UKBB phenotypes. Specifically, we performed 10 rounds of cross-validation, where 10% of data were held out and the remaining 90% used for training. Over 10 rounds, different 10% were held out for validation. In each case, the findings were validated in the test set. Full statistics and approach are described in supplementary computational methods.

It was interesting to determine which individual age-sensitive phenotypes were most useful for age prediction. Since many phenotypes contribute to multiple PLS components, we deconstructed each PLS component and calculated the sum of the absolute values for phenotype coefficients across all components. This provided a weight metric for each phenotype used to predict age. The top thirteen phenotypes with the highest weights are presented in figures 2e and 2f. Most were shared between males and females and were associated with different physiological systems; for example, systolic blood pressure (which likely correlates with cardiovascular health), forced expiratory volume (pulmonary and cartilage/bone health), urea and cystatin C levels (kidney health), and mean time to correctly identify matches (cognitive health). Moreover, if we deleted one of these selected traits, the model substituted a close correlate; specifically, it substituted 1-second FEV (forced expiratory volume) for FEV, systolic blood pressure for diastolic blood pressure, and hand-grip strength (right) for hand grip strength (left). The fact that the model best predicted chronological age when it received input from a wide range of physiological systems underscores the global, systemic nature of the aging process. Similar conclusions were drawn from high-dimensional analysis of aging mice (Chen et al., 2022).

Inferred (biological) age predicts all-cause mortality better than chronological age

We utilized the physiological phenotypes listed in tables S2a, b and the PLS modelling described above to predict female age with a root mean square error of 4.8 years, R²∼0.63, and predict male age with a root mean square error of 5.1 years, R²∼0.6. Several factors may contribute to discrepancies between predicted biological age and chronological age, including statistical noise, variations in life histories among UKBB participants, limited accuracy of certain measurements, and inadequate numbers of relevant measurements. However, some of this discrepancy may arise because certain individuals are aging more slowly or rapidly than the mean for that age. Consistent with this interpretation, we observed a significant correlation of residuals between two assessments for a small number of UKBB participants who were evaluated longitudinally (twice) with intervals of up to 12 years (R2∼0.56, p<10^-255).

To estimate biological age from this cross-sectional data, we computed a value termed ΔAge for each participant. We define ΔAge as the individual’s chronological age subtracted from their predicted age and normalized such that the average ΔAge for the entire population at each age is zero. ΔAge is negative if an individual is predicted to be younger than they are and positive if an individual is predicted to be older. The ΔAge parameter carries no information about the person’s actual chronological age, as it is equally distributed across zero at any age (fig. 3a). Comparable approaches have been employed previously, such as using DNA methylation patterns (Marioni et al., 2015), or facial images and computer vision (Chen et al., 2015) to predict age and identify potentially “fast agers” and “slow agers”.

ΔAge has biological meaning. a) delta-age (ΔAge, predicted biological age minus chronological age) is plotted against chronological age for a random subset of 10,000 UKBB participants. Note that there is no correlation between age and ΔAge; b) histogram of age distribution (blue), and death distribution (red, right y-axis) is presented for UKBB males; c) mortality of UKBB male participants vs their age is platted, note the classical exponential (Gompertzian) shape. Blue dots are actual data, the red line is an exponential fit, and the black dash line is 95% confidence interval; d) histogram of the ΔAge distribution (blue), and death distribution (red, right y-axis) is presented for UKBB males of 62 years of age only; e) mortality of 62-year-old males is plotted against their ΔAge. Blue dots are actual data, the red line is an exponential fit, and the black dashed line is 95% confidence interval. Once again, note the classical exponential (Gompertzian) shape with ΔAge, even though all the subjects are the same age chronologically; f) distribution of ΔAge for all the people in UKBB (all ages and all genders, green shape). The distribution of ΔAge for people who died within 5 years after enrolling in the UKBB (red line) is shown for comparison; note a shift of the deceased distribution to the right, towards larger ΔAge (predicted older on average). The mortality penalty due to ΔAge is plotted as blue dots (left y-axis), the exponential fit of these data is presented as a blue line, and the 99% confidence interval as a grey shade; g) average ΔAge is plotted for UKBB males (g) and females (h) against their highest education (qualification) level achieved; i) the fraction of people who play computer games “sometimes” (yellow dots), never (red dots), and people who play computer games “often” (green dots); j) average Δage of people at different ages separated by their computer gaming habits (see 3i). As a group, people who play computer games “often” are biologically younger than people who play computer games “sometimes”, or “never”.

One year of ΔAge carries approximately the same mortality risk as one year of chronological age

The classical paradigm of aging described by Gompertz stipulates that mortality rates increase exponentially with time, doubling roughly every 8 years (Kirkwood, 2015). In the UKBB dataset that we analyzed, a small number of participants (8,883 males and 5,668 females, fig.3b) passed away within 5 years of their initial test-center attendance. The distribution of these deaths among UKBB participants has a typical "Gompertzian" shape, with mortality rates exponentially doubling every 7.7 years for both males and females (figure 3c). In Gompertz’ model, where mortality depends only on age, everyone of the same age has an equal likelihood of dying. However, by incorporating ΔAge, we were able to further forecast death among individuals of the same age. To illustrate this point, consider males who are 62 years old and group them based on their ΔAge (as shown in figure 3d). Individuals on the left side (with negative ΔAge values) were predicted by the model to be younger than 62, while those on the right were predicted to be older. In this UKBB sub-cohort, several hundred subjects died within five years following their enrollment. Plotting the average mortality for each ΔAge bin in this stratification of 62-year-olds resulted in a Gompertz-like mortality distribution (fig. 3e). Notably, the effect of one year of ΔAge on the mortality rate was almost identical to that of one year of chronological age. It is important to emphasize that death data were not considered during the development of the model of biological age or derivation of ΔAge, and that ΔAge does not exhibit any correlation with chronological age (as illustrated in fig. 3a). The capacity of ΔAge to predict mortality with a similar level of accuracy as chronological age is consistent across genders and ages and can even be observed when individuals of all ages are combined (fig. 3f). We consider this progressive increase in mortality rates with progressively larger ΔAge to be a powerful validation of this modeling strategy for assessing biological age. The fact that combining chronological age with ΔAge leads to a more precise prediction of mortality risk than relying on chronological age alone might be of interest to actuaries.

ΔAge correlates with parental lifespan

Remarkably, we observed a robust correlation of ΔAge with the age at death of the participant’s father (p-value = 1.9*10^-43 for females, and p-value = 3.9*10^-31 for males), and mother (p-value = 1.1*10^-68 for females, and p-value = 1.3*10^-32 for males). Individuals predicted to be biologically younger had parents who lived longer. Previous studies have reported that the lifespans of parents and offspring are correlated (Graham Ruby et al., 2018; Milman and Barzilai, 2016). These findings, too, provide strong validation for the model, reinforcing the idea that ΔAge is not simply noise, but rather carries significant information about the aging process and its variability in the population.

Environmental factors that influence biological age

Previous studies have shown that personal wealth is positively associated with human lifespan (Chetty et al., 2016; Wang and Geng, 2019), whereas smoking and excessive drinking are negatively associated with lifespan. To investigate whether this measure of physiological ΔAge has similar associations, and possibly to identify new environmental factors that influence aging, we calculated the correlation of ΔAge with every parameter available from UKBB (supplemental tables S3a, b). Correlations with p-values lower than 10^-5 (calculated to correct for multiple testing) were considered statistically significant. Interestingly, we observed a strong association of ΔAge with age-dependent biological phenotypes that were not included in the model to predict ΔAge due to the low number of people who underwent the assessment.

For example, heel bone density (UKBB field #3148) and Thalamus volume (UKBB field #25011) both had strong associations with ΔAge (p-values were ∼10^-11 and 10^-10, respectively). These and other phenotypes with strong ΔAge correlations again help to validate the model and might be useful parameters to consider when building biological clocks in the future.

Tables S3c and S3d list the environmental factors we found to correlate with ΔAge. As predicted, wealth was positively correlated with a more youthful ΔAge. For instance, parameters such as "home location" (UKBB field id# 20075), "place of birth" (UKBB field id# 129), "Townsend deprivation index" (UKBB field id# 189), and "total income" have a strong and significant correlation with ΔAge (Tables S3). Additionally, smoking and exposure to smoke (UKBB field ids# 20161 and 20162) were positively correlated with an older ΔAge. The impact of moderate alcohol drinking on long-term health is still a subject of debate. In our data, the overall frequency of alcohol consumption (numerous UKBB fields, like 20414) did not have a significant correlation with ΔAge, however, the alcohol type did. Consuming beer and hard cider (UKBB field id# 1588) were positively correlated with ΔAge, whereas consuming Champagne and other white wines (UKBB field id# 4418) was negatively correlated. It is likely that drinking Champagne frequently is an indicator of higher socio-economic status.

The single most significant non-biological parameter that correlated with ΔAge in both males and females (p-value<10^-200) was "Qualifications" or the level of education achieved (UKBB field id# 6138). Each additional level of education was progressively associated with increased "youthfulness" (Fig. 3g, h). Interestingly, the effect size of education (-1.51) was much greater than that of wealth (-0.81) or place of birth (-0.13).

Certain leisure and social activities were also correlated with ΔAge. The amount of TV watching (UKBB filed# 1070) was positively correlated with ΔAge in both males and females, whereas time spent outdoors (UKBB filed# 1050) for males, and DIY projects (UKBB filed# 2624) for females were correlated with younger ΔAge. Intriguingly, the second strongest behavioral trait that associated with ΔAge was the “frequency with which people play computer games”. This is a novel association, and one that is less likely to reflect socioeconomic status, as access to computer gaming is inexpensive and widely available. Playing computer games associated with youthfulness (fig. 3i, j, Supp. Item #1), with a size effect of -2.2 and p-value of 4*10^-8. This association was equally strong if “age” was factored out from the regression, indicating that generational changes in leisure activities do not explain this association.

Genetic loci associated with biological age

To identify potential genetic determinants of physiological ΔAge, we carried out a genome- wide association study (GWAS), using linear models separately on males and females (Methods). Manhattan plots for male and female GWAS models are presented in figures 4a-d (for summary statistics, see supplemental tables S4). The inflation factor in our analysis was λ_gc=1.2005 for males and λ_gc =1.2531 for females. Linkage disequilibrium regression intercepts were 1.0213+/-0.0083 and 1.0285+/-0.0119 for males and females respectively.

Genetic analysis of ΔAge. a, c) Quantile-Quantile plots for female and male -log10 p- values; b, d) Manhattan plots from genome-wide association analysis of female and male ΔAge; e) Correlation of ΔAge GWAS determination with other GWAS performed and reported by the UKBB consortium. Note the strong genetic relation between GWASs for ΔAge and parental age at death; f) effect of *APOE* alleles on average ΔAge plotted across different ages. Beneficial *APOE2* alleles are in green, and detrimental *APOE4* alleles are in red.

Using a stringent multiple testing correction for GWAS (Chen et al., 2021) with a threshold of 10^-9, we identified 9 loci associated with ΔAge in males and 25 loci in females (fig. 4a, b, table S4). Four of these loci were found in both sexes. Specifically, these include the HLA locus, located at chr6:32,600,000; chr10:64,900,000, a locus that contains NRBF2, JMJD1C, and TATDN1P1 genes; chr19:45,413,233, a locus that contains APOE, TOMM40, and APOC genes; and chr20:23,613,000, a locus that contains the CST3 gene. These genes are strong candidates to influence whether a person is biologically young or old for their age. Two of these loci, APOE (Schächter et al., 1994; Sebastiani et al., 2019) and HLA (Yang et al., 2017), have previously been associated with human longevity, which increases our confidence in the analysis. GWAS analysis of combined male and female ΔAge data identified 12 additional loci (and candidate genes associated with these loci), are listed in table S4, and figure 5b.

Cluster Dropout Models. a) Correlation between ΔAge calculated using full set of identified parameters and each of ten dropout models. Note that ΔAge values remain robust between models, meaning that if the person is predicted to have large ΔAge by the complete model, the “dropout” models will predict large ΔAge as well; b) The list of genes nearest to GWAS loci that associate with female and male ΔAge in the full model. Each hit is presented as a bubble, colored according to the significance of association of the locus with ΔAge, with size representing the effect size of the top SNP in the locus. The full summary is reported in Supplementary table 4.

Additionally, we compared our ΔAge GWAS association results with similar GWAS studies that were performed for other biological clocks. For example, (McCartney et al., 2021) used DNA methylation data on 40,000 individuals to compute biological age called GrimAge. After that they calculated an intrinsic epigenetic age acceleration (IEAA, a value similar to ΔAge, which measured a deviation of biological age from chronological age) and performed GWAS.

A healthy sub-cohort distinguishes genes that affect aging vs age-related disease

Some genes that associated with ΔAge in our analysis are known disease risk factors. For example, the HNF1A (hepatocyte nuclear factor 1 homeobox A) locus (top SNP – rs1169284, ΔAge association p-value = 3.0*10^-23) is associated with diabetes (Shepherd et al., 2009) and cancer (Abel et al., 2018). The APOE (apolipoprotein E) locus (top SNP – rs7412, ΔAge association p-value = 4.4*10^-33) is associated with Alzheimer’s disease and coronary heart disease (Xu et al., 2016).

It is possible that people who carry risk alleles for age-related disease have a higher ΔAge due to the disease itself, even though their aging may be unaffected otherwise. To investigate this, we calculated the association of top loci with ΔAge in a “healthy-only” cohort, excluding people who had been diagnosed with disease; specifically, diabetes, cancer, asthma, emphysema, bronchitis, chronic obstructive pulmonary disease (COPD), cystic fibrosis, sarcoidosis, pulmonary fibrosis tuberculosis, any vascular or heart problems (such as high blood pressure, stroke, angina, or heart attack) or anybody with a history of allergic complications.

These exclusion criteria decreased the number of the people in the study by almost 50%, however, the association of ΔAge for top hits remained (supplementary table 4). These findings suggest that most of the genetic signal associated with ΔAge comes not from a few susceptibility alleles for specific diseases but rather from alleles that describe and possibly drive fundamental processes that change with age; that is, possibly with aging itself. Conversely, this analysis also identified genes that were specifically responsible for certain diseases that present similarly to accelerated aging. For instance, the GCKR (glucokinase regulatory protein) locus showed a strong association with ΔAge (p-value=8*10^-12); however, the association disappeared when we excluded individuals diagnosed with diabetes. This demonstrates that mutations in GCKR cause a disease that resembles aging but do not have a detectable effect on the overall aging of healthy individuals.

Nonetheless, caution should be exercised when interpreting the analysis of this smaller, "healthier" subpopulation. It is possible that certain hits disappeared not due to disease but because of decreased statistical power resulting in false negatives. Conversely, some individuals may have had undiagnosed or subclinical disease, leading to false positives. Additionally, some of the associations may be false positives due to Collider bias. Thus, we favor the interpretation that among the GWAS hits that disappeared in the healthy sub-cohort were disease- susceptibility genes, while those that persisted likely influence the aging process more generally. Future longitudinal and other studies in humans and potentially animals could lend support to this interpretation.

Heritability of ΔAge

To estimate heritability, we performed Linkage Disequilibrium (LD) score regression analysis (Zheng et al., 2017). The analysis involved 1,293,150 unique SNPs with an allele frequency higher than 0.01. We found that total genetic heritability (H²) of ΔAge was ∼11% (0.108+/- 0.009) for females and ∼10% (0.096+/-0.008) for males, which is similar to the genetic heritability estimated previously for human longevity (Graham Ruby et al., 2018; Melzer et al., 2020). This may be because the variation in genetic diversity is not substantial or because existing alleles of critical longevity genes do not have significant effect sizes in this human population.

GWAS signatures that correlate with the ΔAge GWAS

Another way to infer the biological meaning of ΔAge is to compare the GWAS signatures (Manhattan Plots) of ΔAge to GWAS signatures of other traits in public databases (Zheng et al., 2017). We found that the genetic signatures of some of the components used to calculate ΔAge were correlated to the genetic signature of ΔAge itself (Fig. 4e). For example, GWAS of Forced Vital Capacity (FVC) had a correlation with ΔAge GWAS of 0.49 +/- 0.02 (p-value=5*10^-65). In fact, remarkably, the most similar GWASs together spanned multiple organ systems (pulmonary, cardiovascular, musculature, cognition), arguing that this “aging” GWAS integrates the health of multiple organ systems.

In contrast, GWAS signatures of certain physiological parameters, such as blood creatinine levels, which were explicitly used in ΔAge derivation, had no genetic correlation with ΔAge (0.1+/-0.07, p-value=0.1). It is possible that traits whose GWAS signatures genetically correlate with the GWAS signature of ΔAge are drivers of aging, while traits with uncorrelated GWAS signatures are simply biomarkers. Certain metabolic parameters have been correlated to mortality in previous studies (Deelen et al., 2019), but it has been an open question if those metabolites have causal relationship to aging and mortality.

It is interesting to note that the genetic signature of ΔAge has a strong similarity to the genetic signature obtained through GWAS for "Father’s age at death" and "Mother’s age at death" (fig. 4e). This correlation was present even though the mortalities of subjects or parents were not part of the model and were not considered throughout the analysis. The genetic correlation of GWAS for parent’s age mortality with GWAS for offspring’s ΔAge was 0.39±0.03, p-value=1*10^-7 for females and 0.2±0.05, p-value=3*10^-5 for males. These GWAS correlations further demonstrate that ΔAge carries information about aging and longevity, despite its values being derived from cross-sectional physiological data and being independent of lifespan.

Gene ontology highlights a neuronal influence on biological age

To investigate whether specific pathways or systems have an influence on biological age, we performed GeneOntology analysis of extended GWAS hits (combined male and female genetic loci identified by the closest ORF). Five enriched pathways were identified in this analysis (Supp Item #2). Unexpectedly, the top enriched category (GO:98815) was modulation of excitatory postsynaptic potential, enriched ∼18 fold over the expected by-chance reference, with multiple- testing-adjusted p-value of 0.046. This category was exceptional (∼18-fold enrichment), as the second-best enrichment category was enriched only ∼3 fold (response to oxygen-containing compounds). This GO category comprised multiple genes influencing synaptic function (Supp Item #2) suggesting that the nervous system plays a particularly important role in aging systemically. Like the vasculature, the sympathetic nervous system impacts the function of many peripheral organs, and synapse function plays a critical role in the function and the maintenance of the CNS. Hints of such an association have come from genetics studies of worms (Apfeld and Kenyon, 1999; Li et al., 2016), flies (Libert et al., 2007) and laboratory rodents (Garratt et al., 2022; Zullo et al., 2019).

Cluster-dropout analysis enriches for GWAS hits that influence aging globally

If a GWAS hit influences aging itself; reflecting the function of all the organs and physiological systems, the association between the SNP and ΔAge should not disappear if any one measurement is omitted from the model. Thus, we investigated the robustness of the GWAS hits in a systematic way, using what we term “Cluster Dropout Models”. Specifically, we constructed a series of male and female models to predict ΔAge by systematically excluding small sets of highly correlated phenotypic clusters. We built 10 models, in which phenotypic clusters related to muscle (drop-out model 1), body composition (2), kidney health (3), cardio health (4), blood cell composition (5), blood biochemistry (6), neuro-psychiatric phenotypes (7), lipid metabolism (8), physical attributes (9), or general health (10) were excluded. The list of phenotypes belonging to each cluster is reported in Supplementary Table 2 and was guided by the clustering presented in Figures 2a, b, g, and h. As expected, the ΔAge values remained consistent among all the drop-out models (Figure 5a). This means that if a person was predicted to be ∼x years younger or older than their chronological age, this prediction was approximately the same regardless of the phenotypic clusters omitted.

A systematic evaluation of Cluster-Dropout models can suggest which of the genetic hits from our original full-model GWAS are likely to influence organismal aging and which are linked to a narrower phenotype. To perform this analysis, we took the best SNP from each candidate GWAS locus from the full male or female analysis (above) and tested its association with ΔAge computed using each of the 10 drop-out models. The bubble-plots in figure 5b represent the effect size of each of these SNPs (via bubble size), and the associated p-value (via color).

As predicted, some GWAS hits disappeared in certain drop-out models. A particularly informative gene was CST3. CST3 encodes cystatin-C, a metabolite whose concentration increases with age. Levels of cystatin-C are routinely used to evaluate kidney health and it is proposed to be used as a marker in human aging study “TAME” (Justice et al., 2018). Elevated levels of this metabolite had been linked to elevated risk of cardiovascular disease (CVD) (van der Laan et al., 2016), risk of cancer (Jones et al., 2017), and neurodegeneration (Kaur and Levy, 2012). However, in a Mendelian Randomization Study (van der Laan et al., 2016) it was shown that while levels of cystatin-C predict CVD well, SNPs that robustly alter expression of cystatin-C do not associate with CVD.

In the full model, CST3 had the most significant association with ΔAge (effect size >0.4, p- value<10^-80) in both males and females, as represented by its large red bubble. This association remained significant in all the dropout models, except dropout number 3 (kidney health clusters), which contains the CST gene product, Cystatin-C concentration, which was one of the UKBB phenotypes used to generate the model. When the kidney clusters were omitted, the size effect of the CST3 association decreased to less than 0.1, p-value ∼ 0.1, which is represented by the small black bubble. Likewise, if we calculated ΔAge using all the inputs in the full model except for “cystatin-C levels”, the CST3 locus was no longer associated with ΔAge. Combined, these data suggest that cystatin-C is a “marker” rather than a driver or determinant of aging. In contrast, some GWAS hits never dropped out, and these remained candidates for fundamental determinants of physiological ΔAge.

To definitively distinguish whether a gene is a driver or a marker of aging, an experiment would need to be performed. It is possible that certain gene activities are influenced by existing FDA-approved medications, and retrospective analyses of human cohorts who take certain medications can be performed. More likely, however, an animal model would need to be employed, where animals with candidate genes modified via genetic means are investigated for lifespan and onset and progression of age-associated conditions. For example, one can engineer a mouse with a conditional allele of Cystatin-C and evaluate how changes in dosage of this protein influence various phenotypes of aging.

In the same way, cluster-dropout models can be used to interrogate environmental factors. For example, as described above, computer gaming correlates with a youthful biological age (Figures 3i,j, Supp. Item 1). The natural question is - are there specific physiological phenotypes, such as stimulus-reaction time or pattern recognition that drive this correlation or is it reflective of a “whole-body” biological age. To answer this question specifically, as well as to investigate all the phenotypes systematically, we calculated the strength of the correlation between every UKBB phenotype and all the cluster-dropout models (Fig. 5a) in both males and females (presented in supplementary table 5). To account for multiple testing, the Bonferroni corrected threshold of significance was 7*10^-7. The correlation between biological age and computer gaming remained significant across all the models tested in both males and females, suggesting that there are no specific singular phenotypes responsible for this correlation. Such robustness of association was true for most phenotypes, but not all. For example, particulate air pollution (pm10) is associated with older biological age (p-value=1.6*10^-9 for females), however, if the model omits the cluster containing lung parameters, such as FEV, the correlation drops below Bonferroni-corrected statistical significance (p-value=5*10^-3 for females). This might suggest that particulate pollution mostly affects pulmonary health and to a lesser extent global organismal aging.

Another interesting observation is that degree by which certain cluster contributes to the model does not necessarily correlate with how much this cluster contributes to genetic signature of human aging. For example, while dropping out cluster 10 (General Health) resulted in quite significant changes of ΔAge distribution (R²=0.88), the genetic signature as determined by GWAS did not change substantially. The most likely explanation is that many parameters in this category are influenced by environment more strongly than by genetics; for example, not as much as caused by cluster 1 (muscle-related) removal.”

One must keep in mind the caveats and complexity of comparing correlations of different phenotypes to each other, yet this dataset provides a good starting point for possible investigations of environmental factors influencing human aging.

Discussion

General Caveats

Our study has several caveats. We used a cross-sectional dataset, where different ages are presented by people born at different times. Therefore, there is a likely a ”cohort effect” in some or all predictors we use. Additionally, our model assumes that the rate of aging is constant for each individual, which is not always true. For example, a person’s aging rate may change if they stop smoking. Despite these modelling assumptions, we believe that the final results are valid and generalizable and allow us to suggest new methods to measure physiological aging in humans and identify new targets to slow down human aging. The robustness of our modelling can be also assessed by considering a small number of UKBB participants (∼13,000 out of ∼500,000), who have been assessed twice, with the follow-up intervals ranging from 4 to 12 years. We observed a significant correlation (R²∼0.6, p-value<10^{- 255}) between biological-chronological age measures for these individuals between their two assessments. This suggest that variation due to noise is not large. We also found that there is a significant correlation between longitudinally calculated rates of aging (change in biological age divided by assessment interval) and the values calculated using cross-sectional approach.

Furthermore, to minimize the cohort effect in our genetic analysis, we used the year of birth as a covariate. Together with the correlations we observed between Δage and mortality, parent’s mortality, previous GWAS longevity hits, and GWAS Manhattan plot comparisons, these findings suggest that the method we describe is a feasible approach to measure an individual’s rate of aging and to identify genetic and environmental factors that may influence it.

Broader implications of the model for physiological aging

How a general term like “aging” maps onto age-dependent physiological traits is a deep question that may never be answered with great precision. In general, biological clocks can be used to identify new genes and environmental factors that influence aging, as we did here using this physiological clock. In addition, one can “look into the clock” itself to gain additional insights. For example, we found that this mathematical model could best predict chronological age when all the different organ and physiological systems were sampled, emphasizing the systemic nature of aging. If the phenotypes associated with chronological age resulted from the decline of only one or a few organs, this would not be the case. Second, the model showed how different physiological traits co-vary and cluster in the population. Some correlations, such as vitamin D and sleep duration, are not immediately clear. However, a post-hoc examination of such an association can be explained in the light of previous medical research. We anticipate that exploring analogous non-intuitive clusters that cannot be explained currently may provide a new understanding of causal relationships. Third, the use of cluster-dropout models provided a powerful tool for distinguishing between individual genes and environmental factors that impact a specific physiological function from those that might affect all aspects of aging.

Many of the genes we identified are consistent hits in longevity GWAS analysis. Intuitively, this would be expected since aging is a risk factor for death. However, our model allows one to dig deeper and ask whether a longevity GWAS locus might be identified only because alleles prevent people from reaching an extremely old age. One could imagine that this is the case for APOE, since APOE4 individuals generally die prematurely of Alzheimer’s disease (Olichney et al., 1997; Wright et al., 2019). However, we find that at all ages, even as young as 40 years, APOE genotype influences ΔAge (fig. 4f), perhaps due to its more general effects on lipid homeostasis (Abondio et al., 2019) or inflammation.

To gain a deeper understanding of the genetic signature of ΔAge, it might be prudent to consider genetic loci that have a strong association with ΔAge (say p-value<10^-6), even though they do not reach the threshold for genome-wide statistical significance. While some of the loci in this expanded list can be false positives, many of the potential genetic determinants identified this way are of potential interest. The full list of associations is available as supplementary summary statistics table.

To further analyze the meaning of genetic associations with ΔAge that we described above, we compared several published GWAS results obtained for human aging clocks using different health modalities. Specifically, we looked at GWAS for “Epigenetic Blood Age Acceleration” (Lu et al., 2018), ML-imaging-based human retinal aging clock (Ahadi et al., 2023), PhenoAgeAcceleration and BioAgeAcceleration (Kuo et al., 2021), and the ΔAge GWAS we presented in our manuscript. Surprisingly, we discovered that there is no overlap between GWAS results for any two of these clocks built via different modalities – retina, DNA methylation, or physiological functions. However, there is a significant genetic overlap between clocks built using human phenotypic measures and our ΔAge model we describe. For example, the Biological Age Clock Acceleration calculated using HbA1c, Albumin, Cholesterol, FEV, Urea nitrogen, SBP, and Creatinine (Levine, 2013) in a US cohort [from National Health and Nutrition Examination Survey (NHANES)] yielded 16 significant hits in the GWAS analysis, five of which were also significant in our GWAS for UKBB based ΔAge. These five common loci were close to the following genes - APOB, PIK3CG, TRIB1, SMARCA4, and APOE. The significance of this overlap is p < 10-8, suggesting that the ΔAge model we propose might be translatable to other cohorts of people.

An interesting question to consider is why GWAS results from other clock modalities, such as DNA methylation and retinal imaging do not yield any genetic similarities to each other or to physiological and biological clocks. It is possible that these modalities of age assessment depend on completely genetically independent biological processes. For example, in a simplified manner - blood composition might be heavily weighted for DNA methylation, vascular structure for retinal scans, and muscle/bone/kidney health for physiological clocks.

Data from model organisms suggest the master regulators of aging exist, and APOE is the best genetic variant known to influence human aging. Interestingly, only the biological and physiological clock models that we propose here pick it up as a hit. Alternatively, it is also possible that the true master regulators of aging rate are under stringent purifying selection; for example, due to an important role in development, and therefore, do not have genetic variability in human populations examined. As such, they could not be identified as hits in any GWAS studies.

When analyzing the many phenotypes that predict aging using PLS modelling, we discovered that only 9-11 axes are necessary to predict age. This suggests that there might be only ∼10 independent systems (physiological networks) driving human aging. Interestingly, although overall the traits that figure most prominently into the sum of the principal components tend to map onto individual phenotypic clusters (“Dropout Clusters”), together the “meaning” of the differentially weighted sets that comprise each principal component is not obvious. For example, the 10 Dropout Clusters we used are not representative of the 10 axes identified in PLS analysis. It would be interesting to understand the physiological significance of each axis to better understand the process of aging. That said, the two most valuable phenotypes used in our study (those that had, overall, the most weight in age prediction) were forced vital capacity and blood pressure. Moreover, the genetic signature of ΔAge was similar to the genetic signature for FVC and blood pressure (figure 4e). These phenotypes are integrated, multidimensional health measurements. Using genetic information to better understand age-related phenotypes through PLS axis decomposition might be a fruitful direction for future research.

It is interesting to note that the three approaches we used to generate age prediction model (PLS, GBM, and linear regression) yielded very similar or identical results in performance. We chose to settle on one approach (PLS) to not artificially inflate the False Discovery Rate (FDR); however, we verified that the top genetic loci associations obtained via the PLS model were also obtained in the GBM and linear models. Specifically, the top candidates (CST3, APOE, HLA locus, CPS1, PIK3CG, IGF1) identified in the PLS approach had statistically significant associations in all the models of ΔAge. It is likely that due to the small number of predictors (121) compared to a vastly larger number of individuals (over 400,000), the differences that these models introduce to final outcomes are quite small, which increases our confidence in the results.

Finally, from a practical perspective, we suggest that measuring human biological age using the 12 simple but diverse physiological measurements that together capture ∼87% of the full ΔAge model (systolic blood pressure, forced expiratory volume, and so forth; see Fig. 2), might have actuarial and clinical value. For example, this physiological-age index could be measured longitudinally to learn how aging trajectories might be affected by environmental factors or anti-aging therapeutics.

Author Contributions

S.L. and C.K. conceived the idea for the study and its design, S.L. performed modelling and calculations, S.L. and C.K. wrote the manuscript, A.C. maintained and troubleshooted computing clusters and scripts necessary for data acquisition and GWAS.

Acknowledgements

Authors are grateful to the Calico community for support and discussions. Specifically, we are grateful to Eugene Melamud and his group for their support of the UKBB data framework, help with data processing and analysis, and numerous constructive discussions; we are grateful to Aarif Khakoo for his insights into disease-vs-aging paradigms; Kevin Wright and Graham Ruby for their discernment of human genetics of aging; and to David Botstein for his insights into statistical interpretation of our computational results and future applicability of such analysis. We are grateful to Madeleine Cule for supporting Calico’s UKBB data interface and GWAS cluster maintenance and for guidance with GWAS methodology. We are grateful to Amoolya Singh for providing guidance in regression modelling, statistical analysis and interpretation of the data. The authors are grateful to Jonathan K. Pritchard for constructive discussions about modelling, genetic analysis, and results interpretation. This study was carried out using UK Biobank Application number 44584, and we thank the participants in the UK Biobank study. This study was funded by Calico Life Sciences LLC.

Additional files

Supplementary data items and methods

Supplementary Tables 1-5

References

1.
1. Abdellaoui A
2. Hugh-Jones D
3. Yengo L
4. Kemper KE
5. Nivard MG
6. Veul L
7. Holtz Y
8. Zietsch BP
9. Frayling TM
10. Wray NR
11. Yang J
12. Verweij KJH
13. Visscher PM.
2019Genetic correlates of social stratification in Great BritainNature Human Behaviour https://doi.org/10.1038/s41562-019-0757-5 Google Scholar
2.
1. Abel E V.
2. Goto M
3. Magnuson B
4. Abraham S
5. Ramanathan N
6. Hotaling E
7. Alaniz AA
8. Kumar-Sinha C
9. Dziubinski ML
10. Urs S
11. Wang L
12. Shi J
13. Waghray M
14. Ljungman M
15. Crawford HC
16. Simeone DM
2018HNF1A is a novel oncogene that regulates human pancreatic cancer stem cell propertieseLife 7https://doi.org/10.7554/eLife.33947 Google Scholar
3.
1. Abondio P
2. Sazzini M
3. Garagnani P
4. Boattini A
5. Monti D
6. Franceschi C
7. Luiselli D
8. Giuliani C
2019The genetic variability of APOE in different human populations and its implications for longevityGenes (Basel https://doi.org/10.3390/genes10030222 Google Scholar
4.
1. Ahadi S
2. Wilson KA
3. Babenko B
4. McLean CY
5. Bryant D
6. Pritchard O
7. Kumar A
8. Carrera EM
9. Lamy R
10. Stewart JM
11. Varadarajan A
12. Berndl M
13. Kapahi P
14. Bashir A
2023Longitudinal fundus imaging and its genome-wide association analysis provide evidence for a human retinal aging clockeLife 12:1–28https://doi.org/10.7554/ELIFE.82364 Google Scholar
5.
1. Apfeld J
2. Kenyon C
1999Regulation of lifespan by sensory perception in Caenorhabditis elegansNature 402:804–809https://doi.org/10.1038/45544 Google Scholar
6.
1. Bae H
2. Gurinovich A
3. Karagiannis TT
4. Song Z
5. Leshchyk A
6. Li M
7. Andersen SL
8. Arbeev K
9. Yashin A
10. Zmuda J
11. An P
12. Feitosa M
13. Giuliani C
14. Franceschi C
15. Garagnani P
16. Mengel-From J
17. Atzmon G
18. Barzilai N
19. Puca A
20. Schork NJ
21. Perls TT
22. Sebastiani P
2022A Genome-Wide Association Study of 2304 Extreme Longevity Cases Identifies Novel Longevity VariantsInt J Mol Sci 24https://doi.org/10.3390/IJMS24010116 Google Scholar
7.
1. Weiyang Chen
2. Qian W
3. Wu G
4. Weizhong Chen
5. Xian B
6. Chen X
7. Cao Y
8. Green CD
9. Zhao F
10. Tang K
11. Han JDJ
2015Three-dimensional human facial morphologies as robust aging markersCell Res 25:574–587https://doi.org/10.1038/cr.2015.36 Google Scholar
8.
1. Chen Z
2. Boehnke M
3. Wen X
4. Mukherjee B
2021Revisiting the genome-wide significance threshold for common variant GWASG3 Genes|Genomes|Genetics 11https://doi.org/10.1093/G3JOURNAL/JKAA056 Google Scholar
9.
1. Chen Z
2. Raj A
3. Prateek G V.
4. Di Francesco A
5. Liu J
6. Keyes BE
7. Kolumam G
8. Jojic V
9. Freund A.
2022Automated, high-dimensional evaluation of physiological aging and resilience in outbred miceeLife 11https://doi.org/10.7554/ELIFE.72664 Google Scholar
10.
1. Chetty R
2. Stepner M
3. Abraham S
4. Lin S
5. Scuderi B
6. Turner N
7. Bergeron A
8. Cutler D
2016The association between income and life expectancy in the United States, 2001-2014JAMA - Journal of the American Medical Association 315:1750–1766https://doi.org/10.1001/jama.2016.4226 Google Scholar
11.
1. Coenen L
2. Lehallier B
3. Vries HE de
4. Middeldorp J.
2023Markers of aging: Unsupervised integrated analyses of the human plasma proteomeFrontiers in Aging 4https://doi.org/10.3389/FRAGI.2023.1112109 Google Scholar
12.
1. Deelen J
2. Kettunen J
3. Fischer K
4. van der Spek A
5. Trompet S
6. Kastenmüller G
7. Boyd A
8. Zierer J
9. van den Akker EB
10. Ala-Korpela M
11. Amin N
12. Demirkan A
13. Ghanbari M
14. van Heemst D
15. Ikram MA
16. van Klinken JB
17. Mooijaart SP
18. Peters A
19. Salomaa V
20. Sattar N
21. Spector TD
22. Tiemeier H
23. Verhoeven A
24. Waldenberger M
25. Würtz P
26. Davey Smith G
27. Metspalu A
28. Perola M
29. Menni C
30. Geleijnse JM
31. Drenos F
32. Beekman M
33. Jukema JW
34. van Duijn CM
35. Slagboom PE.
2019A metabolic profile of all-cause mortality risk identified in an observational study of 44,168 individualsNat Commun 10:1–8https://doi.org/10.1038/s41467-019-11311-9 Google Scholar
13.
1. Garratt M
2. Erturk I
3. Alonzo R
4. Zufall F
5. Leinders-Zufall T
6. Pletcher SD
7. Miller RA
2022Lifespan extension in female mice by early, transient exposure to adult female olfactory cueseLife 11https://doi.org/10.7554/ELIFE.84060 Google Scholar
14.
1. Gibson J
2. Russ TC
3. Clarke TK
4. Howard DM
5. Hillary RF
6. Evans KL
7. Walker RM
8. Bermingham ML
9. Morris SW
10. Campbell A
11. Hayward C
12. Murray AD
13. Porteous DJ
14. Horvath S
15. Lu AT
16. McIntosh AM
17. Whalley HC
18. Marioni RE
2019A meta-analysis of genome-wide association studies of epigenetic age accelerationPLoS Genet 15:e1008104https://doi.org/10.1371/JOURNAL.PGEN.1008104 Google Scholar
15.
1. Ruby J Graham
2. Wright KM
3. Rand KA
4. Kermany A
5. Noto K
6. Curtis D
7. Varner N
8. Garrigan D
9. Slinkov D
10. Dorfman I
11. Granka JM
12. Byrnes J
13. Myres N
14. Ball C
2018Estimates of the heritability of human longevity are substantially inflated due to assortative matingGenetics 210:1109–1124https://doi.org/10.1534/genetics.118.301613 Google Scholar
16.
1. Id P Hanlon
2. Jani BD
3. Nicholl Id B
4. Lewsey J
5. Mcallister Id DA
6. Mair Id FS
2022Associations between multimorbidity and adverse health outcomes in UK Biobank and the SAIL Databank: A comparison of longitudinal cohort studiesPLOS Medicine https://doi.org/10.1371/journal.pmed.1003931 Google Scholar
17.
1. Haworth S
2. Mitchell R
3. Corbin L
4. Wade KH
5. Dudding T
6. Budu-Aggrey A
7. Carslake D
8. Hemani G
9. Paternoster L
10. Smith GD
11. Davies N
12. Lawson DJ
13. Timpson N J.
2019Apparent latent structure within the UK Biobank sample has implications for epidemiological analysisNat Commun 10https://doi.org/10.1038/S41467-018-08219-1 Google Scholar
18.
1. Jones M
2. Denieffe S
3. Griffin C
4. Tinago W
5. Fitzgibbon MC
2017Evaluation of cystatin C in malignancy and comparability of estimates of GFR in oncology patientsPract Lab Med 8:95–104https://doi.org/10.1016/j.plabm.2017.05.005 Google Scholar
19.
1. Joshi PK.
2017Genome-wide meta-analysis associates HLA-DQA1/DRB1 and LPA and lifestyle factors with human longevityNature Communications https://doi.org/10.1038/s41467-017-00934-5 Google Scholar
20.
1. Justice JN
2. Ferrucci L
3. Newman AB
4. Aroda VR
5. Bahnson JL
6. Divers J
7. Espeland MA
8. Marcovina S
9. Pollak MN
10. Kritchevsky SB
11. Barzilai N
12. Kuchel GA
2018A framework for selection of blood- based biomarkers for geroscience-guided clinical trials: report from the TAME Biomarkers WorkgroupGeroscience 40:419–436https://doi.org/10.1007/S11357-018-0042-Y Google Scholar
21.
1. Kaur G
2. Levy E
2012Cystatin C in Alzheimer’s diseaseFront Mol Neurosci https://doi.org/10.3389/fnmol.2012.00079 Google Scholar
22.
1. Kirkwood TBL
2015Deciphering death: a commentary on Gompertz (1825) ‘On the nature of the function expressive of the law of human mortality, and on a new mode of determining the value of life contingenciesPhilosophical Transactions of the Royal Society B: Biological Sciences 370:20140379https://doi.org/10.1098/rstb.2014.0379 Google Scholar
23.
1. Kuo CL
2. Pilling LC
3. Liu Z
4. Atkins JL
5. Levine ME
2021Genetic associations for two biological age measures point to distinct aging phenotypesAging Cell 20:e13376https://doi.org/10.1111/ACEL.13376 Google Scholar
24.
1. Li LB
2. Lei H
3. Arey RN
4. Li P
5. Liu J
6. Murphy CT
7. Xu XZS
8. Shen K
2016The neuronal kinesin UNC- 104/KIF1A is a key regulator of synaptic aging and insulin signaling-regulated memoryCurr Biol 26:605https://doi.org/10.1016/J.CUB.2015.12.068 Google Scholar
25.
1. Libert S
2. Zwiener J
3. Chu X
4. VanVoorhies W
5. Roman G
6. Pletcher SD
2007Regulation of Drosophila life span by olfaction and food-derived odorsScience 315:1133–1137https://doi.org/10.1126/SCIENCE.1136610 Google Scholar
26.
1. Lu AT
2. Xue L
3. Salfati EL
4. Chen BH
5. Ferrucci L
6. Levy D
7. Joehanes R
8. Murabito JM
9. Kiel DP
10. Tsai PC
11. Yet I
12. Bell JT
13. Mangino M
14. Tanaka T
15. McRae AF
16. Marioni RE
17. Visscher PM
18. Wray NR
19. Deary IJ
20. Levine ME
21. Quach A
22. Assimes T
23. Tsao PS
24. Absher D
25. Stewart JD
26. Li Y
27. Reiner AP
28. Hou L
29. Baccarelli AA
30. Whitsel EA
31. Aviv A
32. Cardona A
33. Day FR
34. Wareham NJ
35. Perry JRB
36. Ong KK
37. Raj K
38. Lunetta KL
39. Horvath S
2018GWAS of epigenetic aging rates in blood reveals a critical role for TERTNature Communications 9:1–13https://doi.org/10.1038/s41467-017-02697-5 Google Scholar
27.
1. Marioni RE
2. Shah S
3. McRae AF
4. Chen BH
5. Colicino E
6. Harris SE
7. Gibson J
8. Henders AK
9. Redmond P
10. Cox SR
11. Pattie A
12. Corley J
13. Murphy L
14. Martin NG
15. Montgomery GW
16. Feinberg AP
17. Fallin MD
18. Multhaup ML
19. Jaffe AE
20. Joehanes R
21. Schwartz J
22. Just AC
23. Lunetta KL
24. Murabito JM
25. Starr JM
26. Horvath S
27. Baccarelli AA
28. Levy D
29. Visscher PM
30. Wray NR
31. Deary IJ
2015DNA methylation age of blood predicts all-cause mortality in later lifeGenome Biol 16:25https://doi.org/10.1186/s13059-015-0584-6 Google Scholar
28.
1. McCartney DL
2. Min JL
3. Richmond RC
4. Lu AT
5. Sobczyk MK
6. Davies G
7. Broer L
8. Guo X
9. Jeong A
10. Jung J
11. Kasela S
12. Katrinli S
13. Kuo PL
14. Matias-Garcia PR
15. Mishra PP
16. Nygaard M
17. Palviainen T
18. Patki A
19. Raffield LM
20. Ratliff SM
21. Richardson TG
22. Robinson O
23. Soerensen M
24. Sun D
25. Tsai PC
26. van der Zee MD
27. Walker RM
28. Wang X
29. Wang Y
30. Xia R
31. Xu Z
32. Yao J
33. Zhao W
34. Correa A
35. Boerwinkle E
36. Dugué PA
37. Durda P
38. Elliott HR
39. Gieger C
40. de Geus EJC
41. Harris SE
42. Hemani G
43. Imboden M
44. Kähönen M
45. Kardia SLR
46. Kresovich JK
47. Li S
48. Lunetta KL
49. Mangino M
50. Mason D
51. McIntosh AM
52. Mengel-From J
53. Moore AZ
54. Murabito JM
55. Ollikainen M
56. Pankow JS
57. Pedersen NL
58. Peters A
59. Polidoro S
60. Porteous DJ
61. Raitakari O
62. Rich SS
63. Sandler DP
64. Sillanpää E
65. Smith AK
66. Southey MC
67. Strauch K
68. Tiwari H
69. Tanaka T
70. Tillin T
71. Uitterlinden AG
72. Van Den Berg DJ
73. van Dongen J
74. Wilson JG
75. Wright J
76. Yet I
77. Arnett D
78. Bandinelli S
79. Bell JT
80. Binder AM
81. Boomsma DI
82. Chen W
83. Christensen K
84. Conneely KN
85. Elliott P
86. Ferrucci L
87. Fornage M
88. Hägg S
89. Hayward C
90. Irvin M
91. Kaprio J
92. Lawlor DA
93. Lehtimäki T
94. Lohoff FW
95. Milani L
96. Milne RL
97. Probst-Hensch N
98. Reiner AP
99. Ritz B
100. Rotter JI
101. Smith JA
102. Taylor JA
103. van Meurs JBJ
104. Vineis P
105. Waldenberger M
106. Deary IJ
107. Relton CL
108. Horvath S
109. Marioni RE
2021Genome-wide association studies identify 137 genetic loci for DNA methylation biomarkers of agingGenome Biol 22:1–25https://doi.org/10.1186/S13059-021-02398-9/FIGURES/4 Google Scholar
29.
1. Melzer D
2. Pilling LC
3. Ferrucci L
2020The genetics of human ageingNat Rev Genet https://doi.org/10.1038/s41576-019-0183-6 Google Scholar
30.
1. Milman S
2. Barzilai N
2016Dissecting the mechanisms underlying unusually successful human health span and life spanCold Spring Harb Perspect Med 6https://doi.org/10.1101/cshperspect.a025098 Google Scholar
31.
1. Min KJ
2. Lee CK
3. Park HN
2012The lifespan of Korean eunuchsCurrent Biology https://doi.org/10.1016/j.cub.2012.06.036 Google Scholar
32.
1. Olichney JM
2. Sabbagh MN
3. Hofstetter CR
4. Galasko D
5. Grundman M
6. Katzman R
7. Thal LJ
1997The impact of apolipoprotein E4 on cause of death in Alzheimer’s diseaseNeurology 49:76–81https://doi.org/10.1212/WNL.49.1.76 Google Scholar
33.
1. Pilling LC
2. Kuo CL
3. Sicinski K
4. Tamosauskaite J
5. Kuchel GA
6. Harries LW
7. Herd P
8. Wallace R
9. Ferrucci L
10. Melzer D
2017Human longevity: 25 genetic loci associated in 389,166 UK biobank participantsAging 9:2504–2520https://doi.org/10.18632/aging.101334 Google Scholar
34.
1. Qiu W
2. Chen H
3. Kaeberlein M
4. Lee S-I
5. Allen PG
2022An explainable AI framework for interpretable biological agemedRxiv https://doi.org/10.1101/2022.10.05.22280735 Google Scholar
35.
1. Schächter F
2. Faure-Delanef L
3. Guénot F
4. Rouger H
5. Froguel P
6. Lesueur-Ginot L
7. Cohen D
1994Genetic associations with human longevity at the APOE and ACE lociNature Genetics 6:29–32https://doi.org/10.1038/ng0194-29 Google Scholar
36.
1. Sebastiani P
2. Gurinovich A
3. Nygaard M
4. Sasaki T
5. Sweigart B
6. Bae H
7. Andersen SL
8. Villa F
9. Atzmon G
10. Christensen K
11. Arai Y
12. Barzilai N
13. Puca A
14. Christiansen L
15. Hirose N
16. Perls TT
2019APOE Alleles and Extreme Human LongevityThe Journals of Gerontology: Series A 74:44–51https://doi.org/10.1093/GERONA/GLY174 Google Scholar
37.
1. Shen S
2. Li C
3. Xiao L
4. Wang X
5. Lv H
6. Shi Y
7. Li Y
8. Huang Q
2020Whole-genome sequencing of Chinese centenarians reveals important genetic variants in aging WGS of centenarian for genetic analysis of agingHum Genomics 14https://doi.org/10.1186/S40246-020-00271-7 Google Scholar
38.
1. Shepherd M
2. Shields B
3. Ellard S
4. Rubio-Cabezas O
5. Hattersley AT
2009A genetic diagnosis of HNF1A diabetes alters treatment and improves glycaemic control in the majority of insulin- treated patientsDiabetic Medicine 26:437–441https://doi.org/10.1111/j.1464-5491.2009.02690.x Google Scholar
39.
1. Tian YE
2. Cropley V
3. Maier AB
4. Lautenschlager NT
5. Breakspear M
6. Zalesky A
2023Heterogeneous aging across multiple organ systems and prediction of chronic disease and mortalityNature Medicine 29:1221–1231https://doi.org/10.1038/s41591-023-02296-6 Google Scholar
40.
1. Timmers PRHJ
2. Mounier N
3. Lall K
4. Fischer K
5. Ning Z
6. Feng X
7. Bretherick AD
8. Clark DW
9. Shen X
10. Esko T
11. Kutalik Z
12. Wilson JF
13. Joshi PK
2019Genomics of 1 million parent lifespans implicates novel pathways and common diseases and distinguishes survival chanceseLife 8:1–40https://doi.org/10.7554/ELIFE.39856 Google Scholar
41.
1. Timmers PRHJ
2. Tiys ES
3. Sakaue S
4. Akiyama M
5. Kiiskinen TTJ
6. Zhou W
7. Hwang S-J
8. Yao C
9. Kamatani Y
10. Zhou W
11. Deelen J
12. Levy D
13. Ganna A
14. Kamatani Y
15. Okada Y
16. Joshi PK
17. Wilson JF
18. Tsepilov YA
2022Mendelian randomization of genetically independent aging phenotypes identifies LPA and VCAM1 as biological targets for human agingNature Aging 2:19–30https://doi.org/10.1038/s43587-021-00159-8 Google Scholar
42.
1. Timmers PRHJ
2. Wilson JF
3. Joshi PK
4. Deelen J
2020Multivariate genomic scan implicates novel loci and haem metabolism in human ageingNature Communications 11:1–10https://doi.org/10.1038/s41467-020-17312-3 Google Scholar
43.
1. van der Laan SW
2. Fall T
3. Soumaré A
4. Teumer A
5. Sedaghat S
6. Baumert J
7. Zabaneh D
8. van Setten J
9. Isgum I
10. Galesloot TE
11. Arpegård J
12. Amouyel P
13. Trompet S
14. Waldenberger M
15. Dörr M
16. Magnusson PK
17. Giedraitis V
18. Larsson A
19. Morris AP
20. Felix JF
21. Morrison AC
22. Franceschini N
23. Bis JC
24. Kavousi M
25. O’Donnell C
26. Drenos F
27. Tragante V
28. Munroe PB
29. Malik R
30. Dichgans M
31. Worrall BB
32. Erdmann J
33. Nelson CP
34. Samani NJ
35. Schunkert H
36. Marchini J
37. Patel RS
38. Hingorani AD
39. Lind L
40. Pedersen NL
41. de Graaf J
42. Kiemeney LALM
43. Baumeister SE
44. Franco OH
45. Hofman A
46. Uitterlinden AG
47. Koenig W
48. Meisinger C
49. Peters A
50. Thorand B
51. Jukema JW
52. Eriksen BO
53. Toft I
54. Wilsgaard T
55. Onland-Moret NC
56. van der Schouw YT
57. Debette S
58. Kumari M
59. Svensson P
60. van der Harst P
61. Kivimaki M
62. Keating BJ
63. Sattar N
64. Dehghan A
65. Reiner AP
66. Ingelsson E
67. den Ruijter HM
68. de Bakker PIW
69. Pasterkamp G
70. Ärnlöv J
71. Holmes M V.
72. Asselbergs FW.
2016Cystatin C and Cardiovascular Disease: A Mendelian Randomization StudyJ Am Coll Cardiol 68:934–945https://doi.org/10.1016/j.jacc.2016.05.092 Google Scholar
44.
1. Wang J
2. Geng L
2019Effects of socioeconomic status on physical and psychological health: Lifestyle as a mediatorInt J Environ Res Public Health 16https://doi.org/10.3390/ijerph16020281 Google Scholar
45.
1. Wilmanski T
2. Diener C
3. Rappaport N
4. Patwardhan S
5. Wiedrick J
6. Lapidus J
7. Earls JC
8. Zimmer A
9. Glusman G
10. Robinson M
11. Yurkovich JT
12. Kado DM
13. Cauley JA
14. Zmuda J
15. Lane NE
16. Magis AT
17. Lovejoy JC
18. Hood L
19. Gibbons SM
20. Orwoll ES
21. Price ND
2021Gut microbiome pattern reflects healthy aging and predicts survival in humansNat Metab 3:274https://doi.org/10.1038/S42255-021-00348-0 Google Scholar
46.
1. Wright KM
2. Rand KA
3. Kermany A
4. Noto K
5. Curtis D
6. Garrigan D
7. Slinkov D
8. Dorfman I
9. Granka JM
10. Byrnes J
11. Myres N
12. Ball CA
13. Ruby JG
2019A prospective analysis of genetic variants associated with human lifespanG3: Genes, Genomes, Genetics 9:2863–2878https://doi.org/10.1534/g3.119.400448 Google Scholar
47.
1. Xia X
2. Chen X
3. Wu G
4. Li F
5. Wang Y
6. Chen Y
7. Chen M
8. Wang X
9. Weiyang Chen
10. Xian B
11. Weizhong Chen
12. Cao Y
13. Xu C
14. Gong W
15. Chen G
16. Cai D
17. Wei W
18. Yan Y
19. Liu K
20. Qiao N
21. Zhao X
22. Jia J
23. Wang W
24. Kennedy BK
25. Zhang K
26. Cannistraci C V.
27. Zhou Y
28. Han JDJ
2020Three-dimensional facial-image analysis to predict heterogeneity of the human ageing rate and the impact of lifestyleNat Metab 2:946–957https://doi.org/10.1038/S42255-020-00270-X Google Scholar
48.
1. Xu M
2. Zhao J
3. Zhang Y
4. Ma X
5. Dai Q
6. Zhi H
7. Wang B
8. Wang L
2016Apolipoprotein E Gene Variants and Risk of Coronary Heart Disease: A Meta-AnalysisBiomed Res Int 2016https://doi.org/10.1155/2016/3912175 Google Scholar
49.
1. Yang F
2. Sun L
3. Zhu X
4. Han J
5. Zeng Y
6. Nie C
7. Yuan H
8. Li X
9. Shi X
10. Yang Y
11. Hu C
12. Lv Z
13. Huang Z
14. Zheng C
15. Liang S
16. Huang J
17. Wan G
18. Qi K
19. Qin B
20. Cao S
21. Zhao X
22. Zhang Y
23. Yang Z
2017Identification of new genetic variants of HLA-DQB1 associated with human longevity and lipid homeostasis—a cross-sectional study in a Chinese populationAging (Albany NY 9:2316https://doi.org/10.18632/AGING.101323 Google Scholar
50.
1. Zenin A
2. Tsepilov Y
3. Sharapov S
4. Getmantsev E
5. Menshikov LI
6. Fedichev PO
7. Aulchenko Y.
2019Identification of 12 genetic loci associated with human healthspanCommunications Biology 10https://doi.org/10.1038/s42003-019-0290-0 Google Scholar
51.
1. Zheng J
2. Mesut Erzurumluoglu A
3. Elsworth BL
4. Kemp JP
5. Howe L
6. Haycock PC
7. Hemani G
8. Tansey K
9. Laurin C
10. Warrington NM
11. Finucane HK
12. Price AL
13. Bulik-Sullivan BK
14. Anttila V
15. Paternoster L
16. Gaunt TR
17. Evans DM
18. Neale BM
2017Databases and ontologies LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysisBioinformatics 33:272–279https://doi.org/10.1093/bioinformatics/btw613 Google Scholar
52.
1. Zullo JM
2. Drake D
3. Aron L
4. O’Hern P
5. Dhamne SC
6. Davidsohn N
7. Mao CA
8. Klein WH
9. Rotenberg A
10. Bennett DA
11. Church GM
12. Colaiácovo MP
13. Yankner BA
2019Regulation of lifespan by neural excitation and RESTNature 574:359–364https://doi.org/10.1038/S41586-019-1647-8 Google Scholar

Article and author information

Author information

Sergiy Libert
Calico Life Sciences, South San Francisco, United States
- For correspondence: libert@calicolabs.com
Alex Chekholko
Calico Life Sciences, South San Francisco, United States
Cynthia Kenyon
Calico Life Sciences, South San Francisco, United States
- For correspondence: cynthia@calicolabs.com

Author Notes

Competing interests: All coauthors work for Calico Life Sciences LLC, a pharmaceutical company engaged in understanding the biology of aging and development of therapies that would ameliorate suffering from age-associated diseases.

Version history

Sent for peer review: October 4, 2023
Preprint posted: October 19, 2023
Reviewed Preprint version 1: January 19, 2024
Reviewed Preprint version 2: March 19, 2025
Version of Record published: June 11, 2025

Cite all versions

You can cite all versions using the DOI https://doi.org/10.7554/eLife.92092. This DOI represents all versions, and will always resolve to the latest one.

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

views: 3,685
downloads: 332
citations: 4

Views, downloads and citations are aggregated across all versions of this paper published by eLife.