a) Systolic blood pressure (UKBB field ID# 4080) and b) hand-grip strength (UKBB field ID# 47) of a random set of 10,000 female UKBB participants is plotted against their age; c) Number of age-sensitive phenotypes plotted against the declining number of people in whom these phenotypes were measured; d, e) sex hormone binding globulin concentrations (UKBB field# 30830) of a random set of 10,000 males (d) and females (e); f) Average number of lifetime sexual partners is plotted against the age of UKBB participants (UKBB field ID# 2149). Grey area denotes 99% confidence interval. Color of dots on the plot represents relative density of dots in the area.

Age-dependent phenotypic clustering. Dendrogram plots of age-dependent female (a) and male (b) phenotypes selected for age prediction. Numbers in the name of “rays” represent UKBB ID numbers for multichoice questions (see supplementary table 1 or the UKBB website), followed by the answer. Major clusters were colored and subjectively assigned a name that reflects a possible biological function of the cluster; The number of principle components included in the PLS (Projection to Latent Structures) model to predict age vs root mean square error of the predictions is plotted for females (c) and (d) males; e) the top phenotypes with the highest weights in the age-predicting PLS model are listed for (e) females and (f) males. Phenotypes shaded in green are shared between sexes, red are specific to females, and blue are specific to males. All phenotypes were used for both sexes, and this shading reflects only the position in list of top-13 traits; g) list of phenotypes used to predict age of females and (h) males projected on 2D space using correlation as the distance measure. The degree of correlation is also depicted by grey lines, the darker the shade, the stronger the correlation. Note that the distortion in positioning is an inevitable consequence of projecting high-dimensional data into 2D space. As before, groups of related phenotypes were subjectively assigned a name that likely depicts their physiology, and phenotypes with the highest weight in the PLS model were depicted by red dots.

ΔAge has biological meaning. a) delta-age (ΔAge, predicted biological age minus chronological age) is plotted against chronological age for a random subset of 10,000 UKBB participants. Note that there is no correlation between age and ΔAge; b) histogram of age distribution (blue), and death distribution (red, right y-axis) is presented for UKBB males; c) mortality of UKBB male participants vs their age is platted, note the classical exponential (Gompertzian) shape. Blue dots are actual data, the red line is an exponential fit, and the black dash line is 95% confidence interval; d) histogram of the ΔAge distribution (blue), and death distribution (red, right y-axis) is presented for UKBB males of 62 years of age only; e) mortality of 62-year-old males is plotted against their ΔAge. Blue dots are actual data, the red line is an exponential fit, and the black dashed line is 95% confidence interval. Once again, note the classical exponential (Gompertzian) shape with ΔAge, even though all the subjects are the same age chronologically; f) distribution of ΔAge for all the people in UKBB (all ages and all genders, green shape). The distribution of ΔAge for people who died within 5 years after enrolling in the UKBB (red line) is shown for comparison; note a shift of the deceased distribution to the right, towards larger ΔAge (predicted older on average). The mortality penalty due to ΔAge is plotted as blue dots (left y-axis), the exponential fit of these data is presented as a blue line, and the 99% confidence interval as a grey shade; g) average ΔAge is plotted for UKBB males (g) and females (h) against their highest education (qualification) level achieved; i) the fraction of people who play computer games “sometimes” (yellow dots), never (red dots), and people who play computer games “often” (green dots); j) average Δage of people at different ages separated by their computer gaming habits (see 3i). As a group, people who play computer games “often” are biologically younger than people who play computer games “sometimes”, or “never”.

Genetic analysis of ΔAge. a, c) Quantile-Quantile plots for female and male -log10 p-values; b, d) Manhattan plots from genome-wide association analysis of female and male ΔAge; e) Correlation of ΔAge GWAS determination with other GWAS performed and reported by the UKBB consortium. Note the strong genetic relation between GWASs for ΔAge and parental age at death; f) effect of APOE alleles on average ΔAge plotted across different ages. Beneficial APOE2 alleles are in green, and detrimental APOE4 alleles are in red.

Cluster Dropout Models. a) Correlation between ΔAge calculated using full set of identified parameters and each of ten dropout models. Note that ΔAge values remain robust between models, meaning that if the person is predicted to have large ΔAge by the complete model, the “dropout” models will predict large ΔAge as well; b) The list of genes nearest to GWAS loci that associate with female and male ΔAge in the full model. Each hit is presented as a bubble, colored according to the significance of association of the locus with ΔAge, with size representing the effect size of the top SNP in the locus. The full summary is reported in Supplementary table 4.