Longitudinal fundus imaging and its genome-wide association analysis provide evidence for a human retinal aging clock

  1. Sara Ahadi  Is a corresponding author
  2. Kenneth A Wilson
  3. Boris Babenko
  4. Cory Y McLean  Is a corresponding author
  5. Drew Bryant
  6. Orion Pritchard
  7. Ajay Kumar
  8. Enrique M Carrera
  9. Ricardo Lamy
  10. Jay M Stewart
  11. Avinash Varadarajan
  12. Marc Berndl
  13. Pankaj Kapahi  Is a corresponding author
  14. Ali Bashir
  1. Google Research, United States
  2. Buck Institute for Research on Aging, United States
  3. Google Health, United States
  4. Department of Biophysics, Post Graduate Institute of Medical Education and Research, India
  5. Department of Ophthalmology, Zuckerberg San Francisco General Hospital and Trauma Center, United States
  6. Department of Ophthalmology, University of California, San Francisco, United States

Abstract

Biological age, distinct from an individual’s chronological age, has been studied extensively through predictive aging clocks. However, these clocks have limited accuracy in short time-scales. Here we trained deep learning models on fundus images from the EyePACS dataset to predict individuals’ chronological age. Our retinal aging clocking, ‘eyeAge’, predicted chronological age more accurately than other aging clocks (mean absolute error of 2.86 and 3.30 years on quality-filtered data from EyePACS and UK Biobank, respectively). Additionally, eyeAge was independent of blood marker-based measures of biological age, maintaining an all-cause mortality hazard ratio of 1.026 even when adjusted for phenotypic age. The individual-specific nature of eyeAge was reinforced via multiple GWAS hits in the UK Biobank cohort. The top GWAS locus was further validated via knockdown of the fly homolog, Alk, which slowed age-related decline in vision in flies. This study demonstrates the potential utility of a retinal aging clock for studying aging and age-related diseases and quantitatively measuring aging on very short time-scales, opening avenues for quick and actionable evaluation of gero-protective therapeutics.

Editor's evaluation

This paper is an important contribution to the biological aging field using eye image data to create an aging clock of the retina in data from eyePACS with validation in the UK Biobank. The authors provide compelling evidence that the clock correlates with chronological and phenotypic age, predicting mortality independently of chronological age and showing longitudinal evidence. The work identifies novel genetic loci with a top site located in the ALKAL2 region, which is functionally validated in a Drosophila model.

https://doi.org/10.7554/eLife.82364.sa0

Introduction

Aging causes molecular and physiological changes throughout all tissues of the body, enhancing the risk of several diseases (López-Otín et al., 2013). Identifying specific markers of aging is a critical area of research, as each individual ages uniquely depending on both genetic and environmental factors (Ahadi et al., 2020). While a variety of aging clocks have recently been developed to track the aging process, including phenotypic age (Liu et al., 2018; a combination of chronological age and nine biomarkers predictive of mortality) and epigenetic clocks derived from DNA methylation (Horvath and Raj, 2018), many require a blood draw and multiplex assay of many analytes.

A growing body of evidence suggests that the microvasculature in the retina might be a reliable indicator of the overall health of the body’s circulatory system and the brain. Changes in the eyes accompany aging and many age-related diseases such as age-related macular degeneration (AMD) (Luu and Palczewski, 2018), diabetic retinopathy (Namperumalsamy et al., 2009), and neurodegenerative disorders like Parkinson’s (Luu and Palczewski, 2018; Archibald et al., 2009) and Alzheimer’s (Frost et al., 2013). Eyes are also ideal windows for early detection of systemic diseases by ophthalmologists, including AIDS (Sun et al., 2009; Cunningham and Margolis, 1998), chronic hypertension (Wong and McIntosh, 2005), and tumors (Kreusel et al., 2002). This broad utility is perhaps unsurprising, as any subtle changes in the vascular system first appear in the smallest blood vessels, and retinal capillaries are amongst the smallest in the body.

The subtle changes induced in these small vessels often go undetected by even the most sophisticated instruments, necessitating the use of better approaches involving deep learning. Fundus imaging has proven to be a powerful and non-invasive means for identifying specific markers of eye-related health. Deep-learning was initially employed to predict diabetic retinopathy from retinal images at accuracies matching, or even exceeding, experts (Gulshan et al., 2016). Since then, retinal images have been employed to identify at least 39 fundus diseases including glaucoma, diabetic retinopathy, age-related macular degeneration (Wong and McIntosh, 2005; Cen et al., 2021), cardiovascular risk (Poplin et al., 2018), chronic kidney disease (Sabanayagam et al., 2020), and, most recently, in predicting age (Zhu et al., 2023). Given its non-invasive, low-cost nature, retinal imaging provides an intriguing opportunity for longitudinal patient analysis to assess the rate of aging.

Here, we use deep learning models to predict chronological age from fundus retinal images, hereafter ‘eyeAge’, and use the deviation of this value from chronological age, hereafter ‘eyeAgeAccel’, for mortality and association analyses. We train this model on the well-studied EyePACS dataset and apply it on both the EyePACS and UK Biobank cohorts. Together, our results suggest that the trajectory of an individual’s biological age can be predicted in timelines under a year and that statistically significant genome-wide associations are possible. Enrichment analysis of top GWAS hits as well as experimental validation of the Drosophila homolog of ALKAL2, a gene in the top GWAS locus, indicates genetic markers of visual decline with age and demonstrates the potential predictive power of a retinal aging clock in assessing biological age.

Results

Prediction of age from fundus images

Figure 1 summarizes the analysis workflow for the study. Using the EyePACS dataset, we trained a fundus image model on 217,289 examples from 100,692 patients and tuned it on 54,292 images from 25,238 patients. These models were employed for longitudinal analysis of repeat patients and also applied on the UK Biobank dataset (119,532 images) which had a notably distinct demographic distribution (Table 1). For both studies, most visits generated two images, one image each for the left and right eye, the EyePACs dataset had more repeat visits by patients making the ratio of total images to total patients slightly larger (Table 1). In both analyses, we took the average of the predictions between the left and right eye from a single visit to infer age (See Materials and methods).

Schematic of analysis pipeline.

EyePACS images were split into train and tune sets based on the patient. The model was then trained with the final model step being selected via the tune set. Prediction results on the EyePACS tune set were used for longitudinal analysis of aging. After filtering for image quality, inference was performed with the same model on the UK Biobank dataset and filtering for image quality, and the resulting eyeAgeAccel was used for GWAS analysis. Enrichment analysis was performed on the GWAS hits with a homolog of the top gene (ALKAL2) validated experimentally in Drosophila.

Table 1
Characteristics of patients in the development and validation sets (before filtering).
Development set (EyePACS)Test set (UK Biobank)
TrainTune
Patients100,69225,23864,019
Images217,28954,292119,532
EthnicityBlack: 11908 [7%]
Asia Pacific Islander: 11842 [7%]
White: 22539 [13%]
Hispanic: 125595 [71%]
Native American: 1791 [1%]
Other: 3809 [2%]
Black: 3040 [7%]
Asia Pacific Islander: 2923 [7%]
White: 5657 [13%]
Hispanic: 31521 [71%]
Native American: 426 [1%]
Other: 918 [2%]
Black: 1540 [1%]
Asia Pacific Islander: 4183 [4%]
White: 107967 [91%]
Hispanic: 0 [0%]
Native_american: 0 [0%]
Other: 5015 [4%]
Self-
reported Sex
Female: 127075 [59%]
Male: 90128 [41%]
Female: 31743 [58%]
Male: 22531 [42%]
Female: 65739 [55%]
Male: 53793 [45%]
Agemedian = 55.13
mean = 54.21
std = 11.50
median = 55.19
mean = 54.20
std = 11.46
median = 57.94
mean = 56.85
std = 8.18

The model showed a strong correlation between chronological age and predicted age (eyeAge) in both the EyePACS (0.95) and UK Biobank (0.87) datasets (Figure 2—figure supplement 1). Using mean absolute error (MAE) to assess the fidelity of the aging clock showed that the model performed favorably on both datasets (2.86 and 3.30, respectively, after quality filtering) relative to previous studies (Zhu et al., 2023; Galkin et al., 2021; McEwen et al., 2020; Horvath, 2013). Next, we evaluated the efficacy of our predictions in one to two year time scales using longitudinal data. Using the EyePACS Tune dataset, we restricted ourselves to data from patients with exactly two visits (1719 subjects) and examined the models’ ability to order the two visits over multiple time scales. Note that no longitudinal information about patients was specifically used to train or tune the model to predict chronological age. While the observed and predicted age differences between the two visits (M=0.033, SD = 2.34, Figure 2—figure supplement 2) had low correlation (pearson ⍴=0.17, p-value = 1.4e-12), Figure 2A shows that the model correctly ordered 71% of visits within a year with an MAE less than 2 years. In both metrics, the fidelity decreased in older groups and with smaller age gaps.

Figure 2 with 3 supplements see all
Longitudinal analysis of patients with exactly two visits in the EyePACS cohort.

(A) Changes of PPR (positive prediction ratio: the ratio of data whose eyeAge increased between subsequent visits) and MAE (mean absolute error) calculated on the same individual in relationship to chronological age at the first visit (left) and time between longitudinal visits (right). (B) Scatter plots representing correlation between eyeAge Gap (difference between predicted age and chronological age) of two consecutive visits from an individual (Same) or two consecutive visits from two different individuals (Random). (C) Correlation of eyeAge and chronological age between left and right and two consecutive visits of the same individual. (D) Scatter plots representing the correlation of left and right eyeAge Gap from the same or two random individuals.

Figure 2—source data 1

MAE and positive prediction ratio for time-matched and random individuals based on age at visit 1.

https://cdn.elifesciences.org/articles/82364/elife-82364-fig2-data1-v2.zip
Figure 2—source data 2

MAE and positive prediction ratio for time-matched and random individuals based on months between visits.

https://cdn.elifesciences.org/articles/82364/elife-82364-fig2-data2-v2.zip
Figure 2—source data 3

Age gap for random and time-matched individuals at visit 1 and 2.

https://cdn.elifesciences.org/articles/82364/elife-82364-fig2-data3-v2.zip
Figure 2—source data 4

Chronological and predicted age for left and right eye.

https://cdn.elifesciences.org/articles/82364/elife-82364-fig2-data4-v2.zip
Figure 2—source data 5

Age gap for random and time-matched individuals for left and right eyes.

https://cdn.elifesciences.org/articles/82364/elife-82364-fig2-data5-v2.zip
Figure 2—source data 6

Scatter plot of eyeAge with chronological age.

https://cdn.elifesciences.org/articles/82364/elife-82364-fig2-data6-v2.zip

To understand if this effect was simply a result of the noise of our innate age prediction, we performed an age-matched control experiment. We compared correlations between data points of one individual to data from a random pair of age-matched individuals (see Materials and methods). Comparisons were performed between each eye and timepoint. For all comparisons, the robust correlation observed within an individual’s data was lost in data between time-matched individuals (Figure 2B and D). Additionally, the positive predictive ratio and MAE exhibited reduced performance, 55% and 3.6 years (Figure 2—figure supplement 3), suggesting a reproducible, individual-specific eyeAge component. To further explore this individual-specific component, Figure 2C compares eyeAge and chronological age within an individual between eyes and timepoints, showing strong correlation in each quadrant.

Testing the model in UK Biobank cohort

We next applied our EyePACS-trained eyeAge model to the UK Biobank dataset. The UK Biobank cohort included retinal fundus images from 64,019 patients as well as extensive clinical labs and genomic data. These clinical markers enabled comparison of eyeAge with phenoAge, a clinical blood marker-based aging clock (Liu et al., 2018). The observed 0.87 correlation between eyeAge and chronological age in the UK Biobank cohort was consistent with (and slightly higher than) the observed correlation of phenoAge and chronological age (0.82) (Figure 3A and B). Notably, the correlation between phenoAge and eyeAge was substantially lower (0.72; Figure 3—figure supplement 1) and, in fact, roughly equivalent to the product of their respective correlations with chronological age, suggesting that they were largely independent. To explore this further, we computed the residuals from linear models that independently regressed chronological age on phenoAge and eyeAge, as described previously (Liu et al., 2018), yielding phenoAge acceleration (phenoAgeAccel) and eyeAge acceleration (eyeAgeAccel), and observed little correlation between the two age acceleration measures (Figure 3C). We then performed Cox proportional hazards regression analysis to assess mortality risk (Cox, 1972). The hazard ratio for eyeAge was statistically significant when adjusting for (self-reported) sex (1.09, CI=[1.08, 1.10], p-value = 1.6e-53), sex and age (1.04, CI=[1.02, 1.06], p-value = 1.8e-4), and sex, age, and phenoAge (1.03, CI=[1.01, 1.05], p-value = 2.8e-3) (Figure 3D). Stratifying the hazard ratio analysis showed a slight increase in the hazard ratio for women compared to men (1.035 vs 1.026), however the confidence intervals overlapped heavily (Supplementary file 1). Hazard ratio results adjusted for visual acuity are presented in (Figure 3—figure supplement 2 and Supplementary file 2).

Figure 3 with 2 supplements see all
Relationships between eyeAge, phenoAge, and chronological age in the UK Biobank cohort.

(A) Correlation between eyeAge and chronological age (Pearson ⍴=0.86). (B) Correlation between phenoAge and chronological age (Pearson ⍴=0.82). (C) Correlation between eyeAgeAcceleration and phenoAgeAcceleration (Pearson ⍴=0.12). (D) Forest plot of all-cause mortality hazard ratios (diamonds) and confidence intervals (lines) for the UK Biobank dataset. Purple lines are adjusted only for sex; orange lines are adjusted for sex and age; blue lines are adjusted for sex, age, and phenoAge.

Figure 3—source data 1

Age, eyeAge, phenoAge, eyeAge Acceleration and phenoAge Acceleration values for each individual.

https://cdn.elifesciences.org/articles/82364/elife-82364-fig3-data1-v2.zip

We also investigated the relationship between eyeAge and multiple additional measures of morbidity and disability available in the UK Biobank. We performed Cox proportional hazards regression on six additional chronic disease outcomes when adjusting for age and sex: chronic obstructive pulmonary disease (COPD), myocardial infarction, asthma, stroke, Parkinsonism, and dementia. Nominally significant associations between eyeAge and both COPD (p-value = 0.0048) and myocardial infarction (p-value = 0.049) were observed (Supplementary file 3). We performed linear regression on seven morbidity measurements reported at the time of imaging: fluid intelligence, systolic and diastolic blood pressure, the ‘Health score (England)’ index of multiple deprivation, pulse wave arterial stiffness, self-reported overall health rating, and self-reported presence of a longstanding illness. Increased eyeAgeAccel corresponded to significantly increased systolic blood pressure (p-value = 1.025e-7) and decreased levels of deprivation (p-value = 2.26e-5) as measured by the Health score (England) index of multiple deprivation (Supplementary file 4). Interestingly, increased eyeAgeAccel also corresponded with significantly increased performance in fluid intelligence scores (p-value = 5.34e-27).

As visual acuity has long been known to degrade with age (Gittings and Fozard, 1986), we examined the extent to which eyeAge explains the known correlation between chronological age and visual acuity. Although chronological age and eyeAge are highly correlated (Figure 3A), we observed a slightly higher correlation of eyeAge with visual acuity (⍴=0.221) compared to chronological age vs. visual acuity (⍴=0.218). Both measures of age appear relevant for visual acuity decline, as the influence of chronological age remained significant even after regressing out the influence of eyeAge on visual acuity (p-value = 1.6e-13, Supplementary file 5).

GWAS and experimental validation of ALK

Based on the patient-specific eyeAgeAccel effects and its independence from phenoAgeAccel, a GWAS was conducted to identify genetic factors associated with eyeAgeAccel. We subsetted the cohort to individuals of European ancestry, performed genotype quality control, and utilized a single eyeAgeAccel value per individual, resulting in a cohort of 45,444 individuals for GWAS analysis. GWAS was performed using BOLT-LMM (see Materials and methods) with chronological age, sex, genotyping array type, the top five principal components of genetic ancestry, and indicator variables for the six assessment centers used for the imaging as covariates. Full GWAS summary statistics are available in Supplementary file 6.

Genomic inflation was low (1.05; Figure 4—figure supplement 1). The stratified linkage disequilibrium (LD) score regression-based intercept was 1.02 (SEM = 0.01), indicating that polygenicity, rather than population structure, drove the test statistic inflation. The SNP-based heritability was 0.11 (SEM = 0.02), an appreciable fraction of the estimated broad-sense heritability of biological age (27–57% via twin and family studies). The GWAS identified 38 independent suggestive hits (R2 ≤0.1, p≤1 × 10−6) at 28 independent loci, 12 of which reached genome-wide significance (p≤5 × 10−8) (Figure 4, Supplementary file 7).

Figure 4 with 2 supplements see all
GWAS analyses and experimental validation.

(A) Manhattan plot representing significant genes associated with eyeAgeAcceleration. (B) p-Values for enriched pathways: Macular thickness, ADHD (attention deficit hyperactivity disorder), AMD (age-related macular degeneration), spherical equivalent, and refractive error. (C) Assessment of visual performance of transgenic and control flies with age. p-Value is relative to control (*=p < 0.05). p-Value for ALK RNAi vs. control is 0.009; p-value for UAS-ALK-DN vs. control is 0.006. Error bars show standard deviation between 3 biological replicates. n = 100 flies per replicate.

Many of the hits were associated with eye function and age-related disease (truncated list of candidate hits summarized in Supplementary file 8). The most significant locus spanned 650 kb and included three genes in a highly significant LD block: SH3YL1, ACP1, and ALKAL2 (Figure 4—figure supplement 2). The SH3YL1 gene has recently been implicated as a biomarker for nephropathy in type 2 diabetes (Choi et al., 2021), whereas ALKAL2 enables protein tyrosine kinase activity (Woodling et al., 2020). In other significant gene candidates, we identified variants in the genes OCA2, POC5, and GJA3, which have all been implicated in eye development and function. OCA2 specifically is known to be important for eye pigmentation (Kamaraj and Purohit, 2014), whereas POC5 is linked to AMD (Yan et al., 2018). GJA3 has been implicated in age-related cataract development (Tang et al., 2019). MEF2C has reported roles in numerous age-related conditions, including Alzheimer’s disease (Xue et al., 2021) and muscle wasting in cancer (Judge et al., 2020) and GRM is associated with age-related hearing loss (Liu et al., 2021b). Additional candidates are reported to be involved in cancer prognosis and progression, including TSPAN11 (Liu et al., 2021a), NKX6-1 (Su et al., 2021), and SLC16A1 (Zhang et al., 2021).

Gene enrichment analysis (Xie et al., 2021) identified significant associations (adjusted p<0.05) between our gene candidates and macular thickness and degeneration, as seen in previous human GWAS studies (Buniello et al., 2019) and cataract formation (Elsevier pathway collection; Cheadle et al., 2017) as well as non-eye related diseases such as bone mineralization, tumor suppression, and Amyloid Precursor Protein pathways (Biocarta; Nishimura, 2001). Gene Ontology (GO) term analysis of our gene candidates revealed significant enrichment (adjusted p<0.05) for protein tyrosine kinase activator activity, gap junction channel activity, and wide pore channel activity (Figure 4B).

Sum of single effects regression (Wang et al., 2020) was used to identify putative causal variants for each locus (Supplementary file 9). In the most significant locus (Figure 4—figure supplement 2), we identified the deletion variant rs56350804 as the single variant with a posterior inclusion probability (PIP) above 0.45 (rs56350804 PIP = 0.9998). While rs56350804 is intronic to SH3YL1, expression quantitative trait locus (eQTL) analysis by the Genotype-Tissue Expression consortium identified significant eQTL between rs56350804 and each of SH3YL1, ACP1, and ALKAL2 (GTEx Consortium 2020). In particular, the ALKAL2 gene had its expression modulated by rs56350804 in cervical spinal cord tissue (p=3.0 × 10–16), and inhibition of the Drosophila homolog of ALKAL2, Alk, has been shown to extend lifespan (Woodling et al., 2020), making it a good candidate for exploring a potential role in visual function.

Previously, D. melanogaster has been used to study the impact of aging interventions on retinal health by using the phototaxis index, a fly’s ability to be attracted toward light (Hodge et al., 2022). We used D. melanogaster to observe visual decline via phototaxis with transgenic ALK inhibition. We crossed the pan-neuronal RU486-inducible Gal4 driver elav-Gal4-GS with UAS-AlkRNAi flies or UAS-AlkDN to determine the effects of neuron-specific Alk inhibition. Both transgenic interventions resulted in significantly increased visual performance with age, whereas background controls showed no change in performance with RU486 treatment (Figure 4C). These results support the implication from the GWAS that ALK influences the aging of the visual system.

Discussion

Retinal health has long been an important factor for visual aging, manifested as glaucoma, AMD, and other age-related retinal diseases, but until recently it was not known whether it could be indicative of overall health and aging. In this study, we applied deep learning models for predicting an individual’s age from retinal fundus images and showed that these predictions may be informative for tracking aging patterns longitudinally. While other cellular and blood-related molecular markers of aging have recently been identified, these are at times invasive and, although accurate, take a long time to develop (Horvath, 2013). Other aging clocks from blood (Horvath, 2013; Peters et al., 2015), saliva (Bocklandt et al., 2011), skin (Bocklandt et al., 2011; Fleischer et al., 2018), muscle (Mamoshina et al., 2018), and liver (Wang et al., 2017) showed an MAE deviating 4–8 years from the actual age. More dynamic markers such as proteins and metabolites can track aging in shorter time intervals but are still limited to 2–4 years (Ahadi et al., 2020; Wang et al., 2017; Chen et al., 2012). In contrast, using deep learning models on retina fundus images, we were able to predict changes in aging at a granularity of less than a year. These small time-scales, and relative low-cost of imaging, makes eyeAge promising for longitudinal studies.

Correlation and hazard ratio analyses from our study suggest that eyeAge and phenotypic age are conditionally independent given chronological age. Therefore, eyeAge is a potential biomarker that reflects a layer of biological aging not included in blood markers. This is supported by our GWAS findings; different genes were associated with eyeAgeAccel compared to phenoAgeAccel (Kuo et al., 2021). However, there are limitations with this approach. Similar to other aging clocks (such as DNA-methylome), eyeAge underperforms phenotypic age in mortality prediction. This is likely because the biomarkers used to calculate phenotypic age were explicitly selected based on their ability to predict mortality. New algorithms that incorporate blood markers and retinal clocks have the potential to be better predictors of morbidity and mortality. Additionally, it remains to be seen whether eyeAgeAccel would reflect interventions such as behavioral changes or medication.

Our GWAS identified candidate genes associated with several eye- and age-related functions, such as POC5 (Yan et al., 2018) and GJA3 (Tang et al., 2019). Additional significant candidates had previously identified functions that are not restricted to the eye but are still related to age, e.g. MEF2C being associated with Alzheimer’s disease (Xue et al., 2021) and multiple candidates (TSPAN11, NKX6-1, SLC16A1, RAET1G, SNTG1, ARRDC3, RASSF3, DIRC3, and GCNT3) associated with cancer (Supplementary file 8). These suggest that eyeAge may identify general signatures of aging rather than purely eye-related traits. Pathway analyses similarly were split between eye-related pathways and others that were not eye-specific. While we suspect many of the eye-related pathways to have an aging component, some pathways may be enriched artifactually. For example, though melanin biosynthesis has been associated with protection from photodamage (Hodge et al., 2022), the predicted quality of fundus images has also been shown to be influenced by eye color (Guenther et al., 2020). Notably, an independent group separately identified our top GWAS candidate locus as the most significant locus (Goallec et al., 2021). This combined with previous studies showing ALK to be important for lifespan extension in flies (Woodling et al., 2020) and our own experimental validation confirming improved ocular health in a fly homolog, Alk, is compelling evidence of a true biological signal in the GWAS.

Taken together, our work reinforces the utility of fundus imaging for evaluating overall health and opens up new opportunities for using it to predict longevity. eyeAge has substantial applications in aging and aging-related diseases, from biomarker application to tracking therapeutics. In particular, the retinal aging clock because of its ease of use, low cost, and non-invasive sample collection, has the unique potential to additionally assess lifestyle and environmental factors implicated in aging. Retinal aging clocks can be immensely valuable to future clinical trials of rejuvenation/anti-aging therapies and for personalized medicine to measure improvements in aging over short periods, not only improving actionability but also enabling rapid iteration.

Materials and methods

Key resources table
Reagent type (species) or resourceDesignationSource or referenceIdentifiersAdditional information
Strain, wDah background (Drosophila melanogaster, females)wDah control strainLaboratory of Linda Partridge, Woodling et al., 2020Maintained in Kapahi Lab
Strain, wDah background (Drosophila melanogaster, females)UAS-ALKRNAi RNAi for ALKLaboratory of Linda Partridge, Woodling et al., 2020VDRC GD 11446Maintained in Kapahi Lab
Strain, wDah background (Drosophila melanogaster, females)UAS-ALKDN dominant negative ALK overexpressionLaboratory of Linda Partridge, Woodling et al., 2020Maintained in Kapahi Lab
Strain, wDah background (Drosophila melanogaster, females)elav-GS Ru486 inducible Gal4 driverBloomington Drosophila Stock Center, Woodling et al., 2020BDSC 43642Maintained in Kapahi Lab
Chemical compound, drugRU486 (mifepristone)United States Biological, Osterwalder et al., 2001282888For inducting fly GeneSwitch expression system; 200 µM final concentration in food

EyeAge model development

Request a detailed protocol

Model development was done on the EyePACS train dataset (Table 1). A deep learning model with an Inception-v3 architecture (Deng et al., 2009; Szegedy et al., 2015) was trained to take a color fundus photo as input and predict the chronological age (referred to as chronologicalAge below) using L1 loss. Age values were normalized to have zero mean and unit variance before training (and during inference this normalization is reversed to get back to year units). Model training was stopped after 363,200 steps by looking at performance on the EyePACS tune dataset. The hyperparameters of the model were as follows: the initial learning rate was 0.0001, which was warmed up to 0.001 over 40,751 steps; after the warm up phase, the learning rate was decayed by a factor of 0.99 every 13,584 steps; dropout was applied to the prelogits at a rate of 0.2; a weight decay of 4e-5 was used. The model backbone was pre-trained using the ImageNet dataset (Deng et al., 2009). As some of the color fundus images in the UK Biobank dataset were of very low quality, we also trained a separate deep learning model to predict image quality, similar to what was reported in our prior work (Mitani et al., 2020; Varadarajan et al., 2018).

EyeAge model evaluation

Request a detailed protocol

The model described above was applied to images to predict chronological age. The image quality model described above was used to discard low quality images – reducing the initial 85,645 patient (174,049 image) dataset to 66,533 patients (120,362 images). Finally, we restricted the data to the first assessment visit to UK Biobank. This was done to reduce bias associated with image quality differences, as we observed quality differences between images captured in the later follow-up visits. Since these follow-up visits happened several years after the initial assessment, the time to event or censorship is much smaller, and a model could exploit this association. For participants that had images of both eyes passing the quality filter, we averaged the predictions across the two eyes. After these processing steps, we ended up with 55,267 data points total, one per remaining participant. Next, using the predicted eyeAge and the chronologicalAge of the participant at the time of imaging, an ‘eyeAgeAcceleration’ score was calculated for each participant as the residuals of the ordinary least squares regression model ‘chronologicalAge ~eyeAge’ (Liu et al., 2018). In order to compare with another well-known biological marker of age, phenoAge (Liu et al., 2018) was also computed using the values of blood markers available for the participants. PhenoAgeAcceleration was then computed in an analogous manner to eyeAgeAcceleration.

Method on selection of random set

Request a detailed protocol

Figure 2 required identification of matched, random individuals to assess the potential person-specific component of eyeAge predictions. For Figure 2—figure supplement 3, we created matched sets of visit pairs for each patient’s first visit by identifying a randomly matching patient visit that was 0–2 years after a patient’s first visit. To eliminate artifacts due to sampling differences between first and second visits, once we identified a patient’s first visit to match, we constrained its set of potential randomly matched patient visits to only be from second visits. For the longitudinal analysis in 2B (right), individuals were split both by age and by time between visits (using 2 month buckets) and, again, randomly matched. For Figure 2D, the individuals were split evenly in 2-year buckets. Individuals within the same bucket had their left and right predictions compared to one another.

Regression analyses in UK Biobank

Request a detailed protocol

Cox proportional hazards regression was performed using the lifelines package, https://github.com/CamDavidsonPilon/lifelines. Since retinal imaging was performed at the initial visit, individuals with events with an unknown date or date prior to the initial visit were excluded. All UK Biobank algorithmically defined outcomes with at least 4000 events were analyzed: asthma (field 42014), COPD (field 42016), dementia (field 42018), myocardial infarction (field 42000), all-cause Parkinsonism (field 42030), and stroke (field 42006). We note that because eyeAgeAccel is defined as eyeAge - alpha * age - beta for constants alpha and beta identified through regression of age on eyeAge, hazard ratios for eyeAge are identical to those in which eyeAgeAccel is used in the model instead.

Linear regression was performed on morbidity-related measurements taken at the same visit during which retinal imaging occurred, and was implemented using the statsmodels package with the model INT(outcome)~INT(age)+sex + INT(eyeAgeAccel), where INT(...) represents the rank-based inverse normal transformation. Individuals for which any of the outcome, age, or eyeAgeAccel variables were in the top 1% of outlier values were excluded. Measurements analyzed were: Overall health rating (field 2178), Long-standing illness (field 2188), Systolic blood pressure (field 4080), Diastolic blood pressure (field 4079), Pulse wave arterial stiffness index (field 21021), Health score (England) (field 26413), Fluid intelligence score (field 20016).

GWAS

Request a detailed protocol

The eyeAgeAccel value defined above was used as the target for GWAS analysis. GWAS analysis was performed on the fundus-based phenotype as described previously (Alipanahi et al., 2021). Briefly, samples were restricted to individuals of European ancestry to avoid confounding effects due to population structure. European genetic ancestry was defined by computing the medioid of the 15-dimensional space of the top genetic principal components in individuals who self-identified as ‘British’ ancestry and defining all individuals within a distance of 40 from the medioid as ‘European’ (corresponding to the 99th percentile of distances of all individuals who self-identified as British or Irish). Samples were further restricted to those who also passed sample quality control measures computed by UK Biobank, that is those not flagged as outliers for heterozygosity or missingness, possessing a putative sex chromosome aneuploidy, or whose self-reported and genetically inferred sex were discordant.

BOLT-LMM v2.3.4 was used to examine associations between genotype and eyeAgeAcceleration in European individuals in the UK Biobank (n=45,444). All genotyped variants with minor allele frequency >0.001 were used to perform model fitting and heritability estimation. GWAS was performed in genotyped variants and imputed variants on autosomal chromosomes, with imputed variants filtered to exclude those with minor allele frequency (MAF) <0.001, imputation INFO score <0.8, or Hardy-Weinberg equilibrium (HWE) P<1 × 10–10 in Europeans. In total, 13,297,147 variants passed all quality control measures. Covariates included in the association study were chronological age, sex, genotyping array type, the top five principal components of genetic ancestry, and indicator variables for the six assessment centers used for the imaging.

Genome-wide suggestive (p≤1 × 10−6) lead SNPs, independent at R2 ≤0.1, were identified using the –clump command in PLINK version v1.90b4. The LD reference panel contained 10,000 unrelated UK Biobank subjects of European ancestry (as defined above). To identify distinct non-overlapping loci of association, all variants with R2 ≥0.1 with a lead SNP were grouped into a ‘cluster’ with that lead SNP, and subsequently clusters within 250 kilobases of each other were merged, with the lowest p-value lead SNP retained as the locus representative. Putative causal variants were identified using susieR version 0.9.0. At each locus containing at least 10 variants in LD, the susieR::susie_suff_stat function was used to estimate posterior inclusion probabilities for each variant in the locus, using the same LD reference panel as was used to generate loci and with a maximum of L=10 causal variants per locus and 200 iterations of coordinate ascent.

Validation of Alk in fly

Fly husbandry and phenotyping

Request a detailed protocol

For fly crosses, 15 virgin females were crossed with 3 males in bottles containing 1.55% live yeast, cornmeal, sugar, and agar (Wilson et al., 2020). Crosses were dumped 5 days following crossing, and female progeny were sorted into 4 replicate vials of 25 flies each, with food containing 200 μm RU486 to induce activation of the Gal-UAS system (Nicholson et al., 2008). Flies were maintained in 65% relative humidity at 25 °C in a 24 hr light/dark cycle throughout life. Two weeks post-induction, phototaxis was tested as previously described Hodge et al., 2022 by placing flies in a clear, empty 30 cm-long vial horizontally in a dark room. Light was shined on one end and the number of flies in the last 10 cm closest to the light source after 1 min was scored for responsiveness to light signals. This was tested across each of the four vials per group in three biological replicates (total 100 flies per replicate). Strains used were 3xelav-GS (provided from the lab of Geetanjali Chawla) Parkhitko et al., 2020 for RU486-dependent pan-neuronal Gal4, wDah control strain, UAS-AlkRNAi, and UAS-AlkDN (provided from the lab of Linda Partridge) (Woodling et al., 2020).

Pathway analysis

Request a detailed protocol

All significant (p<1.0 × 10–6) GWAS candidates were used to assess pathway enrichment via Enrichr (Xie et al., 2021).

Statistical analysis

Request a detailed protocol

For Drosophila phototaxis results, significance (p<0.05) was assessed using unpaired t-test. For Figure 4C, error bars represent SD across at least three biological replicates. Significant differences between experimental groups and controls are indicated by *. *, p<0.05. Statistical analyses were calculated with GraphPad Prism 4.

Data and code availability

Request a detailed protocol

A subset of EyePACS data is freely available online (https://www.kaggle.com/competitions/diabetic-retinopathy-detection/data). To enquire about access to the full EyePACS dataset, researchers should contact Jorge Cuadros (jcuadros@eyepacs.com). The UK Biobank data are available for approved projects (application process detailed at https://www.ukbiobank.ac.uk/enable-your-research/apply-for-access) through the UK Biobank Access Management System (https://www.ukbiobank.ac.uk). We have deposited the derived data fields and model predictions following UK Biobank policy, which will be available through the UK Biobank Access Management System. Full GWAS summary statistics are available in the Supplementary File. To develop the eyeAge model we used the TensorFlow deep learning framework, available at https://www.tensorflow.org. Code and detailed instructions for both model training and prediction of chronological age from fundus images is open-source and freely available as a minor modification (https://gist.github.com/cmclean/a7e01b916f07955b2693112dcd3edb60), (Ahadi, 2023 copy archived at swh:1:rev:ba002c0a6edddd13814ecc9e07ec14249b2375f4) of our previously published repository for fundus model training (https://zenodo.org/record/7154413) (Cosentino et al., 2021).

Data availability

A subset of EyePACS data is freely available online (https://www.kaggle.com/competitions/diabetic-retinopathy-detection/data). To enquire about access to the full EyePACS dataset, researchers should contact Jorge Cuadros (jcuadros@eyepacs.com). Proposals and agreements are assessed internally at EyePACS and may be subject to ethics approvals. The UKB data are available for approved projects (application process detailed at https://www.ukbiobank.ac.uk/enable-your-research/apply-for-access) through the UK Biobank Access Management System (https://www.ukbiobank.ac.uk) . We have deposited the derived data fields and model predictions following UKB policy, which will be available through the UK Biobank Access Management System. Full GWAS summary statistics are available in the Supplementary File. To develop the eyeAge model we used the TensorFlow deep learning framework, available at https://www.tensorflow.org. Code and detailed instructions for both model training and prediction of chronological age from fundus images is open-source and freely available as a minor modification (https://gist.github.com/cmclean/a7e01b916f07955b2693112dcd3edb60, (copy archived at swh:1:rev:ba002c0a6edddd13814ecc9e07ec14249b2375f4)) of our previously published repository for fundus model training (https://zenodo.org/record/7154413).

The following previously published data sets were used

References

  1. Conference
    1. Deng J
    2. Dong W
    3. Socher R
    4. Li L-J
    (2009) ImageNet: A large-scale hierarchical image database
    2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.
    https://doi.org/10.1109/CVPR.2009.5206848
    1. Nishimura D
    (2001) BioCarta
    Biotech Software & Internet Report 2:117–120.
    https://doi.org/10.1089/152791601750294344

Decision letter

  1. Sara Hägg
    Reviewing Editor; Karolinska Institutet, Sweden
  2. Carlos Isales
    Senior Editor; Augusta University, United States
  3. Sara Hägg
    Reviewer; Karolinska Institutet, Sweden

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Longitudinal fundus imaging and its genome-wide association analysis provide evidence for a human retinal aging clock" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Carlos Isales as the Senior Editor.

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

1) Additional investigations on the age acceleration residuals are suggested, to differentiate between chronological and biological aging, which is needed.

2) Additional follow-up analyses in UKB as suggested.

Reviewer #1 (Recommendations for the authors):

The clock is trained on chronological age and age acceleration measures are compared between this clock and phenotypic age acceleration. However, another way would have been to train the clock on phenotypic age instead of chronological age. The second-generation epigenetic clocks are done in this way and those clocks perform better than the first-generation clocks. Since the authors use two sets of data, this approach could have been employed if starting with the UK Biobank where the phenoage clock is available. Could the authors please comment on this approach and discuss whether you think that your results could have been different?

Gender is a social construct while sex is the biological variable that is intended in this paper. Please change accordingly.

Please comment on sex differences. Are there differences in retinal aging between men and women? Is the eye clock performing better if trained separately in men and women? Whenever possible, please provide supplementary figures for sex-stratified analyses (Table 1, Figure 3D, etc.).

Figure 3D – please provide the full scale on the x-axis, and add the 95% Cis in numbers.

How is the pathway enrichment analysis done? I cannot find any information on that. Likewise, no statistical analysis section is included in the methods. A description of the Cox models is needed. Are the proportional hazards assumption met? How is age adjusted for in the model? What is the underlying time scale? Many more lifestyle factors etc. are available in UK Biobank that could additionally be adjusted for in the model.

The GWAS result, how does it compare to other GWAS of biological aging? There have been GWAS publications on phenotypic age, epigenetic clocks, telomere length, mitochondrial DNA abundance, etc.

For all figures: The figure legend should stand alone. Please provide additional information about the data used in the figure, sample sizes, methods used to perform the analyses, etc.

Reviewer #2 (Recommendations for the authors):

Below I have a few specific comments that, being addressed, would improve the article:

The background in para 1 requires a bit of revision for accuracy:

– In para 1, the article states that the PhenoAge is an algorithm derived from blood biomarkers based on chronological age. It is not. In fact, the PhenoAge includes chronological age as a component in addition to biomarker information. The biomarkers included were selected on the basis of their age-independent associations with mortality risk. The parenthetical text should be revised.

– Also in para 1, the authors refer to "an epigenetic clock". But the reference is to a review article that summarizes many of these clocks. I would suggest that they should simply refer to "clocks" plural.

– Finally the suggestion that these clocks require invasive cell or tissue extraction is a bit of an overstatement. What the existing measures of biological age do require is a blood draw + multiplex assay. The authors should just say that.

The longitudinal analysis reported in para 2 is very helpful. However, given the claims about the accuracy of the assessment, the authors should report how close the estimated change in age from the retinal measures was to the calendar time elapsed between measurements, not just the probability that the measurements were correctly ordered. (perhaps this is what they are reporting as MAE in this analysis. But if so, it should be clarified) Also, given the very large size of the EyePACS data, it would be feasible to exclude the longitudinal samples from the training entirely. This would make for a cleaner version of the longitudinal analysis, although I don't think this is necessary.

The UKB analysis comparing EyeAge and PhenoAge is also very helpful. However, it would be enhanced by including a set of analyses in which both EyeAge and PhenoAge were regressed on chronological age (separately) and residual values predicted. The correlation among these residuals will inform whether the biological aging information in the two measures is similar or different. In addition, it would be helpful to see effect sizes for associations of the residuals with mortality. (separate models for each measure with the inclusion of covariates for chronological age and sex would be ideal) These additional analyses can clarify (a) how similar the age-independent information in the two measures is, and (b) how they compare as mortality predictors.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Longitudinal fundus imaging and its genome-wide association analysis provide evidence for a human retinal aging clock" for further consideration by eLife. Your revised article has been evaluated by Carlos Isales (Senior Editor) and a Reviewing Editor.

The manuscript has been improved but there are some remaining issues that need to be addressed, as outlined below by reviewer 2.

Reviewer #1 (Recommendations for the authors):

The authors have addressed the reviewer comments.

Reviewer #2 (Recommendations for the authors):

The authors have mostly addressed my comments. In particular, Figure 3 panels C and D are very helpful, as is the additional analysis of longitudinal change. I note below a few areas where further clarification could improve the manuscript. I reiterate my enthusiasm for the manuscript, which I think makes an important contribution to the literature on biological aging and, to the extent the authors measure can be readily implemented in retinal imaging data beyond the datasets reported in this manuscript, has the potential to deliver a new tool to aging research.

The only change I would argue is essential is revision of the statement in the abstract that EyeAge was 71% accurate in measuring aging in longitudinal data. As noted below, based on data reported by the authors, the accuracy of prediction of time elapsed between repeated measures is <3%. The claim in the abstract should be revised for clarity.

Further comments:

original reviewer comment: The longitudinal analysis reported in para 2 is very helpful. However, given the claims about the accuracy of the assessment, the authors should report how close the estimated change in age from the retinal measures was to the calendar time elapsed between measurements, not just the probability that the measurements were correctly ordered. (perhaps this is what they are reporting as MAE in this analysis. But if so, it should be clarified).

Author response: Correlation calculated between estimated change in chronological age and predicted age, while significant, is low (pearson Rho = 0.17, p-value = 1.4e-12). These differences may be due to the distinction of measuring chronological age vs. biological age as well as the inherent noise associated in the measurement. However, we agree that this information may be valuable to the reader and we have included this correlation number and p-value in the manuscript (changes underlined below).

"Using the EyePACS Tune dataset, we restricted ourselves to data from patients with exactly two visits (1,719 subjects) and examined the models' ability to order the two visits over multiple time scales. Note that no longitudinal information about patients was specifically used to train or tune the model to predict chronological age. While the observed and predicted age differences between the two visits had low correlation (pearson ⍴ = 0.17, p-value = 1.4e-12), …".

R1 reviewer comment: That correlation is quite low. However, to interpret it, we need a bit more information about how much variation there is in time elapsed between the two visits included in analysis. I wonder if a scatterplot as a supplemental figure would help clarify. Specifically, I am thinking that if there is little variation in the time elapsed, the correlation is likely to be very low. Why not instead report the distribution of the prediction error? In other words, what was the mean and SD of the difference between time elapsed and change in EyeAge?

In addition, the authors should revise the statement in the abstract:

"Longitudinal studies showed that the resulting models were able to predict individuals' aging in time-scales less than a year with 71% accuracy".

In fact, it appears that the accuracy of prediction is just slightly under 3% (i.e. 0.17^2). What the algorithm could do with 71% accuracy was correctly order two repeated observations in time. The abstract should be revised.

https://doi.org/10.7554/eLife.82364.sa1

Author response

Reviewer #1 (Recommendations for the authors):

The clock is trained on chronological age and age acceleration measures are compared between this clock and phenotypic age acceleration. However, another way would have been to train the clock on phenotypic age instead of chronological age. The second-generation epigenetic clocks are done in this way and those clocks perform better than the first-generation clocks. Since the authors use two sets of data, this approach could have been employed if starting with the UK Biobank where the phenoage clock is available. Could the authors please comment on this approach and discuss whether you think that your results could have been different?

The primary issues here are that the EyePACS not only lacks the measurements required for phenoAge but also lacks mortality and genomics data. It’s unclear what type of validation we could then perform on the EyePACs if we trained such a model. Similarly, splitting the UK Biobank data is problematic for two reasons: (1.) The dataset was already fairly limited for GWAS, further fragmentation would be inherently problematic, (2.) A key strength of the approach is cross-dataset generalization and within a single dataset we would not be able to make that claim.

Gender is a social construct while sex is the biological variable that is intended in this paper. Please change accordingly.

We thank the reviewer for bringing this to our attention. We used the following field: [https://biobank.ctsu.ox.ac.uk/crystal/field.cgi?id=31], which is self-reported sex. Throughout the updated manuscript, to make this clear, we have replaced references to this variable with "self-reported sex".

Please comment on sex differences. Are there differences in retinal aging between men and women? Is the eye clock performing better if trained separately in men and women? Whenever possible, please provide supplementary figures for sex-stratified analyses (Table 1, Figure 3D, etc.).

We appreciate the reviewer’s suggestion and have added the requested analysis for comparing hazard ratio stratified by sex. We have Supplementary File 1 which shows the Age and phenoAge adjustments for males, females, and combined.

Similarly, we have added the following text to the paper:

“Stratifying the hazard ratio analysis showed a slight increase in the hazard ratio for women compared to men (1.035 vs. 1.026), however the confidence intervals overlapped heavily (Supplementary File 1).”

Figure 3D – please provide the full scale on the x-axis, and add the 95% Cis in numbers.

We have added the full scale axis and the 95% CI’s to the text. See text below:

“The hazard ratio for eyeAgeAccel was was statistically significant when adjusting for (self-reported) sex (1.09, CI=[1.08, 1.10], p-value=1.6e-53), sex and age (1.04, CI=[1.02, 1.06], p-value=1.8e-4), and sex, age, and phenoAge (1.03, CI=[1.01, 1.05], p-value=2.8e-3) (Figure 3D).”

How is the pathway enrichment analysis done? I cannot find any information on that. Likewise, no statistical analysis section is included in the methods.

Pathway enrichment was performed via Enrichr online platform using all significant gene candidates from the GWAS. The literature for this platform has been cited in the text, and additional details regarding this as well as the statistical analyses have been added to the Methods.

A description of the Cox models is needed. Are the proportional hazards assumption met? How is age adjusted for in the model? What is the underlying time scale? Many more lifestyle factors etc. are available in UK Biobank that could additionally be adjusted for in the model.

Age has been adjusted in the Cox model by including it as a covariate, same as self-reported sex and phenotypic age. The unit of eyeAge is 1 year. We used the lifelines library to perform a Grambsch-Therneau test of proportional hazards assumption. At a p-value cutoff of 0.05, it did not reject the null hypothesis that the coefficients are not time-varying, using two time transforms (km and rank):

Author response table 1
test_statisticp-log2(p)
chronological agekm2.160.142.82
rank2.160.142.82
eyeAgekm0.220.640.65
rank0.220.640.65
sexkm2.470.123.11
rank2.470.123.11
phenoAgekm2.630.113.25
rank2.630.103.25

The GWAS result, how does it compare to other GWAS of biological aging? There have been GWAS publications on phenotypic age, epigenetic clocks, telomere length, mitochondrial DNA abundance, etc.

We have compared our results to GWAS results on phenotypic age of UK Biobank participants and there’s no overlap between significant genes (Kuo et al. 2021). We suggest that is because eyeAge and phenotypic age each capture different aspects of biological aging and as our hazard ratio results indicate, they are independent of each other. We have added a citation to the Kuo et al. manuscript in the Discussion section.

For all figures: The figure legend should stand alone. Please provide additional information about the data used in the figure, sample sizes, methods used to perform the analyses, etc.

We have added the dataset used and analysis method to the figure legends as applicable.

Reviewer #2 (Recommendations for the authors):

Below I have a few specific comments that, being addressed, would improve the article:

The background in para 1 requires a bit of revision for accuracy:

– In para 1, the article states that the PhenoAge is an algorithm derived from blood biomarkers based on chronological age. It is not. In fact, the PhenoAge includes chronological age as a component in addition to biomarker information. The biomarkers included were selected on the basis of their age-independent associations with mortality risk. The parenthetical text should be revised.

We thank the reviewer for the correction. We have updated the text to read:

“a combination of chronological age and 9 biomarkers predictive of mortality”.

– Also in para 1, the authors refer to "an epigenetic clock". But the reference is to a review article that summarizes many of these clocks. I would suggest that they should simply refer to "clocks" plural.

Thank you for the suggestion. As suggested, we have used “clocks” instead.

– Finally the suggestion that these clocks require invasive cell or tissue extraction is a bit of an overstatement. What the existing measures of biological age do require is a blood draw + multiplex assay. The authors should just say that.

Thanks for the comment. We have changed the sentence to “many require a blood draw and multiplex assay of many analytes”.

The longitudinal analysis reported in para 2 is very helpful. However, given the claims about the accuracy of the assessment, the authors should report how close the estimated change in age from the retinal measures was to the calendar time elapsed between measurements, not just the probability that the measurements were correctly ordered. (perhaps this is what they are reporting as MAE in this analysis. But if so, it should be clarified)

Correlation calculated between estimated change in chronological age and predicted age, while significant, is low (pearson Rho = 0.17, p-value = 1.4e-12). These differences may be due to the distinction of measuring chronological age vs. biological age as well as the inherent noise associated in the measurement. However, we agree that this information may be valuable to the reader and we have included this correlation number and p-value in the manuscript.

“Using the EyePACS Tune dataset, we restricted ourselves to data from patients with exactly two visits (1,719 subjects) and examined the models’ ability to order the two visits over multiple time scales. Note that no longitudinal information about patients was specifically used to train or tune the model to predict chronological age. While the observed and predicted age differences between the two visits had low correlation (pearson ⍴ = 0.17, p-value = 1.4e-12), …”

Also, given the very large size of the EyePACS data, it would be feasible to exclude the longitudinal samples from the training entirely. This would make for a cleaner version of the longitudinal analysis, although I don't think this is necessary.

We are unclear if this is the reviewer’s suggestion, but as a point of clarification, we did separate the train and tune sets by patient. Therefore, the longitudinal data should be clean in that there is no contamination of patients samples from the EyePACS train set.

The UKB analysis comparing EyeAge and PhenoAge is also very helpful. However, it would be enhanced by including a set of analyses in which both EyeAge and PhenoAge were regressed on chronological age (separately) and residual values predicted. The correlation among these residuals will inform whether the biological aging information in the two measures is similar or different.

We believe that Figure 3 addresses this point:

In addition, it would be helpful to see effect sizes for associations of the residuals with mortality. (separate models for each measure with the inclusion of covariates for chronological age and sex would be ideal) These additional analyses can clarify (a) how similar the age-independent information in the two measures is, and (b) how they compare as mortality predictors.

We acknowledge the importance of adjusting for chronological age, and the nuances associated with doing this in the context of the residuals which already have been adjusted for age.

We have specifically adjusted for chronological age in our hazard analyses. Performing hazard analyses with the residual (or “acceleration”) variables in places of eyeAge and/or phenoAge produces the exact same hazard ratios for these variables, but changes the hazard ratio for chronological age. We believe this is because eyeAgeAccel = (eyeAge – W*chronologicalAge – bias), where W and bias are the coefficients that are fit to produce the residual (and phenoAgeAccel is similarly defined). This introduces a clear collinearity between the variables, since chronological age is included multiple times with different weights. For this reason, we feel that it is more appropriate to perform mortality hazard analysis with eyeAge and phenoAge, rather than their acceleration counterparts.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Reviewer #2 (Recommendations for the authors):

The authors have mostly addressed my comments. In particular, Figure 3 panels C and D are very helpful, as is the additional analysis of longitudinal change. I note below a few areas where further clarification could improve the manuscript. I reiterate my enthusiasm for the manuscript, which I think makes an important contribution to the literature on biological aging and, to the extent the authors measure can be readily implemented in retinal imaging data beyond the datasets reported in this manuscript, has the potential to deliver a new tool to aging research.

The only change I would argue is essential is revision of the statement in the abstract that EyeAge was 71% accurate in measuring aging in longitudinal data. As noted below, based on data reported by the authors, the accuracy of prediction of time elapsed between repeated measures is <3%. The claim in the abstract should be revised for clarity.

We thank the reviewer for their comment and have removed the following statement from the abstract to avoid any confusion:

“Longitudinal studies showed that the resulting models were able to correct individuals’ aging in time-scales less than a year with 71% accuracy.”

Further comments:

Original reviewer comment: The longitudinal analysis reported in para 2 is very helpful. However, given the claims about the accuracy of the assessment, the authors should report how close the estimated change in age from the retinal measures was to the calendar time elapsed between measurements, not just the probability that the measurements were correctly ordered. (perhaps this is what they are reporting as MAE in this analysis. But if so, it should be clarified).

Author response: Correlation calculated between estimated change in chronological age and predicted age, while significant, is low (pearson Rho = 0.17, p-value = 1.4e-12). These differences may be due to the distinction of measuring chronological age vs. biological age as well as the inherent noise associated in the measurement. However, we agree that this information may be valuable to the reader and we have included this correlation number and p-value in the manuscript (changes underlined below).

"Using the EyePACS Tune dataset, we restricted ourselves to data from patients with exactly two visits (1,719 subjects) and examined the models' ability to order the two visits over multiple time scales. Note that no longitudinal information about patients was specifically used to train or tune the model to predict chronological age. While the observed and predicted age differences between the two visits had low correlation (pearson ⍴ = 0.17, p-value = 1.4e-12), …".

R1 reviewer comment: That correlation is quite low. However, to interpret it, we need a bit more information about how much variation there is in time elapsed between the two visits included in analysis. I wonder if a scatterplot as a supplemental figure would help clarify. Specifically, I am thinking that if there is little variation in the time elapsed, the correlation is likely to be very low. Why not instead report the distribution of the prediction error? In other words, what was the mean and SD of the difference between time elapsed and change in EyeAge?

We’ve plotted the requested scatterplot showing the time elapsed (x-axis) vs. the difference between time elapsed and change in eyeAge (y-axis). We also calculated the distribution of the prediction error with a mean of 0.033 and standard deviation of 2.34. This plot was added to the manuscript as Figure 2—figure supplement 2.

With the text:

“While the observed and predicted age differences between the two visits (M = 0.033, SD = 2.34) had low correlation (pearson ⍴ = 0.17, p-value = 1.4e-12, Figure 2—figure supplement 2), Figure 2A shows that the model correctly ordered 71% of visits within a year with an MAE less than 2 years. In both metrics the fidelity decreased in older groups and with smaller age gaps.”

In addition, the authors should revise the statement in the abstract:

"Longitudinal studies showed that the resulting models were able to predict individuals' aging in time-scales less than a year with 71% accuracy".

In fact, it appears that the accuracy of prediction is just slightly under 3% (i.e. 0.17^2). What the algorithm could do with 71% accuracy was correctly order two repeated observations in time. The abstract should be revised.

We have removed the following statement from the abstract to avoid any confusion:

“Longitudinal studies showed that the resulting models were able to correct individuals’ aging in time-scales less than a year with 71% accuracy.”

https://doi.org/10.7554/eLife.82364.sa2

Article and author information

Author details

  1. Sara Ahadi

    Google Research, Mountain View, United States
    Contribution
    Conceptualization, Data curation, Formal analysis, Supervision, Investigation, Visualization, Methodology, Writing – original draft, Project administration, Writing – review and editing
    For correspondence
    saraahadi@gmail.com
    Competing interests
    is not currently affiliated with Google Research, however work for this manuscript was conducted while affiliated with Google Research. The author has no other competing interests to declare
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-7849-2135
  2. Kenneth A Wilson

    Buck Institute for Research on Aging, Novato, United States
    Contribution
    Formal analysis, Validation, Visualization, Writing – original draft, Writing – review and editing, Investigation
    Contributed equally with
    Boris Babenko and Cory Y McLean
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-3227-9977
  3. Boris Babenko

    Google Health, Palo Alto, United States
    Contribution
    Data curation, Software, Formal analysis, Visualization, Methodology, Writing – original draft, Writing – review and editing
    Contributed equally with
    Kenneth A Wilson and Cory Y McLean
    Competing interests
    is affiliated with Google Health. The author has no other competing interests to declare
  4. Cory Y McLean

    Google Health, Cambridge, United States
    Contribution
    Data curation, Software, Formal analysis, Visualization, Methodology, Writing – original draft, Writing – review and editing
    Contributed equally with
    Kenneth A Wilson and Boris Babenko
    For correspondence
    cym@google.com
    Competing interests
    is affiliated with Google Health. The author has no other competing interests to declare
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-9928-8216
  5. Drew Bryant

    Google Research, Mountain View, United States
    Contribution
    Formal analysis
    Competing interests
    is affiliated with Google Research. The author has no other competing interests to declare
  6. Orion Pritchard

    Google Research, Mountain View, United States
    Contribution
    Formal analysis
    Competing interests
    is affiliated with Google Research. The author has no other competing interests to declare
  7. Ajay Kumar

    Department of Biophysics, Post Graduate Institute of Medical Education and Research, Chandigarh, India
    Contribution
    Writing – original draft
    Competing interests
    No competing interests declared
  8. Enrique M Carrera

    Buck Institute for Research on Aging, Novato, United States
    Contribution
    Validation
    Competing interests
    No competing interests declared
  9. Ricardo Lamy

    Department of Ophthalmology, Zuckerberg San Francisco General Hospital and Trauma Center, San Francisco, United States
    Contribution
    Interpretation of results
    Competing interests
    No competing interests declared
  10. Jay M Stewart

    Department of Ophthalmology, University of California, San Francisco, San Francisco, United States
    Contribution
    Interpretation of results
    Competing interests
    No competing interests declared
  11. Avinash Varadarajan

    Google Health, Palo Alto, United States
    Contribution
    Conceptualization, Resources, Data curation, Software, Formal analysis
    Competing interests
    is affiliated with Google Health. The author has no other competing interests to declare
  12. Marc Berndl

    Google Research, Mountain View, United States
    Contribution
    Conceptualization, Formal analysis, Supervision, Visualization, Methodology
    Competing interests
    is affiliated with Google Research. The author has no other competing interests to declare
  13. Pankaj Kapahi

    Buck Institute for Research on Aging, Novato, United States
    Contribution
    Conceptualization, Supervision, Funding acquisition, Validation, Methodology, Writing – original draft, Writing – review and editing
    Contributed equally with
    Ali Bashir
    For correspondence
    Pkapahi@buckinstitute.org
    Competing interests
    Reviewing editor, eLife
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-5629-4947
  14. Ali Bashir

    Google Research, Mountain View, United States
    Contribution
    Conceptualization, Data curation, Formal analysis, Supervision, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing
    Contributed equally with
    Pankaj Kapahi
    Competing interests
    is not currently affiliated with Google Research, however work for this manuscript was conducted while affiliated with Google Research. The author has no other competing interests to declare

Funding

NIH (T32AG000266-23)

  • Kenneth A Wilson

NIH (R01AG038688)

  • Pankaj Kapahi

NIH (AG045835)

  • Pankaj Kapahi

Larry L. Hillblom Foundation

  • Pankaj Kapahi

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

This research has been conducted with the UK Biobank resource application 17643. We thank Jorge Cuadros from EyePACS for data access and helpful conversations. KAW is supported by NIH T32AG000266-23. We thank the Bloomington Drosophila Stock Center for providing flies used in this study. This work is funded by grants awarded to PK from the Reta Haynes Foundation, American Federation of Aging Research, NIH grants R01 R01AG038688 and AG045835 and the Larry L Hillblom Foundation.

Ethics

The UK Biobank study was reviewed and approved by the North West Multi-Centre Research Ethics Committee. For the EyePACS study, ethics review and IRB exemption was obtained using Quorum Review IRB (Seattle, WA).

Senior Editor

  1. Carlos Isales, Augusta University, United States

Reviewing Editor

  1. Sara Hägg, Karolinska Institutet, Sweden

Reviewer

  1. Sara Hägg, Karolinska Institutet, Sweden

Version history

  1. Preprint posted: July 27, 2022 (view preprint)
  2. Received: August 2, 2022
  3. Accepted: March 22, 2023
  4. Accepted Manuscript published: March 28, 2023 (version 1)
  5. Version of Record published: April 17, 2023 (version 2)

Copyright

© 2023, Ahadi et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 3,206
    Page views
  • 534
    Downloads
  • 4
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Sara Ahadi
  2. Kenneth A Wilson
  3. Boris Babenko
  4. Cory Y McLean
  5. Drew Bryant
  6. Orion Pritchard
  7. Ajay Kumar
  8. Enrique M Carrera
  9. Ricardo Lamy
  10. Jay M Stewart
  11. Avinash Varadarajan
  12. Marc Berndl
  13. Pankaj Kapahi
  14. Ali Bashir
(2023)
Longitudinal fundus imaging and its genome-wide association analysis provide evidence for a human retinal aging clock
eLife 12:e82364.
https://doi.org/10.7554/eLife.82364

Share this article

https://doi.org/10.7554/eLife.82364

Further reading

    1. Computational and Systems Biology
    James D Brunner, Nicholas Chia
    Research Article

    The microbial community composition in the human gut has a profound effect on human health. This observation has lead to extensive use of microbiome therapies, including over-the-counter 'probiotic' treatments intended to alter the composition of the microbiome. Despite so much promise and commercial interest, the factors that contribute to the success or failure of microbiome-targeted treatments remain unclear. We investigate the biotic interactions that lead to successful engraftment of a novel bacterial strain introduced to the microbiome as in probiotic treatments. We use pairwise genome-scale metabolic modeling with a generalized resource allocation constraint to build a network of interactions between taxa that appear in an experimental engraftment study. We create induced sub-graphs using the taxa present in individual samples and assess the likelihood of invader engraftment based on network structure. To do so, we use a generalized Lotka-Volterra model, which we show has strong ability to predict if a particular invader or probiotic will successfully engraft into an individual's microbiome. Furthermore, we show that the mechanistic nature of the model is useful for revealing which microbe-microbe interactions potentially drive engraftment.

    1. Computational and Systems Biology
    Tae-Yun Kang, Federico Bocci ... Andre Levchenko
    Research Article

    Angiogenesis is a morphogenic process resulting in the formation of new blood vessels from pre-existing ones, usually in hypoxic micro-environments. The initial steps of angiogenesis depend on robust differentiation of oligopotent endothelial cells into the Tip and Stalk phenotypic cell fates, controlled by NOTCH-dependent cell–cell communication. The dynamics of spatial patterning of this cell fate specification are only partially understood. Here, by combining a controlled experimental angiogenesis model with mathematical and computational analyses, we find that the regular spatial Tip–Stalk cell patterning can undergo an order–disorder transition at a relatively high input level of a pro-angiogenic factor VEGF. The resulting differentiation is robust but temporally unstable for most cells, with only a subset of presumptive Tip cells leading sprout extensions. We further find that sprouts form in a manner maximizing their mutual distance, consistent with a Turing-like model that may depend on local enrichment and depletion of fibronectin. Together, our data suggest that NOTCH signaling mediates a robust way of cell differentiation enabling but not instructing subsequent steps in angiogenic morphogenesis, which may require additional cues and self-organization mechanisms. This analysis can assist in further understanding of cell plasticity underlying angiogenesis and other complex morphogenic processes.