Human genetic analyses of organelles highlight the nucleus in age-related trait heritability

Abstract
eLife digest
Introduction
Results
Discussion
Materials and methods
Appendix 1
Data availability
References
Article and author information
Metrics

Abstract

Most age-related human diseases are accompanied by a decline in cellular organelle integrity, including impaired lysosomal proteostasis and defective mitochondrial oxidative phosphorylation. An open question, however, is the degree to which inherited variation in or near genes encoding each organelle contributes to age-related disease pathogenesis. Here, we evaluate if genetic loci encoding organelle proteomes confer greater-than-expected age-related disease risk. As mitochondrial dysfunction is a ‘hallmark’ of aging, we begin by assessing nuclear and mitochondrial DNA loci near genes encoding the mitochondrial proteome and surprisingly observe a lack of enrichment across 24 age-related traits. Within nine other organelles, we find no enrichment with one exception: the nucleus, where enrichment emanates from nuclear transcription factors. In agreement, we find that genes encoding several organelles tend to be ‘haplosufficient,’ while we observe strong purifying selection against heterozygous protein-truncating variants impacting the nucleus. Our work identifies common variation near transcription factors as having outsize influence on age-related trait risk, motivating future efforts to determine if and how this inherited variation then contributes to observed age-related organelle deterioration.

eLife digest

Getting older increases our risk of experiencing a wide range of diseases, such as diabetes, heart disease and neurodegenerative disease. The genetic variations that we inherit from our parents play a major role in predicting this risk. However, the biological networks involved in this process are extremely complex and remain challenging to decipher.

Prior studies have suggested that specialised structures inside our body’s cells, called organelles, may have an important role to play in aging. Organelles represent self-contained biological factories inside each cell, designed to perform specific tasks. Examples include the nucleus, which harbours most of the cell’s genetic material, and mitochondria, which help provide cells with energy.

Organelles tend to deteriorate and become dysfunctional with age, and mitochondria in particular are badly affected by the ageing process. A decline in organelle activity has been thought to explain ageing and the development of age-related diseases. However, this has never been systematically tested on a large scale at the inherited genetic level.

Gupta et al. assessed whether common inherited genetic variation in genes associated with ten different organelles could affect the risk of age-related disease, using a database of DNA samples from more than 300,000 individuals. They considered 24 diseases and traits that become more common with advanced age.

Gupta et al. discovered that inherited variants in or near genes associated with the nucleus were consistently linked to age-related disease risks. Most of this signal arose from genes encoding the nuclear transcription factors, proteins that help to control the rate at which genes are expressed. However, variants in genes associated with other organelles, including mitochondria, did not appear to be linked to age-related diseases.

This research suggests that inherited variation in transcription factors in the nucleus could act as genetic levers that increase the risk of common, age-related diseases. It also suggests that common genetic variation in other cellular organelles may not be as heavily involved in the development of such diseases. Such insights into the cellular structures and biological pathways involved in ageing and age-related disease also establish new targets for drugs to prevent or treat disease.

Introduction

The global burden of age-related diseases such as type 2 diabetes (T2D), Parkinson’s disease (PD), and cardiovascular disease (CVD) has been steadily rising due in part to a progressively aging population. These diseases are often highly heritable: for example, narrow-sense heritabilities were recently estimated as 56% for T2D, 46% for general hypertension, and 41% for atherosclerosis (Wang et al., 2017). Genome-wide association studies (GWAS) have led to the discovery of thousands of robust associations with common genetic variants (Claussnitzer et al., 2020), implicating a complex genetic architecture as underlying much of the heritable risk. These loci hold the potential to reveal underlying mechanisms of disease and spotlight targetable pathways.

Aging has been associated with dysfunction in many cellular organelles (López-Otín et al., 2013). Dysregulation of autophagic proteostasis, for which the lysosome is central, has been implicated in myriad age-related disorders including neurodegeneration, heart disease, and aging itself (Mizushima et al., 2008), and mouse models deficient for autophagy in the central nervous system show neurodegeneration (Hara et al., 2006; Komatsu et al., 2006). Endoplasmic reticular (ER) stress has been invoked as central to metabolic syndrome and insulin resistance in T2D (Ozcan et al., 2004). Disruption in the nucleus through increased gene regulatory noise from epigenetic alterations (López-Otín et al., 2013) and elevated nuclear envelope 'leakiness' (D'Angelo et al., 2009) has been implicated in aging. Dysfunction in the mitochondria has even been invoked as a ‘hallmark’ of aging (López-Otín et al., 2013) and has been observed in many common age-associated diseases (Lane et al., 2015; Petersen et al., 2004; Mootha et al., 2003; Schapira et al., 1990; Bender et al., 2006; Wanagat et al., 2001; Ashar et al., 2017). In particular, deficits in mitochondrial oxidative phosphorylation (OXPHOS) have been documented in aging and age-related diseases as evidenced by in vivo (Estrada et al., 2012) P-NMR measures (Petersen et al., 2004; Fleischman et al., 2010), enzymatic activity (Mootha et al., 2003; Schapira et al., 1990; Fannin et al., 1999; Trounce et al., 1989; Kelley et al., 2002; Patti et al., 2003; Stump et al., 2003) in biopsy material, accumulation of somatic mitochondrial DNA (mtDNA) mutations (Bender et al., 2006; Wanagat et al., 2001; Taylor et al., 2003), and a decline in mtDNA copy number (mtCN) (Ashar et al., 2017).

Given that a decline in organelle function is observed in age-related disease, a natural question is whether inherited variation in loci encoding organelles is enriched for age-related disease risk. Although it has long been known that recessive mutations leading to defects within many cellular organelles can lead to inherited syndromes (e.g. mutations in >300 nuclear DNA (nucDNA)-encoded mitochondrial genes lead to inborn mitochondrial disease; Frazier et al., 2019), it is unknown how this extends to common disease. In the present study, we use a human genetics approach to assess common variation in loci relevant to the function of ten cellular organelles. We begin with a deliberate focus on mitochondria given the depth of literature linking it to age-related disease, interrogating both nucDNA and mtDNA loci that contribute to the organelle’s proteome. This genetic approach is supported by the observation that heritability estimates of measures of mitochondrial function are substantial (33–65%; Curran et al., 2007; Xing et al., 2008). We then extend our analyses to nine additional organelles.

To our surprise, we find no evidence of enrichment for genome-wide association signal in or near mitochondrial genes across any of our analyses. Further, of 10 tested organelles, only the nucleus shows enrichment among many age-associated traits, with the signal emanating primarily from the transcription factors (TFs). Further analysis shows that genes encoding the mitochondrial proteome tend to be tolerant to heterozygous predicted loss-of-function (pLoF) variation and thus are surprisingly ‘haplosufficient’ – that is, show little fitness cost with heterozygous pLoF. In contrast, nuclear TFs are especially sensitive to gene dosage and are often ‘haploinsufficient,’ showing substantial purifying selection against heterozygous pLoF. Thus, our work highlights inherited variation influencing gene-regulatory pathways, rather than organelle physiology, in the inherited risk of common age-associated diseases.

Results

Age-related diseases and traits show diverse genetic architectures

To systematically define age-related diseases, we turned to recently published epidemiological data from the United Kingdom (U.K.) (Kuan et al., 2019) in order to match U.K. Biobank (UKB) (Sudlow et al., 2015) cohort. We prioritized traits whose prevalence increased as a function of age (Materials and methods) and were represented in UKB (https://github.com/Nealelab/UK_Biobank_GWAS) and/or had available published GWAS meta-analyses (Teslovich et al., 2010; Ehret et al., 2011; Manning et al., 2012; Morris et al., 2012; Schunkert et al., 2011; Estrada et al., 2012; Christophersen et al., 2017; Pattaro et al., 2016; Nalls et al., 2019; Lambert et al., 2013; Figure 1A, Appendix 1). We used SNP-heritability estimates from stratified linkage disequilibrium score regression (S-LDSC, https://github.com/bulik/ldsc) (Finucane et al., 2015) to ensure that our selected traits were sufficiently heritable (Supplementary file 1, Materials and methods, Appendix 1), observing heritabilities across UKB and meta-analysis traits as high as 0.28 (bone mineral density), all with heritability Z-score > 4. We then computed pairwise genetic and phenotypic correlations between the age-associated traits to compare their respective genetic architectures and phenotypic relationships (Figure 1B, Materials and methods). In general, genetic correlations were greater in magnitude than respective phenotypic correlations, potentially as GWAS are less sensitive to purely non-genetic factors that may influence phenotypes (e.g. measurement error). As expected we find a highly correlated module of primarily cardiometabolic traits with high density lipoprotein (HDL) showing anti-correlation (Bulik-Sullivan et al., 2015). Interestingly, several other traits (gastroesophageal reflux disease (GERD), osteoarthritis) showed moderate genetic correlation to the cardiometabolic trait cluster while atrial fibrillation, for which T2D and CVD are risk factors (Wasmer et al., 2017), showed phenotypic, but not genetic, correlation. Our final set of prioritized, age-associated traits included 24 genetically diverse, heritable phenotypes (Supplementary file 1). Of these, 11 traits were sufficiently heritable only in UKB, three were sufficiently heritable only among non-UKB meta-analyses, and 10 were well-powered in both UKB and an independent cohort.

Figure 1

Download asset Open asset

Selection of genetically diverse age-related diseases and traits using epidemiological data.

(A) Period prevalence of age-associated diseases systematically selected for this study (Materials and methods). Epidemiological data obtained from Kuan et al., 2019. (B) Genetic (lower half) and phenotypic (upper half) correlation between the selected age-related traits. All correlations were assessed between UK Biobank phenotypes with the exception of eGFR, Alzheimer’s Disease, and Parkinson’s Disease, for which the respective meta-analyses were used (Materials and methods). Grey ‘o’ in phenotypic correlations indicate phenotypes not tested within UKB for which individual-level data was not available. All data displayed in this panel are available in Figure 1—source data 1. * represents correlations that are significantly different from 0 at a Bonferroni-corrected threshold for p = 0.05 across all tested traits.

Figure 1—source data 1 Genetic and phenotypic correlation point estimates and standard errors.: https://cdn.elifesciences.org/articles/68610/elife-68610-fig1-data1-v3.xlsx
Download elife-68610-fig1-data1-v3.xlsx

Mitochondrial genes are not enriched among age-related trait GWAS

To test if age-related trait heritability was enriched among mitochondria-relevant loci, we began by simply asking if ~1100 nucDNA genes encoding the mitochondrial proteome from the MitoCarta2.0 inventory (Calvo et al., 2016) were found near lead SNPs for our selected traits represented in the NHGRI-EBI GWAS Catalog (https://www.ebi.ac.uk/gwas/) (MacArthur et al., 2017) more frequently than expectation (Materials and methods, Appendix 1). To our surprise, no traits showed a statistically significant enrichment of mitochondrial genes (Figure 2—figure supplement 1A); in fact, six traits showed a statistically significant depletion. Even more strikingly, MitoCarta genes tended to be nominally enriched in fewer traits than the average randomly selected sample of protein-coding genes (Figure 2—figure supplement 1B, empirical p = 0.014). This lack of enrichment was observed more broadly across virtually all traits represented in the GWAS Catalog (Figure 2—figure supplement 1C). We also examined specific transcriptional regulators of mitochondrial biogenesis (TFAM, GABPA, GABPB1, ESRRA, YY1, NRF1, PPARGC1A, PPARGC1B) and found very little evidence supporting a role for these genes in modifying risk for the age-related GWAS Catalog phenotypes (Appendix 1).

To investigate further, we turned to U.K. Biobank (UKB). We compiled and tested loci encoding the mitochondrial proteome (Figure 2A) with which we interrogated the association between common mitochondrial variation and common disease. First, we considered all common variants in or near nucDNA MitoCarta genes, as well as two subsets of MitoCarta: mitochondrial Mendelian disease genes (Frazier et al., 2019) and nucDNA-encoded OXPHOS genes. Second, we obtained and tested mtDNA genotypes at up to 213 loci after quality control (Materials and methods) from 360,662 individuals for associations with age-related traits.

Figure 2 with 9 supplements see all

Download asset Open asset

Assessment of the association of nucDNA and mtDNA loci contributing to the mitochondrial proteome with age-related traits.

(A) Scheme outlining the aspects of mitochondrial function assessed in this study. nucDNA loci contributing to the mitochondrial proteome are shown in teal, while mtDNA loci are shown in pink. (B) S-LDSC enrichment p-values on top of the baseline model in UKB. Inset labels represent gene-set size; dotted line represents BH FDR 0.1 threshold. (C) Visualization of mtDNA variants and associations with age-related diseases. The outer-most track represents the genetic architecture of the circular mtDNA. The heatmap track represents the log-scaled number of individuals with an alternate genotype at each site. The inner track represents mitochondrial genome-wide association p-values, with radial angle corresponding to position on the mtDNA and magnitude representing –log₁₀ p-value. Dotted line represents Bonferroni cutoff for all tested trait-variant pairs. (D) Replication of S-LDSC enrichment results in meta-analyses. Dotted line represents BH FDR 0.1 threshold. * represents traits for which sufficiently well-powered cohorts from both UKB and meta-analyses were available. The trait color legend to the right of panel (C) applies to panels (B) and (C), representing UKB traits. S-LDSC enrichment p-values plotted in (B) and (D) are available in Source data 1; mtDNA-GWAS summary statistics are available in Source data 2.

First, we used S-LDSC (Finucane et al., 2015; Finucane et al., 2018) and MAGMA (https://ctg.cncr.nl/software/magma) (de Leeuw et al., 2015), two robust methods that can be used to assess gene-based heritability enrichment accounting for LD and several confounders, to test if there was any evidence of heritability enrichment among MitoCarta genes (Materials and methods). We found no evidence of enrichment near nucDNA MitoCarta genes for any trait tested in UKB using S-LDSC (Figure 2B, Figure 2—figure supplement 2A), consistent with our results from the GWAS Catalog. We replicated this lack of enrichment using MAGMA at two different window sizes (Figure 2—figure supplement 2C, Figure 2—figure supplement 2E; all q > 0.1).

Given the lack of enrichment among the MitoCarta genes, we wanted to (1) verify that our selected methods could detect previously reported enrichments and (2) confirm that common variation in or near MitoCarta genes could lead to expression-level perturbations. We first successfully replicated previously reported enrichment among tissue-specific genes for key traits using both S-LDSC (Figure 2—figure supplement 3, Figure 2—figure supplement 4) and MAGMA (Figure 2—figure supplement 5, Figure 2—figure supplement 6, Appendix 1, Materials and methods). We next confirmed that we had sufficient power using both S-LDSC and MAGMA to detect physiologically relevant enrichment effect sizes among MitoCarta genes (Figure 2—figure supplement 7, Materials and methods, Appendix 1). We finally examined the landscape of cis-expression QTLs (eQTLs) for these genes and found that almost all MitoCarta genes have cis-eQTLs in at least one tissue and often have cis-eQTLs in more tissues than most protein-coding genes (Figure 2—figure supplement 8, Materials and methods, Appendix 1). Hence, our selected methods could detect physiologically relevant heritability enrichments among our selected traits at gene-set sizes comparable to that of MitoCarta, and common variants in or near MitoCarta genes exerted cis-control on gene expression.

Next, we considered mtDNA loci genotyped in UKB, obtaining calls for up to 213 common variants passing quality control across 360,662 individuals (Materials and methods, Appendix 1). We found no significant associations on the mtDNA for any of the 21 age-related traits available in UKB using linear or logistic regression (Materials and methods, Figure 2C, Figure 2—figure supplement 9; Source data 2).

As a control and to validate our approach, we also performed mtDNA-GWAS for specific traits with previously reported associations. A recent analysis of ~147,437 individuals in BioBank Japan revealed four distinct traits with significant mtDNA associations (Yamamoto et al., 2020). Of these, creatinine and aspartate aminotransferase (AST) had sufficiently large sample sizes in UKB. We observed a large number of associations throughout the mtDNA for both traits (p < 1.15 * 10^-5, Figure 2—figure supplement 9E). Thus, our mtDNA association method was able to replicate robust mtDNA associations among well-powered traits.

We sought to replicate our negative results in an independent cohort. We turned to published GWAS meta-analyses (Teslovich et al., 2010; Ehret et al., 2011; Manning et al., 2012; Morris et al., 2012; Schunkert et al., 2011; Estrada et al., 2012; Christophersen et al., 2017; Pattaro et al., 2016; Nalls et al., 2019; Lambert et al., 2013; Supplementary file 1) and successfully replicated the lack of enrichment for MitoCarta genes across all 10 traits with an available independent cohort GWAS using S-LDSC (Figure 2D, Figure 2—figure supplement 2B) and MAGMA (Figure 2—figure supplement 2D, Appendix 1; all q > 0.1). Importantly, while we were unable to pursue analyses for PD and Alzheimer’s disease in UKB due to limited case counts, we tested MitoCarta genes among well-powered meta-analyses for these disorders (Appendix 1) and observed no enrichment (Figure 2D; all q > 0.1).

In summary, we tested (1) nucDNA loci near genes that encode the mitochondrial proteome in the GWAS Catalog, UKB, and GWAS meta-analyses, (2) transcriptional regulators of mitochondrial biogenesis in the GWAS Catalog, and (3) mtDNA variants in UKB. We found no convincing evidence of heritability enrichment for common age-associated diseases near these mitochondrial loci.

Of all tested organelles, only the nucleus shows enrichment for age-related trait heritability

We next asked whether heritability for age-related diseases and traits clusters among loci associated with any cellular organelle. We used the COMPARTMENTS database (https://compartments.jensenlab.org) to define gene-sets corresponding to the proteomes of nine additional organelles (Binder et al., 2014) besides mitochondria (Materials and methods). We used S-LDSC to produce heritability estimates for these categories in the UKB age-related disease traits, finding evidence of heritability enrichment in many traits for genes comprising the nuclear proteome (Figure 3A, Materials and methods). No other tested organelles showed evidence of heritability enrichment. Variation in or near genes comprising the nuclear proteome explained over 50% of disease heritability on average despite representing only ~35% of tested SNPs (Figure 3—figure supplement 1, Appendix 1). We successfully replicated this pattern of heritability enrichment among organelles using MAGMA in UKB at two window sizes (Figure 3—figure supplement 2A, Figure 3—figure supplement 2B), again finding enrichment only among genes related to the nucleus.

Figure 3 with 8 supplements see all

Download asset Open asset

Heritability enrichment of organellar proteomes across age-related disease in UK Biobank.

(A) Quantile-quantile plot of heritability enrichment p-values atop the baseline model for gene-sets representing organellar proteomes, with black line representing expected null p-values following the uniform distribution and shaded ribbon representing 95% CI. (B) Scheme of spatially distinct disjoint subsets of the nuclear proteome as a strategy to characterize observed enrichment of the nuclear proteome. Numbers represent gene-set size. (C) S-LDSC enrichment p-values for spatial subsets of the nuclear proteome computed atop the baseline model. (D) S-LDSC enrichment p-values for TFs and all other nucleus-localizing proteins. Inset numbers represent gene-set sizes, black lines represent cutoff at BH FDR < 10%. * represents traits for which sufficiently well-powered cohorts from both UKB and meta-analyses were available. Enrichment p-values and coefficients are available in Source data 1.

Much of the nuclear enrichment signal emanates from transcription factors

With over 6000 genes comprising the nuclear proteome, we considered largely disjoint subsets of the organelle’s proteome to trace the source of the enrichment signal (The Gene Ontology Consortium et al., 2019; Ashburner et al., 2000; Lambert et al., 2018; Figure 3B, Materials and methods, Appendix 1). We found significant heritability enrichment within the set of 1804 genes whose protein products are annotated to localize to the chromosome itself (q < 0.1 for nine traits, Figure 3C, Figure 3—figure supplement 3A). Further partitioning revealed that much of this signal is attributable to the subset classified as TFs (Lambert et al., 2018) (1523 genes, q < 0.1 for 10 traits, Figure 3D, Figure 3—figure supplement 3B). We replicated these results using MAGMA in UKB at two window sizes (Figure 3—figure supplement 2), and also replicated enrichments among TFs in several (but not all) corresponding meta-analyses (Figure 3—figure supplement 4) despite reduced power (Figure 2—figure supplement 7H). We generated functional subdivisions of the TFs (Materials and methods, Appendix 1), finding that the non-zinc finger TFs showed enrichment for a highly similar set of traits to those enriched for the whole set of TFs (Figure 3—figure supplement 5D, Figure 3—figure supplement 6B, Figure 3—figure supplement 7B, Figure 3—figure supplement 8B). Interestingly, the KRAB domain-containing zinc fingers (KRAB ZFs) (Kapopoulou et al., 2016), which are recently evolved (Figure 3—figure supplement 5H), were largely devoid of enrichment even compared to non-KRAB ZFs (Figure 3—figure supplement 5E, Figure 3—figure supplement 6C, Figure 3—figure supplement 7C, Figure 3—figure supplement 8C). Thus, we find that variation within or near non-KRAB domain-containing TF genes has an outsize influence on age-associated disease heritability.

We next turned to recently published GWAS assessing parental lifespan (Timmers et al., 2019) and ‘healthspan’ via first morbidity hazard (Zenin et al., 2019). Both traits showed highly significant heritability via S-LDSC ( $h^{2} (s . e .) =$ 0.0265 (0.0019) and 0.0348 (0.003) respectively, Materials and methods). Enrichment analysis of organelles among these traits revealed a significant enrichment for the nucleus for parental lifespan (p = 0.0003) using MAGMA (Figure 4). While we observed only a nominally ‘suggestive’ enrichment for the nucleus for healthspan (p = 0.058), S-LDSC showed significant nuclear heritability enrichment (p = 0.0016, Figure 4—figure supplement 1). Analysis of spatial subsets of the nuclear proteome showed significant enrichment for TFs and proteins localizing to the chromosome in both aging phenotypes using MAGMA (Figure 4) and for healthspan using S-LDSC (Figure 4—figure supplement 1).

Figure 4 with 1 supplement see all

Download asset Open asset

Enrichment of organellar proteomes within parental lifespan and healthspan as proxies for aging.

Upper panels represent organelle proteomes; lower panels represent spatial subsets of the nuclear proteome. Numbers atop each bar represent gene-set sizes. Dashed lines represent cutoff at BH FDR < 10%, dotted lines represent nominal p = 0.05. p-Values and coefficients available in Source data 3.

Mitochondrial genes tend to be more ‘haplosufficient’ than genes encoding other organelles

In light of observing heritability enrichment only among nuclear transcription factors, we wanted to determine if the fitness cost of pLoF variation in genes across cellular organelles mirrored our results. Mitochondria-localizing genes and TFs play a central role in numerous Mendelian diseases (Frazier et al., 2019; Jimenez-Sanchez et al., 2001; Worman and Courvalin, 2002; Cleaver, 1994), so we initially hypothesized that genes belonging to either category would be under significant purifying selection (i.e., constraint). We obtained constraint metrics from gnomAD (https://gnomad.broadinstitute.org) (Karczewski et al., 2020) as the LoF observed/expected fraction (LOEUF). In agreement with our GWAS enrichment results, we observed that the mitochondrion on average is one of the least constrained organelles we tested, in stark contrast to the nucleus (Figure 5A). In fact, the nucleus was second only to the set of 'haploinsufficient' genes (defined based on curated human clinical genetic data; Karczewski et al., 2020, Materials and methods) in the proportion of its genes in the most constrained decile, while the mitochondrion lay on the opposite end of the spectrum (Figure 5B). Interestingly, even the Mendelian mitochondrial disease genes had a high tolerance to pLoF variation on average in comparison to TFs (Figure 5C). Even across different categories of TFs, we observed that highly constrained TF subsets tend to show GWAS enrichment (Figure 5-Figure supplement 1, Figure 3-Figure supplement 5E) relative to unconstrained subsets for our tested traits. Indeed, explicit inclusion of LOEUF as a covariate in the enrichment analysis model (Materials and methods) reduced the significance of (but did not eliminate) the enrichment seen for the TFs (Figure 5-Figure supplement 2B, Figure 5-Figure supplement 2E, Figure 5-Figure supplement 2F). Thus, while disruption in both mitochondrial genes and TFs can produce rare disease, the fitness cost of heterozygous variation in mitochondrial genes appears to be far lower than that among TFs. This dichotomy reflects the contrasting enrichment results between mitochondrial genes and TFs and supports the importance of gene regulation as it relates to evolutionary conservation.

Figure 5 with 3 supplements see all

Download asset Open asset

Differences in constraint distribution across organelles.

(A) Constraint as measured by LOEUF from gnomAD v2.1.1 for genes comprising organellar proteomes, book-ended by distributions for known haploinsufficient genes as well as olfactory receptors. Lower values indicate genes exacting a greater organismal fitness cost from a heterozygous LoF variant (greater constraint). (B) Proportion of each gene-set found in the lowest LOEUF decile. Higher values indicate gene-sets containing more highly constrained genes. (C) Constraint distributions for subsets of the nuclear-encoded mitochondrial proteome (red) and subsets of the nucleus (teal). Black points represent the mean with 95% CI. Inset numbers represent gene-set size.

Discussion

Pathology in cellular organelles has been widely documented in age-related diseases (López-Otín et al., 2013; Ozcan et al., 2004; Colacurcio and Nixon, 2016; Kanfi et al., 2010; Blasco, 2007; Bhattarai et al., 2020). Using a human genetics approach, here we report the unexpected discovery that except for the nucleus, cellular organelles tend not to be enriched in genetic associations for common, age-related diseases. We started with a focus on the mitochondria as a decline in mitochondrial abundance and activity has long been reported as one of the most consistent correlates of aging (Wanagat et al., 2001; Fleischman et al., 2010; Trounce et al., 1989; Taylor et al., 2003) and age-associated diseases (Petersen et al., 2004; Mootha et al., 2003; Schapira et al., 1990; Bender et al., 2006; Ashar et al., 2017; Fannin et al., 1999; Kelley et al., 2002; Patti et al., 2003; Stump et al., 2003). We tested common variants contributing to the mitochondrial proteome on the nucDNA and mtDNA and found no convincing evidence of heritability enrichment in any tested trait, cohort, or method. We systematically expanded our analysis to survey 10 organelles and found that only the nucleus showed enrichment, with much of this signal originating from nuclear TFs. Constraint analysis showed a substantial fitness cost to heterozygous loss-of-function mutations in genes encoding the nuclear proteome, whereas genes encoding the mitochondrial proteome were ‘haplosufficient’.

Here, we focus on enrichment to place the complex genetic architectures of age-related traits in a broader biological context and prioritize pathways for follow-up. For these highly polygenic traits, any large fraction of the genome may explain a statistically significant amount of disease heritability (de Leeuw et al., 2016; Loh et al., 2015), and indeed associations between individual organelle-relevant loci and certain common diseases have been identified previously (Billingsley et al., 2019; Kraja et al., 2019). For example, variants in the endoplasmic reticular genes WFS1 and ATF6B and the mitochondrial gene ATP5G1 have been associated with common T2D (Xue et al., 2018). These genes are present in the respective organelle gene-sets, however unlike TFs, neither the endoplasmic reticulum nor the mitochondrion showed enrichment for T2D. Importantly, both MAGMA and S-LDSC are capable of detecting an enrichment even in a highly polygenic background. Both methods have been used in the past to identify biologically plausible disease-relevant tissues (Finucane et al., 2015; Finucane et al., 2018) and pathway enrichments (Jansen et al., 2019; Pardiñas et al., 2018) in traits across the spectrum of polygenicity, and we identify enrichments among disease-relevant tissues using both methods in several highly polygenic traits.

While previous work has shown that common disease GWAS can be enriched for expression in specific disease-relevant organs (Finucane et al., 2018; Maurano et al., 2012), our data suggest that this framework does not generally extend from organs to organelles. This finding contrasts with our classical nosology of inborn errors of metabolism that tend to be mapped to ‘causal’ organelles, for example, lysosomal storage diseases, disorders of peroxisomal biogenesis, and mitochondrial OXPHOS disorders. The observed enrichment for TFs within the nucleus indicates that common variation influencing genome regulation impacts common disease risk more than variation influencing individual organelles.

Our analysis of common inherited mitochondrial variation represents, to our knowledge, the most comprehensive joint assessment of mitochondria-relevant nucDNA and mtDNA variation in age-related diseases. We replicated mtDNA associations with creatinine and AST observed previously in BioBank Japan (Yamamoto et al., 2020), further supporting our approach. While individual mtDNA variants have been previously associated with certain traits (Raule et al., 2007; Yu et al., 2008; Hudson et al., 2013a), these associations appear to be conflicting in the literature, perhaps because of limited power and/or uncontrolled confounding biases such as population stratification (Samuels et al., 2006; Biffi et al., 2010). Our negative results are surprising, but they are compatible with a prior enrichment analysis focused on T2D (Segrè et al., 2010) as well as a small number of isolated reports interrogating either mitochondria-relevant nucDNA (Segrè et al., 2010) or mtDNA (Yamamoto et al., 2020; Saxena et al., 2006; Hudson et al., 2014; Hudson et al., 2013b) loci in select diseases.

To our knowledge, we are the first to systematically document heterogeneity in average pLoF across cellular organelles. That MitoCarta genes are ‘haplosufficient’ and pLoF tolerant (Figure 5A) is consistent with the observation that most of the ~300 inborn mitochondrial disease genes produce disease with recessive inheritance (Frazier et al., 2019) and healthy parents. The few mitochondrial disorders that show autosomal dominant inheritance are nearly always due to dominant negativity rather than haploinsufficiency. The intolerance of TFs to pLoF variation (Figure 5C) provide a stark contrast to the results from the mitochondria that is borne out in their associated Mendelian disease syndromes: TFs are known to be haploinsufficient (Seidman and Seidman, 2002) and even regulatory variants modulating their expression can produce severe Mendelian disease (van der Lee et al., 2020). We observe enrichment among TFs for 10 different diseases as well as parental lifespan and healthspan, consistent with observed elevated purifying selection against pLoF variants in these genes. Our enrichment results combined with pLoF intolerance suggest that variation among TFs may produce disease-associated variants with larger effect sizes than expectation, underscoring their importance as genetic ‘levers’ for common disease heritability.

Why are mitochondria so robust to variation in gene dosage (Figure 5) and hence ‘haplosufficient?’ We propose two possibilities. First, mitochondrial pathways tend to be highly interconnected, and it was already proposed by Wright, 1934 and later by Kacser and Burns, 1981 that haplosufficiency arises as a consequence of physiology, that is, system output is inherently buffered against the partial loss of a single gene due to the network organization of metabolic reactions. Kacser and Burns in fact explicitly mention that noncatalytic gene products fall outside their framework, and we believe that our finding that nucleus-localizing and cytoskeletal genes are the two most pLoF-intolerant compartments is consistent with their assessment. Second, mitochondria were formerly autonomous microbes and hence may have retained vestigial layers of ‘intra-organelle buffering’ against genetic variation. Numerous feedback control mechanisms, including respiratory control (Chance and Williams, 1955), help to ensure organelle robustness across physiological extremes (Vafai and Mootha, 2012; Balaban et al., 1986). In fact, a recent CRISPR screen showed that of the genes for which knock-out modified survival under a mitochondrial poison, there is a striking over-representation of genes that themselves encode mitochondrial proteins (To et al., 2019).

Throughout this study, we have tested for enrichment among inherited common variant associations near genes via an additive genetic model. We acknowledge the limitations of focusing on a specific genetic model and variant frequency regime, though note that common variation is the largest documented source of narrow-sense heritability, which typically accounts for a majority of disease heritability (Golan et al., 2014; Polderman et al., 2015). First, we consider only common variants. While rare variants may prove to be instructive, it is notable that a previous rare variant analysis in T2D (Fuchsberger et al., 2016) failed to show enrichment among OXPHOS genes. Second, we consider only additive genetic models. A recessive model may be particularly fruitful for mitochondrial genes given their tolerance to pLoF variation, however these models are frequently power-limited and may not explain much more phenotypic variance than additive models (Hill et al., 2008; Zhu et al., 2015). Third, we have not considered epistasis. The effects of mtDNA-nucDNA interactions (Rand and Mossman, 2020) in common diseases have yet to be assessed. While there is debate about whether biologically-relevant epistasis can be simply captured by main effects (Polderman et al., 2015; Hill et al., 2008; Sackton and Hartl, 2016; Hemani et al., 2014) at individual loci, it is possible that modeling mtDNA-nucDNA interactions will reveal new contributions. Fourth, to systematically assess all organelles, we restrict our analyses to variants near genes comprising each organelle’s proteome. It remains possible that future work will systematically identify novel organelle-relevant loci elsewhere in the genome which contribute disproportionately to age-related trait heritability. Fifth, while we are well-powered to detect physiologically relevant enrichments among most tested organelles (including the mitochondrion), our power may be more limited for particularly small compartments (e.g. lysosome). Finally, it is crucial not to confuse our mtDNA-GWAS results with previously reported associations between somatic mtDNA mutations and age-associated disease (Bender et al., 2006; Wanagat et al., 2001; Taylor et al., 2003) – the present work is focused on germline variation.

We have not formally addressed the causality of mitochondrial dysfunction in common age-related disease and the observed lack of heritability enrichment does not preclude the possibility of a therapeutic benefit in targeting the mitochondrion for age-related disease. For example, mitochondrial dysfunction is documented in brain or heart infarcts following blood vessel occlusion in laboratory-based models (Solenski et al., 2002; Flameng et al., 1991). Clearly, mitochondrial genetic variants do not influence infarct risk in this laboratory model, but pharmacological blockade of the mitochondrial permeability transition pore can mitigate reperfusion injury and infarct size (Weinbrenner et al., 1998). Future studies will be required to determine if and how the mitochondrial dysfunction associated with common age-associated diseases can be targeted for therapeutic benefit. Efforts to develop reliable measures of mitochondrial function and dysfunction have the potential to unbiasedly discover genetic instruments that influence the mitochondrion, and causal inference techniques such as Mendelian Randomization may shed light on this important causal question.

Our finding that the nucleus is the only organelle that shows enrichment for common age-associated trait heritability builds on prior work implicating nuclear processes in aging. Most human progeroid syndromes result from monogenic defects in nuclear components (Kubben and Misteli, 2017) (e.g. LMNA in Hutchinson-Gilford progeria syndrome, TERC in dyskeratosis congenita), and telomere length has long been observed as a marker of aging (Garcia et al., 2007). Heritability enrichment of age-related traits among gene regulators is consistent with the epigenetic dysregulation (Han and Brunet, 2012) and elevated transcriptional noise (López-Otín et al., 2013; Bahar et al., 2006) observed in aging (e.g. SIRT6 modulation influences mouse longevity and metabolic syndrome; Kanfi et al., 2012; Kanfi et al., 2010). An important role for gene regulation in common age-related disease is in agreement with both the observation that a very large fraction of common disease-associated loci corresponds to the non-coding genome and the enrichment of disease heritability in histone marks and TF binding sites (Finucane et al., 2015; Karczewski et al., 2013). Given that a deterioration in several other cellular organelles has been so frequently documented in age-related traits, a future challenge lies in elucidating how inherited variation in or near TFs ultimately leads to the observed organelle dysfunction in age-related disease.

Data availability

Heritability point estimates and standard errors for age-related traits are listed in Supplementary file 1. Genetic and phenotypic correlation point estimates and standard errors/p-values plotted in Figure 1B are available in Figure 1—source data 1. Summary statistics from mtDNA-GWAS (plotted in Figure 2 and Figure 2—figure supplement 9) are available in Source data 2. All gene-based enrichment analysis p-values and point estimates are available in Source data 1 and Source data 3. Period prevalence data for diseases in the UK can be obtained from Kuan et al., 2019. Gene-sets can be found using COMPARTMENTS (https://compartments.jensenlab.org), MitoCarta 2.0 (https://www.broadinstitute.org/files/shared/metabolism/mitocarta/human.mitocarta2.0.html), Lambert et al., 2018 (DOI: 10.1016/j.cell.2018.01.029), Frazier et al., 2019 (DOI: 10.1074/jbc.R117.809194), Finucane et al., 2018 (https://alkesgroup.broadinstitute.org/LDSCORE/), Kapopoulou et al., 2016 (DOI: 10.1111/evo.12819), and the MacArthur laboratory (https://github.com/macarthur-lab/gene_lists, copy archived at swh:1:rev:fcc849637bd71e683bffc618e1a48081a8df08f8), Minikel, 2021. Gene age estimates were obtained from Litman and Stein, 2019 (DOI: 10.1053/j.seminoncol.2018.11.002). GWAS catalog annotations can be obtained from: https://www.ebi.ac.uk/gwas. Heritability estimates across UKB can be obtained at: https://nealelab.github.io/UKBB_ldsc/. UKB summary statistics can be obtained from Neale lab GWAS round 2: https://github.com/Nealelab/UK_Biobank_GWAS, (copy archived at swh:1:rev:dc7b7b590413ec96a45a64f7213f50a3a0606198), Howrigan, 2021. Annotations for the Baseline v1.1 and BaselineLD v2.2 models as well as other relevant reference data, including the 1000G EUR reference panel, can be obtained from https://alkesgroup.broadinstitute.org/LDSCORE/. eQTL and expression data in human tissues can be obtained from GTEx: https://www.gtexportal.org. Constraint estimates can be found via gnomAD: https://gnomad.broadinstitute.org. See citations for publicly available GWAS meta-analysis summary statistics (Teslovich et al., 2010; Ehret et al., 2011; Timmers et al., 2019; Zenin et al., 2019; Manning et al., 2012; Morris et al., 2012; Schunkert et al., 2011; Estrada et al., 2012; Christophersen et al., 2017; Pattaro et al., 2016; Nalls et al., 2019; Lambert et al., 2013).

Code availability

Our analysis leverages publicly available tools including LDSC for heritability enrichment and genetic correlation (https://github.com/bulik/ldsc, copy archived at swh:1:rev:aa33296abac9569a6422ee6ba7eb4b902422cc74); Schorsch, 2021, MAGMA v1.07b for gene-set enrichment analysis (https://ctg.cncr.nl/software/magma), Hail v0.2.51 for distributed computing and mtDNA GWAS (https://hail.is), the R circlize package (Gu et al., 2014) for visualization of mtDNA-GWAS, and the R polycor package for phenotypic correlations with binary traits.

Share this article

Cite this article

Selection of genetically diverse age-related diseases and traits using epidemiological data.

Figure 1—source data 1

Assessment of the association of nucDNA and mtDNA loci contributing to the mitochondrial proteome with age-related traits.

Heritability enrichment of organellar proteomes across age-related disease in UK Biobank.

Enrichment of organellar proteomes within parental lifespan and healthspan as proxies for aging.

Differences in constraint distribution across organelles.

Author details

Rahul Gupta

Contribution

For correspondence

Competing interests

Konrad J Karczewski

Contribution

Competing interests

Daniel Howrigan

Contribution

Competing interests

Benjamin M Neale

Contribution

For correspondence

Competing interests

Vamsi K Mootha

Contribution

For correspondence

Competing interests

Citations by DOI

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Research organism