1. Genetics and Genomics
Download icon

Human genetic analyses of organelles highlight the nucleus in age-related trait heritability

  1. Rahul Gupta  Is a corresponding author
  2. Konrad J Karczewski
  3. Daniel Howrigan
  4. Benjamin M Neale  Is a corresponding author
  5. Vamsi K Mootha  Is a corresponding author
  1. Howard Hughes Medical Institute and Department of Molecular Biology, Massachusetts General Hospital, United States
  2. Broad Institute of MIT and Harvard, United States
  3. Analytic and Translational Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, United States
Research Article
  • Cited 0
  • Views 1,349
  • Annotations
Cite this article as: eLife 2021;10:e68610 doi: 10.7554/eLife.68610


Most age-related human diseases are accompanied by a decline in cellular organelle integrity, including impaired lysosomal proteostasis and defective mitochondrial oxidative phosphorylation. An open question, however, is the degree to which inherited variation in or near genes encoding each organelle contributes to age-related disease pathogenesis. Here, we evaluate if genetic loci encoding organelle proteomes confer greater-than-expected age-related disease risk. As mitochondrial dysfunction is a ‘hallmark’ of aging, we begin by assessing nuclear and mitochondrial DNA loci near genes encoding the mitochondrial proteome and surprisingly observe a lack of enrichment across 24 age-related traits. Within nine other organelles, we find no enrichment with one exception: the nucleus, where enrichment emanates from nuclear transcription factors. In agreement, we find that genes encoding several organelles tend to be ‘haplosufficient,’ while we observe strong purifying selection against heterozygous protein-truncating variants impacting the nucleus. Our work identifies common variation near transcription factors as having outsize influence on age-related trait risk, motivating future efforts to determine if and how this inherited variation then contributes to observed age-related organelle deterioration.

eLife digest

Getting older increases our risk of experiencing a wide range of diseases, such as diabetes, heart disease and neurodegenerative disease. The genetic variations that we inherit from our parents play a major role in predicting this risk. However, the biological networks involved in this process are extremely complex and remain challenging to decipher.

Prior studies have suggested that specialised structures inside our body’s cells, called organelles, may have an important role to play in aging. Organelles represent self-contained biological factories inside each cell, designed to perform specific tasks. Examples include the nucleus, which harbours most of the cell’s genetic material, and mitochondria, which help provide cells with energy.

Organelles tend to deteriorate and become dysfunctional with age, and mitochondria in particular are badly affected by the ageing process. A decline in organelle activity has been thought to explain ageing and the development of age-related diseases. However, this has never been systematically tested on a large scale at the inherited genetic level.

Gupta et al. assessed whether common inherited genetic variation in genes associated with ten different organelles could affect the risk of age-related disease, using a database of DNA samples from more than 300,000 individuals. They considered 24 diseases and traits that become more common with advanced age.

Gupta et al. discovered that inherited variants in or near genes associated with the nucleus were consistently linked to age-related disease risks. Most of this signal arose from genes encoding the nuclear transcription factors, proteins that help to control the rate at which genes are expressed. However, variants in genes associated with other organelles, including mitochondria, did not appear to be linked to age-related diseases.

This research suggests that inherited variation in transcription factors in the nucleus could act as genetic levers that increase the risk of common, age-related diseases. It also suggests that common genetic variation in other cellular organelles may not be as heavily involved in the development of such diseases. Such insights into the cellular structures and biological pathways involved in ageing and age-related disease also establish new targets for drugs to prevent or treat disease.


The global burden of age-related diseases such as type 2 diabetes (T2D), Parkinson’s disease (PD), and cardiovascular disease (CVD) has been steadily rising due in part to a progressively aging population. These diseases are often highly heritable: for example, narrow-sense heritabilities were recently estimated as 56% for T2D, 46% for general hypertension, and 41% for atherosclerosis (Wang et al., 2017). Genome-wide association studies (GWAS) have led to the discovery of thousands of robust associations with common genetic variants (Claussnitzer et al., 2020), implicating a complex genetic architecture as underlying much of the heritable risk. These loci hold the potential to reveal underlying mechanisms of disease and spotlight targetable pathways.

Aging has been associated with dysfunction in many cellular organelles (López-Otín et al., 2013). Dysregulation of autophagic proteostasis, for which the lysosome is central, has been implicated in myriad age-related disorders including neurodegeneration, heart disease, and aging itself (Mizushima et al., 2008), and mouse models deficient for autophagy in the central nervous system show neurodegeneration (Hara et al., 2006; Komatsu et al., 2006). Endoplasmic reticular (ER) stress has been invoked as central to metabolic syndrome and insulin resistance in T2D (Ozcan et al., 2004). Disruption in the nucleus through increased gene regulatory noise from epigenetic alterations (López-Otín et al., 2013) and elevated nuclear envelope 'leakiness' (D'Angelo et al., 2009) has been implicated in aging. Dysfunction in the mitochondria has even been invoked as a ‘hallmark’ of aging (López-Otín et al., 2013) and has been observed in many common age-associated diseases (Lane et al., 2015; Petersen et al., 2004; Mootha et al., 2003; Schapira et al., 1990; Bender et al., 2006; Wanagat et al., 2001; Ashar et al., 2017). In particular, deficits in mitochondrial oxidative phosphorylation (OXPHOS) have been documented in aging and age-related diseases as evidenced by in vivo (Estrada et al., 2012) P-NMR measures (Petersen et al., 2004; Fleischman et al., 2010), enzymatic activity (Mootha et al., 2003; Schapira et al., 1990; Fannin et al., 1999; Trounce et al., 1989; Kelley et al., 2002; Patti et al., 2003; Stump et al., 2003) in biopsy material, accumulation of somatic mitochondrial DNA (mtDNA) mutations (Bender et al., 2006; Wanagat et al., 2001; Taylor et al., 2003), and a decline in mtDNA copy number (mtCN) (Ashar et al., 2017).

Given that a decline in organelle function is observed in age-related disease, a natural question is whether inherited variation in loci encoding organelles is enriched for age-related disease risk. Although it has long been known that recessive mutations leading to defects within many cellular organelles can lead to inherited syndromes (e.g. mutations in >300 nuclear DNA (nucDNA)-encoded mitochondrial genes lead to inborn mitochondrial disease; Frazier et al., 2019), it is unknown how this extends to common disease. In the present study, we use a human genetics approach to assess common variation in loci relevant to the function of ten cellular organelles. We begin with a deliberate focus on mitochondria given the depth of literature linking it to age-related disease, interrogating both nucDNA and mtDNA loci that contribute to the organelle’s proteome. This genetic approach is supported by the observation that heritability estimates of measures of mitochondrial function are substantial (33–65%; Curran et al., 2007; Xing et al., 2008). We then extend our analyses to nine additional organelles.

To our surprise, we find no evidence of enrichment for genome-wide association signal in or near mitochondrial genes across any of our analyses. Further, of 10 tested organelles, only the nucleus shows enrichment among many age-associated traits, with the signal emanating primarily from the transcription factors (TFs). Further analysis shows that genes encoding the mitochondrial proteome tend to be tolerant to heterozygous predicted loss-of-function (pLoF) variation and thus are surprisingly ‘haplosufficient’ – that is, show little fitness cost with heterozygous pLoF. In contrast, nuclear TFs are especially sensitive to gene dosage and are often ‘haploinsufficient,’ showing substantial purifying selection against heterozygous pLoF. Thus, our work highlights inherited variation influencing gene-regulatory pathways, rather than organelle physiology, in the inherited risk of common age-associated diseases.


Age-related diseases and traits show diverse genetic architectures

To systematically define age-related diseases, we turned to recently published epidemiological data from the United Kingdom (U.K.) (Kuan et al., 2019) in order to match U.K. Biobank (UKB) (Sudlow et al., 2015) cohort. We prioritized traits whose prevalence increased as a function of age (Materials and methods) and were represented in UKB (https://github.com/Nealelab/UK_Biobank_GWAS) and/or had available published GWAS meta-analyses (Teslovich et al., 2010; Ehret et al., 2011; Manning et al., 2012; Morris et al., 2012; Schunkert et al., 2011; Estrada et al., 2012; Christophersen et al., 2017; Pattaro et al., 2016; Nalls et al., 2019; Lambert et al., 2013; Figure 1A, Appendix 1). We used SNP-heritability estimates from stratified linkage disequilibrium score regression (S-LDSC, https://github.com/bulik/ldsc) (Finucane et al., 2015) to ensure that our selected traits were sufficiently heritable (Supplementary file 1, Materials and methods, Appendix 1), observing heritabilities across UKB and meta-analysis traits as high as 0.28 (bone mineral density), all with heritability Z-score > 4. We then computed pairwise genetic and phenotypic correlations between the age-associated traits to compare their respective genetic architectures and phenotypic relationships (Figure 1B, Materials and methods). In general, genetic correlations were greater in magnitude than respective phenotypic correlations, potentially as GWAS are less sensitive to purely non-genetic factors that may influence phenotypes (e.g. measurement error). As expected we find a highly correlated module of primarily cardiometabolic traits with high density lipoprotein (HDL) showing anti-correlation (Bulik-Sullivan et al., 2015). Interestingly, several other traits (gastroesophageal reflux disease (GERD), osteoarthritis) showed moderate genetic correlation to the cardiometabolic trait cluster while atrial fibrillation, for which T2D and CVD are risk factors (Wasmer et al., 2017), showed phenotypic, but not genetic, correlation. Our final set of prioritized, age-associated traits included 24 genetically diverse, heritable phenotypes (Supplementary file 1). Of these, 11 traits were sufficiently heritable only in UKB, three were sufficiently heritable only among non-UKB meta-analyses, and 10 were well-powered in both UKB and an independent cohort.

Selection of genetically diverse age-related diseases and traits using epidemiological data.

(A) Period prevalence of age-associated diseases systematically selected for this study (Materials and methods). Epidemiological data obtained from Kuan et al., 2019. (B) Genetic (lower half) and phenotypic (upper half) correlation between the selected age-related traits. All correlations were assessed between UK Biobank phenotypes with the exception of eGFR, Alzheimer’s Disease, and Parkinson’s Disease, for which the respective meta-analyses were used (Materials and methods). Grey ‘o’ in phenotypic correlations indicate phenotypes not tested within UKB for which individual-level data was not available. All data displayed in this panel are available in Figure 1—source data 1. * represents correlations that are significantly different from 0 at a Bonferroni-corrected threshold for p = 0.05 across all tested traits.

Figure 1—source data 1

Genetic and phenotypic correlation point estimates and standard errors.


Mitochondrial genes are not enriched among age-related trait GWAS

To test if age-related trait heritability was enriched among mitochondria-relevant loci, we began by simply asking if ~1100 nucDNA genes encoding the mitochondrial proteome from the MitoCarta2.0 inventory (Calvo et al., 2016) were found near lead SNPs for our selected traits represented in the NHGRI-EBI GWAS Catalog (https://www.ebi.ac.uk/gwas/) (MacArthur et al., 2017) more frequently than expectation (Materials and methods, Appendix 1). To our surprise, no traits showed a statistically significant enrichment of mitochondrial genes (Figure 2—figure supplement 1A); in fact, six traits showed a statistically significant depletion. Even more strikingly, MitoCarta genes tended to be nominally enriched in fewer traits than the average randomly selected sample of protein-coding genes (Figure 2—figure supplement 1B, empirical p = 0.014). This lack of enrichment was observed more broadly across virtually all traits represented in the GWAS Catalog (Figure 2—figure supplement 1C). We also examined specific transcriptional regulators of mitochondrial biogenesis (TFAM, GABPA, GABPB1, ESRRA, YY1, NRF1, PPARGC1A, PPARGC1B) and found very little evidence supporting a role for these genes in modifying risk for the age-related GWAS Catalog phenotypes (Appendix 1).

To investigate further, we turned to U.K. Biobank (UKB). We compiled and tested loci encoding the mitochondrial proteome (Figure 2A) with which we interrogated the association between common mitochondrial variation and common disease. First, we considered all common variants in or near nucDNA MitoCarta genes, as well as two subsets of MitoCarta: mitochondrial Mendelian disease genes (Frazier et al., 2019) and nucDNA-encoded OXPHOS genes. Second, we obtained and tested mtDNA genotypes at up to 213 loci after quality control (Materials and methods) from 360,662 individuals for associations with age-related traits.

Figure 2 with 9 supplements see all
Assessment of the association of nucDNA and mtDNA loci contributing to the mitochondrial proteome with age-related traits.

(A) Scheme outlining the aspects of mitochondrial function assessed in this study. nucDNA loci contributing to the mitochondrial proteome are shown in teal, while mtDNA loci are shown in pink. (B) S-LDSC enrichment p-values on top of the baseline model in UKB. Inset labels represent gene-set size; dotted line represents BH FDR 0.1 threshold. (C) Visualization of mtDNA variants and associations with age-related diseases. The outer-most track represents the genetic architecture of the circular mtDNA. The heatmap track represents the log-scaled number of individuals with an alternate genotype at each site. The inner track represents mitochondrial genome-wide association p-values, with radial angle corresponding to position on the mtDNA and magnitude representing –log10 p-value. Dotted line represents Bonferroni cutoff for all tested trait-variant pairs. (D) Replication of S-LDSC enrichment results in meta-analyses. Dotted line represents BH FDR 0.1 threshold. * represents traits for which sufficiently well-powered cohorts from both UKB and meta-analyses were available. The trait color legend to the right of panel (C) applies to panels (B) and (C), representing UKB traits. S-LDSC enrichment p-values plotted in (B) and (D) are available in Source data 1; mtDNA-GWAS summary statistics are available in Source data 2.

First, we used S-LDSC (Finucane et al., 2015; Finucane et al., 2018) and MAGMA (https://ctg.cncr.nl/software/magma) (de Leeuw et al., 2015), two robust methods that can be used to assess gene-based heritability enrichment accounting for LD and several confounders, to test if there was any evidence of heritability enrichment among MitoCarta genes (Materials and methods). We found no evidence of enrichment near nucDNA MitoCarta genes for any trait tested in UKB using S-LDSC (Figure 2B, Figure 2—figure supplement 2A), consistent with our results from the GWAS Catalog. We replicated this lack of enrichment using MAGMA at two different window sizes (Figure 2—figure supplement 2C, Figure 2—figure supplement 2E; all q > 0.1).

Given the lack of enrichment among the MitoCarta genes, we wanted to (1) verify that our selected methods could detect previously reported enrichments and (2) confirm that common variation in or near MitoCarta genes could lead to expression-level perturbations. We first successfully replicated previously reported enrichment among tissue-specific genes for key traits using both S-LDSC (Figure 2—figure supplement 3, Figure 2—figure supplement 4) and MAGMA (Figure 2—figure supplement 5, Figure 2—figure supplement 6, Appendix 1, Materials and methods). We next confirmed that we had sufficient power using both S-LDSC and MAGMA to detect physiologically relevant enrichment effect sizes among MitoCarta genes (Figure 2—figure supplement 7, Materials and methods, Appendix 1). We finally examined the landscape of cis-expression QTLs (eQTLs) for these genes and found that almost all MitoCarta genes have cis-eQTLs in at least one tissue and often have cis-eQTLs in more tissues than most protein-coding genes (Figure 2—figure supplement 8, Materials and methods, Appendix 1). Hence, our selected methods could detect physiologically relevant heritability enrichments among our selected traits at gene-set sizes comparable to that of MitoCarta, and common variants in or near MitoCarta genes exerted cis-control on gene expression.

Next, we considered mtDNA loci genotyped in UKB, obtaining calls for up to 213 common variants passing quality control across 360,662 individuals (Materials and methods, Appendix 1). We found no significant associations on the mtDNA for any of the 21 age-related traits available in UKB using linear or logistic regression (Materials and methods, Figure 2C, Figure 2—figure supplement 9; Source data 2).

As a control and to validate our approach, we also performed mtDNA-GWAS for specific traits with previously reported associations. A recent analysis of ~147,437 individuals in BioBank Japan revealed four distinct traits with significant mtDNA associations (Yamamoto et al., 2020). Of these, creatinine and aspartate aminotransferase (AST) had sufficiently large sample sizes in UKB. We observed a large number of associations throughout the mtDNA for both traits (p < 1.15 * 10-5, Figure 2—figure supplement 9E). Thus, our mtDNA association method was able to replicate robust mtDNA associations among well-powered traits.

We sought to replicate our negative results in an independent cohort. We turned to published GWAS meta-analyses (Teslovich et al., 2010; Ehret et al., 2011; Manning et al., 2012; Morris et al., 2012; Schunkert et al., 2011; Estrada et al., 2012; Christophersen et al., 2017; Pattaro et al., 2016; Nalls et al., 2019; Lambert et al., 2013; Supplementary file 1) and successfully replicated the lack of enrichment for MitoCarta genes across all 10 traits with an available independent cohort GWAS using S-LDSC (Figure 2D, Figure 2—figure supplement 2B) and MAGMA (Figure 2—figure supplement 2D, Appendix 1; all q > 0.1). Importantly, while we were unable to pursue analyses for PD and Alzheimer’s disease in UKB due to limited case counts, we tested MitoCarta genes among well-powered meta-analyses for these disorders (Appendix 1) and observed no enrichment (Figure 2D; all q > 0.1).

In summary, we tested (1) nucDNA loci near genes that encode the mitochondrial proteome in the GWAS Catalog, UKB, and GWAS meta-analyses, (2) transcriptional regulators of mitochondrial biogenesis in the GWAS Catalog, and (3) mtDNA variants in UKB. We found no convincing evidence of heritability enrichment for common age-associated diseases near these mitochondrial loci.

Of all tested organelles, only the nucleus shows enrichment for age-related trait heritability

We next asked whether heritability for age-related diseases and traits clusters among loci associated with any cellular organelle. We used the COMPARTMENTS database (https://compartments.jensenlab.org) to define gene-sets corresponding to the proteomes of nine additional organelles (Binder et al., 2014) besides mitochondria (Materials and methods). We used S-LDSC to produce heritability estimates for these categories in the UKB age-related disease traits, finding evidence of heritability enrichment in many traits for genes comprising the nuclear proteome (Figure 3A, Materials and methods). No other tested organelles showed evidence of heritability enrichment. Variation in or near genes comprising the nuclear proteome explained over 50% of disease heritability on average despite representing only ~35% of tested SNPs (Figure 3—figure supplement 1, Appendix 1). We successfully replicated this pattern of heritability enrichment among organelles using MAGMA in UKB at two window sizes (Figure 3—figure supplement 2A, Figure 3—figure supplement 2B), again finding enrichment only among genes related to the nucleus.

Figure 3 with 8 supplements see all
Heritability enrichment of organellar proteomes across age-related disease in UK Biobank.

(A) Quantile-quantile plot of heritability enrichment p-values atop the baseline model for gene-sets representing organellar proteomes, with black line representing expected null p-values following the uniform distribution and shaded ribbon representing 95% CI. (B) Scheme of spatially distinct disjoint subsets of the nuclear proteome as a strategy to characterize observed enrichment of the nuclear proteome. Numbers represent gene-set size. (C) S-LDSC enrichment p-values for spatial subsets of the nuclear proteome computed atop the baseline model. (D) S-LDSC enrichment p-values for TFs and all other nucleus-localizing proteins. Inset numbers represent gene-set sizes, black lines represent cutoff at BH FDR < 10%. * represents traits for which sufficiently well-powered cohorts from both UKB and meta-analyses were available. Enrichment p-values and coefficients are available in Source data 1.

Much of the nuclear enrichment signal emanates from transcription factors

With over 6000 genes comprising the nuclear proteome, we considered largely disjoint subsets of the organelle’s proteome to trace the source of the enrichment signal (The Gene Ontology Consortium et al., 2019; Ashburner et al., 2000; Lambert et al., 2018; Figure 3B, Materials and methods, Appendix 1). We found significant heritability enrichment within the set of 1804 genes whose protein products are annotated to localize to the chromosome itself (q < 0.1 for nine traits, Figure 3C, Figure 3—figure supplement 3A). Further partitioning revealed that much of this signal is attributable to the subset classified as TFs (Lambert et al., 2018) (1523 genes, q < 0.1 for 10 traits, Figure 3D, Figure 3—figure supplement 3B). We replicated these results using MAGMA in UKB at two window sizes (Figure 3—figure supplement 2), and also replicated enrichments among TFs in several (but not all) corresponding meta-analyses (Figure 3—figure supplement 4) despite reduced power (Figure 2—figure supplement 7H). We generated functional subdivisions of the TFs (Materials and methods, Appendix 1), finding that the non-zinc finger TFs showed enrichment for a highly similar set of traits to those enriched for the whole set of TFs (Figure 3—figure supplement 5D, Figure 3—figure supplement 6B, Figure 3—figure supplement 7B, Figure 3—figure supplement 8B). Interestingly, the KRAB domain-containing zinc fingers (KRAB ZFs) (Kapopoulou et al., 2016), which are recently evolved (Figure 3—figure supplement 5H), were largely devoid of enrichment even compared to non-KRAB ZFs (Figure 3—figure supplement 5E, Figure 3—figure supplement 6C, Figure 3—figure supplement 7C, Figure 3—figure supplement 8C). Thus, we find that variation within or near non-KRAB domain-containing TF genes has an outsize influence on age-associated disease heritability.

We next turned to recently published GWAS assessing parental lifespan (Timmers et al., 2019) and ‘healthspan’ via first morbidity hazard (Zenin et al., 2019). Both traits showed highly significant heritability via S-LDSC (h2(s.e.)= 0.0265 (0.0019) and 0.0348 (0.003) respectively, Materials and methods). Enrichment analysis of organelles among these traits revealed a significant enrichment for the nucleus for parental lifespan (p = 0.0003) using MAGMA (Figure 4). While we observed only a nominally ‘suggestive’ enrichment for the nucleus for healthspan (p = 0.058), S-LDSC showed significant nuclear heritability enrichment (p = 0.0016, Figure 4—figure supplement 1). Analysis of spatial subsets of the nuclear proteome showed significant enrichment for TFs and proteins localizing to the chromosome in both aging phenotypes using MAGMA (Figure 4) and for healthspan using S-LDSC (Figure 4—figure supplement 1).

Figure 4 with 1 supplement see all
Enrichment of organellar proteomes within parental lifespan and healthspan as proxies for aging.

Upper panels represent organelle proteomes; lower panels represent spatial subsets of the nuclear proteome. Numbers atop each bar represent gene-set sizes. Dashed lines represent cutoff at BH FDR < 10%, dotted lines represent nominal p = 0.05. p-Values and coefficients available in Source data 3.

Mitochondrial genes tend to be more ‘haplosufficient’ than genes encoding other organelles

In light of observing heritability enrichment only among nuclear transcription factors, we wanted to determine if the fitness cost of pLoF variation in genes across cellular organelles mirrored our results. Mitochondria-localizing genes and TFs play a central role in numerous Mendelian diseases (Frazier et al., 2019; Jimenez-Sanchez et al., 2001; Worman and Courvalin, 2002; Cleaver, 1994), so we initially hypothesized that genes belonging to either category would be under significant purifying selection (i.e., constraint). We obtained constraint metrics from gnomAD (https://gnomad.broadinstitute.org) (Karczewski et al., 2020) as the LoF observed/expected fraction (LOEUF). In agreement with our GWAS enrichment results, we observed that the mitochondrion on average is one of the least constrained organelles we tested, in stark contrast to the nucleus (Figure 5A). In fact, the nucleus was second only to the set of 'haploinsufficient' genes (defined based on curated human clinical genetic data; Karczewski et al., 2020, Materials and methods) in the proportion of its genes in the most constrained decile, while the mitochondrion lay on the opposite end of the spectrum (Figure 5B). Interestingly, even the Mendelian mitochondrial disease genes had a high tolerance to pLoF variation on average in comparison to TFs (Figure 5C). Even across different categories of TFs, we observed that highly constrained TF subsets tend to show GWAS enrichment (Figure 5-Figure supplement 1, Figure 3-Figure supplement 5E) relative to unconstrained subsets for our tested traits. Indeed, explicit inclusion of LOEUF as a covariate in the enrichment analysis model (Materials and methods) reduced the significance of (but did not eliminate) the enrichment seen for the TFs (Figure 5-Figure supplement 2B, Figure 5-Figure supplement 2E, Figure 5-Figure supplement 2F). Thus, while disruption in both mitochondrial genes and TFs can produce rare disease, the fitness cost of heterozygous variation in mitochondrial genes appears to be far lower than that among TFs. This dichotomy reflects the contrasting enrichment results between mitochondrial genes and TFs and supports the importance of gene regulation as it relates to evolutionary conservation.

Figure 5 with 3 supplements see all
Differences in constraint distribution across organelles.

(A) Constraint as measured by LOEUF from gnomAD v2.1.1 for genes comprising organellar proteomes, book-ended by distributions for known haploinsufficient genes as well as olfactory receptors. Lower values indicate genes exacting a greater organismal fitness cost from a heterozygous LoF variant (greater constraint). (B) Proportion of each gene-set found in the lowest LOEUF decile. Higher values indicate gene-sets containing more highly constrained genes. (C) Constraint distributions for subsets of the nuclear-encoded mitochondrial proteome (red) and subsets of the nucleus (teal). Black points represent the mean with 95% CI. Inset numbers represent gene-set size.


Pathology in cellular organelles has been widely documented in age-related diseases (López-Otín et al., 2013; Ozcan et al., 2004; Colacurcio and Nixon, 2016; Kanfi et al., 2010; Blasco, 2007; Bhattarai et al., 2020). Using a human genetics approach, here we report the unexpected discovery that except for the nucleus, cellular organelles tend not to be enriched in genetic associations for common, age-related diseases. We started with a focus on the mitochondria as a decline in mitochondrial abundance and activity has long been reported as one of the most consistent correlates of aging (Wanagat et al., 2001; Fleischman et al., 2010; Trounce et al., 1989; Taylor et al., 2003) and age-associated diseases (Petersen et al., 2004; Mootha et al., 2003; Schapira et al., 1990; Bender et al., 2006; Ashar et al., 2017; Fannin et al., 1999; Kelley et al., 2002; Patti et al., 2003; Stump et al., 2003). We tested common variants contributing to the mitochondrial proteome on the nucDNA and mtDNA and found no convincing evidence of heritability enrichment in any tested trait, cohort, or method. We systematically expanded our analysis to survey 10 organelles and found that only the nucleus showed enrichment, with much of this signal originating from nuclear TFs. Constraint analysis showed a substantial fitness cost to heterozygous loss-of-function mutations in genes encoding the nuclear proteome, whereas genes encoding the mitochondrial proteome were ‘haplosufficient’.

Here, we focus on enrichment to place the complex genetic architectures of age-related traits in a broader biological context and prioritize pathways for follow-up. For these highly polygenic traits, any large fraction of the genome may explain a statistically significant amount of disease heritability (de Leeuw et al., 2016; Loh et al., 2015), and indeed associations between individual organelle-relevant loci and certain common diseases have been identified previously (Billingsley et al., 2019; Kraja et al., 2019). For example, variants in the endoplasmic reticular genes WFS1 and ATF6B and the mitochondrial gene ATP5G1 have been associated with common T2D (Xue et al., 2018). These genes are present in the respective organelle gene-sets, however unlike TFs, neither the endoplasmic reticulum nor the mitochondrion showed enrichment for T2D. Importantly, both MAGMA and S-LDSC are capable of detecting an enrichment even in a highly polygenic background. Both methods have been used in the past to identify biologically plausible disease-relevant tissues (Finucane et al., 2015; Finucane et al., 2018) and pathway enrichments (Jansen et al., 2019; Pardiñas et al., 2018) in traits across the spectrum of polygenicity, and we identify enrichments among disease-relevant tissues using both methods in several highly polygenic traits.

While previous work has shown that common disease GWAS can be enriched for expression in specific disease-relevant organs (Finucane et al., 2018; Maurano et al., 2012), our data suggest that this framework does not generally extend from organs to organelles. This finding contrasts with our classical nosology of inborn errors of metabolism that tend to be mapped to ‘causal’ organelles, for example, lysosomal storage diseases, disorders of peroxisomal biogenesis, and mitochondrial OXPHOS disorders. The observed enrichment for TFs within the nucleus indicates that common variation influencing genome regulation impacts common disease risk more than variation influencing individual organelles.

Our analysis of common inherited mitochondrial variation represents, to our knowledge, the most comprehensive joint assessment of mitochondria-relevant nucDNA and mtDNA variation in age-related diseases. We replicated mtDNA associations with creatinine and AST observed previously in BioBank Japan (Yamamoto et al., 2020), further supporting our approach. While individual mtDNA variants have been previously associated with certain traits (Raule et al., 2007; Yu et al., 2008; Hudson et al., 2013a), these associations appear to be conflicting in the literature, perhaps because of limited power and/or uncontrolled confounding biases such as population stratification (Samuels et al., 2006; Biffi et al., 2010). Our negative results are surprising, but they are compatible with a prior enrichment analysis focused on T2D (Segrè et al., 2010) as well as a small number of isolated reports interrogating either mitochondria-relevant nucDNA (Segrè et al., 2010) or mtDNA (Yamamoto et al., 2020; Saxena et al., 2006; Hudson et al., 2014; Hudson et al., 2013b) loci in select diseases.

To our knowledge, we are the first to systematically document heterogeneity in average pLoF across cellular organelles. That MitoCarta genes are ‘haplosufficient’ and pLoF tolerant (Figure 5A) is consistent with the observation that most of the ~300 inborn mitochondrial disease genes produce disease with recessive inheritance (Frazier et al., 2019) and healthy parents. The few mitochondrial disorders that show autosomal dominant inheritance are nearly always due to dominant negativity rather than haploinsufficiency. The intolerance of TFs to pLoF variation (Figure 5C) provide a stark contrast to the results from the mitochondria that is borne out in their associated Mendelian disease syndromes: TFs are known to be haploinsufficient (Seidman and Seidman, 2002) and even regulatory variants modulating their expression can produce severe Mendelian disease (van der Lee et al., 2020). We observe enrichment among TFs for 10 different diseases as well as parental lifespan and healthspan, consistent with observed elevated purifying selection against pLoF variants in these genes. Our enrichment results combined with pLoF intolerance suggest that variation among TFs may produce disease-associated variants with larger effect sizes than expectation, underscoring their importance as genetic ‘levers’ for common disease heritability.

Why are mitochondria so robust to variation in gene dosage (Figure 5) and hence ‘haplosufficient?’ We propose two possibilities. First, mitochondrial pathways tend to be highly interconnected, and it was already proposed by Wright, 1934 and later by Kacser and Burns, 1981 that haplosufficiency arises as a consequence of physiology, that is, system output is inherently buffered against the partial loss of a single gene due to the network organization of metabolic reactions. Kacser and Burns in fact explicitly mention that noncatalytic gene products fall outside their framework, and we believe that our finding that nucleus-localizing and cytoskeletal genes are the two most pLoF-intolerant compartments is consistent with their assessment. Second, mitochondria were formerly autonomous microbes and hence may have retained vestigial layers of ‘intra-organelle buffering’ against genetic variation. Numerous feedback control mechanisms, including respiratory control (Chance and Williams, 1955), help to ensure organelle robustness across physiological extremes (Vafai and Mootha, 2012; Balaban et al., 1986). In fact, a recent CRISPR screen showed that of the genes for which knock-out modified survival under a mitochondrial poison, there is a striking over-representation of genes that themselves encode mitochondrial proteins (To et al., 2019).

Throughout this study, we have tested for enrichment among inherited common variant associations near genes via an additive genetic model. We acknowledge the limitations of focusing on a specific genetic model and variant frequency regime, though note that common variation is the largest documented source of narrow-sense heritability, which typically accounts for a majority of disease heritability (Golan et al., 2014; Polderman et al., 2015). First, we consider only common variants. While rare variants may prove to be instructive, it is notable that a previous rare variant analysis in T2D (Fuchsberger et al., 2016) failed to show enrichment among OXPHOS genes. Second, we consider only additive genetic models. A recessive model may be particularly fruitful for mitochondrial genes given their tolerance to pLoF variation, however these models are frequently power-limited and may not explain much more phenotypic variance than additive models (Hill et al., 2008; Zhu et al., 2015). Third, we have not considered epistasis. The effects of mtDNA-nucDNA interactions (Rand and Mossman, 2020) in common diseases have yet to be assessed. While there is debate about whether biologically-relevant epistasis can be simply captured by main effects (Polderman et al., 2015; Hill et al., 2008; Sackton and Hartl, 2016; Hemani et al., 2014) at individual loci, it is possible that modeling mtDNA-nucDNA interactions will reveal new contributions. Fourth, to systematically assess all organelles, we restrict our analyses to variants near genes comprising each organelle’s proteome. It remains possible that future work will systematically identify novel organelle-relevant loci elsewhere in the genome which contribute disproportionately to age-related trait heritability. Fifth, while we are well-powered to detect physiologically relevant enrichments among most tested organelles (including the mitochondrion), our power may be more limited for particularly small compartments (e.g. lysosome). Finally, it is crucial not to confuse our mtDNA-GWAS results with previously reported associations between somatic mtDNA mutations and age-associated disease (Bender et al., 2006; Wanagat et al., 2001; Taylor et al., 2003) – the present work is focused on germline variation.

We have not formally addressed the causality of mitochondrial dysfunction in common age-related disease and the observed lack of heritability enrichment does not preclude the possibility of a therapeutic benefit in targeting the mitochondrion for age-related disease. For example, mitochondrial dysfunction is documented in brain or heart infarcts following blood vessel occlusion in laboratory-based models (Solenski et al., 2002; Flameng et al., 1991). Clearly, mitochondrial genetic variants do not influence infarct risk in this laboratory model, but pharmacological blockade of the mitochondrial permeability transition pore can mitigate reperfusion injury and infarct size (Weinbrenner et al., 1998). Future studies will be required to determine if and how the mitochondrial dysfunction associated with common age-associated diseases can be targeted for therapeutic benefit. Efforts to develop reliable measures of mitochondrial function and dysfunction have the potential to unbiasedly discover genetic instruments that influence the mitochondrion, and causal inference techniques such as Mendelian Randomization may shed light on this important causal question.

Our finding that the nucleus is the only organelle that shows enrichment for common age-associated trait heritability builds on prior work implicating nuclear processes in aging. Most human progeroid syndromes result from monogenic defects in nuclear components (Kubben and Misteli, 2017) (e.g. LMNA in Hutchinson-Gilford progeria syndrome, TERC in dyskeratosis congenita), and telomere length has long been observed as a marker of aging (Garcia et al., 2007). Heritability enrichment of age-related traits among gene regulators is consistent with the epigenetic dysregulation (Han and Brunet, 2012) and elevated transcriptional noise (López-Otín et al., 2013; Bahar et al., 2006) observed in aging (e.g. SIRT6 modulation influences mouse longevity and metabolic syndrome; Kanfi et al., 2012; Kanfi et al., 2010). An important role for gene regulation in common age-related disease is in agreement with both the observation that a very large fraction of common disease-associated loci corresponds to the non-coding genome and the enrichment of disease heritability in histone marks and TF binding sites (Finucane et al., 2015; Karczewski et al., 2013). Given that a deterioration in several other cellular organelles has been so frequently documented in age-related traits, a future challenge lies in elucidating how inherited variation in or near TFs ultimately leads to the observed organelle dysfunction in age-related disease.

Data availability

Heritability point estimates and standard errors for age-related traits are listed in Supplementary file 1. Genetic and phenotypic correlation point estimates and standard errors/p-values plotted in Figure 1B are available in Figure 1—source data 1. Summary statistics from mtDNA-GWAS (plotted in Figure 2 and Figure 2—figure supplement 9) are available in Source data 2. All gene-based enrichment analysis p-values and point estimates are available in Source data 1 and Source data 3. Period prevalence data for diseases in the UK can be obtained from Kuan et al., 2019. Gene-sets can be found using COMPARTMENTS (https://compartments.jensenlab.org), MitoCarta 2.0 (https://www.broadinstitute.org/files/shared/metabolism/mitocarta/human.mitocarta2.0.html), Lambert et al., 2018 (DOI: 10.1016/j.cell.2018.01.029), Frazier et al., 2019 (DOI: 10.1074/jbc.R117.809194), Finucane et al., 2018 (https://alkesgroup.broadinstitute.org/LDSCORE/), Kapopoulou et al., 2016 (DOI: 10.1111/evo.12819), and the MacArthur laboratory (https://github.com/macarthur-lab/gene_lists, copy archived at swh:1:rev:fcc849637bd71e683bffc618e1a48081a8df08f8), Minikel, 2021. Gene age estimates were obtained from Litman and Stein, 2019 (DOI: 10.1053/j.seminoncol.2018.11.002). GWAS catalog annotations can be obtained from: https://www.ebi.ac.uk/gwas. Heritability estimates across UKB can be obtained at: https://nealelab.github.io/UKBB_ldsc/. UKB summary statistics can be obtained from Neale lab GWAS round 2: https://github.com/Nealelab/UK_Biobank_GWAS, (copy archived at swh:1:rev:dc7b7b590413ec96a45a64f7213f50a3a0606198), Howrigan, 2021. Annotations for the Baseline v1.1 and BaselineLD v2.2 models as well as other relevant reference data, including the 1000G EUR reference panel, can be obtained from https://alkesgroup.broadinstitute.org/LDSCORE/. eQTL and expression data in human tissues can be obtained from GTEx: https://www.gtexportal.org. Constraint estimates can be found via gnomAD: https://gnomad.broadinstitute.org. See citations for publicly available GWAS meta-analysis summary statistics (Teslovich et al., 2010; Ehret et al., 2011; Timmers et al., 2019; Zenin et al., 2019; Manning et al., 2012; Morris et al., 2012; Schunkert et al., 2011; Estrada et al., 2012; Christophersen et al., 2017; Pattaro et al., 2016; Nalls et al., 2019; Lambert et al., 2013).

Code availability

Our analysis leverages publicly available tools including LDSC for heritability enrichment and genetic correlation (https://github.com/bulik/ldsc, copy archived at swh:1:rev:aa33296abac9569a6422ee6ba7eb4b902422cc74); Schorsch, 2021, MAGMA v1.07b for gene-set enrichment analysis (https://ctg.cncr.nl/software/magma), Hail v0.2.51 for distributed computing and mtDNA GWAS (https://hail.is), the R circlize package (Gu et al., 2014) for visualization of mtDNA-GWAS, and the R polycor package for phenotypic correlations with binary traits.

Materials and methods

Trait selection

Request a detailed protocol

Sex-standardized period prevalence of over 300 diseases was obtained from an extensive survey of the National Health Service in the UK as reported previously (Kuan et al., 2019). To select high prevalence late-onset diseases, we ranked diseases with a median onset over 50 years of age by the sum of the period prevalence of all age categories above 50. We selected the top 30 diseases using this metric and manually mapped these traits to similar or equivalent phenotypes with publicly available summary statistics from UKB and/or well-powered meta-analyses (e.g. Parkinson’s Disease and Alzheimer’s Disease for dementia) resulting in 24 traits with data available in UKB (RRID:SCR_012815), meta-analyses, or both (Supplementary file 1).

Criteria for inclusion of summary statistics

Request a detailed protocol

We manually mapped selected age-related diseases and traits to corresponding phenotypes in UKB. In parallel, we searched the literature to identify well-powered EUR-predominant GWAS (referred to as meta-analyses) that (1) used primarily non-targeted arrays, (2) had publicly available full summary statistics, and (3) did not enroll individuals from UKB to serve as independent replication (Appendix 1). We produced heritability estimates using stratified linkage-disequilibrium score regression (S-LDSC, https://github.com/bulik/ldsc) (Finucane et al., 2015) atop the BaselineLD v2.2 model using reference LD scores computed from 1000G EUR (https://alkesgroup.broadinstitute.org/LDSCORE/). We computed the heritability Z-score, a statistic that captures sample size, polygenicity, and heritability (Finucane et al., 2015), and included only traits with heritability Z-score > 4 (Appendix 1) for further analysis.

Genetic correlations among age-related traits

Request a detailed protocol

Pairwise genetic correlations, rg, were computed using linkage-disequilibrium score correlation (Bulik-Sullivan et al., 2015) on all selected age-related traits with heritability Z-score > 4. We used UKB summary statistics (https://github.com/Nealelab/UK_Biobank_GWAS) for all sufficiently powered traits; summary statistics from meta-analyses were used for eGFR (Pattaro et al., 2016), Alzheimer’s Disease (Lambert et al., 2013), and Parkinson’s Disease (Nalls et al., 2019) as these traits showed heritability Z-score > 4 within meta-analyses but not in UKB (Supplementary file 1). p-Values for genetic correlation represented deviation from the null hypothesis rg=0. Traits were ordered by their contribution to the first eigenvector of the absolute value of the correlation matrix, with point estimates and standard errors available in Source data 1. Bonferroni correction was applied producing a p-value cutoff of 0.05/242+212=1.03*10-4, accounting for both genotypic and phenotypic correlation hypothesis tests.

Phenotypic correlations in UKB

Request a detailed protocol

Pairwise phenotypic correlations, rp, were computed for all 21 traits with well-powered individual level data available in UKB (Supplementary file 1). Pearson correlation was computed between continuous traits via cor.test in R (RRID:SCR_001905) with a two-sided alternative. Tetrachoric correlation was used to compute correlations between binary traits and biserial correlation was used for correlations between binary and continuous traits, using the polychor and polyserial functions of the polycor package in R using the two-step approximation, respectively. These approaches model a latent normally distributed variable underlying binary traits. p-Values were computed using a normal approximation using standard error estimates from polycor. Point estimates and standard errors are available in Figure 1—source data 1.

Assessment of mitochondria-localizing genes in the GWAS catalog

Request a detailed protocol

We mapped variants in the GWAS Catalog (RRID:SCR_012745) (obtained on September 5th, 2019, https://www.ebi.ac.uk/gwas/) meeting genome-wide significance (p < 5e-8) to genes using provided annotations, producing a set of trait-associated genes for each trait. We manually selected phenotypes represented in the GWAS Catalog matching our set of age-associated traits with > 30 trait-associated genes. For each trait, we computed the proportion of trait-associated genes that were mitochondria-localizing (defined via MitoCarta2.0; Calvo et al., 2016, RRID:SCR_018165) and tested for enrichment or depletion relative to overall genome background using two-sided Fisher’s exact tests. We corrected for multiple hypothesis tests with the Benjamini-Hochberg (BH) procedure at FDR q-value < 0.1.

We also computed the test statistic Ngenrich, defined as the number of age-associated traits showing a nominal (not necessarily statistically significant) enrichment for a given gene-set g, for the MitoCarta genes. We then generated an empirical null distribution for Ngenrich. We drew 1000 random samples of protein-coding genes, where each sample contained the same number of genes as the set of mitochondria-localizing genes and computed Ngenrich for each of these gene-sets (Figure 2—figure supplement 1B). The one-sided p-value, defined as PrNgenrichx under the null, was subsequently obtained.

We expanded our enrichment/depletion analysis to all 332 traits in the GWAS Catalog with over 30 trait-associated genes; for enrichment or depletion testing, we used two-sided Fisher’s exact tests and corrected for multiple hypothesis testing with the BH procedure at FDR q-value < 0.1.

Harmonization and filtering of summary statistics for LDSC and MAGMA

Request a detailed protocol

UKB summary statistics previously formatted for use with LDSC and filtered to HapMap3 (HM3) (RRID:SCR_004563) SNPs (https://github.com/Nealelab/UKBB_ldsc) were used for analysis with S-LDSC. For analysis with MAGMA v1.07b (de Leeuw et al., 2015), we included variants from the full Neale Lab UKB Round 2 GWAS summary statistics (https://github.com/Nealelab/UK_Biobank_GWAS) with INFO > 0.8 and MAF > 0.01, and excluded any variants flagged as low confidence (a heuristic defined by MAF < 0.001 or expected case MAC < 25).

Summary statistics obtained from publicly available GWAS meta-analyses (Teslovich et al., 2010; Ehret et al., 2011; Manning et al., 2012; Morris et al., 2012; Schunkert et al., 2011; Estrada et al., 2012; Christophersen et al., 2017; Pattaro et al., 2016; Nalls et al., 2019; Lambert et al., 2013) were reported in varied formats. We manually verified the genome build upon which each meta-analysis reported results and ensured that all sets of summary statistics contained columns listing p-value, variant rsID, genome-build specific coordinates, and if available, variant-specific sample size (Supplementary file 1). If variant coordinates or rsID were not provided, the relevant columns were obtained from dbSNP (RRID:SCR_002338) database version 130 (for hg18) or 146 (for hg19). We used the summary statistic munging script provided with S-LDSC (https://github.com/bulik/ldsc) to generate summary statistics compatible with S-LDSC, restricting to HM3 SNPs as these tend to be best behaved for analysis with LDSC. For use of meta-analyses with MAGMA (de Leeuw et al., 2015), we restricted analysis to variants with INFO > 0.8 and MAF > 0.01 if such information was provided.

Multiple testing correction for gene-set enrichment analysis

Request a detailed protocol

To account for the multiple hypothesis tests performed throughout this study for age-related traits, we obtained p-value thresholds via the BH procedure at FDR < 0.1 for all gene-sets assessed for a given method and cohort type (where the two cohort types were UKB and meta-analysis). The BH procedure at FDR < 0.1 was also applied to our analyses of parental lifespan and healthspan.

Gene-set-based enrichment analysis

Request a detailed protocol

We extensively use S-LDSC and MAGMA to perform gene-set enrichment analyses among GWAS summary statistics. To test enrichment with S-LDSC, SNPs were mapped to each gene with a 100 kb symmetric window as recommended (Finucane et al., 2018) and LD scores were computed using the 1000G EUR reference panel (RRID:SCR_006828) (https://alkesgroup.broadinstitute.org/LDSCORE/) and subsequently restricted to the HM3 SNPs. We used S-LDSC to test for heritability enrichment controlling for 53 annotations including coding regions, enhancer regions, 5’ and 3’ UTRs, and others as previously described (Finucane et al., 2015) (baseline v1.1, referred to as baseline model hereafter). We also used MAGMA with both 5 kb up, 1.5 kb down and 100 kb symmetric windows to test for enrichment. MAGMA gene-level analysis was performed with the 1000G EUR LD reference panel to account for LD structure, and gene-set analysis was performed including covariates for gene length, variant density, inverse minor allele count (MAC), as well as log-transformed versions of these covariates. Statistical tests for both S-LDSC and MAGMA were one-sided, considering enrichment only. For both methods, we included the relevant superset of genes as a control to ensure that our analysis was competitive (Appendix 1). We refer to this approach as the ‘usual approach.’ All enrichment effect size estimates and p-values are available in Source data 1 and Source data 3.

Enrichment analysis of genes comprising the mitochondrial proteome

Request a detailed protocol

We obtained the set of nuclear-encoded mitochondria-localizing genes using MitoCarta2.0 (Calvo et al., 2016) and used the literature to obtain the subset of MitoCarta genes involved in inherited mitochondrial disease (Frazier et al., 2019) as well as those producing components of oxidative phosphorylation (OXPHOS) complexes. We used both S-LDSC and MAGMA to test for enrichment in the usual way (Materials and methods) controlling for the set of protein-coding genes to ensure a competitive analysis (Appendix 1). We also tested mitochondria-localizing genes for enrichment in meta-analyses using S-LDSC and MAGMA with the same parameters as for UKB traits (Appendix 1).

Tissue-expressed gene-set enrichment analysis

Request a detailed protocol

To obtain the set of genes most expressed in a given tissue versus others, we obtained t-statistics computed from GTEx (RRID:SCR_013042) v6 gene-level transcript-per-million (TPM) data corrected for age and sex as published previously (Finucane et al., 2018). For each tissue, we selected the top 2485 genes (10%) with the highest t-statistics for tissue-specific expression, producing tissue-expressed gene-sets. We selected nine tissues based on expectation of enrichment for our tested traits in UKB (e.g. liver for LDL levels, esophageal mucosa for GERD). We used both S-LDSC and MAGMA to test for enrichment in the usual way (Materials and methods) controlling for the set of tissue-expressed genes to ensure a competitive analysis (Appendix 1). Tissue-expressed gene-set analyses were performed on meta-analyses with S-LDSC and MAGMA on the same tissues using the same parameters as used in UKB.

Power analysis

Request a detailed protocol

To test for the effects of gene-set size on power, we selected 10 positive control tissue-trait pairs based on (1) the presence of tissue enrichment in UKB with S-LDSC and MAGMA and (2) if the observed enrichment was biologically plausible. The pairs tested were liver-HDL, liver-LDL, liver-TG, liver-cholesterol, pancreas-glucose, pancreas-T2D, atrial appendage-atrial fibrillation, sigmoid colon-diverticular disease, coronary artery-myocardial infarction, and visceral adipose-HDL. We then, in brief, used an empirical sampling-based approach, generating random subsamples of a selected set of tissue-expressed gene-sets at four different gene-set sizes (1523, 1105, 800, and 350 genes), defining power as the proportion of trials showing a significant enrichment (Appendix 1). We used the same sub-sampled gene-sets for enrichment analysis using both S-LDSC and MAGMA in the usual way (Materials and methods) controlling for the set of tissue-expressed genes to ensure a competitive analysis (Appendix 1). We used the same gene-sets among the subset of the positive control traits that showed enrichment in the corresponding meta-analysis to verify power for the meta-analyses (Appendix 1).

Cross-tissue eQTL analysis

Request a detailed protocol

We obtained the set of eGenes from GTEx (RRID:SCR_013042) v8 across 49 tissues (https://www.gtexportal.org), filtering to only include cis-eQTLs with q-value < 0.05. To determine how the landscape of cis-eQTLs for MitoCarta genes compared to other protein-coding genes, we regressed the number of tissues with a detected cis-eQTL for a given gene x, NxeQTL, onto an indicator for membership in a given organellar proteome (Ixorganelle), controlling for gene length, log gene length, breadth of expression (τx), and the number of tissues with detected expression > 5 TPM (Nxexpress, Appendix 1). To quantify breadth of expression, we obtained median-per-tissue GTEx v8 TPM expression values and computed τ (Yanai et al., 2005) after removing lowly expressed genes with maximal cross-tissue TPM < 1, defined as:


where xi is the expression of gene x in tissue i with n tissues. τ ranges from 0 to 1, with lower τ indicating broadly expressed genes and higher τ indicating more tissue specific expression patterns. Because GTEx sampled multiple tissue subtypes (e.g. brain sub-regions) that show correlated expression profiles (Melé et al., 2015) which bias τx, NxeQTL, and Nxexpress upward, for each broader tissue class (brain, heart, artery, esophagus, skin, cervix, colon, adipose), we selected a single representative tissue when computing these quantities (Figure 3—figure supplement 5B, Appendix 1). We used LD scores computed from the 1000G EUR reference panel. The model, fit via ordinary least squares for each tested organelle, was:


mtDNA-wide association study

Request a detailed protocol

We obtained mtDNA genotype data on 265 variants as obtained on the UK Biobank Axiom array and the UK BiLEVE array from the full UKB release (RRID:SCR_012815) (Sudlow et al., 2015). To perform variant QC, we used evoker-lite (RRID:SCR_009145) (Morris et al., 2010) to generate fluorescence cluster plots per-variant and per-batch and manually inspected the results, removing 19 variants due to cluster plot abnormalities (Supplementary file 2a, Appendix 1). We additionally removed any variants with heterozygous calls, within-array-type call rate < 0.95, and with less than 20 individuals with an alternate genotype. For case-control traits, we removed any phenotype-variant pair with an expected case count of alternate genotype individuals of less than 20, resulting in a maximum of 213 variants tested per trait (Appendix 1). To perform sample QC, we restricted samples to the same samples from which UKB summary statistics were generated (https://github.com/Nealelab/UK_Biobank_GWAS), namely unrelated individuals seven standard deviations away from the first 6 European sample selection PCs with self-reported white-British, Irish, or White ethnicity and no evidence of sex chromosome aneuploidy. We additionally removed any samples with within-array-type mitochondrial variant call rate < 0.95, resulting in 360,662 unrelated samples of EUR ancestry. We generated the LD matrix for mitochondrial DNA variants using Hail v0.2.51 (https://hail.is) pairwise for all 213 variants tested across all post-QC samples.

We ran mtDNA-GWAS for all 21 UKB age-related phenotypes as well as creatinine and AST using Hail v0.2.51 via linear regression controlling for the first 20 PCs of the nuclear genotype matrix, sex, age, age2, sex*age, and sex*age2 as performed for the UKB GWAS (https://github.com/Nealelab/UK_Biobank_GWAS). We also used Hail to run Firth logistic regression with the same covariates for case/control traits. As we observed that some mitochondrial DNA variants were specific to array type, we also ran linear regression including array type as a covariate; we did not perform logistic regression with array type as a covariate due to convergence issues from complete separation of variants assessed only on a single array type. We defined mtDNA-wide significance using a Bonferroni correction by p=0.0543371.15e-5.

Enrichment analysis of components of organellar proteomes

Request a detailed protocol

COMPARTMENTS (RRID:SCR_015561) (https://compartments.jensenlab.org) (Binder et al., 2014) is a resource integrating several lines of evidence for protein localization predictions including annotations, text-mining, sequence predictions, and experimental data from the Human Protein Atlas. We used this resource to obtain the degree of evidence (a number ranging from 0 to 5) linking each gene to localization to one of 12 organelles: nucleus, cytosol, cytoskeleton, peroxisome, lysosome, endoplasmic reticulum, Golgi apparatus, plasma membrane, endosome, extracellular space, mitochondrion, and proteasome. To avoid noisy localization assignments due to weak text mining and prediction evidence, we only considered localization assignments with a score > 2 as described previously (Binder et al., 2014). We subsequently assigned compartment(s) to each gene by selecting the compartment(s) with the maximal score within each gene. We only included compartments containing over 240 genes due to limited power at smaller gene-set sizes and used MitoCarta2.0 (Calvo et al., 2016) to obtain a higher confidence set of genes localizing to the mitochondrion, resulting in gene-sets representing the proteomes of 10 organelles. S-LDSC and MAGMA were used to test for enrichment across the UKB age-related traits for these gene-sets in the usual way, controlling for the set of protein-coding genes. S-LDSC was also used to obtain estimates of the percentage of heritability explained by each organelle gene-set.

Enrichment analysis of spatial components of the nucleus

Request a detailed protocol

To produce interpretable sub-divisions of the nucleus, we used Gene Ontology (GO) (RRID:SCR_017505) (The Gene Ontology Consortium et al., 2019; Ashburner et al., 2000) to identify terms listed as children of the nucleus cellular component (GO:0005634). We used Ensembl (RRID:SCR_002344) version 99 (Yates et al., 2020) to obtain a first pass set of genes annotated to each sub-compartment of the nucleus (or its children). After manual review of sub-compartments with > 90 genes, we selected nucleoplasm (GO:0005654), nuclear chromosome (GO:0000228), nucleolus (GO:0005730), nuclear envelope (GO:0005635), splicosomal complex (GO:0005681), nuclear DNA-directed RNA polymerase complex (GO:0055029), and nuclear pore (GO:0005643). We excluded terms listed as ‘part’ due to poor interpretability and manually excluded similar terms (e.g. nuclear lumen vs nucleoplasm). To generate a high confidence set of genes localizing to each of these selected sub-compartments, we then turned to the COMPARTMENTS resource which assigns localization confidence scores for each protein to GO cellular component terms. We assigned members of the nuclear proteome to these selected nuclear sub-compartments using same the approach outlined for the organelle analysis (Materials and methods). After filtering our selected sub-compartments to those containing > 240 genes, we obtained four categories: nucleoplasm, nuclear chromosome, nucleolus, and nuclear envelope. The nuclear chromosome annotation was largely overlapping with a manually curated high-quality list of TFs (Lambert et al., 2018) however was not exhaustive; as such, we merged these lists to generate the chromosome and TF category. To improve interpretability, we removed genes from nucleoplasm that were also assigned to another nuclear sub-compartment, constructed a list of other nucleus-localizing proteins not captured in these four sub-compartments, and included only genes annotated as localizing to the nucleus (Materials and methods). S-LDSC and MAGMA were used to test for enrichment across the UKB age-related traits for these gene-sets in the usual way while controlling for the set of protein-coding genes (Materials and methods).

Enrichment analysis of functionally distinct TF subsets

Request a detailed protocol

We used a published, curated, high-quality list of TFs (Lambert et al., 2018) to partition the Chromosome and TF category into TFs and other chromosomal proteins. To determine which TFs are broadly expressed versus tissue specific, we computed τ per TF across all selected tissues after removing lowly expressed genes with maximal cross-tissue TPM < 1 (Materials and methods, Appendix 1). The threshold for tissue-specific genes was set at τ0.76 based on the location of the central nadir of the resultant bimodal distribution (Figure 3—figure supplement 5A). To identify terciles of TFs by age, we obtained relative gene age assignments for each gene previously generated by obtaining the modal earliest ortholog level across several databases mapped to 19 ordered phylostrata (Litman and Stein, 2019). DNA-binding domain (DBD) annotations for the TFs were obtained from previous manual curation efforts (Lambert et al., 2018). S-LDSC and MAGMA were used to test for enrichment across the UKB age-related traits for these gene-sets in the usual way while controlling for the set of protein-coding genes (Materials and methods). We also tested TFs for enrichment in meta-analyses using S-LDSC and MAGMA with the same parameters as for UKB traits (Appendix 1).

Analysis of constraint across organelles and sub-organellar gene-sets

Request a detailed protocol

We obtained gene-level gnomAD (RRID:SCR_014964) v2.1.1 constraint tables (https://gnomad.broadinstitute.org), haploinsufficient genes, and olfactory receptors (Karczewski et al., 2020) (https://github.com/macarthur-lab/gene_lists). Constraint values as loss-of-function observed/expected fraction (LOEUF) were mapped to genes within organelle, sub-mitochondrial, sub-nuclear, and TF binding domain gene-sets.

Enrichment analysis across age-related disease holding constraint as a covariate

Request a detailed protocol

To test for enrichment with constraint as a covariate, we used MAGMA with UKB age-related traits. We mapped variants to genes and performed the gene-level analysis as done previously for the mitochondria-localizing gene and organelle analysis. We included LOEUF and log LOEUF as covariates for the gene-set analysis in addition to the default covariates (gene length, SNP density, inverse MAC, as well as the respective log-transformed versions) via the –condition-residualize flag.

Appendix 1

Choice of traits with meta-analyses with cohorts separate from UKB

For 10 traits with well-powered UKB GWAS and meta-analyses, we ensured that the meta-analyses used did not incorporate data from UKB thus allowing their use as replication cohorts. Parkinson’s Disease (Nalls et al., 2019) and Alzheimer’s Disease (Lambert et al., 2013) were analyzed as part of meta-analyses but not UKB due to power limitations in UKB and eGFR was assessed only in the tested meta-analysis. In the case of Parkinson’s Disease, a well-powered GWAS was recently performed and included UKB individuals (Nalls et al., 2019). Given that this trait was not sufficiently powered for analysis in UKB alone, we chose to proceed with summary statistics from this study. Because mtDNA-GWAS could only be performed in UKB (where we had access to individual-level data), we were unable to explicitly test for mtDNA associations with Parkinson’s disease, Alzheimer’s disease, and eGFR.

Heritability Z-score threshold selection

Total SNP heritability Z-score encapsulates variables such as polygenicity, sample size, and underlying disease heritability, all of which influence S-LDSC power (Finucane et al., 2015). Previous work has indicated that genetic correlation estimates from LD score regression are noisy for total SNP heritability Z-score < 4 (Bulik-Sullivan et al., 2015), and total SNP heritability Z-score > 7 has been used as a condition for trait inclusion for S-LDSC (Finucane et al., 2015). We decided to use a more relaxed cutoff of total SNP heritability Z-score > 4 for two major reasons: First, we used a distinct enrichment methodology, MAGMA, to validate enrichment signatures. To our knowledge, MAGMA does not produce unstable enrichment estimates for traits with moderate heritability Z-score. Second, we also used GWAS data from non-overlapping cohorts, when available, as independent validation for traits tested in UKB. The lower cutoff was sufficient to produce results that largely replicated across methodology and cohort, while allowing for the inclusion of several traits of interest. Further, several traits with heritability Z-score between 4 and 7 show positive control tissue enrichments and substantial enrichment detection power (for example, LDL levels).

Choice of traits to test in the GWAS Catalog

We searched the GWAS Catalog phenotypes to identify age-related traits. We manually identified 30 phenotypes that matched our 24 age-related traits (Figure 2—figure supplement 1A). This list differs from our full list of age-related traits for two reasons: (1) not all 24 age-related traits had a sufficient number of associated genes for analysis, and (2) in several cases, multiple phenotypes listed in the GWAS catalog matched our age-related traits (e.g. ‘Cholesterol, total’ and ‘Total cholesterol levels’); we tested these separately.

Investigation of mitochondria-relevant transcription factors in the GWAS Catalog

We tested if any of eight TFs known to regulate mitochondrial function – TFAM, GABPA, GABPB1, ESRRA, YY1, NRF1, PPARGC1A, and PPARGC1B – were the nearest gene to any genome-wide significant variants listed for age-related traits in the GWAS Catalog. We tested the same traits we used for enrichment analysis of the MitoCarta genes in the GWAS Catalog (Figure 2—figure supplement 1) and did not find any signal for 29/30 tested phenotypes. We did find that TFAM was one of the nearest genes for heel bone mineral density, however we note that there are a total of 1496 unique mapped nearest genes for this trait. Further, we tested mitochondria-localizing genes for enrichment in GWAS for heel bone mineral density (3148_irnt) in UKB and found no evidence of enrichment (Figure 2).

Choice of enrichment method

In this study, we leveraged several enrichment methods to ensure robustness to methodology. We used Fisher’s exact test in a first-pass analysis of enrichment of GWAS signal in the GWAS Catalog. While this provides a useful preview of the enrichment landscape across published GWAS, this suffers from numerous limitations, including the usage of only genome-wide significant SNPs, the treatment of each variant as equally likely to contribute to GWAS signal under the null, and an inability to easily control for covariates such a gene length, among others. As such, we used two different methods, MAGMA and S-LDSC, to test for GWAS enrichment among our gene-sets while resolving these confounders and reducing the likelihood of model misspecification. We used S-LDSC to test for heritability enrichment within specified variants controlling for 53 functional categories including DNase hypersensitivity sites, H3K4Me sites, and coding regions. MAGMA uses a variation of Fisher’s method to obtain gene-level test statistics and test for gene-set enrichment controlling for LD structure, and when performing the gene-set enrichment testing we controlled for gene length, inverse MAC, and SNP density. We also used tissue-specific enrichments as positive controls to ensure that the methods we used were working properly.

Notably, when running MAGMA on age-associated traitmeta-analyses with a 100 kb window, we were unable to find tissue-specific enrichments (Figure 2—figure supplement 6B). Given that S-LDSC, and MAGMA with a 5 kb up and 1.5 kb down window, identified these enrichments in the selected meta-analyses (Figure 2—figure supplement 3B, Figure 2—figure supplement 6A) and that we observe reduced power for enrichments among meta-analyses relative to UKB (Figure 2—figure supplement 7H), we attributed the lack of tissue enrichments using MAGMA at 100 kb in meta-analyses to a lack of power. Indeed, MAGMA with a 100 kb symmetric window was able to identify enrichments in UKB (Figure 2—figure supplement 5B). Thus, we did not test any other gene-sets among meta-analyses using MAGMA with a 100 kb window.

Choice of control genes for gene-based tests

For all gene-based analyses we aimed to perform a competitive analysis, testing if our genes of interest explained more trait heritability than comparable loci elsewhere in the genome. For our positive control tests and power analyses leveraging the set of highest expressed tissue-specific genes, we controlled for the set of genes across which t-statistics were computed (~25,000 genes); namely all genes that had at least four samples in GTEx with one or more counts-per-million (Finucane et al., 2018). All of our non-tissue gene-sets (e.g. MitoCarta genes, organelle-localizing genes) were subsets of the set of protein-coding genes, so we controlled for the set of protein-coding genes for these analyses (~19,000 genes). For S-LDSC, this involved including the respective control gene-set annotation atop the baseline model; for MAGMA, this involved defining the gene location file based on the control gene-set such that the space of genes considered was restricted to the genes to be controlled for.

Power analysis of gene-based tests

To verify the power of S-LDSC and MAGMA in our selected traits, we sub-sampled each of ten positive control tissue-trait pairs. We subsampled the set of tissue-expressed genes for each of the six selected tissues at various gene-set sizes and empirically assessed the number of trials in which significant enrichment was detected, giving us an estimate of power, or Pr(reject|alternative). All tissue enrichments were originally performed with 2485 genes (Materials and methods); as such we conducted subsampling trials with 1523, 1105, 800, and 350 genes to assess power throughout our study. Because LD score computations are very computationally intensive, we generated 50 random subsamples per gene-set size-tissue pair ensuring that each sample contained a proportional number of genes per chromosome to the original tissue expressed gene-set. We mapped variants to genes and computed LD scores per-chromosome for each annotation (Materials and methods). For each gene-set size and tissue (24 gene-set size-tissue pairs), we generated 1000 sets of LD scores by shuffling LD scores computed per chromosome, effectively generating 1000 random tissue gene-set subsamples for each gene-set size-tissue pair. We subsequently used S-LDSC to test for enrichment for each of the 1000 tissue gene-set subsamples in the aforementioned selected traits for each gene-set size, resulting in 240,000 regressions atop the baseline model as performed for the tissue enrichments in the usual way (Materials and methods). The gene-sets generated for use with S-LDSC (1000 per gene-set size-tissue pair) were also exported for analysis using MAGMA with the same competitive analysis performed for the tissue-enrichment analysis (Materials and methods).

To characterize the power differential between UKB and meta-analyses, we tested the subset of the tissue-trait pairs tested in UKB that showed enrichment in the corresponding meta-analysis with either S-LDSC (Figure 2—figure supplement 3B) or MAGMA (Figure 2—figure supplement 6). This resulted in an assessment of power among meta-analyses for liver-TG, liver-LDL, liver-HDL, visceral adipose-HDL, atrial appendage-atrial fibrillation, pancreas-T2D, and pancreas-glucose. We tested the same gene-sets tested with S-LDSC and MAGMA (1000 per gene-set size-tissue pair) in UKB using both S-LDSC and MAGMA in the usual way (Materials and methods).

As expected, we noted that power was a function of both enrichment effect size and gene-set size for S-LDSC and MAGMA (Figure 2—figure supplement 7AFigure 2—figure supplement 7F). While we observed lower power across most tested traits among meta-analyses when compared to UKB, power was acceptable among the meta-analyses for high effect size enrichments for gene-sets with 1105 genes (Figure 2—figure supplement 7G, Figure 2—figure supplement 7H).

Choice of gene-sets to test for replication among meta-analyses

Because our power analyses showed a substantial reduction of power for tested meta-analyses relative to UKB (Figure 2—figure supplement 7I, Figure 2—figure supplement 7J), we tested only a subset of all tested gene-sets for replication among meta-analyses. Namely, we sought to test replication of the two major organelle-based results in this study: (1) the lack of enrichment of mitochondria-localizing genes across age-related disease and (2) the enrichment of chromosome and TF genes, with subsequent enrichment of the TFs alone.

Choice of tissues to include for multi-tissue analyses

An assumption key to several statistics used for the eQTL and TF breadth of expression analyses (τx, NxeQTL,Nxexpress) is that different tissues are not overrepresented in the set of tissues assessed. This assumption breaks down in GTEx, where the brain, artery, and esophagus were sampled in multiple sub-regions and the skin, cervix, colon, and adipose tissue were sampled in two sub-regions. We selected specific sub-regions as shown in Figure 3—figure supplement 5B manually as expression profiles within sub-regions tend to be far more similar than profiles between sub-regions. To test robustness, we selected an alternate set of tissues within each class (brain frontal cortex (ba9), artery tibial, esophagus muscularis, skin sun exposed (lower leg), colon transverse, and adipose subcutaneous) and repeated our analyses. For our eQTL analysis, we find results that are very similar using this alternate set of tissues as expected (Figure 2—figure supplement 8C). Further, we found that with the cutoff of τ=0.76 for a tissue-specific gene, only 32 of the 1463 tested TFs would be classified differently (Supplementary file 2b). Using our original choice of tissues, we find that 605 TFs are tissue specific (Lambert et al., 2018 report 542 tissue-specific TFs), that 75% of homeodomain containing TFs are tissue specific (Lambert et al., 2018 report 82%), and that 18.6% of KRAB ZF TFs are tissue specific (Lambert et al., 2018 report 12%). Thus, our results using our choice of tissues are robust to the specific choice of tissue sub-region within a tissue region and are in good agreement with previously reported tissue-specific expression annotations.

Model selection for eQTL analyses

To understand if genetic variation near genes localizing to a given organelle were abnormally unlikely to produce downstream biological consequences, we turned to cis-eQTLs. Because most genes have a measured cis-eQTL in at least one tissue (Figure 2—figure supplement 8A), we constructed a model to test if genes localizing to a given organelle had significant cis-eQTLs in more or less tissues than other protein-coding genes. We included several covariates to minimize the risk of confounding from first principles (Materials and methods). We corrected for genelength and log10(genelength) as we expected that higher number of SNPs in longer genes would increase the probability of eQTL detection; Nxexpress as we suspected that genes would have detectable eQTLs at most in tissues where they were expressed; and τx as we expected that broadly expressed genes would be more likely to have cis-eQTLs detected in more tissues. Upon model fitting, we observed that all coefficients were significantly different from 0.

Manual variant QC for mtDNA-GWAS

We used two strategies to manually review the variants that made it through automated variant QC filters (Materials and methods). First, we visually reviewed fluorescence cluster plots for each mtDNA variant to ensure that our variant calls were accurate (Materials and methods). We visually categorized each variant into five categories: clear pass, batch concern, off target variant (OTV) concern, resolution concern, and misclustering (Supplementary file 2a), removing 19 variants from further analysis due to cluster plot abnormalities. Second, we computed the mtDNA LD matrix finding no evidence of distance-dependent LD on the mtDNA (Figure 2—figure supplement 9A) as observed previously (Yamamoto et al., 2020).

Minor allele frequency filters for mtDNA-GWAS

We used two variant frequency filters to ensure that our regression test statistics were well-behaved (Materials and methods). For continuous traits, we included only variants that had at least 20 individuals with an alternate genotype. For binary traits, we implemented a per-trait and per-variant filter by computing the proportion of individuals with an alternate genotype required such that, under null expectation, there would be at least 20 cases with an alternate genotype. This filter has been shown to eliminate false positive associations by eliminating low MAC variants for rare traits, in which highly imprecise allele frequency estimates can exert high leverage on test statistics (Howrigan et al., 2017). This was operationalized as an MAF cutoff as there are by definition no heterozygotes on the mitochondrial DNA, such that for each trait we included only variants that satisfied MAF20/minCaseSampleSize,ControlSampleSize. The sample size estimates were dependent on the variant being assessed as certain variants had distinct missingness patterns due to measurement on a particular genotype array used for only a subset of the cohort (Methods). In total, we tested up to 213 variants per phenotype, assessing a total of 4337 variant-phenotype pairs.

Enrichment analysis of Parkinson’s disease

Of course, much interest lies around characterizing the involvement of mitochondrial dysfunction in PD (Nguyen et al., 2019; Grünewald et al., 2019; Abou-Sleiman et al., 2006; Ge et al., 2020). We find no evidence of heritability enrichment among MitoCarta genes in a recent PD GWAS (Nalls et al., 2019; Figure 2D). Due to power limitations, we were unable to assess mtDNA associations with PD (Appendix 1), though to our knowledge, broadly reproducible associations between inherited mtDNA variants and PD have yet to be reported (Bose and Beal, 2016; Müller-Nedebock et al., 2019).

Interpretation of heritability explained by organellar gene-sets

For the sets of genes corresponding to organellar proteomes, we highlight the substantial amount of SNP-heritability explained by variants in or near genes contributing to the nuclear proteome. It is notable that all organelles show prophSNP2/propSNP>1 (Figure 3—figure supplement 1). We believe that this is because of other properties of the SNPs near organelle-localizing genes, namely that all selected SNPs are near protein coding genes. SNPs in protein coding regions are known to be enriched for heritability (Finucane et al., 2015), and indeed when we explicitly model these potentially confounding functional SNP annotations (DNase hypersensitivity sites, H3K4Me sites, coding regions; Materials and methods) only the enrichment among variants near nucleus-localizing genes persists.

Overlap analysis of subsets of the nuclear proteome

We performed pairwise overlap analysis for our five final sub-nuclear compartments (Nucleoplasm, Chromosome and TF, Nucleolus, Nuclear Envelope, Other Nuclear Proteins), finding that virtually all pairs showed an overlap of less than 5% (with an exception for the nucleolus, ~13% of which was also represented in chromosome and TF). S-LDSC and MAGMA were used to test for enrichment across the UKB age-related traits for these gene-sets as performed previously for the organelle analysis.

GWAS enrichments of functional subdivisions of the class of TFs

We further subdivided the TFs based on breadth of expression in human tissues, DNA-binding domain (DBD), and gene age (Materials and methods). We found a similar pattern of enrichment for tissue-specific TFs and broadly expressed TFs (Figure 3—figure supplement 5C, Figure 3—figure supplement 6A, Figure 3—figure supplement 7A). However, upon stratification by the three largest categories of TF DBD (Lambert et al., 2018), we found that non-zinc finger TFs showed enrichment for many age-related traits (Figure 3—figure supplement 5D, Figure 3—figure supplement 6B, Figure 3—figure supplement 7B, Figure 3—figure supplement 8B), while the KRAB domain-containing zinc fingers (KRAB ZFs), were largely devoid of enrichment even compared to non-KRAB ZFs (Figure 3—figure supplement 5E, Figure 3—figure supplement 6C, Figure 3—figure supplement 7C, Figure 3—figure supplement 8C). While our power analysis suggests sufficient power only for high effect sizes at ~350 genes, we note that (1) the KRAB ZFs and non-KRAB ZFs have similar gene-set sizes and (2) S-LDSC coefficient point estimates are systematically much higher for non-KRAB ZFs than for KRAB ZFs (Figure 3—figure supplement 7C). Notably, while we initially observed enrichment only for ancient and intermediate-age TFs but not recently evolved TFs (Figure 3—figure supplement 5G, Figure 3—figure supplement 6D, Figure 3—figure supplement 7D, Figure 3—figure supplement 8D), we find that old and recent non-KRAB TFs showed similar enrichment profiles (Figure 3—figure supplement 5I, Figure 3—figure supplement 6E, Figure 3—figure supplement 7E, Figure 3—figure supplement 8E), suggesting that the lack of signal among recent TFs was likely attributable to the KRAB domain containing ZFs which are predominantly recently-evolved (Figure 3—figure supplement 5H).

Age-related disease GWAS enrichment with constraint as a covariate

We wanted to assess if our observed enrichment results persist after explicitly accounting for any variance explained by the degree of constraint. We used MAGMA and included LOEUF as a covariate in the gene-set enrichment analysis model (Materials and methods), finding that the LOEUF correction did not substantially impact MitoCarta gene enrichment (Figure 5—figure supplement 2A, Figure 5—figure supplement 3A) but did reduce the degree of enrichment seen for nucleus-localizing genes (Figure 5—figure supplement 2B, Figure 5—figure supplement 3B). We continue observing enrichment for the TFs across several age-related diseases (Figure 5—figure supplement 2E, Figure 5—figure supplement 2F) with a similar pattern of enrichment in non-ZF TFs and non-KRAB ZFs (Figure 5—figure supplement 2G) to that seen with the original model (Figure 3—figure supplement 5D, Figure 3—figure supplement 5E). Thus, while constraint explains a substantial component of the enrichment observed for the TFs among age-related diseases, an enrichment signal persists after accounting for LOEUF.

Data availability

Heritability point estimates and standard errors for age-related traits are listed in Supplementary File 1. Genetic and phenotypic correlation point estimates and standard errors/p-values plotted in Figure 1B are available in Figure 1-Source data 1. Summary statistics from mtDNA-GWAS (plotted in Figure 2 and Figure 2—figure supplement 9) are available in Source data 2. All gene-based enrichment analysis p-values and point estimates are available in Source data 1 and Source data 3. Period prevalence data for diseases in the UK can be obtained from Kuan et al. 2019. Gene-sets can be found using COMPARTMENTS (https://compartments.jensenlab.org), MitoCarta 2.0 (https://www.broadinstitute.org/files/shared/metabolism/mitocarta/human.mitocarta2.0.html), Lambert et al. 2018 (DOI: 10.1016/j.cell.2018.01.029), Frazier et al. 2019 (DOI: 10.1074/jbc.R117.809194), Finucane et al. 2018 (https://alkesgroup.broadinstitute.org/LDSCORE/), Kapopoulou et al. 2015 (DOI: 10.1111/evo.12819), and the MacArthur laboratory (https://github.com/macarthur-lab/gene_lists, copy archived at https://archive.softwareheritage.org/swh:1:rev:fcc849637bd71e683bffc618e1a48081a8df08f8). Gene age estimates were obtained from Litman, Stein 2019 (DOI: 10.1053/j.seminoncol.2018.11.002). GWAS catalog annotations can be obtained from: https://www.ebi.ac.uk/gwas. Heritability estimates across UKB can be obtained at: https://nealelab.github.io/UKBB_ldsc/. UKB summary statistics can be obtained from Neale lab GWAS round 2: https://github.com/Nealelab/UK_Biobank_GWAS (copy archived at https://archive.softwareheritage.org/swh:1:rev:dc7b7b590413ec96a45a64f7213f50a3a0606198). Annotations for the Baseline v1.1 and BaselineLD v2.2 models as well as other relevant reference data, including the 1000G EUR reference panel, can be obtained from https://alkesgroup.broadinstitute.org/LDSCORE/. eQTL and expression data in human tissues can be obtained from GTEx: https://www.gtexportal.org. Constraint estimates can be found via gnomAD: https://gnomad.broadinstitute.org. See citations for publicly available GWAS meta-analysis summary statistics (Teslovich et al., 2010; Ehret et al., 2011; Timmers et al., 2019; Zenin et al., 2019; Manning et al., 2012; Morris et al., 2012; Schunkert et al., 2011; Estrada et al., 2012; Christophersen et al., 2017; Pattaro et al., 2016; Nalls et al., 2019; Lambert et al., 2013).

The following previously published data sets were used
    1. Teslovich TM
    (2010) University of Michigan
    ID lipids2010. Biological, clinical and population relevance of 95 loci for blood lipids.
    1. DIAGRAM Consortium
    (2012) DIAGRAM T2D Stage 1 GWAS
    ID 1 GWAS. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes, stage 1 GWAS.
    1. CARDIoGRAM plus C4D Consortium
    (2011) CARDIoGRAM plus C4D meta-analysis
    ID meta-analysis. Large-scale association analysis identifies 13 new susceptibility loci for coronary artery disease.
    1. GEnetic Factors for OSteoporosis Consortium
    (2012) GEFOS Pooled Femoral Neck Summary Statistics
    ID GEFOS2_FNBMD_POOLED_GC. Genome-wide meta-analysis identifies 56 bone mineral density loci and reveals 14 loci associated with risk of fracture.
    1. AFGen
    (2017) Human Genetics Amplifier
    ID 28416818.2017. Large-scale analyses of common and rare variants identify 12 new loci associated with atrial fibrillation.
    1. AFGen
    (2016) CKDGen Data at Medical Center - University of Freiburg
    ID Pattaro2016data. Genetic associations at 53 loci highlight cell types and biological pathways relevant for kidney function; eGFRcrea and CKD.
    1. Brainstorm
    2. IPDGC
    (2019) IPDGC GWAS META5 summary stats (excluding 23andMe)
    ID 1FZ9UL99LAqyWnyNBxxlx6qOUlfAnublN. Identification of novel risk loci, causal insights, and heritable risk for Parkinson's disease: a meta-analysis of genome-wide association studies.
    1. International Genomics of Alzheimer's Project (IGAP)
    (2013) IGAP Stage 1
    ID ng00036. Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer's disease.
    1. Timmers PRHJ
    2. Mounier N
    3. Lall K
    4. Fischer K
    5. Ning Z
    6. Feng X
    7. Bretherick AD
    8. Clark DW
    9. eQTLGen Consortium
    10. Shen X
    11. Esko T
    12. Kutalik Z
    13. Wilson JF
    14. Joshi PK
    (2019) Edinburgh DataShare
    Genomics of 1 million parent lifespans implicates novel pathways and common diseases and distinguishes survival chances.
    1. Zenin A
    2. Tsepilov Y
    3. Sharapov S
    4. Getmantsev E
    5. Menshikov LI
    6. Fedichev PO
    7. Aulchenko Y
    (2019) Zenodo
    Identification of 12 genetic loci associated with human healthspan.
    1. GTEx Consortium
    (2019) GTEx portal
    ID GTEx_Analysis_2017-06-05_v8_RNASeQCv1.1.9_gene_median_tpm. GTEx v8 median expression TPM per tissue.


    1. Chance B
    2. Williams GR
    Respiratory enzymes in oxidative phosphorylation. III. the steady state
    The Journal of Biological Chemistry 217:409–428.
    1. Christophersen IE
    2. Rienstra M
    3. Roselli C
    4. Yin X
    5. Geelhoed B
    6. Barnard J
    7. Lin H
    8. Arking DE
    9. Smith AV
    10. Albert CM
    11. Chaffin M
    12. Tucker NR
    13. Li M
    14. Klarin D
    15. Bihlmeyer NA
    16. Low SK
    17. Weeke PE
    18. Müller-Nurasyid M
    19. Smith JG
    20. Brody JA
    21. Niemeijer MN
    22. Dörr M
    23. Trompet S
    24. Huffman J
    25. Gustafsson S
    26. Schurmann C
    27. Kleber ME
    28. Lyytikäinen LP
    29. Seppälä I
    30. Malik R
    31. Horimoto A
    32. Perez M
    33. Sinisalo J
    34. Aeschbacher S
    35. Thériault S
    36. Yao J
    37. Radmanesh F
    38. Weiss S
    39. Teumer A
    40. Choi SH
    41. Weng LC
    42. Clauss S
    43. Deo R
    44. Rader DJ
    45. Shah SH
    46. Sun A
    47. Hopewell JC
    48. Debette S
    49. Chauhan G
    50. Yang Q
    51. Worrall BB
    52. Paré G
    53. Kamatani Y
    54. Hagemeijer YP
    55. Verweij N
    56. Siland JE
    57. Kubo M
    58. Smith JD
    59. Van Wagoner DR
    60. Bis JC
    61. Perz S
    62. Psaty BM
    63. Ridker PM
    64. Magnani JW
    65. Harris TB
    66. Launer LJ
    67. Shoemaker MB
    68. Padmanabhan S
    69. Haessler J
    70. Bartz TM
    71. Waldenberger M
    72. Lichtner P
    73. Arendt M
    74. Krieger JE
    75. Kähönen M
    76. Risch L
    77. Mansur AJ
    78. Peters A
    79. Smith BH
    80. Lind L
    81. Scott SA
    82. Lu Y
    83. Bottinger EB
    84. Hernesniemi J
    85. Lindgren CM
    86. Wong JA
    87. Huang J
    88. Eskola M
    89. Morris AP
    90. Ford I
    91. Reiner AP
    92. Delgado G
    93. Chen LY
    94. Chen YI
    95. Sandhu RK
    96. Li M
    97. Boerwinkle E
    98. Eisele L
    99. Lannfelt L
    100. Rost N
    101. Anderson CD
    102. Taylor KD
    103. Campbell A
    104. Magnusson PK
    105. Porteous D
    106. Hocking LJ
    107. Vlachopoulou E
    108. Pedersen NL
    109. Nikus K
    110. Orho-Melander M
    111. Hamsten A
    112. Heeringa J
    113. Denny JC
    114. Kriebel J
    115. Darbar D
    116. Newton-Cheh C
    117. Shaffer C
    118. Macfarlane PW
    119. Heilmann-Heimbach S
    120. Almgren P
    121. Huang PL
    122. Sotoodehnia N
    123. Soliman EZ
    124. Uitterlinden AG
    125. Hofman A
    126. Franco OH
    127. Völker U
    128. Jöckel KH
    129. Sinner MF
    130. Lin HJ
    131. Guo X
    132. Dichgans M
    133. Ingelsson E
    134. Kooperberg C
    135. Melander O
    136. Loos RJF
    137. Laurikka J
    138. Conen D
    139. Rosand J
    140. van der Harst P
    141. Lokki ML
    142. Kathiresan S
    143. Pereira A
    144. Jukema JW
    145. Hayward C
    146. Rotter JI
    147. März W
    148. Lehtimäki T
    149. Stricker BH
    150. Chung MK
    151. Felix SB
    152. Gudnason V
    153. Alonso A
    154. Roden DM
    155. Kääb S
    156. Chasman DI
    157. Heckbert SR
    158. Benjamin EJ
    159. Tanaka T
    160. Lunetta KL
    161. Lubitz SA
    162. Ellinor PT
    163. METASTROKE Consortium of the ISGC
    164. Neurology Working Group of the CHARGE Consortium
    165. AFGen Consortium
    (2017) Large-scale analyses of common and rare variants identify 12 new loci associated with atrial fibrillation
    Nature Genetics 49:946–952.
    1. Ehret GB
    2. Munroe PB
    3. Rice KM
    4. Bochud M
    5. Johnson AD
    6. Chasman DI
    7. Smith AV
    8. Tobin MD
    9. Verwoert GC
    10. Hwang SJ
    11. Pihur V
    12. Vollenweider P
    13. O'Reilly PF
    14. Amin N
    15. Bragg-Gresham JL
    16. Teumer A
    17. Glazer NL
    18. Launer L
    19. Zhao JH
    20. Aulchenko Y
    21. Heath S
    22. Sõber S
    23. Parsa A
    24. Luan J
    25. Arora P
    26. Dehghan A
    27. Zhang F
    28. Lucas G
    29. Hicks AA
    30. Jackson AU
    31. Peden JF
    32. Tanaka T
    33. Wild SH
    34. Rudan I
    35. Igl W
    36. Milaneschi Y
    37. Parker AN
    38. Fava C
    39. Chambers JC
    40. Fox ER
    41. Kumari M
    42. Go MJ
    43. van der Harst P
    44. Kao WH
    45. Sjögren M
    46. Vinay DG
    47. Alexander M
    48. Tabara Y
    49. Shaw-Hawkins S
    50. Whincup PH
    51. Liu Y
    52. Shi G
    53. Kuusisto J
    54. Tayo B
    55. Seielstad M
    56. Sim X
    57. Nguyen KD
    58. Lehtimäki T
    59. Matullo G
    60. Wu Y
    61. Gaunt TR
    62. Onland-Moret NC
    63. Cooper MN
    64. Platou CG
    65. Org E
    66. Hardy R
    67. Dahgam S
    68. Palmen J
    69. Vitart V
    70. Braund PS
    71. Kuznetsova T
    72. Uiterwaal CS
    73. Adeyemo A
    74. Palmas W
    75. Campbell H
    76. Ludwig B
    77. Tomaszewski M
    78. Tzoulaki I
    79. Palmer ND
    80. Aspelund T
    81. Garcia M
    82. Chang YP
    83. O'Connell JR
    84. Steinle NI
    85. Grobbee DE
    86. Arking DE
    87. Kardia SL
    88. Morrison AC
    89. Hernandez D
    90. Najjar S
    91. McArdle WL
    92. Hadley D
    93. Brown MJ
    94. Connell JM
    95. Hingorani AD
    96. Day IN
    97. Lawlor DA
    98. Beilby JP
    99. Lawrence RW
    100. Clarke R
    101. Hopewell JC
    102. Ongen H
    103. Dreisbach AW
    104. Li Y
    105. Young JH
    106. Bis JC
    107. Kähönen M
    108. Viikari J
    109. Adair LS
    110. Lee NR
    111. Chen MH
    112. Olden M
    113. Pattaro C
    114. Bolton JA
    115. Köttgen A
    116. Bergmann S
    117. Mooser V
    118. Chaturvedi N
    119. Frayling TM
    120. Islam M
    121. Jafar TH
    122. Erdmann J
    123. Kulkarni SR
    124. Bornstein SR
    125. Grässler J
    126. Groop L
    127. Voight BF
    128. Kettunen J
    129. Howard P
    130. Taylor A
    131. Guarrera S
    132. Ricceri F
    133. Emilsson V
    134. Plump A
    135. Barroso I
    136. Khaw KT
    137. Weder AB
    138. Hunt SC
    139. Sun YV
    140. Bergman RN
    141. Collins FS
    142. Bonnycastle LL
    143. Scott LJ
    144. Stringham HM
    145. Peltonen L
    146. Perola M
    147. Vartiainen E
    148. Brand SM
    149. Staessen JA
    150. Wang TJ
    151. Burton PR
    152. Soler Artigas M
    153. Dong Y
    154. Snieder H
    155. Wang X
    156. Zhu H
    157. Lohman KK
    158. Rudock ME
    159. Heckbert SR
    160. Smith NL
    161. Wiggins KL
    162. Doumatey A
    163. Shriner D
    164. Veldre G
    165. Viigimaa M
    166. Kinra S
    167. Prabhakaran D
    168. Tripathy V
    169. Langefeld CD
    170. Rosengren A
    171. Thelle DS
    172. Corsi AM
    173. Singleton A
    174. Forrester T
    175. Hilton G
    176. McKenzie CA
    177. Salako T
    178. Iwai N
    179. Kita Y
    180. Ogihara T
    181. Ohkubo T
    182. Okamura T
    183. Ueshima H
    184. Umemura S
    185. Eyheramendy S
    186. Meitinger T
    187. Wichmann HE
    188. Cho YS
    189. Kim HL
    190. Lee JY
    191. Scott J
    192. Sehmi JS
    193. Zhang W
    194. Hedblad B
    195. Nilsson P
    196. Smith GD
    197. Wong A
    198. Narisu N
    199. Stančáková A
    200. Raffel LJ
    201. Yao J
    202. Kathiresan S
    203. O'Donnell CJ
    204. Schwartz SM
    205. Ikram MA
    206. Longstreth WT
    207. Mosley TH
    208. Seshadri S
    209. Shrine NR
    210. Wain LV
    211. Morken MA
    212. Swift AJ
    213. Laitinen J
    214. Prokopenko I
    215. Zitting P
    216. Cooper JA
    217. Humphries SE
    218. Danesh J
    219. Rasheed A
    220. Goel A
    221. Hamsten A
    222. Watkins H
    223. Bakker SJ
    224. van Gilst WH
    225. Janipalli CS
    226. Mani KR
    227. Yajnik CS
    228. Hofman A
    229. Mattace-Raso FU
    230. Oostra BA
    231. Demirkan A
    232. Isaacs A
    233. Rivadeneira F
    234. Lakatta EG
    235. Orru M
    236. Scuteri A
    237. Ala-Korpela M
    238. Kangas AJ
    239. Lyytikäinen LP
    240. Soininen P
    241. Tukiainen T
    242. Würtz P
    243. Ong RT
    244. Dörr M
    245. Kroemer HK
    246. Völker U
    247. Völzke H
    248. Galan P
    249. Hercberg S
    250. Lathrop M
    251. Zelenika D
    252. Deloukas P
    253. Mangino M
    254. Spector TD
    255. Zhai G
    256. Meschia JF
    257. Nalls MA
    258. Sharma P
    259. Terzic J
    260. Kumar MV
    261. Denniff M
    262. Zukowska-Szczechowska E
    263. Wagenknecht LE
    264. Fowkes FG
    265. Charchar FJ
    266. Schwarz PE
    267. Hayward C
    268. Guo X
    269. Rotimi C
    270. Bots ML
    271. Brand E
    272. Samani NJ
    273. Polasek O
    274. Talmud PJ
    275. Nyberg F
    276. Kuh D
    277. Laan M
    278. Hveem K
    279. Palmer LJ
    280. van der Schouw YT
    281. Casas JP
    282. Mohlke KL
    283. Vineis P
    284. Raitakari O
    285. Ganesh SK
    286. Wong TY
    287. Tai ES
    288. Cooper RS
    289. Laakso M
    290. Rao DC
    291. Harris TB
    292. Morris RW
    293. Dominiczak AF
    294. Kivimaki M
    295. Marmot MG
    296. Miki T
    297. Saleheen D
    298. Chandak GR
    299. Coresh J
    300. Navis G
    301. Salomaa V
    302. Han BG
    303. Zhu X
    304. Kooner JS
    305. Melander O
    306. Ridker PM
    307. Bandinelli S
    308. Gyllensten UB
    309. Wright AF
    310. Wilson JF
    311. Ferrucci L
    312. Farrall M
    313. Tuomilehto J
    314. Pramstaller PP
    315. Elosua R
    316. Soranzo N
    317. Sijbrands EJ
    318. Altshuler D
    319. Loos RJ
    320. Shuldiner AR
    321. Gieger C
    322. Meneton P
    323. Uitterlinden AG
    324. Wareham NJ
    325. Gudnason V
    326. Rotter JI
    327. Rettig R
    328. Uda M
    329. Strachan DP
    330. Witteman JC
    331. Hartikainen AL
    332. Beckmann JS
    333. Boerwinkle E
    334. Vasan RS
    335. Boehnke M
    336. Larson MG
    337. Järvelin MR
    338. Psaty BM
    339. Abecasis GR
    340. Chakravarti A
    341. Elliott P
    342. van Duijn CM
    343. Newton-Cheh C
    344. Levy D
    345. Caulfield MJ
    346. Johnson T
    347. International Consortium for Blood Pressure Genome-Wide Association Studies
    348. CARDIoGRAM consortium
    349. CKDGen Consortium
    350. KidneyGen Consortium
    351. EchoGen consortium
    352. CHARGE-HF consortium
    (2011) Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk
    Nature 478:103–109.
    1. Estrada K
    2. Styrkarsdottir U
    3. Evangelou E
    4. Hsu YH
    5. Duncan EL
    6. Ntzani EE
    7. Oei L
    8. Albagha OM
    9. Amin N
    10. Kemp JP
    11. Koller DL
    12. Li G
    13. Liu CT
    14. Minster RL
    15. Moayyeri A
    16. Vandenput L
    17. Willner D
    18. Xiao SM
    19. Yerges-Armstrong LM
    20. Zheng HF
    21. Alonso N
    22. Eriksson J
    23. Kammerer CM
    24. Kaptoge SK
    25. Leo PJ
    26. Thorleifsson G
    27. Wilson SG
    28. Wilson JF
    29. Aalto V
    30. Alen M
    31. Aragaki AK
    32. Aspelund T
    33. Center JR
    34. Dailiana Z
    35. Duggan DJ
    36. Garcia M
    37. Garcia-Giralt N
    38. Giroux S
    39. Hallmans G
    40. Hocking LJ
    41. Husted LB
    42. Jameson KA
    43. Khusainova R
    44. Kim GS
    45. Kooperberg C
    46. Koromila T
    47. Kruk M
    48. Laaksonen M
    49. Lacroix AZ
    50. Lee SH
    51. Leung PC
    52. Lewis JR
    53. Masi L
    54. Mencej-Bedrac S
    55. Nguyen TV
    56. Nogues X
    57. Patel MS
    58. Prezelj J
    59. Rose LM
    60. Scollen S
    61. Siggeirsdottir K
    62. Smith AV
    63. Svensson O
    64. Trompet S
    65. Trummer O
    66. van Schoor NM
    67. Woo J
    68. Zhu K
    69. Balcells S
    70. Brandi ML
    71. Buckley BM
    72. Cheng S
    73. Christiansen C
    74. Cooper C
    75. Dedoussis G
    76. Ford I
    77. Frost M
    78. Goltzman D
    79. González-Macías J
    80. Kähönen M
    81. Karlsson M
    82. Khusnutdinova E
    83. Koh JM
    84. Kollia P
    85. Langdahl BL
    86. Leslie WD
    87. Lips P
    88. Ljunggren Ö
    89. Lorenc RS
    90. Marc J
    91. Mellström D
    92. Obermayer-Pietsch B
    93. Olmos JM
    94. Pettersson-Kymmer U
    95. Reid DM
    96. Riancho JA
    97. Ridker PM
    98. Rousseau F
    99. Slagboom PE
    100. Tang NL
    101. Urreizti R
    102. Van Hul W
    103. Viikari J
    104. Zarrabeitia MT
    105. Aulchenko YS
    106. Castano-Betancourt M
    107. Grundberg E
    108. Herrera L
    109. Ingvarsson T
    110. Johannsdottir H
    111. Kwan T
    112. Li R
    113. Luben R
    114. Medina-Gómez C
    115. Palsson ST
    116. Reppe S
    117. Rotter JI
    118. Sigurdsson G
    119. van Meurs JB
    120. Verlaan D
    121. Williams FM
    122. Wood AR
    123. Zhou Y
    124. Gautvik KM
    125. Pastinen T
    126. Raychaudhuri S
    127. Cauley JA
    128. Chasman DI
    129. Clark GR
    130. Cummings SR
    131. Danoy P
    132. Dennison EM
    133. Eastell R
    134. Eisman JA
    135. Gudnason V
    136. Hofman A
    137. Jackson RD
    138. Jones G
    139. Jukema JW
    140. Khaw KT
    141. Lehtimäki T
    142. Liu Y
    143. Lorentzon M
    144. McCloskey E
    145. Mitchell BD
    146. Nandakumar K
    147. Nicholson GC
    148. Oostra BA
    149. Peacock M
    150. Pols HA
    151. Prince RL
    152. Raitakari O
    153. Reid IR
    154. Robbins J
    155. Sambrook PN
    156. Sham PC
    157. Shuldiner AR
    158. Tylavsky FA
    159. van Duijn CM
    160. Wareham NJ
    161. Cupples LA
    162. Econs MJ
    163. Evans DM
    164. Harris TB
    165. Kung AW
    166. Psaty BM
    167. Reeve J
    168. Spector TD
    169. Streeten EA
    170. Zillikens MC
    171. Thorsteinsdottir U
    172. Ohlsson C
    173. Karasik D
    174. Richards JB
    175. Brown MA
    176. Stefansson K
    177. Uitterlinden AG
    178. Ralston SH
    179. Ioannidis JP
    180. Kiel DP
    181. Rivadeneira F
    (2012) Genome-wide meta-analysis identifies 56 bone mineral density loci and reveals 14 loci associated with risk of fracture
    Nature Genetics 44:491–501.
    1. Fuchsberger C
    2. Flannick J
    3. Teslovich TM
    4. Mahajan A
    5. Agarwala V
    6. Gaulton KJ
    7. Ma C
    8. Fontanillas P
    9. Moutsianas L
    10. McCarthy DJ
    11. Rivas MA
    12. Perry JRB
    13. Sim X
    14. Blackwell TW
    15. Robertson NR
    16. Rayner NW
    17. Cingolani P
    18. Locke AE
    19. Tajes JF
    20. Highland HM
    21. Dupuis J
    22. Chines PS
    23. Lindgren CM
    24. Hartl C
    25. Jackson AU
    26. Chen H
    27. Huyghe JR
    28. van de Bunt M
    29. Pearson RD
    30. Kumar A
    31. Müller-Nurasyid M
    32. Grarup N
    33. Stringham HM
    34. Gamazon ER
    35. Lee J
    36. Chen Y
    37. Scott RA
    38. Below JE
    39. Chen P
    40. Huang J
    41. Go MJ
    42. Stitzel ML
    43. Pasko D
    44. Parker SCJ
    45. Varga TV
    46. Green T
    47. Beer NL
    48. Day-Williams AG
    49. Ferreira T
    50. Fingerlin T
    51. Horikoshi M
    52. Hu C
    53. Huh I
    54. Ikram MK
    55. Kim BJ
    56. Kim Y
    57. Kim YJ
    58. Kwon MS
    59. Lee J
    60. Lee S
    61. Lin KH
    62. Maxwell TJ
    63. Nagai Y
    64. Wang X
    65. Welch RP
    66. Yoon J
    67. Zhang W
    68. Barzilai N
    69. Voight BF
    70. Han BG
    71. Jenkinson CP
    72. Kuulasmaa T
    73. Kuusisto J
    74. Manning A
    75. Ng MCY
    76. Palmer ND
    77. Balkau B
    78. Stančáková A
    79. Abboud HE
    80. Boeing H
    81. Giedraitis V
    82. Prabhakaran D
    83. Gottesman O
    84. Scott J
    85. Carey J
    86. Kwan P
    87. Grant G
    88. Smith JD
    89. Neale BM
    90. Purcell S
    91. Butterworth AS
    92. Howson JMM
    93. Lee HM
    94. Lu Y
    95. Kwak SH
    96. Zhao W
    97. Danesh J
    98. Lam VKL
    99. Park KS
    100. Saleheen D
    101. So WY
    102. Tam CHT
    103. Afzal U
    104. Aguilar D
    105. Arya R
    106. Aung T
    107. Chan E
    108. Navarro C
    109. Cheng CY
    110. Palli D
    111. Correa A
    112. Curran JE
    113. Rybin D
    114. Farook VS
    115. Fowler SP
    116. Freedman BI
    117. Griswold M
    118. Hale DE
    119. Hicks PJ
    120. Khor CC
    121. Kumar S
    122. Lehne B
    123. Thuillier D
    124. Lim WY
    125. Liu J
    126. van der Schouw YT
    127. Loh M
    128. Musani SK
    129. Puppala S
    130. Scott WR
    131. Yengo L
    132. Tan ST
    133. Taylor HA
    134. Thameem F
    135. Wilson G
    136. Wong TY
    137. Njølstad PR
    138. Levy JC
    139. Mangino M
    140. Bonnycastle LL
    141. Schwarzmayr T
    142. Fadista J
    143. Surdulescu GL
    144. Herder C
    145. Groves CJ
    146. Wieland T
    147. Bork-Jensen J
    148. Brandslund I
    149. Christensen C
    150. Koistinen HA
    151. Doney ASF
    152. Kinnunen L
    153. Esko T
    154. Farmer AJ
    155. Hakaste L
    156. Hodgkiss D
    157. Kravic J
    158. Lyssenko V
    159. Hollensted M
    160. Jørgensen ME
    161. Jørgensen T
    162. Ladenvall C
    163. Justesen JM
    164. Käräjämäki A
    165. Kriebel J
    166. Rathmann W
    167. Lannfelt L
    168. Lauritzen T
    169. Narisu N
    170. Linneberg A
    171. Melander O
    172. Milani L
    173. Neville M
    174. Orho-Melander M
    175. Qi L
    176. Qi Q
    177. Roden M
    178. Rolandsson O
    179. Swift A
    180. Rosengren AH
    181. Stirrups K
    182. Wood AR
    183. Mihailov E
    184. Blancher C
    185. Carneiro MO
    186. Maguire J
    187. Poplin R
    188. Shakir K
    189. Fennell T
    190. DePristo M
    191. de Angelis MH
    192. Deloukas P
    193. Gjesing AP
    194. Jun G
    195. Nilsson P
    196. Murphy J
    197. Onofrio R
    198. Thorand B
    199. Hansen T
    200. Meisinger C
    201. Hu FB
    202. Isomaa B
    203. Karpe F
    204. Liang L
    205. Peters A
    206. Huth C
    207. O'Rahilly SP
    208. Palmer CNA
    209. Pedersen O
    210. Rauramaa R
    211. Tuomilehto J
    212. Salomaa V
    213. Watanabe RM
    214. Syvänen AC
    215. Bergman RN
    216. Bharadwaj D
    217. Bottinger EP
    218. Cho YS
    219. Chandak GR
    220. Chan JCN
    221. Chia KS
    222. Daly MJ
    223. Ebrahim SB
    224. Langenberg C
    225. Elliott P
    226. Jablonski KA
    227. Lehman DM
    228. Jia W
    229. Ma RCW
    230. Pollin TI
    231. Sandhu M
    232. Tandon N
    233. Froguel P
    234. Barroso I
    235. Teo YY
    236. Zeggini E
    237. Loos RJF
    238. Small KS
    239. Ried JS
    240. DeFronzo RA
    241. Grallert H
    242. Glaser B
    243. Metspalu A
    244. Wareham NJ
    245. Walker M
    246. Banks E
    247. Gieger C
    248. Ingelsson E
    249. Im HK
    250. Illig T
    251. Franks PW
    252. Buck G
    253. Trakalo J
    254. Buck D
    255. Prokopenko I
    256. Mägi R
    257. Lind L
    258. Farjoun Y
    259. Owen KR
    260. Gloyn AL
    261. Strauch K
    262. Tuomi T
    263. Kooner JS
    264. Lee JY
    265. Park T
    266. Donnelly P
    267. Morris AD
    268. Hattersley AT
    269. Bowden DW
    270. Collins FS
    271. Atzmon G
    272. Chambers JC
    273. Spector TD
    274. Laakso M
    275. Strom TM
    276. Bell GI
    277. Blangero J
    278. Duggirala R
    279. Tai ES
    280. McVean G
    281. Hanis CL
    282. Wilson JG
    283. Seielstad M
    284. Frayling TM
    285. Meigs JB
    286. Cox NJ
    287. Sladek R
    288. Lander ES
    289. Gabriel S
    290. Burtt NP
    291. Mohlke KL
    292. Meitinger T
    293. Groop L
    294. Abecasis G
    295. Florez JC
    296. Scott LJ
    297. Morris AP
    298. Kang HM
    299. Boehnke M
    300. Altshuler D
    301. McCarthy MI
    (2016) The genetic architecture of type 2 diabetes
    Nature 536:41–47.
    1. Kraja AT
    2. Liu C
    3. Fetterman JL
    4. Graff M
    5. Have CT
    6. Gu C
    7. Yanek LR
    8. Feitosa MF
    9. Arking DE
    10. Chasman DI
    11. Young K
    12. Ligthart S
    13. Hill WD
    14. Weiss S
    15. Luan J
    16. Giulianini F
    17. Li-Gao R
    18. Hartwig FP
    19. Lin SJ
    20. Wang L
    21. Richardson TG
    22. Yao J
    23. Fernandez EP
    24. Ghanbari M
    25. Wojczynski MK
    26. Lee WJ
    27. Argos M
    28. Armasu SM
    29. Barve RA
    30. Ryan KA
    31. An P
    32. Baranski TJ
    33. Bielinski SJ
    34. Bowden DW
    35. Broeckel U
    36. Christensen K
    37. Chu AY
    38. Corley J
    39. Cox SR
    40. Uitterlinden AG
    41. Rivadeneira F
    42. Cropp CD
    43. Daw EW
    44. van Heemst D
    45. de Las Fuentes L
    46. Gao H
    47. Tzoulaki I
    48. Ahluwalia TS
    49. de Mutsert R
    50. Emery LS
    51. Erzurumluoglu AM
    52. Perry JA
    53. Fu M
    54. Forouhi NG
    55. Gu Z
    56. Hai Y
    57. Harris SE
    58. Hemani G
    59. Hunt SC
    60. Irvin MR
    61. Jonsson AE
    62. Justice AE
    63. Kerrison ND
    64. Larson NB
    65. Lin KH
    66. Love-Gregory LD
    67. Mathias RA
    68. Lee JH
    69. Nauck M
    70. Noordam R
    71. Ong KK
    72. Pankow J
    73. Patki A
    74. Pattie A
    75. Petersmann A
    76. Qi Q
    77. Ribel-Madsen R
    78. Rohde R
    79. Sandow K
    80. Schnurr TM
    81. Sofer T
    82. Starr JM
    83. Taylor AM
    84. Teumer A
    85. Timpson NJ
    86. de Haan HG
    87. Wang Y
    88. Weeke PE
    89. Williams C
    90. Wu H
    91. Yang W
    92. Zeng D
    93. Witte DR
    94. Weir BS
    95. Wareham NJ
    96. Vestergaard H
    97. Turner ST
    98. Torp-Pedersen C
    99. Stergiakouli E
    100. Sheu WH
    101. Rosendaal FR
    102. Ikram MA
    103. Franco OH
    104. Ridker PM
    105. Perls TT
    106. Pedersen O
    107. Nohr EA
    108. Newman AB
    109. Linneberg A
    110. Langenberg C
    111. Kilpeläinen TO
    112. Kardia SLR
    113. Jørgensen ME
    114. Jørgensen T
    115. Sørensen TIA
    116. Homuth G
    117. Hansen T
    118. Goodarzi MO
    119. Deary IJ
    120. Christensen C
    121. Chen YI
    122. Chakravarti A
    123. Brandslund I
    124. Bonnelykke K
    125. Taylor KD
    126. Wilson JG
    127. Rodriguez S
    128. Davies G
    129. Horta BL
    130. Thyagarajan B
    131. Rao DC
    132. Grarup N
    133. Davila-Roman VG
    134. Hudson G
    135. Guo X
    136. Arnett DK
    137. Hayward C
    138. Vaidya D
    139. Mook-Kanamori DO
    140. Tiwari HK
    141. Levy D
    142. Loos RJF
    143. Dehghan A
    144. Elliott P
    145. Malik AN
    146. Scott RA
    147. Becker DM
    148. de Andrade M
    149. Province MA
    150. Meigs JB
    151. Rotter JI
    152. North KE
    (2019) Associations of Mitochondrial and Nuclear Mitochondrial Variants and Genes with Seven Metabolic Traits
    The American Journal of Human Genetics 104:112–138.
    1. Lambert JC
    2. Ibrahim-Verbaas CA
    3. Harold D
    4. Naj AC
    5. Sims R
    6. Bellenguez C
    7. DeStafano AL
    8. Bis JC
    9. Beecham GW
    10. Grenier-Boley B
    11. Russo G
    12. Thorton-Wells TA
    13. Jones N
    14. Smith AV
    15. Chouraki V
    16. Thomas C
    17. Ikram MA
    18. Zelenika D
    19. Vardarajan BN
    20. Kamatani Y
    21. Lin CF
    22. Gerrish A
    23. Schmidt H
    24. Kunkle B
    25. Dunstan ML
    26. Ruiz A
    27. Bihoreau MT
    28. Choi SH
    29. Reitz C
    30. Pasquier F
    31. Cruchaga C
    32. Craig D
    33. Amin N
    34. Berr C
    35. Lopez OL
    36. De Jager PL
    37. Deramecourt V
    38. Johnston JA
    39. Evans D
    40. Lovestone S
    41. Letenneur L
    42. Morón FJ
    43. Rubinsztein DC
    44. Eiriksdottir G
    45. Sleegers K
    46. Goate AM
    47. Fiévet N
    48. Huentelman MW
    49. Gill M
    50. Brown K
    51. Kamboh MI
    52. Keller L
    53. Barberger-Gateau P
    54. McGuiness B
    55. Larson EB
    56. Green R
    57. Myers AJ
    58. Dufouil C
    59. Todd S
    60. Wallon D
    61. Love S
    62. Rogaeva E
    63. Gallacher J
    64. St George-Hyslop P
    65. Clarimon J
    66. Lleo A
    67. Bayer A
    68. Tsuang DW
    69. Yu L
    70. Tsolaki M
    71. Bossù P
    72. Spalletta G
    73. Proitsi P
    74. Collinge J
    75. Sorbi S
    76. Sanchez-Garcia F
    77. Fox NC
    78. Hardy J
    79. Deniz Naranjo MC
    80. Bosco P
    81. Clarke R
    82. Brayne C
    83. Galimberti D
    84. Mancuso M
    85. Matthews F
    86. Moebus S
    87. Mecocci P
    88. Del Zompo M
    89. Maier W
    90. Hampel H
    91. Pilotto A
    92. Bullido M
    93. Panza F
    94. Caffarra P
    95. Nacmias B
    96. Gilbert JR
    97. Mayhaus M
    98. Lannefelt L
    99. Hakonarson H
    100. Pichler S
    101. Carrasquillo MM
    102. Ingelsson M
    103. Beekly D
    104. Alvarez V
    105. Zou F
    106. Valladares O
    107. Younkin SG
    108. Coto E
    109. Hamilton-Nelson KL
    110. Gu W
    111. Razquin C
    112. Pastor P
    113. Mateo I
    114. Owen MJ
    115. Faber KM
    116. Jonsson PV
    117. Combarros O
    118. O'Donovan MC
    119. Cantwell LB
    120. Soininen H
    121. Blacker D
    122. Mead S
    123. Mosley TH
    124. Bennett DA
    125. Harris TB
    126. Fratiglioni L
    127. Holmes C
    128. de Bruijn RF
    129. Passmore P
    130. Montine TJ
    131. Bettens K
    132. Rotter JI
    133. Brice A
    134. Morgan K
    135. Foroud TM
    136. Kukull WA
    137. Hannequin D
    138. Powell JF
    139. Nalls MA
    140. Ritchie K
    141. Lunetta KL
    142. Kauwe JS
    143. Boerwinkle E
    144. Riemenschneider M
    145. Boada M
    146. Hiltuenen M
    147. Martin ER
    148. Schmidt R
    149. Rujescu D
    150. Wang LS
    151. Dartigues JF
    152. Mayeux R
    153. Tzourio C
    154. Hofman A
    155. Nöthen MM
    156. Graff C
    157. Psaty BM
    158. Jones L
    159. Haines JL
    160. Holmans PA
    161. Lathrop M
    162. Pericak-Vance MA
    163. Launer LJ
    164. Farrer LA
    165. van Duijn CM
    166. Van Broeckhoven C
    167. Moskvina V
    168. Seshadri S
    169. Williams J
    170. Schellenberg GD
    171. Amouyel P
    172. European Alzheimer's Disease Initiative (EADI)
    173. Genetic and Environmental Risk in Alzheimer's Disease
    174. Alzheimer's Disease Genetic Consortium
    175. Cohorts for Heart and Aging Research in Genomic Epidemiology
    (2013) Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer's disease
    Nature Genetics 45:1452–1458.
    1. Manning AK
    2. Hivert MF
    3. Scott RA
    4. Grimsby JL
    5. Bouatia-Naji N
    6. Chen H
    7. Rybin D
    8. Liu CT
    9. Bielak LF
    10. Prokopenko I
    11. Amin N
    12. Barnes D
    13. Cadby G
    14. Hottenga JJ
    15. Ingelsson E
    16. Jackson AU
    17. Johnson T
    18. Kanoni S
    19. Ladenvall C
    20. Lagou V
    21. Lahti J
    22. Lecoeur C
    23. Liu Y
    24. Martinez-Larrad MT
    25. Montasser ME
    26. Navarro P
    27. Perry JR
    28. Rasmussen-Torvik LJ
    29. Salo P
    30. Sattar N
    31. Shungin D
    32. Strawbridge RJ
    33. Tanaka T
    34. van Duijn CM
    35. An P
    36. de Andrade M
    37. Andrews JS
    38. Aspelund T
    39. Atalay M
    40. Aulchenko Y
    41. Balkau B
    42. Bandinelli S
    43. Beckmann JS
    44. Beilby JP
    45. Bellis C
    46. Bergman RN
    47. Blangero J
    48. Boban M
    49. Boehnke M
    50. Boerwinkle E
    51. Bonnycastle LL
    52. Boomsma DI
    53. Borecki IB
    54. Böttcher Y
    55. Bouchard C
    56. Brunner E
    57. Budimir D
    58. Campbell H
    59. Carlson O
    60. Chines PS
    61. Clarke R
    62. Collins FS
    63. Corbatón-Anchuelo A
    64. Couper D
    65. de Faire U
    66. Dedoussis GV
    67. Deloukas P
    68. Dimitriou M
    69. Egan JM
    70. Eiriksdottir G
    71. Erdos MR
    72. Eriksson JG
    73. Eury E
    74. Ferrucci L
    75. Ford I
    76. Forouhi NG
    77. Fox CS
    78. Franzosi MG
    79. Franks PW
    80. Frayling TM
    81. Froguel P
    82. Galan P
    83. de Geus E
    84. Gigante B
    85. Glazer NL
    86. Goel A
    87. Groop L
    88. Gudnason V
    89. Hallmans G
    90. Hamsten A
    91. Hansson O
    92. Harris TB
    93. Hayward C
    94. Heath S
    95. Hercberg S
    96. Hicks AA
    97. Hingorani A
    98. Hofman A
    99. Hui J
    100. Hung J
    101. Jarvelin MR
    102. Jhun MA
    103. Johnson PC
    104. Jukema JW
    105. Jula A
    106. Kao WH
    107. Kaprio J
    108. Kardia SL
    109. Keinanen-Kiukaanniemi S
    110. Kivimaki M
    111. Kolcic I
    112. Kovacs P
    113. Kumari M
    114. Kuusisto J
    115. Kyvik KO
    116. Laakso M
    117. Lakka T
    118. Lannfelt L
    119. Lathrop GM
    120. Launer LJ
    121. Leander K
    122. Li G
    123. Lind L
    124. Lindstrom J
    125. Lobbens S
    126. Loos RJ
    127. Luan J
    128. Lyssenko V
    129. Mägi R
    130. Magnusson PK
    131. Marmot M
    132. Meneton P
    133. Mohlke KL
    134. Mooser V
    135. Morken MA
    136. Miljkovic I
    137. Narisu N
    138. O'Connell J
    139. Ong KK
    140. Oostra BA
    141. Palmer LJ
    142. Palotie A
    143. Pankow JS
    144. Peden JF
    145. Pedersen NL
    146. Pehlic M
    147. Peltonen L
    148. Penninx B
    149. Pericic M
    150. Perola M
    151. Perusse L
    152. Peyser PA
    153. Polasek O
    154. Pramstaller PP
    155. Province MA
    156. Räikkönen K
    157. Rauramaa R
    158. Rehnberg E
    159. Rice K
    160. Rotter JI
    161. Rudan I
    162. Ruokonen A
    163. Saaristo T
    164. Sabater-Lleal M
    165. Salomaa V
    166. Savage DB
    167. Saxena R
    168. Schwarz P
    169. Seedorf U
    170. Sennblad B
    171. Serrano-Rios M
    172. Shuldiner AR
    173. Sijbrands EJ
    174. Siscovick DS
    175. Smit JH
    176. Small KS
    177. Smith NL
    178. Smith AV
    179. Stančáková A
    180. Stirrups K
    181. Stumvoll M
    182. Sun YV
    183. Swift AJ
    184. Tönjes A
    185. Tuomilehto J
    186. Trompet S
    187. Uitterlinden AG
    188. Uusitupa M
    189. Vikström M
    190. Vitart V
    191. Vohl MC
    192. Voight BF
    193. Vollenweider P
    194. Waeber G
    195. Waterworth DM
    196. Watkins H
    197. Wheeler E
    198. Widen E
    199. Wild SH
    200. Willems SM
    201. Willemsen G
    202. Wilson JF
    203. Witteman JC
    204. Wright AF
    205. Yaghootkar H
    206. Zelenika D
    207. Zemunik T
    208. Zgaga L
    209. Wareham NJ
    210. McCarthy MI
    211. Barroso I
    212. Watanabe RM
    213. Florez JC
    214. Dupuis J
    215. Meigs JB
    216. Langenberg C
    217. DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) Consortium
    218. Multiple Tissue Human Expression Resource (MUTHER) Consortium
    (2012) A genome-wide approach accounting for body mass index identifies genetic variants influencing fasting glycemic traits and insulin resistance
    Nature Genetics 44:659–669.
    1. Morris AP
    2. Voight BF
    3. Teslovich TM
    4. Ferreira T
    5. Segrè AV
    6. Steinthorsdottir V
    7. Strawbridge RJ
    8. Khan H
    9. Grallert H
    10. Mahajan A
    11. Prokopenko I
    12. Kang HM
    13. Dina C
    14. Esko T
    15. Fraser RM
    16. Kanoni S
    17. Kumar A
    18. Lagou V
    19. Langenberg C
    20. Luan J
    21. Lindgren CM
    22. Müller-Nurasyid M
    23. Pechlivanis S
    24. Rayner NW
    25. Scott LJ
    26. Wiltshire S
    27. Yengo L
    28. Kinnunen L
    29. Rossin EJ
    30. Raychaudhuri S
    31. Johnson AD
    32. Dimas AS
    33. Loos RJ
    34. Vedantam S
    35. Chen H
    36. Florez JC
    37. Fox C
    38. Liu CT
    39. Rybin D
    40. Couper DJ
    41. Kao WH
    42. Li M
    43. Cornelis MC
    44. Kraft P
    45. Sun Q
    46. van Dam RM
    47. Stringham HM
    48. Chines PS
    49. Fischer K
    50. Fontanillas P
    51. Holmen OL
    52. Hunt SE
    53. Jackson AU
    54. Kong A
    55. Lawrence R
    56. Meyer J
    57. Perry JR
    58. Platou CG
    59. Potter S
    60. Rehnberg E
    61. Robertson N
    62. Sivapalaratnam S
    63. Stančáková A
    64. Stirrups K
    65. Thorleifsson G
    66. Tikkanen E
    67. Wood AR
    68. Almgren P
    69. Atalay M
    70. Benediktsson R
    71. Bonnycastle LL
    72. Burtt N
    73. Carey J
    74. Charpentier G
    75. Crenshaw AT
    76. Doney AS
    77. Dorkhan M
    78. Edkins S
    79. Emilsson V
    80. Eury E
    81. Forsen T
    82. Gertow K
    83. Gigante B
    84. Grant GB
    85. Groves CJ
    86. Guiducci C
    87. Herder C
    88. Hreidarsson AB
    89. Hui J
    90. James A
    91. Jonsson A
    92. Rathmann W
    93. Klopp N
    94. Kravic J
    95. Krjutškov K
    96. Langford C
    97. Leander K
    98. Lindholm E
    99. Lobbens S
    100. Männistö S
    101. Mirza G
    102. Mühleisen TW
    103. Musk B
    104. Parkin M
    105. Rallidis L
    106. Saramies J
    107. Sennblad B
    108. Shah S
    109. Sigurðsson G
    110. Silveira A
    111. Steinbach G
    112. Thorand B
    113. Trakalo J
    114. Veglia F
    115. Wennauer R
    116. Winckler W
    117. Zabaneh D
    118. Campbell H
    119. van Duijn C
    120. Uitterlinden AG
    121. Hofman A
    122. Sijbrands E
    123. Abecasis GR
    124. Owen KR
    125. Zeggini E
    126. Trip MD
    127. Forouhi NG
    128. Syvänen AC
    129. Eriksson JG
    130. Peltonen L
    131. Nöthen MM
    132. Balkau B
    133. Palmer CN
    134. Lyssenko V
    135. Tuomi T
    136. Isomaa B
    137. Hunter DJ
    138. Qi L
    139. Shuldiner AR
    140. Roden M
    141. Barroso I
    142. Wilsgaard T
    143. Beilby J
    144. Hovingh K
    145. Price JF
    146. Wilson JF
    147. Rauramaa R
    148. Lakka TA
    149. Lind L
    150. Dedoussis G
    151. Njølstad I
    152. Pedersen NL
    153. Khaw KT
    154. Wareham NJ
    155. Keinanen-Kiukaanniemi SM
    156. Saaristo TE
    157. Korpi-Hyövälti E
    158. Saltevo J
    159. Laakso M
    160. Kuusisto J
    161. Metspalu A
    162. Collins FS
    163. Mohlke KL
    164. Bergman RN
    165. Tuomilehto J
    166. Boehm BO
    167. Gieger C
    168. Hveem K
    169. Cauchi S
    170. Froguel P
    171. Baldassarre D
    172. Tremoli E
    173. Humphries SE
    174. Saleheen D
    175. Danesh J
    176. Ingelsson E
    177. Ripatti S
    178. Salomaa V
    179. Erbel R
    180. Jöckel KH
    181. Moebus S
    182. Peters A
    183. Illig T
    184. de Faire U
    185. Hamsten A
    186. Morris AD
    187. Donnelly PJ
    188. Frayling TM
    189. Hattersley AT
    190. Boerwinkle E
    191. Melander O
    192. Kathiresan S
    193. Nilsson PM
    194. Deloukas P
    195. Thorsteinsdottir U
    196. Groop LC
    197. Stefansson K
    198. Hu F
    199. Pankow JS
    200. Dupuis J
    201. Meigs JB
    202. Altshuler D
    203. Boehnke M
    204. McCarthy MI
    205. Wellcome Trust Case Control Consortium
    206. Meta-Analyses of Glucose and Insulin-related traits Consortium (MAGIC) Investigators
    207. Genetic Investigation of ANthropometric Traits (GIANT) Consortium
    208. Asian Genetic Epidemiology Network–Type 2 Diabetes (AGEN-T2D) Consortium
    209. South Asian Type 2 Diabetes (SAT2D) Consortium
    210. DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) Consortium
    (2012) Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes
    Nature genetics 44:981–990.
    1. Pattaro C
    2. Teumer A
    3. Gorski M
    4. Chu AY
    5. Li M
    6. Mijatovic V
    7. Garnaas M
    8. Tin A
    9. Sorice R
    10. Li Y
    11. Taliun D
    12. Olden M
    13. Foster M
    14. Yang Q
    15. Chen MH
    16. Pers TH
    17. Johnson AD
    18. Ko YA
    19. Fuchsberger C
    20. Tayo B
    21. Nalls M
    22. Feitosa MF
    23. Isaacs A
    24. Dehghan A
    25. d'Adamo P
    26. Adeyemo A
    27. Dieffenbach AK
    28. Zonderman AB
    29. Nolte IM
    30. van der Most PJ
    31. Wright AF
    32. Shuldiner AR
    33. Morrison AC
    34. Hofman A
    35. Smith AV
    36. Dreisbach AW
    37. Franke A
    38. Uitterlinden AG
    39. Metspalu A
    40. Tonjes A
    41. Lupo A
    42. Robino A
    43. Johansson Å
    44. Demirkan A
    45. Kollerits B
    46. Freedman BI
    47. Ponte B
    48. Oostra BA
    49. Paulweber B
    50. Krämer BK
    51. Mitchell BD
    52. Buckley BM
    53. Peralta CA
    54. Hayward C
    55. Helmer C
    56. Rotimi CN
    57. Shaffer CM
    58. Müller C
    59. Sala C
    60. van Duijn CM
    61. Saint-Pierre A
    62. Ackermann D
    63. Shriner D
    64. Ruggiero D
    65. Toniolo D
    66. Lu Y
    67. Cusi D
    68. Czamara D
    69. Ellinghaus D
    70. Siscovick DS
    71. Ruderfer D
    72. Gieger C
    73. Grallert H
    74. Rochtchina E
    75. Atkinson EJ
    76. Holliday EG
    77. Boerwinkle E
    78. Salvi E
    79. Bottinger EP
    80. Murgia F
    81. Rivadeneira F
    82. Ernst F
    83. Kronenberg F
    84. Hu FB
    85. Navis GJ
    86. Curhan GC
    87. Ehret GB
    88. Homuth G
    89. Coassin S
    90. Thun GA
    91. Pistis G
    92. Gambaro G
    93. Malerba G
    94. Montgomery GW
    95. Eiriksdottir G
    96. Jacobs G
    97. Li G
    98. Wichmann HE
    99. Campbell H
    100. Schmidt H
    101. Wallaschofski H
    102. Völzke H
    103. Brenner H
    104. Kroemer HK
    105. Kramer H
    106. Lin H
    107. Leach IM
    108. Ford I
    109. Guessous I
    110. Rudan I
    111. Prokopenko I
    112. Borecki I
    113. Heid IM
    114. Kolcic I
    115. Persico I
    116. Jukema JW
    117. Wilson JF
    118. Felix JF
    119. Divers J
    120. Lambert JC
    121. Stafford JM
    122. Gaspoz JM
    123. Smith JA
    124. Faul JD
    125. Wang JJ
    126. Ding J
    127. Hirschhorn JN
    128. Attia J
    129. Whitfield JB
    130. Chalmers J
    131. Viikari J
    132. Coresh J
    133. Denny JC
    134. Karjalainen J
    135. Fernandes JK
    136. Endlich K
    137. Butterbach K
    138. Keene KL
    139. Lohman K
    140. Portas L
    141. Launer LJ
    142. Lyytikäinen LP
    143. Yengo L
    144. Franke L
    145. Ferrucci L
    146. Rose LM
    147. Kedenko L
    148. Rao M
    149. Struchalin M
    150. Kleber ME
    151. Cavalieri M
    152. Haun M
    153. Cornelis MC
    154. Ciullo M
    155. Pirastu M
    156. de Andrade M
    157. McEvoy MA
    158. Woodward M
    159. Adam M
    160. Cocca M
    161. Nauck M
    162. Imboden M
    163. Waldenberger M
    164. Pruijm M
    165. Metzger M
    166. Stumvoll M
    167. Evans MK
    168. Sale MM
    169. Kähönen M
    170. Boban M
    171. Bochud M
    172. Rheinberger M
    173. Verweij N
    174. Bouatia-Naji N
    175. Martin NG
    176. Hastie N
    177. Probst-Hensch N
    178. Soranzo N
    179. Devuyst O
    180. Raitakari O
    181. Gottesman O
    182. Franco OH
    183. Polasek O
    184. Gasparini P
    185. Munroe PB
    186. Ridker PM
    187. Mitchell P
    188. Muntner P
    189. Meisinger C
    190. Smit JH
    191. Kovacs P
    192. Wild PS
    193. Froguel P
    194. Rettig R
    195. Mägi R
    196. Biffar R
    197. Schmidt R
    198. Middelberg RP
    199. Carroll RJ
    200. Penninx BW
    201. Scott RJ
    202. Katz R
    203. Sedaghat S
    204. Wild SH
    205. Kardia SL
    206. Ulivi S
    207. Hwang SJ
    208. Enroth S
    209. Kloiber S
    210. Trompet S
    211. Stengel B
    212. Hancock SJ
    213. Turner ST
    214. Rosas SE
    215. Stracke S
    216. Harris TB
    217. Zeller T
    218. Zemunik T
    219. Lehtimäki T
    220. Illig T
    221. Aspelund T
    222. Nikopensius T
    223. Esko T
    224. Tanaka T
    225. Gyllensten U
    226. Völker U
    227. Emilsson V
    228. Vitart V
    229. Aalto V
    230. Gudnason V
    231. Chouraki V
    232. Chen WM
    233. Igl W
    234. März W
    235. Koenig W
    236. Lieb W
    237. Loos RJ
    238. Liu Y
    239. Snieder H
    240. Pramstaller PP
    241. Parsa A
    242. O'Connell JR
    243. Susztak K
    244. Hamet P
    245. Tremblay J
    246. de Boer IH
    247. Böger CA
    248. Goessling W
    249. Chasman DI
    250. Köttgen A
    251. Kao WH
    252. Fox CS
    253. ICBP Consortium
    254. AGEN Consortium
    256. CHARGe-Heart Failure Group
    257. ECHOGen Consortium
    (2016) Genetic associations at 53 loci highlight cell types and biological pathways relevant for kidney function
    Nature Communications 7:1–19.