Integrative analysis of metabolite GWAS illuminates the molecular basis of pleiotropy and genetic correlation
Abstract
Pleiotropy and genetic correlation are widespread features in GWAS, but they are often difficult to interpret at the molecular level. Here, we perform GWAS of 16 metabolites clustered at the intersection of amino acid catabolism, glycolysis, and ketone body metabolism in a subset of UK Biobank. We utilize the well-documented biochemistry jointly impacting these metabolites to analyze pleiotropic effects in the context of their pathways. Among the 213 lead GWAS hits, we find a strong enrichment for genes encoding pathway-relevant enzymes and transporters. We demonstrate that the effect directions of variants acting on biology between metabolite pairs often contrast with those of upstream or downstream variants as well as the polygenic background. Thus, we find that these outlier variants often reflect biology local to the traits. Finally, we explore the implications for interpreting disease GWAS, underscoring the potential of unifying biochemistry with dense metabolomics data to understand the molecular basis of pleiotropy in complex traits and diseases.
Data availability
The source data and analyzed data have been deposited in Dryad. Code are available at the github link (https://github.com/courtrun/Pleiotropy-of-UKB-Metabolites). The raw individual level data are available through application to UK Biobank.
-
Pleiotropy of UK Biobank Metabolites [preliminary]Dryad Digital Repository, doi:10.5061/dryad.79cnp5hxs.
-
The UK Biobank resource with deep phenotyping and genomic dataUK Biobank http://www.ukbiobank.ac.uk/register-apply/.
Article and author information
Author details
Funding
Stanford Knight-Hennessy Scholars Program (Graduate Student Fellowship)
- Courtney J Smith
National Science Foundation (Graduate Student Fellowship)
- Courtney J Smith
National Institute of Health (5R01HG011432 and 5R01AG066490)
- Jonathan K Pritchard
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Ethics
Human subjects: All participants provided written informed consent and ethical approval was obtained from the North West Multi-Center Research Ethics Committee (11/NW/0382). The current analysis was approved under UK Biobank Project 24983 and 30418.
Copyright
© 2022, Smith et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 4,306
- views
-
- 734
- downloads
-
- 28
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Genetics and Genomics
One of the goals of synthetic biology is to enable the design of arbitrary molecular circuits with programmable inputs and outputs. Such circuits bridge the properties of electronic and natural circuits, processing information in a predictable manner within living cells. Genome editing is a potentially powerful component of synthetic molecular circuits, whether for modulating the expression of a target gene or for stably recording information to genomic DNA. However, programming molecular events such as protein-protein interactions or induced proximity as triggers for genome editing remains challenging. Here, we demonstrate a strategy termed ‘P3 editing’, which links protein-protein proximity to the formation of a functional CRISPR-Cas9 dual-component guide RNA. By engineering the crRNA:tracrRNA interaction, we demonstrate that various known protein-protein interactions, as well as the chemically induced dimerization of protein domains, can be used to activate prime editing or base editing in human cells. Additionally, we explore how P3 editing can incorporate outputs from ADAR-based RNA sensors, potentially allowing specific RNAs to induce specific genome edits within a larger circuit. Our strategy enhances the controllability of CRISPR-based genome editing, facilitating its use in synthetic molecular circuits deployed in living cells.
-
- Biochemistry and Chemical Biology
- Genetics and Genomics
RNA binding proteins (RBPs) containing intrinsically disordered regions (IDRs) are present in diverse molecular complexes where they function as dynamic regulators. Their characteristics promote liquid-liquid phase separation (LLPS) and the formation of membraneless organelles such as stress granules and nucleoli. IDR-RBPs are particularly relevant in the nervous system and their dysfunction is associated with neurodegenerative diseases and brain tumor development. Serpine1 mRNA-binding protein 1 (SERBP1) is a unique member of this group, being mostly disordered and lacking canonical RNA-binding domains. We defined SERBP1’s interactome, uncovered novel roles in splicing, cell division and ribosomal biogenesis, and showed its participation in pathological stress granules and Tau aggregates in Alzheimer’s brains. SERBP1 preferentially interacts with other G-quadruplex (G4) binders, implicated in different stages of gene expression, suggesting that G4 binding is a critical component of SERBP1 function in different settings. Similarly, we identified important associations between SERBP1 and PARP1/polyADP-ribosylation (PARylation). SERBP1 interacts with PARP1 and its associated factors and influences PARylation. Moreover, protein complexes in which SERBP1 participates contain mostly PARylated proteins and PAR binders. Based on these results, we propose a feedback regulatory model in which SERBP1 influences PARP1 function and PARylation, while PARylation modulates SERBP1 functions and participation in regulatory complexes.