Exploiting the mediating role of the metabolome to unravel transcript-to-phenotype associations

  1. Chiara Auwerx
  2. Marie C Sadler
  3. Tristan Woh
  4. Alexandre Reymond
  5. Zoltán Kutalik  Is a corresponding author
  6. Eleonora Porcu  Is a corresponding author
  1. Center for Integrative Genomics, University of Lausanne, Switzerland
  2. Swiss Institute of Bioinformatics, Switzerland
  3. University Center for Primary Care and Public Health, Switzerland
  4. Department of Computational Biology, University of Lausanne, Switzerland
4 figures and 2 additional files

Figures

Figure 1 with 1 supplement
Workflow overview.

(A) Estimation of the causal transcript-to-metabolite and metabolite-to-phenotype effects through univariable Mendelian randomization (MR). (B) Estimation of the causal transcript-to-phenotype …

Figure 1—figure supplement 1
Number of instrumental variables (IVs) used for causal effect estimation through Mendelian randomization (MR).

Distribution of the number of IVs used by the univariable MR aiming at identifying (A) transcript-to-metabolite, (B) metabolite-to-phenotype, and (C) transcript-to-phenotype causal relations. Data …

Figure 1—figure supplement 1—source data 1

Number of instrumental variables (IVs) used for causal effect estimation through Mendelian randomization (MR).

Number of IVs used by the univariable MR aiming at identifying transcript-to-metabolite, metabolite-to-phenotype, and transcript-to-phenotype causal relations. This file relates to Figure 1—figure supplement 1.

https://cdn.elifesciences.org/articles/81097/elife-81097-fig1-figsupp1-data1-v1.xlsx
Direct and mediated effects.

(A) Graphical representation of the multivariable Mendelian randomization (MVMR) framework for mediation analysis: DNA represents genetic instrumental variables (IVs) chosen to be directly …

Figure 2—source data 1

Direct and mediated effects.

Total (αTP ; transcript-to-phenotype effect) and direct (αd ; direct_effect) effects for the 216 transcript-metabolite-trait causal triplets involving the listed transcript (Gene_ID), metabolite (Shin_ID), and complex phenotype. This file relates to Figure 2B.

https://cdn.elifesciences.org/articles/81097/elife-81097-fig2-data1-v1.xlsx
Molecular pleiotropy at the FADS locus.

(A) Genome browser (GRCh37/hg19) view of the genomic region on chromosome 11 encompassing TMEM258, FADS1, and FADS2 (red). (B) Diagram of the mediation signals detected for TMEM258, FADS1, and FADS2.…

Figure 4 with 2 supplements
Power comparison between transcriptome-wide Mendelian randomization (TWMR) and multivariable Mendelian randomization (MVMR).

Heatmap showing the difference in statistical power between TWMR and mediation analysis through MVMR at current (A; N=8000) and realistic future (B; N=90,000) metabolic quantitative trait loci (mQTL) dataset …

Figure 4—source data 1

Difference in statistical power between transcriptome-wide Mendelian randomization (TWMR) and mediation analysis at N = 8000 metabolic quantitative trait locus (mQTL) dataset sample size.

Each cell represents the mean difference in power between TWMR and mediation analysis for a given scenario across 10 simulations. Rows reflect decreasing ratio between transcript-to-metabolite and metabolite-to-phenotype effects from 10 to 0.1 (sigma). Columns reflect increasing proportion of direct to total effect from –2 to 2 (rho). This file relates to Figure 4A.

https://cdn.elifesciences.org/articles/81097/elife-81097-fig4-data1-v1.xlsx
Figure 4—source data 2

Difference in statistical power between transcriptome-wide Mendelian randomization (TWMR) and mediation analysis at N = 90,000 metabolic quantitative trait locus (mQTL) dataset sample size.

Each cell represents the mean difference in power between TWMR and mediation analysis for a given scenario across 10 simulations. Rows reflect decreasing ratio between transcript-to-metabolite and metabolite-to-phenotype effects from 10 to 0.1 (sigma). Columns reflect increasing proportion of direct to total effect from –2 to 2 (rho). This file relates to Figure 4A.

https://cdn.elifesciences.org/articles/81097/elife-81097-fig4-data2-v1.xlsx
Figure 4—figure supplement 1
Distribution of empirical causal triplets along tested regime parameters.

Distribution of the proportion (ρ) of direct (αd) to total (αTP) effect (i.e., effect not mediated by the metabolite with arrows indicating increasing proportion of direct effect; x-axis) and ratio …

Figure 4—figure supplement 1—source data 1

Distribution of empirical causal triplets along tested regime parameters.

Each cell counts the number of empirical causal traits that fall within a certain parameter regime. Rows reflect decreasing ratio between transcript-to-metabolite and metabolite-to-phenotype effects (σ) from 10 to 0.1 (top row indicating σ>10). Columns reflect increasing proportion of direct to total effect (ρ) from –2 to 2 . This file relates to Figure 4—figure supplement 1.

https://cdn.elifesciences.org/articles/81097/elife-81097-fig4-figsupp1-data1-v1.xlsx
Figure 4—figure supplement 2
Power comparison between transcriptome-wide Mendelian randomization (TWMR) and multivariable Mendelian randomization (MVMR) at smaller sample sizes.

Heatmap showing the difference in statistical power between TWMR and mediation analysis through MVMR at metabolic quantitative trait locus (mQTL) dataset sample sizes smaller than the one in the …

Figure 4—figure supplement 2—source data 1

Difference in statistical power between transcriptome-wide Mendelian randomization (TWMR) and mediation analysis at N = 1000 metabolic quantitative trait locus (mQTL) dataset sample size.

Each cell represents the mean difference in power between TWMR and mediation analysis for a given scenario across 10 simulations. Rows reflect decreasing ratio between transcript-to-metabolite and metabolite-to-phenotype effects from 10 to 0.1 (sigma). Columns reflect increasing proportion of direct to total effect from –2 to 2 (rho). This file relates to Figure 4—figure supplement 2A.

https://cdn.elifesciences.org/articles/81097/elife-81097-fig4-figsupp2-data1-v1.xlsx
Figure 4—figure supplement 2—source data 2

Difference in statistical power between transcriptome-wide Mendelian randomization (TWMR) and mediation analysis at N = 2000 metabolic quantitative trait locus (mQTL) dataset sample size.

Each cell represents the mean difference in power between TWMR and mediation analysis for a given scenario across 10 simulations. Rows reflect decreasing ratio between transcript-to-metabolite and metabolite-to-phenotype effects from 10 to 0.1 (sigma). Columns reflect increasing proportion of direct to total effect from –2 to 2 (rho). This file relates to Figure 4—figure supplement 2B.

https://cdn.elifesciences.org/articles/81097/elife-81097-fig4-figsupp2-data2-v1.xlsx
Figure 4—figure supplement 2—source data 3

Difference in statistical power between transcriptome-wide Mendelian randomization (TWMR) and mediation analysis at N = 4000 metabolic quantitative trait locus (mQTL) dataset sample size.

Each cell represents the mean difference in power between TWMR and mediation analysis for a given scenario across 10 simulations. Rows reflect decreasing ratio between transcript-to-metabolite and metabolite-to-phenotype effects from 10 to 0.1 (sigma). Columns reflect increasing proportion of direct to total effect from –2 to 2 (rho). This file relates to Figure 4—figure supplement 2C.

https://cdn.elifesciences.org/articles/81097/elife-81097-fig4-figsupp2-data3-v1.xlsx

Additional files

MDAR checklist
https://cdn.elifesciences.org/articles/81097/elife-81097-mdarchecklist1-v1.docx
Supplementary file 1

Supplementary tables.

a. Annotation of metabolites measured in Shin et al., 2014 with Human Metabolome Database (HMDB) identifiers. b. Significant transcript-to-metabolite causal effects (FDR 5%) identified through univariable Mendelian randomization. Both original effects (ORIGINAL) and those after excluding outliers (N_outlier) are reported. The FDR column reports the adjusted p-value used to select significant associations (FDR ≤0.05). The HMDB and PubMed columns indicate the PMID of publications reporting a link between the tested transcript and metabolite, as identified per automated literature review, with ‘1*’ indicating associations reported without referencing a specific publication. c. List of the 28 medically relevant phenotypes assessed in this study. d. Significant metabolite-to-phenotype causal effects (FDR 5%) identified through univariable metabolome-wide Mendelian randomization (MWMR). Both original effects (ORIGINAL) and those after excluding outliers (N_outlier) are reported. The FDR column reports the adjusted p-value used to select significant associations (FDR ≤0.05). e. Significant transcript-to-phenotype causal effects (FDR 5%) identified through univariable transcriptome-wide Mendelian randomization (TWMR). Both original effects (ORIGINAL) and those after excluding outliers (N_outlier) are reported. The FDR column reports the adjusted p-value used to select significant associations (FDR ≤0.05). f. Identified causal transcript-metabolite-phenotype triplets. Effect size and p-value for the transcript-to-metabolite, metabolite-to-phenotype, and transcript-to-phenotype relations among the 216 identified causal triplets, along with estimated direct and indirect effects. Rows colored in beige were identified by our automated literature review of transcript-to-metabolite pairs and were subjected to an automated literature review of the transcript-phenotype relation. The PubMed column reports the PMID of publications identified per automated literature review for the involved gene and phenotype (using the synonyms in PubMed_PHENO) after manual curation of abstracts to exclude findings in which search terms were used in an erroneous context. g. Metabolites integrating the effect of multiple transcripts. Twelve metabolites integrate the effect of multiple transcripts to in turn influence one or several phenotypes. Transcripts in bold in the same color are encoded by genes in close genomic proximity.

https://cdn.elifesciences.org/articles/81097/elife-81097-supp1-v1.xlsx

Download links