Pan-tissue transcriptome analysis reveals sex-dimorphic human aging

  1. Siqi Wang  Is a corresponding author
  2. Danyue Dong
  3. Xin Li
  4. Zefeng Wang  Is a corresponding author
  1. Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, China
  2. University of California, Los Angeles, United States
  3. School of Life Science, Southern University of Science and Technology, China
15 figures and 7 additional files

Figures

Figure 1 with 1 supplement
Schematic overview of our study.

RNA-seq datasets from 54 human tissues (including 13 brain regions) were downloaded from GTEx Portal. Gene expression (GE) and alternative splicing (AS) quantifications were performed by Paean. The individual sex or age and age-by-sex effects on global transcriptomic variation were evaluated by designing a principal component-based signal-to-variation ratio (pcSVR) value, and the age-differential genes/AS events were subsequently identified in each sex. Sex-biased age-associated AS events were used to find the sex-specific associations with certain diseases. We also constructed AS regulatory networks to find the regulatory age-associated splicing factors (SFs) and explored the transcriptional regulation of sex hormones via nuclear receptors. Moreover, we focused on age-associated GE and then evaluated the aging rate and their modulated genes in different sexes.

Figure 1—figure supplement 1
Principal component analysis (PCA) on gene expression (GE) and alternative splicing (AS) profiles.

PCA is performed using GE and AS profiles, respectively. The X- and Y-axis indicate scores of the top 2 PCs. Female and male samples are labeled with the red and blue dots, while samples with continuous ages are labeled in a gradient palette. The percentage of explained variation by each PC is labeled nearby each axis.

Figure 2 with 1 supplement
Individual sex or age effects and combined age-by-sex effects on global transcriptome variation.

Principal component-based signal-to-variation ratio (pcSVR) between different sex or age groups calculated by gene expression (GE) (A) and alternative splicing (AS) (B). The yellow triangles show the significant sex-pcSVR calculated between females and males, while black triangles show the significant age-pcSVR calculated between young (i.e., 20–40 years old) and old (i.e.,>60 years old) using the empirical p-value cutoff 0.1. Insignificant data points are labeled in gray. (C) The age effect on transcriptome variation as calculated by GE vs. AS. The X-axis of the scatter plot shows the log2 transformed age-pcSVR between young and old calculated by GE, while the Y-axis shows the values calculated by AS. Blue dots indicate the tissues with significant pcSVR specific in AS, while green dots indicate the tissues with significant pcSVR specific in GE. Tissue names in each class are labeled under the scatter plot. (D) The sex effect on transcriptome variation as calculated by GE vs. AS. The analyses and labels are similar to panel C except the sex-pcSVR is used in all calculations. Scatter plots of the age effect between females vs. males calculated by GE (E) and AS (F). The X-axis shows log2 transformed age-pcSVR calculated in females, while the Y-axis shows the age effect calculated in males. Dashed lines showed equal age-pcSVR in females and males. Dark red and blue dots indicate the tissues with female- and male-specific age effects.

Figure 2—figure supplement 1
Principal component-based signal-to-variation ratio (pcSVR) calculation and robustness.

(A) Schematic diagram of pcSVR calculation. pcSVR higher than 1 indicates that the differences between groups are higher than the noise within groups (left). pcSVR no more than 1 indicates that the differences between groups are smaller than the noise within groups (right). Two spheres represent two groups (females vs. males or young vs. old). Only three dimensions are shown. pcSVR between different sex or age groups calculated by gene expression (GE) (B) and alternative splicing (AS) (C) without genes and AS events on sex chromosomes. The yellow triangles show significant pcSVR calculated between females and males, while the black triangles show significant pcSVR calculated between young (i.e., 20–40 years old) and old (i.e., >60 years old) groups using permutation p-value <0.1. Insignificant values are labeled in gray. Bar plot of the age- and sex-pcSVR values (D) and corresponding empirical p-values (E) calculated using different numbers of PCs capturing 50–90% of the variance. The dashed lines indicate the thresholds for significant sex/age effects judged by the pcSVR analysis, and the mean and s.d. (standard deviation) are plotted here. s. D. are plotted here. (F) Correlations of pcSVRs calculated by the bootstrap and sub-sampling approaches as shown in scatter plots of sex-pcSVR and age-pcSVR values. The X-axis shows the results from the bootstrap approach, while the Y-axis shows the results from permutation analysis using 50% sub-sampling. Each point indicates a tissue. (G) Correlations of empirical p-values calculated by the bootstrap and sub-sampling approaches as shown in scatter plots of −log10 transformed empirical p-values. The label of the X- and Y-axis is similar to panel F. The dots at the right margin indicate the empirical p-values equal to 0, with infinite −log10 transformed values.

Figure 3 with 4 supplements
Sex-stratified age-associated genes and alternative splicing (AS) events across tissues.

The numbers of age-associated genes (A, left) and AS events (B, left) in each sex. The red bars indicate the numbers of age-associated genes or AS events in females, while blue bars indicate those in males. The numbers of age-associated genes and AS events common in both sexes are shown in bars with relatively lighter colors. The significance of the overlapped age-associated genes (A, right) or AS events (B, right) between sexes is estimated using the hypergeometric test. Tissues with significant overlapped p-values are labeled by dark red dots. (C) Correlation between the age effects on genes (green) or AS events (blue) in females vs. males. The Y-axis represents the correlations of the effect sizes of age (βF, βM) between two sexes across all genes/AS events (Spearman’s correlation). Gray lines link a pair of tissue in gene expression (GE) and AS, and the median ±1.5xIQR are plotted in box plot. The p-values are estimated using the Wilcoxon signed-rank test.

Figure 3—figure supplement 1
Differential gene expression (GE) and alternative splicing (AS) analysis to identify sex/age-differential genes and AS events.

The numbers of age-differential (A) and sex-differential (B) genes (left) and AS events (right). The color bar between the two bar plots represents the overlapped significance between sex/age-differential genes and spliced genes calculated using hypergeometric test. The dashed lines separate the tissues with significant or non-significant overlaps using the p-value <0.01. The sex/age-differential genes and AS events are defined with coefficient p-value <0.05 and fold change >1.5 for GE or ΔPSI >0.05 for AS. Correlation between the sample sizes and the numbers of age/sex-differential genes (C) or AS events (D). The X-axis indicates the total sample sizes. The Y-axis shows the numbers of age-differential genes/AS events (left) and sex-differential genes/AS events (right). The p-values of the coefficient of the sample sizes are calculated by linear regression as the formula on the bottom. (E) The numbers of genes (left) and AS events (right) with significant age-by-sex interactions. The color bar between two bar plots represents the overlapped significance between genes and spliced genes calculated using the hypergeometric test. The dashed lines separate the tissues with significant or non-significant overlaps.

Figure 3—figure supplement 2
The evaluation of the confounding factors in the linear regression model.

Correlation between the surrogate variables and human genetic background, including the donor ethnicity (A) and the principal components from whole-genome sequencing (WGS) data (B). The highest correlation r between surrogate variables vs. post-mortem interval (PMI) (C), season of death (D), and time of death (E). Each bar represents those correlations in each human tissue. The bars are color-coded according to a −log10(p-value) scale shown on the right.

Figure 3—figure supplement 3
Permutation analysis of the linear regression model in differential analysis.

Distribution of the p-values across sex/age-differential genes (A) or alternative splicing (AS) events (D) identified from original groups or shuffled groups in 1000 permutated iterations. Fractions of original sex/age-differential genes (B) or AS events (E) that are detected in each permutation. Each data point indicates the result from one shuffled permutation, and the median ±1.5xIQR are plotted as box plot. False discovery rate (FDR) of each age/sex-differential gene (C) and AS event (F) calculated across 1000 permutations. Each data point indicates the result of one age/sex-differential gene (C) and AS event (F), and the median ±1.5xIQR are plotted as box plot.

Figure 3—figure supplement 4
Robustness of gene expression (GE) and alternative splicing (AS) cutoffs.

The distribution of junction reads counts (A) and TPMs (B) of age/sex-differential genes/AS events in multiple tissues. The X-axis indicates the average TPMs or JCs across the samples in certain tissue. The X-axis indicates the average TPMs or JCs across the samples in certain tissue. (C) Overlap between the age/sex-differential genes and AS genes in decision-related brain region. The overlapped p-values calculated by hypergeometric test are shown with the color bar. (D) Correlation of the age effects between females and males on GE (green) or AS (blue) using decision-related brain regions as example. Y-axis represents the correlations of the effect sizes of age (βF, βM) between two sexes across all genes/AS events (Spearman’s correlation). Each jitter indicates the correlation under each single cutoff, with the median ±1.5xIQR plotted. The p-values are estimated using the Wilcoxon signed-rank test.

Figure 4 with 2 supplements
Male-biased associations between the alternative splicing (AS) changes during aging and Alzheimer’s disease (AD).

(A) GO analysis of the sex-biased age-associated AS events (sBASEs) in each sex and brain region. The heatmap shows the sex-specific pathways that are significantly enriched in more than three brain regions. Clustering is conducted by default parameters in pheatmap functions. The −log10 transformed enrichment p-values are shown in the color scale. (B) Venn diagram between the sBASEs in females and males with AD-related AS events. The p-values are calculated using the hypergeometric test. (C) Correlation between AD- and age-associated AS changes in females and males. The X-axis indicates the PSIOld − PSIYoung, while the Y-axis indicates PSIAD − PSIControl. sBASEs in females are labeled in red, while sBASEs in males are in deep blue. The estimated Rho and p-value by Spearman’s correlation test in each sex are labeled on the top. (D) Model for AD prediction and feature importance evaluation. sBASEs are used for predicting AD in females and males, respectively. 90% of samples are randomly selected as training sets for 100 iterations. The recursive feature elimination approach is used for feature selection. Feature importance is evaluated by the averaged mean decrease accuracy (MDA) across 100 iterations. (E) AS levels of the skipped exon 10 on SLC43A2 during plaque stages in males and females. (F) AS levels of the skipped exons 3 and 4 on FAM107A during tangles stages in males and females. Mean ± standard error is shown in the error bar. (G) Performances of sex-stratified and merge-sexes models predicted by sBASEs or randomly selected AS events for 100 iterations. The control models (i.e., the sex-stratified model trained by randomly selected AS events and the merge-sexes model trained by sBASEs in each sex) are highlighted with dashed lines. The median ± 1.5xIQR of area under the curve (AUC) across 100 iterations is shown in the boxplot.

Figure 4—figure supplement 1
Functional enrichment of sex-biased age-associated alternative splicing events (sBASEs) across tissues.

GO enrichment of the sBASEs in females or males across all tissues. Sex-specific GO terms significantly in at least 15 tissues were shown.

Figure 4—figure supplement 2
Sex-biased associations between sex-biased age-associated alternative splicing events (sBASEs) and diseases in decision-related brain region.

(A) DO analysis of the sBASEs in females or males in decision-related brain region. Sex-specific DO terms with top 10 −log10 transformed enrichment p-values in each sex were shown. (B) Venn diagram between sex-biased age-associated genes and Alzheimer’s disease (AD)-related genes. The p-values are calculated using the hypergeometric test. (C) Correlation between AD- and age-associated AS changes across the sBASEs in females and males from the ROSMAP dataset. The X-axis indicates the age-associated AS changes (PSIOld − PSIYoung), while Y-axis indicates AD-associated AS changes (PSIAD − PSIControl). (D) Performances of sex-stratified and merge-sexes models predicted by sBASEs and randomly selected AS events for 100 iterations using support vector machine (SVM) classifier. The control models (i.e., the sex-stratified model trained by randomly selected AS events and the merge-sexes model trained by sBASEs in each sex) are highlighted with dashed lines. The m median ± 1.5xIQR of area under the curve (AUC) in 100 iterations is shown in the boxplot.

Figure 5 with 1 supplement
Sex-dimorphic alternative splicing (AS) regulation during aging in decision-related brain region.

(A) Schematic diagram for constructing AS regulatory networks (left) and the networks of the decision-related brain region (right). The blue and red lines indicate the regulations in males and females. Hub genes in octagons are splicing factors (SFs). The red ones are female-specific age-associated SFs, while the age-associated SFs common in both sexes are labeled in yellow. The color of the ellipse indicates the significance of sex-biased age-associated AS events (sBASEs) during aging, and the thickness of the line shows −log10 Spearman’s correlation p-values between the TPMs of SFs and PSIs of AS events during aging. (B) Examples of female-specific age-associated SFs (left) and age-associated SFs common in both sexes (right) during aging. Median ±1.5xIQR is shown in the boxplot. The grey triangles indicate averaged expression level. The p-value and fold change between old and young are labeled at the bottom. (C) Percentages of the sBASEs regulated by the sex-specific age-associated SFs in four brain regions. (D) Functional enrichment of sBASEs regulated by female-specific age-associated SFs based on MsigDB. (E) Schematic diagram for the transcriptional regulation on age-associated SFs via nuclear receptors (left). The boxplot shows the ESR1-binding scores on age-associated SFs (right). The Y-axis indicates the ratio of ESR1-binding scores of each SF divided by the median binding score of age-associated SFs common in both sexes. (F) Expression levels of ESR1 during aging. The legends of each group are the same as those in B. (G) Expression levels of SRSF1 and SRSF7 treated by estrogen receptor agonist (PPT) vs. DMSO in MCF-7 cell line (N = 3). (H) Expression levels of SRSF1 and SRSF7 treated by 1 nM estradiol (E2) vs. vehicle control (veh) in ESR1 wild-type, Y537S, and D538G mutant MCF-7 cells (N = 4). The error bars indicate the mean ± s.d. (I) Expression levels of SRSF1 and SRSF7 during aging. Y-axis indicates the normalized TPM values. The p-value and fold change between old and young groups in each sex are labeled at the bottom. The legends of each group are the same as those in B.

Figure 5—figure supplement 1
Gene expression regulation of sex-biased age-associated splicing factors by estrogen via ESR1 in multiple datasets.

(A) Expression levels of HNRNPA2B1 and HNRNPC treated by 1 nM estradiol (E2) vs. vehicle control (veh) in ESR1 wild-type, Y537S, and D538G mutant MCF-7 cell lines. Y-axis indicates the TPM values (N=3); the p-value is labeled on the top. (B) Expression levels of Hnrnpa2b1, Hnrnpc, Srsf1, and Srsf7 treated by estradiol (E2) vs. vehicle control (Oil) of ERα wild-type and knockout ARCs in female mice. Y-axis indicates the FPKM values. (C) ESR1 ChIP-seq reads coverages around Srsf1 and Srsf7 treated by estradiol (E2) vs. vehicle control (Oil) of ERα wild-type (WT, top panel) and knock-out (KO, bottom panel) in female mice. Y-axis of each track indicates the read counts at each position from ChIP-seq. mean ± s.d. is plotted.

Figure 6 with 2 supplements
Sex-dimorphic aging rate of gene expression (GE) during aging process.

(A) Workflow of time series and breakpoint analysis of GE in each sex. (B) Aging rates and breakpoints in the aorta (left) and transverse colon (right). The Y-axis shows the estimated aging rate (see Materials and methods). The p-values are calculated using the Wilcoxon signed-rank test. (C) Characteristics of the major breakpoints across tissues in each sex. The left panel shows the distribution of the major breakpoints across multiple tissues. The right panel shows the aging rate at the major breakpoints. The p-values are calculated using the Wilcoxon signed-rank test. (D) Pipelines for identifying aging-modulated genes (AMGs) in each sex and tissue (see Method). (E) AMGs in different sexes and tissues. The numbers of AMGs in females (red) and males (blue) are shown in the bar plot (left). Tissues are ordered by the number of AMGs common in both sexes, which are shown in lighter colors. Significances of the overlapped AMGs are calculated using hypergeometric test and colored in a gradient palette (right). GO analysis of male-biased AMGs (F) and female-biased AMGs (G). Sex-specific GO terms with top 10 −log10 transformed enrichment p-values are plotted. GO terms are labeled by purple dots, the size of which indicates the significance of the enrichment analysis. The larger the dots, the more significant enrichment scores of the biological pathways. AMGs are labeled by blue dots, the color of which indicates the number of GO terms associated with those genes. (H) The binding scores of AR (top) and ESR1 (bottom) on male-specific AMGs, female-specific AMGs, and randomly selected AR/ESR1 target genes. The p-values are calculated by the Wilcoxon rank-sum test. The error bars indicate median ±1.5xIQR.

Figure 6—figure supplement 1
Breakpoint analysis across multiple tissues.

Tissues with faster aging rates in males (A) and females (B). The p-values are calculated using the Wilcoxon signed-rank test and are labeled at the bottom. (C) Characteristics of the major breakpoints calculated by all chronological genes in females and males. The left panel shows the distribution of the major breakpoints across multiple tissues. The right panel shows the aging rate at the major breakpoints. The p-values are both calculated using the Wilcoxon signed-rank test. Characteristics of the major breakpoints calculated by chronological sex-biased age-associated alternative splicing events (sBASEs) (D) and all chronological alternative splicing (AS) events (E) in females and males. The left panel shows the distribution of the major breakpoints across multiple tissues. The right panel shows the aging rate at the major breakpoints. p-values are both calculated using the Wilcoxon signed-rank test. The error bars indicate median ±1.5xIQR.

Figure 6—figure supplement 2
Gene expression (GE) patterns of the aging-modulated genes (AMGs) in whole blood tissue.

(A) GE patterns of AMGs common in both sexes (counts = 82) during aging. The values represent the z-score of fitted TPM from Autoregressive Integrated Moving Average (ARIMA) models. (B) Examples of AMGs common in both sexes with specific GE patterns. The dots indicate the ARIMA-fitted expression levels at each age point. The solid lines indicate the fitted values from LOESS regression for displaying the GE trends during aging. (C) GO analysis of the AMGs common in both sexes with specific GE patterns during aging. GE patterns of male-biased (D, counts = 83) and female-biased (E, counts = 61) AMGs during aging. AMGs in dotted circles indicate the cluster of AMGs with increasing expression levels during aging in males.

Author response image 1
The results of differential gene expression analysis with vs without the inclusion of PMI correction as a known covariate.

The scatter plots show the correlations of significance levels (pvalues, left panel) and effect sizes (coefficients, right panel) of sex (A) and age (B). Whole-blood tissue is used as an example.

Author response image 2
The comparison between the gene expression of whole blood tissue from GTEx and PBMCs from Shen et al.

(A) The bar plot shows the number of age (left panel) or sex-associated (right panel) genes in the two datasets. The grey bars highlight the proportion of overlapped genes in both datasets. (B) The top 10 significantly enriched biological processes in the GTEx-specific age-associated genes. The color bar shows the number of age-associated genes in specific pathways.

Author response image 3
The distribution and functional relevance of sBASEs with coding effects.

(A) The number of sBASEs and CDS-altering sBASEs across multiple tissues. The deeper bars show the number of sBASEs whose alternative splice sites are located at protein-coding regions. (B) GO biological pathways in each sex and brain region. Heatmap shows the sex-specific pathways that are significantly enriched by CDS-altering sBASEs in more than 2 brain regions and sex. (C) Correlation between ADassociated and age-associated AS changes across the CDS-altering sBASEs that alter protein-coding sequences in females and males. (D) Performances of sex-stratified models predicted by CDS-altering sBASEs in 100 iterations using the random forest approach

Author response image 4
SRSF1 regulations on specific sBASEs using SRSF1 knock-down RNA-seq data in GBM cells.

Three examples are shown to be regulated during aging with significant changes between SRSF1 KD vs control in U251 and U87MG cell lines. The splicing diagrams are shown below.

Author response image 5

(A) The correlation between gene length and age-associated changes across GTEx tissues in human samples. The correlation tests are evaluated using Spearman’s approach. The color bar indicates the -log10 transformed p-values in the correlation test. (B) The results of GO enrichment analysis using the genes with Foldchange > 2 and length > 105 bp. The parent terms calculated by ‘rrvgo’ with a similarity threshold of 0.9 are shown.

Author response image 6
The correlation between gene length and age-associated changes across tissues in females and males, respectively.

The correlation tests are evaluated using the Spearman’s approach. The red dots indicate the significant correlations in females, while the navy dots show those in males.

Author response image 7
Age (left panel), BMI (Body Mass Index) (middle panel), and PMI (Post-Mortem Interval) (right panel) distribution in GTEx v8 cohort.
Author response image 8
Sex (left panel), ethnicity (middle panel), and manner of death (DTHMNNR) (right panel) distribution in GTEx v8 cohort.
Author response image 9
Protein levels of some male-specific splicing factors in human hippocampus quantified using MS data.

The Y-axis shows the protein intensity. Different facets mean different sample batch sets. The yellow boxes indicate the protein levels in the young group, while the brown boxes indicate those in the old group.

Additional files

MDAR checklist
https://cdn.elifesciences.org/articles/102449/elife-102449-mdarchecklist1-v1.pdf
Supplementary file 1

Summary for sample sizes of each group in multiple tissues.

https://cdn.elifesciences.org/articles/102449/elife-102449-supp1-v1.xlsx
Supplementary file 2

Principal component-based signal-to-variation ratio (pcSVR) values and corresponding permutation p-values between different age or sex groups as judged by gene expression (GE) and alternative splicing (AS).

https://cdn.elifesciences.org/articles/102449/elife-102449-supp2-v1.xlsx
Supplementary file 3

Lists of sex-stratified age-associated genes and alternative splicing (AS) events in 35 human tissues and brain regions.

https://cdn.elifesciences.org/articles/102449/elife-102449-supp3-v1.xlsx
Supplementary file 4

p-values of GO analysis for sex-biased age-associated alternative splicing events (sBASEs) across multiple tissues.

https://cdn.elifesciences.org/articles/102449/elife-102449-supp4-v1.xlsx
Supplementary file 5

AS regulatory networks between sex-biased age-associated alternative splicing events (sBASEs) and age-associated splicing factors in four functional brain regions.

https://cdn.elifesciences.org/articles/102449/elife-102449-supp5-v1.xlsx
Supplementary file 6

Aging-modulated genes (AMGs) and enriched p-values across multiple tissues.

https://cdn.elifesciences.org/articles/102449/elife-102449-supp6-v1.xlsx

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Siqi Wang
  2. Danyue Dong
  3. Xin Li
  4. Zefeng Wang
(2025)
Pan-tissue transcriptome analysis reveals sex-dimorphic human aging
eLife 13:RP102449.
https://doi.org/10.7554/eLife.102449.3