Alternative polyadenylation mediates genetic regulation of gene expression

  1. Briana E Mittleman
  2. Sebastian Pott
  3. Shane Warland
  4. Tony Zeng
  5. Zepeng Mu
  6. Mayher Kaur
  7. Yoav Gilad
  8. Yang Li  Is a corresponding author
  1. Genetics, Genomics, and Systems Biology, University of Chicago, United States
  2. Department of Human Genetics, University of Chicago, United States
  3. Section of Genetic Medicine, Department of Medicine, University of Chicago, United States
8 figures and 4 additional files

Figures

Figure 1 with 11 supplements
3' sequencing of mRNA from the nuclear fraction to study inter-individual variation in APA.

(A) Schematic of how genetic variants affect phenotypes by percolating through gene regulatory layers (black arrows). We aimed to understand how genetic variation can mediate gene regulation through …

Figure 1—figure supplement 1
Relationship between the number of PAS identified in our study and gene expression levels (TPM) as measured from GEUVADIS YRI LCLs.

Genes with mean TPM <1 across individuals were considered not expressed and thus were removed for this analysis.

Figure 1—figure supplement 2
Intronic and 3' UTR share signal site motif density distributions.

(A) Stacked density plot showing the signal site distribution for PAS in 3’ UTR. (B) Stacked density plot showing the signal site distribution for PAS in introns. Other signal sites are in both …

Figure 1—figure supplement 3
Number of PAS identified with usage larger than the usage cutoff (x-axis) in the total mRNA fraction (purple).

Proportion of PAS in introns when PAS are filtered by total usage (green). Proportion of PAS in 3’ UTRs when PAS are filtered by total usage (orange).

Figure 1—figure supplement 4
Intronic PAS are enriched in introns with the weakest 5’ splice sites.

Splice site strengths for all introns were calculated using MaxEntScore.

Figure 1—figure supplement 5
We adapted LeafCutter to identify genes with significant differential usage of PAS between the total and nuclear fraction (Li et al., 2018).

The majority of PAS preferentially used in the nuclear fraction are intronic, whereas the majority of PAS preferentially used in the total fraction lie in the 3’ UTR.

Figure 1—figure supplement 6
Our identified PAS include both previously annotated and novel sites.

(A) Distribution of distance between PAS and the closest annotated site in the annotation database (PolyA_DB release 3.2). (B) Scatter plot showing the number of PAS we identified in our study …

Figure 1—figure supplement 7
Validation of cellular fractionation with western blots.

(A) Western blot against Carboxyl terminal domain of RNA Polymerase II, photo captured at 10 s exposure. Blot is not used for quantification, but to validate cell fractionation. (B) Western blot …

Figure 1—figure supplement 8
Proportion of reads that map to the genome (mapped) and the proportion of final reads used for analysis are cleanly mapped (Clean Mapped) by nuclear mRNA library.

Cleanly mapped reads are reads that mapped successfully and passed the filtering for mispriming (MP) as described in the Materials and methods.

Figure 1—figure supplement 9
Proportion of reads that map to the genome (mapped) and the proportion of final reads used for analysis that are cleanly mapped (Clean Mapped) by total mRNA library.

Cleanly mapped reads are reads that mapped successfully and passed the filtering for mispriming (MP) as described in the Materials and methods.

Figure 1—figure supplement 10
Total number of reads that map to the genome (mapped) and the number of final reads used for analysis that are cleanly mapped (Clean Mapped) by nuclear mRNA library.

Cleanly mapped reads are reads that mapped successfully and passed the filtering for mispriming (MP) as described in the Materials and methods.

Figure 1—figure supplement 11
Total number of reads that map to the genome (mapped) and the number of final reads used for analysis that are cleanly mapped (Clean Mapped) by total mRNA library.

Cleanly mapped reads are reads that mapped successfully and passed the filtering for mispriming (MP) as described in the Materials and methods.

Figure 2 with 8 supplements
Impact of genetic variation on PAS choice.

(A) An apaQTL in the ABTB2 gene impact usage of an intronic PAS. (Top) Gene track and identified PAS. Each bar represents a PAS. The red bar corresponds to the PAS most strongly associated with the …

Figure 2—figure supplement 1
Q-Q plots for total and nuclear apaQTL linear regression tests.

(A) Q-Q plot for nuclear apaQTLs, plotting adjusted p-values of the top SNP PAS associations. (B) Q-Q plot for total apaQTLs, plotting adjusted p-values of the top SNP PAS associations.

Figure 2—figure supplement 2
Proportion of PAS in different genomic locations with a significant apaQTL.

The numbers above each bar represent the number of identified apaQTL for each location.

Figure 2—figure supplement 3
Top 4 PCs included in our apaQTL linear models to account for technical variation.

(A) Proportion of variance explained in the 10 first PCs by experimental variables in nuclear APA usage. We used a linear model to look at correlation between PC and each covariate. (B) Proportion …

Figure 2—figure supplement 4
Expansion of Figure 4B that includes both fractions.

(A) Histogram showing the distribution of the distance between lead apaQTL SNP and the PAS, separated by mRNA fraction. (B) Histogram showing the distribution of the distance between lead apaQTL SNP …

Figure 2—figure supplement 5
Signal site disruption cannot explain the majority of apaQTLs.

(A) Q-Q plot showing the nuclear apaQTL p-values for SNP in signal sites upstream of 3’ UTR PAS compared to matched SNPs (equal distance) upstream of a set of 3’UTR PAS without identified signal …

Figure 2—figure supplement 6
Total specific apaQTLs have smaller effect sizes than shared apaQTLs.

(A) Boxplot showing the -log10(p-value) of the nominal total apaQTL associations separated by whether the association is also identified in the nuclear mRNA fraction. ApaQTLs that are total-specific …

Figure 2—figure supplement 7
Storey's Pi statistics suggest most apaQTLs are shared between fractions.

(A) Histogram showing the P-value distribution of the apaQTL associations between the lead total apaQTL SNP and the corresponding PAS ascertained using our 3’-Seq data from the nuclear mRNA …

Figure 2—figure supplement 8
Sharing of genetic effects on APA between fractions.

(A) Normalized effect sizes ascertained in total mRNA and nuclear fraction of total apaQTLs tested in both fractions. (B) Normalized effect sizes ascertained in total mRNA and nuclear fraction for …

Figure 3 with 4 supplements
APA can mediate genetic effects on mRNA expression.

(A) Scatter plot of intronic apaQTL effect sizes plotted against their eQTL effect sizes shows negative correlation. (B) Quantile-quantile (Q–Q) plot for apaQTLs shows that apaQTLs are more highly …

Figure 3—figure supplement 1
Scatter plot showing the relationship between intronic nuclear apaQTL effect size and eQTL effect size after removing outlier SNPs (Filtered for SNPs with eQTL effect size <− 2.0).
Figure 3—figure supplement 2
Q-Q plot showing the total apaQTL (adjusted) p-values separated by whether the gene harbors an explained (red) or unexplained (blue) eQTLs.

We observe an enrichment for low apaQTL association p-values in genes with eQTLs compared to all tested genes (black).

Figure 3—figure supplement 3
Bar plot showing the proportion of apaQTLs located in each of the 12 chromatin states from chromHMM.

We find that the location profile of apaQTLs is more similar to that of unexplained eQTLs than that of explained eQTLs. Error bars represent the 95% confidence interval for each point estimate from …

Figure 3—figure supplement 4
Proportion of eQTLs putatively explained by apaQTLs separated by fraction.

Expression QTLs could be explained by apaQTLs identified from both fractions. This observation is robust to apaQTL association p-value cutoffs. We observed that apaQTLs explain a slightly higher …

Figure 4 with 3 supplements
apaQTLs can regulate gene expression without affecting mRNA expression levels.

(A) Quantile-quantile (Q–Q) plot for apaQTLs separated by genes in previously detected rQTLS (red) and pQTLs (purple) that are not eQTLs. Black points are apaQTL genes with no pQTL, rQTL, or eQTL. (B

Figure 4—figure supplement 1
Stronger ribo QTLs and protein QTLs than expression QTLs in total fraction.

(A) Q-Q plot showing the total apaQTL (adjusted) p-values separated by whether the corresponding gene has a ribosome occupancy QTL (red) or an eQTL (red). We see an enrichment for low apaQTL …

Figure 4—figure supplement 2
LocusZoom plots for EIF2A apaQTL in Figure 4B along with associations with RNA expression, ribosome occupancy (ribo-seq), and protein expression as determined using normalized data from Li et al., 2016.

LD patterns were colored according to the HapMap YRI lines.

Figure 4—figure supplement 3
Genetic variation around PAS contribute to trait heritability.

(A) Percent of heritability explained by SNPs within 1kb around each PAS. Error bars represent +/- 1 standard deviation. We analyzed the following phenotypes: Baso – Basophil count, Baso_p – …

Author response image 1
Author response image 2
Author response image 3
Author response image 4

Additional files

Supplementary file 1

Supplementary Text.

https://cdn.elifesciences.org/articles/57492/elife-57492-supp1-v2.pdf
Supplementary file 2

ApaQTL whose lead SNP is nominally associated with protein expression levels but not expression.

Table includes p-value and slope for the associated between the lead SNP and nuclear APA usage, gene expression levels, protein expression levels, and ribosome occupancy (as measured using ribo-seq).

https://cdn.elifesciences.org/articles/57492/elife-57492-supp2-v2.txt
Supplementary file 3

Library information for each Yoruba lymphoblastoid cell line, including sample, collection, and read information.

https://cdn.elifesciences.org/articles/57492/elife-57492-supp3-v2.txt
Transparent reporting form
https://cdn.elifesciences.org/articles/57492/elife-57492-transrepform-v2.docx

Download links