Divergence in alternative polyadenylation contributes to gene regulatory differences between humans and chimpanzees
Abstract
While comparative functional genomic studies have shown that inter-species differences in gene expression can be explained by corresponding inter-species differences in genetic and epigenetic regulatory mechanisms, co-transcriptional mechanisms, such as alternative polyadenylation (APA), have received little attention. We characterized APA in lymphoblastoid cell lines from six humans and six chimpanzees by identifying and estimating usage for 44,432 polyadenylation sites (PAS) in 9,518 genes. Although APA is largely conserved, 1,705 genes showed significantly different PAS usage (FDR 0.05) between species. Genes with divergent APA also tend to be differentially expressed, are enriched among genes showing differences in protein translation, and can explain a subset of observed inter-species protein expression differences that do not differ at the transcript level. Finally, we found that genes with a dominant PAS, which is used more often than other PAS, are particularly enriched for differentially expressed genes.
Data availability
Sequencing data available on GEO under accession GSE155245.
Article and author information
Author details
Funding
National Institutes of Health (T32GM09197)
- Briana E Mittleman
National Institutes of Health (F31HL149259)
- Briana E Mittleman
National Institutes of Health (R01HG010772)
- Yoav Gilad
National Institutes of Health (R35GM13172)
- Yoav Gilad
National Institutes of Health (K12HL119995)
- Sebastian Pott
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Reviewing Editor
- Graham Coop, University of California, Davis, United States
Version history
- Received: August 28, 2020
- Accepted: February 12, 2021
- Accepted Manuscript published: February 17, 2021 (version 1)
- Version of Record published: March 12, 2021 (version 2)
Copyright
© 2021, Mittleman et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 1,209
- views
-
- 164
- downloads
-
- 11
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Genetics and Genomics
LD score regression (LDSC) is a method to estimate narrow-sense heritability from genome-wide association study (GWAS) summary statistics alone, making it a fast and popular approach. In this work, we present interaction-LD score (i-LDSC) regression: an extension of the original LDSC framework that accounts for interactions between genetic variants. By studying a wide range of generative models in simulations, and by re-analyzing 25 well-studied quantitative phenotypes from 349,468 individuals in the UK Biobank and up to 159,095 individuals in BioBank Japan, we show that the inclusion of a cis-interaction score (i.e. interactions between a focal variant and proximal variants) recovers genetic variance that is not captured by LDSC. For each of the 25 traits analyzed in the UK Biobank and BioBank Japan, i-LDSC detects additional variation contributed by genetic interactions. The i-LDSC software and its application to these biobanks represent a step towards resolving further genetic contributions of sources of non-additive genetic effects to complex trait variation.
-
- Evolutionary Biology
- Genetics and Genomics
Copy number variation in large gene families is well characterized for plant resistance genes, but similar studies are rare in animals. The zebrafish (Danio rerio) has hundreds of NLR immune genes, making this species ideal for studying this phenomenon. By sequencing 93 zebrafish from multiple wild and laboratory populations, we identified a total of 1513 NLRs, many more than the previously known 400. Approximately half of those are present in all wild populations, but only 4% were found in 80% or more of the individual fish. Wild fish have up to two times as many NLRs per individual and up to four times as many NLRs per population than laboratory strains. In contrast to the massive variability of gene copies, nucleotide diversity in zebrafish NLR genes is very low: around half of the copies are monomorphic and the remaining ones have very few polymorphisms, likely a signature of purifying selection.