Single cell RNA sequencing provides powerful insight into the factors that determine each cell's unique identity. Previous studies led to the surprising observation that alternative splicing among single cells is highly variable and follows a bimodal pattern: a given cell consistently produces either one or the other isoform for a particular splicing choice, with few cells producing both isoforms. Here we show that this pattern arises almost entirely from technical limitations. We analyze alternative splicing in human and mouse single cell RNA-seq datasets, and model them with a probabilistic simulator. Our simulations show that low gene expression and low capture efficiency distort the observed distribution of isoforms. This gives the appearance of binary splicing outcomes, even when the underlying reality is consistent with more than one isoform per cell. We show that accounting for the true amount of information recovered can produce biologically meaningful measurements of splicing in single cells.
All sequencing data reanalyzed in this study were acquired from GEO.
Single-cell analysis of allelic gene expression in pluripotency, differentiation and X-chromosome inactivationNCBI Gene Expression Omnibus, GSE74155.
Defining the early steps of cardiovascularlineage segregation by single cell RNA-seqNCBI Gene Expression Omnibus, GSE100471.
Pseudo-temporal ordering of individual cells reveals regulators of differentiationNCBI Gene Expression Omnibus, GSE52529.
Single-cell alternative splicing analysis with Expedition reveals splicing dynamics during neuron differentiationNCBI Gene Expression Omnibus, GSE85908.
Olfactory stem cell differentiation: horizontal basal cell (HBC) lineageNCBI Gene Expression Omnibus, GSE95601.
- Carlos F Buen Abad Najar
- Nir Yosef
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
- L Stirling Churchman, Harvard Medical School, United States
© 2020, Buen Abad Najar et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Previously, we showed that a massively parallel reporter assay, mSTARR-seq, could be used to simultaneously test for both enhancer-like activity and DNA methylation-dependent enhancer activity for millions of loci in a single experiment (Lea et al., 2018). Here, we apply mSTARR-seq to query nearly the entire human genome, including almost all CpG sites profiled either on the commonly used Illumina Infinium MethylationEPIC array or via reduced representation bisulfite sequencing. We show that fragments containing these sites are enriched for regulatory capacity, and that methylation-dependent regulatory activity is in turn sensitive to the cellular environment. In particular, regulatory responses to interferon alpha (IFNA) stimulation are strongly attenuated by methyl marks, indicating widespread DNA methylation-environment interactions. In agreement, methylation-dependent responses to IFNA identified via mSTARR-seq predict methylation-dependent transcriptional responses to challenge with influenza virus in human macrophages. Our observations support the idea that pre-existing DNA methylation patterns can influence the response to subsequent environmental exposures—one of the tenets of biological embedding. However, we also find that, on average, sites previously associated with early life adversity are not more likely to functionally influence gene regulation than expected by chance.
A new in vitro system called Rec-Seq sheds light on how mRNA molecules compete for the machinery that translates their genetic sequence into proteins.