A combination of single-cell techniques and computational analysis enables the simultaneous discovery of cell states, lineage relationships and the genes that control developmental decisions.
The identity or state of a cell depends on numerous factors. Some of these factors are transient in nature (such as the stage the cell is at in the cell cycle), while others reflect long-lasting commitments, such as those that occur during the development of stem cells (Novershtern et al., 2011, Graf and Enver, 2009). By making the entire transcriptome available, single-cell RNA sequencing is now allowing researchers to systematically investigate these factors (Wagner et al., 2016; Tanay and Regev, 2017). Specifically, single-cell technology opens the way for developmental biologists who work on the transitions between different cell states to explore three outstanding questions: (1) What are the cell states (both transitional and long lasting or terminal) that comprise a developmental process of interest? (2) What transitions take place between these states? (3) How are these transitions regulated?
Now, in a pair of papers in eLife, researchers at Harvard University and the Allen Institute for Brain Science report a framework that uses whole-genome mRNA expression profiling to address these questions, which they then apply to stem cell differentiation in mouse embryos (Furchtgott et al., 2017; Jang et al., 2017). The basic concept that underlies these two papers concerns the second question, which is about transitions between cell states that have already been defined in advance. Previous attempts to address this question mostly relied on the notion that two cell states are 'close' to each other in their lineage tree if their gene expression profiles are similar (Qiu et al., 2011; Shin et al., 2015). In the first of the papers Leon Furchtgott, Samuel Melton, Vilas Menon and Sharad Ramanathan present an alternative strategy, which was motivated by an investigation of gene expression in B- and T-cells as they developed (Furchtgott et al., 2017).
Combining this gene expression data with what was already known about the lineage relationship between the different states of the B- and T-cells, Furchtgott et al. identified triplets of cell states that exhibited a consistent pattern. Each triplet contained a precursor state and two descendant states, and for many transcription factor genes, the expression in one member of the triplet was much less than in the other two members. Furthermore, the member of the triplet with low levels of gene expression was rarely the 'central' state, which can represent either a common precursor for the two other states, or a transitional state between them (see Figure 1). This finding is consistent with previous work which showed that cell differentiation involves the selective silencing of certain transcription factors (Graf and Enver, 2009; Novershtern et al., 2011), or that transcriptional profiles often exhibit a 'single-pulse' pattern during development (Yosef and Regev, 2011).
Furchtgott et al. then developed a statistical method to test whether a given triplet of states reflects a true developmental progression and, if so, in what order. The method is based on identifying 'transitional' genes that are clearly expressed at low levels in one of the states, and testing whether the overall pattern of transitions (while looking at the entire set of transcription factor genes) is likely due to the presence of a lineage relationship. All the triplets that 'pass' this test are then merged into one global lineage tree. This new strategy opened up the possibility of categorizing all the cells in a single-cell RNA sequencing dataset into developmental states, without knowing in advance what these states were. To achieve this, which essentially involves answering the first of our three questions, Furchtgott et al. developed an iterative algorithm that seeks to partition the cells into clusters (each representing a separate state), such that the overall likelihood of lineage relationships between these clusters is high.
Ramanathan, Sumin Jang and co-workers – including Jang and Sandeep Choubey as joint first authors – then used this approach to do two things: first, they identified nine cell states that occur as embryonic stem cells undergo differentiation and eventually become progenitor cells for the different germ layers in a mouse embryo; second, they organized these nine states into a lineage tree (Jang et al., 2017). The next challenge was to find out how transcriptional regulation controlled cell development in this system.
In the past researchers have focused primarily on the transitions between states (see, for example, Novershtern et al., 2011; Shin et al., 2015). However, when Jang et al. used flow cytometry and live-cell microscopy to monitor various biomarkers for the cell state, they found that the different cell states were relatively stable, while the transitions between them occurred more rapidly. This prompted them to focus on the states themselves, rather than the transitions between them. In particular they looked for a network of interactions between sets of transcription factors that could converge to a number of different steady states (Huang et al., 2009), with each of these steady states representing a particular cell state.
Since the number of networks that exhibit this property is extremely high, Jang et al. were not able to explore all of them. Rather, they explored a sample, looking for relationships between transcription factors that were consistent across many networks. This allowed them to make predictions about the relationships between various transcription factors, and how these relationships depended on the cell state. For example, they predicted (and then experimentally verified) that the expression of Oct4 is more sensitive to the over-expression of Sox2 when a cell is in an epiblast-like state than when it is in a state that is like an embryonic stem cell.
While new tools, particularly single-cell RNA-sequencing, are proving to be highly productive, we could learn much more by measuring other molecular profiles within the cells. For instance, knowing more about the chromatin state of single cells could help a lot when categorizing them into cell states and trying to identify the most active transcription factors (Wagner et al., 2016; Tanay and Regev, 2017). Technologies for lineage tracking can further provide direct observations of cell state transitions (Woodworth et al., 2017). Looking to the future, the ability to collect multiple types of data from single cells, combined with the ability to integrate and interpret all these data in an informative manner, is sure to lead to new insights into how cells change and fate decisions are made during development.
Cancer attractors: a systems view of tumors from a gene network dynamics and developmental perspectiveSeminars in Cell & Developmental Biology 20:869–876.https://doi.org/10.1016/j.semcdb.2009.07.003
Discovering biological progression underlying microarray samplesPLoS Computational Biology 7:e1001123.https://doi.org/10.1371/journal.pcbi.1001123
Revealing the vectors of cellular identity with single-cell genomicsNature Biotechnology 34:1145–1160.https://doi.org/10.1038/nbt.3711
Building a lineage from single cells: Genetic techniques for cell lineage trackingNature Reviews Genetics 18:230–244.https://doi.org/10.1038/nrg.2016.159
Downloads (link to download the article as PDF)
Download citations (links to download the citations from this article in formats compatible with various reference manager tools)
Open citations (links to open the citations from this article in various online reference manager services)
Computational analysis of gene expression to determine both the sequence of lineage choices made by multipotent cells and to identify the genes influencing these decisions is challenging. Here we discover a pattern in the expression levels of a sparse subset of genes among cell types in B- and T-cell developmental lineages that correlates with developmental topologies. We develop a statistical framework using this pattern to simultaneously infer lineage transitions and the genes that determine these relationships. We use this technique to reconstruct the early hematopoietic and intestinal developmental trees. We extend this framework to analyze single-cell RNA-seq data from early human cortical development, inferring a neocortical-hindbrain split in early progenitor cells and the key genes that could control this lineage decision. Our work allows us to simultaneously infer both the identity and lineage of cell types as well as a small set of key genes whose expression patterns reflect these relationships.