Single-cell transcriptomics in the early C. elegans embryo.

a, The embryonic C. elegans cell lineage as deciphered by Sulston and colleagues (Sulston et al., 1983). The number of cells present at each stage is indicated on the left and schematic images of the embryo are shown for several stages to the right with cells colored by founder cell lineages AB, MS, E, C, D, and P4. The fate of the cells at the last stage is also indicated according to the legend at the foot of the lineage. The number of embryos examined in this study is indicated on the right. Colored bars correspond to the windows in which embryos were considered to be at the same stage. b, a tSNE map of the cells isolated in this study. Circles indicate the individual cells, colored by the stage of development of the embryo from which they were manually collected. c-d, Gene expression for the indicated lineage and fate marker genes on the tSNE map shown in b.

Inferring the transcriptomes of cell states throughout specification in the C. elegans embryo.

a, tSNE clusters of cells from 15-cell stage embryos, shown in terms of their embryo of origin (top), cell clustering (middle) and inferred cell identity (bottom). b, Gene expression for the indicated genes across the assigned cell states of the 28-cell stage. Rows correspond to genes and each bar within a cell state indicates the expression level of a sample. Black boxes indicate differential gene expression. Colors in the bottom row indicate the embryo of origin. c, Double single-molecule RNA in situ hybridization showing spatial identity of F19F10.1 expressing nuclei (yellow). The positions of all nuclei are shown with DAPI-staining (blue), and the embryos were oriented with the posterior landmark gene pal-1 (purple) (Hunter and Kenyon, 1996). Cell identities were manually annotated based on previous descriptions (Table S4). d, Average expression of cluster identity genes through the first 8 cell-divisions of embryonic development. Each bar represents the inferred transcriptome of a cell state and shows the standardized expression for the indicated genes, using the same color bar as in b. Equivalent transcriptomes derived from sister cells are indicated by blue circles. Lineage symmetry gives rise to left/right equivalence groups indicated with black boxes. e, Expression of ceh-43 according to lineage of the inferred transcriptomes (top) and a transcriptional reporter strain (bottom), where highest expression is shown in red and no detected expression is black (see also Fig. S4).

TF gene families have spatiotemporal specificities.

a, Detecting enrichment and depletion of genes of the same gene family, among the TFs expressed by a cell, using the hypergeometric distribution. b-c, For the homeobox domain (b) and GATA zinc finger domain (c) gene families, the enrichments and depletions are indicated for each cell of the lineage, plotted as in Fig. 2d. d, TF family stage and lineage enrichments. For each cell (column), the enrichment and depletion for the genes of a TF family (row) is indicated (similar to b,c). The top bars indicate the lineage and stage of the cell. e, For the indicated TF family the significance for enrichments is shown using the T-test between the values shown in d for the noted stage and lineage.

Homeodomain genes are expressed in lineage-specific stripes across the anterior-posterior axis.

a, Lineage expression patterns for the indicated genes in the 28-, 51-, and 102-cell stage. Thick and thin lines indicate binarized ‘on’ and ‘off’ expressions, respectively. b, Three dimensional expression patterns separated by lineage (AB/MS/C) for the genes indicated in a. Circles represent cells in the embryo at the particular stage examined. Expression of each gene is indicated by a circle of a different size and color. Light gray cells indicate cells in other lineages. c, Clustering of cells according to gene expression of the indicated lineage and stage. Colors distinguish K-means clustering.

Expression of the orthologs of Drosophila specification genes in C. elegans.

Lineage expression patterns in the 28-cell stage for the indicated genes whose orthologs function as maternal, gap, pair-rule, and segment polarity genes in Drosophila. Thick and thin lines indicate binarized ‘on’ and ‘off’ differential expressions, respectively.

Sample collection and initial analysis.

a, A schematic indicating how transcriptomes are collected. Individual embryos are opened and the cells were manually isolated such that for each cell the embryo of origin is also recorded. b, tSNE plots for the indicated cell stages. (left) Cells are colored by embryo of origin. (right) Cells are colored by inferred cell-fate.

Cell state assignment and robustness analysis.

For the 2-cell (a), 4-cell (b), 8-cell (c), 15-cell (d), 28-cell (e), 51-cell (f), and 102-cell (g) stages, the two plots indicate 1. the expression of known cell state markers (Tables S2 and S3) and 2. the similarity (correlation coefficient) across the cells in terms of these genes. Cells are colored by embryo of origin. Expression levels are indicated as z-scores. Cell states with a name including an ‘x’, indicate the multiple cell states with consistent name (Sulston et al., 1983). Cell similarity scores are shown by the colormap. To the right, the embryo of origin is indicated.

Number of cell states detected at each of the studied stages.

Columns and rows correspond to cell identities (both in the same order). Light and dark colors correspond to identity and distinctness. For the 51 and 102 cell stages the cells are ordered the same as in Figure 1 and 2. The number above each plot indicates the number of cell-states identified at the particular stage.

Inferred and validated gene expression.

As in Fig. 2d, expression of dmd-4 and unc-30 according to lineage of the inferred transcriptomes (top) and a transcriptional reporter strain (bottom), where highest expression is shown in red and no detected expression is black.

Expression of genes in C. elegans whose Drosophila orthologs have patterning functions.

Expression on the tSNE plot is indicated as in Figure 1b.

Characteristics of the dataset. Rows correspond to a studied embryo. The average number of transcripts and genes detected are shown.

Gene markers from the literature.

Cell state markers for the expression profiles used for assignment were curated from the literature (da Veiga Beltrame et al., 2022) and, in particular, one large study (Murray et al., 2012).

Inferred cell states.

Each row corresponds to one of the 119 inferred cell states. The third column indicates the number of samples (scRNA-Seq cells) annotated to each cell state.