Figures and data

whole-organism T-ChIC of zebrafish embryos quantifies of full length transcripts and histone modifications from the same single cell.
a. woT-ChIC experimental workflow: (1) Single cells from different time points are labelled with a combination of CellTracer dyes and permeabilized either to retain cytoplasmic RNA (whole-cell), or exclude it (nuclei) before antibody+pA-MNase incubation. (2) Cells are pooled and sorted in 384-well plates with position-indexed RNA and ChIC barcodes. (3) DNA and RNA fragments are ligated and repaired before re-pooling them for IVT and PCR amplification to produce sequencing libraries. b. UMAP projections of single cells using signals from H3K27me3 (left) and transcriptome (right) and colored by timepoints. The 6 sampled timepoints (right) are pooled into 3 groups (early/middle/late) based on the complexity of H3K27me3 signal. c. Single-cell trackplot showing signal on the 450-kb region around the hoxc gene cluster. The heatmaps show signals in single cells, while the coverage tracks on top show the pseudo-bulk signal (blue: RNA, pink: H3K27me3). Publicly available bulk H3K27me3 datasets are shown for comparison (grey) d. UMAP projections (based on transcriptome signal) showing gene-level normalized signal for RNA (blue) or H3K27me3 (pink) on 2 selected genes.

Spatio-temporal spreading of H3K27me3 correlates with gene silencing.
a. UMAP projection of cells based on H3K27me3 signal (same as Fig 1b, left) indicating the different cell types annotated using the transcriptome signal. b. An example genomic locus that demonstrates the cis-spreading of H3K27me3 signal (pink) around the zic3 gene with time during development. Note that apart from the zic3 gene, the spreading also correlates with a downregulation of the expression for the nearby gene (rbmx) with time. c. Single-cell heatmap each row is a single cell selected from ectoderm lineage (34.5%) showing the average H3K27me3 signal for the top 100 genes detected by the linear model, showing increase in spreading of signal at the 100-kb region surrounding the center bin with pseudotime. The center bin was identified as the bin with non-zero signal in pluripotent cells. (PT=Pseudotime, RT=Real time) d. Line plots comparing H3K27me3 spreading and gene expression with pseudotime. The bottom panel shows the average fraction of bins that show H3K27me3 signal on 2 sets of genes (spreading, non-spreading) with time, while the top panel shows the average gene expression of these genes-sets along pseudotime. e. Heatmap of H3K27me3 signals across cell types for genes with cell-type-specific demethylation (CT=Cell Type). The analysed cell types are indicated in the right legend, while top legend shows all cell types (colors same as 2a). f. Correlation of H3K27me3 fold-change with change in gene expression, for the selected cell types.Top 5 genes with H3K27me3 loss are labelled per cell type.

Integrative analysis of H3K27me3, H3K4me1 and transcriptome.
a. UMAP projection of the single cells based on the 3 modalities H3K27me3 (left), RNA (center) and H3K4me1 (right) after integration of the two batches and annotation of cells. b. Total UMI counts per cell on H3K4me1 enriched regions from the integrated dataset, divided into K4me1-unique regions and regions co-enriched for K27me3. H3K4me1 signal (top panels) on these regions remains unchanged with time, while H3K27me3 signal (bottom panels) increases. c. Correlation between histone modification and gene expression with latent time. Top plot shows metacells arranged by pseudotime (Y-axis) and the Pearson correlation coefficient between H3K4me1 and RNA (top) and H3K27me3 and RNA. d. Scatterplot showing correlation between H3K4me1 and unspliced RNA for two selected metacells indicated in (c) (early and late in latent time).

Prediction of TF activity using TF epigenetics and transcription.
a. Schematic and outputs of the prediction model. 1) (top) Normalized H3K4me1 and H3K27me3 signal on the TF locus, TF (spliced) RNA signal, and indicators of cell state (pseudotime and lineage) are, used to predict TF activity (bottom) 2) The lasso regression model is used to select the most useful predictors for each TF. The bottom right plot shows the TFs sorted by the R2 values on the independent test dataset. 3) Coefficients from the final models are ranked and compared to categorize the TFs. The number of TFs classified as activator/repressor, or regulated/independent are shown in the right plot. b. UMAPs showing the expression and motif activities of TFs classified as “activators” or “repressors” using the above model. TF expression (left) is based on (normalized) spliced RNA and TF activity (right) is based on H3K4me1 signal on TFBS. For activators, the motif activities on TFBS are gained in cells where the TFs are expressed, while for the repressors, the motif activities are lost in the cells expressing the TF. c. UMAPs show (normalized) H3K4me1 signal and H3K27me3 signal on TF gene body, for TFs classified as “regulated” or “independent”. The histone modifications on regulated TFs are correlated with their activities.

Quality control for the RNA and ChIC fraction
a. Unique Fragment Counts (UFIs) and genes detected based on the spliced (left) and unspliced (right) counts. Cells with T-ChIC signal (in blue) show a trend comparable to the cells with only transcriptome signal (T-noChIC) (in gray). b. UMAP (based on RNA), showing cells with multi-omic signal (blue) co-embed with cells containing only transcriptomic signal (gray). c. Examples of cell-type-specific miRNA expression detected by TChIC’s whole transcriptome approach. d. Unique ChIC counts (representing MN cuts) per bin against the number of detected bins, for 50-kb genomic bins. Cells passing the ChIC QC are coloured in black, while the rest (QC Fail) cells are in Grey. e. Genome track plot showing the effect of QC-based cell filtering on signal enrichment. Top track (black) shows the signal after filtering of bad quality cells, while bottom (grey) track shows the filtered (bad quality) cells. f. Unique RNA fragments (UFI) per cell, classified as “spliced” or “unspliced”, on pooled early (4-6 hpf), middle (8-12 hpf) and late (24 hpf) timepoints. g. Unique ChIC counts (UMI) per cell on pooled early (4-6 hpf), middle (8-12 hpf) and late (24 hpf) timepoints. Mb; megabases. Chr; chromosome. UMI; unique molecular identifier. UFI: unique fragment identifier.

Evaluation of the quantitative chromatin signal from tChIC
a. Pearson correlation of pseudo-bulk H3K27me3 signal for pooled timepoints from our study with comparable time points from the published datasets. b. Fraction of cumulative signal of H3K27me3 in bins, ranked lowest to highest. Pooled time points from our study (blue, orange and green) show a stronger enrichment of H3K27me3 compared to the corresponding time points in public bulk data (red, purple and brown). A uniformly distributed signal over the genome (no enrichment) would lead to a diagonal line. c. 2D density plot comparing log-scaled H3K27me3 signal with gene expression for 3 genes showing decreasing (hoxc3a) increasing (pcdh1b) and transient (rfx4) silencing during development (from Fig. 1c-d) in single cells. d. UMAP showing the transient silencing of rfx4 by H3K27me3 specific to the non-neural ectoderm.

Annotation of H3K27me3-RNA tChIC data (4-24 hpf)
a. Heatmaps showing prediction score (based on CCA-MNN) between cell types annotated by Wagner et al. (2018) on single cells from our study. Leiden clusters of our cells (based on RNA) are shown on the left. b. UMAP (based on RNA) with Leiden clusters for our data c. UMAP coloured by prediction confidence for the transferred label from Wagner et al. d. UMAPs showing meta cells (dark circles). Cells and metacells are coloured by pseudotime. e. UMAPs with final annotation of celltypes (labels are the same as Fig. 2a). f. Barplot showing proportion of cell types detected at 10 and 24 hpf within the 4 biological replicates of T-ChIC experiments (other timepoints were not profiled by all four replicates). Colors correspond to the cell type labels (from e).

H3K27me3 shows cis-spreading with time
a. Plot showing the number of detected bins compared to the average counts per bin in single-cells for the 3 pooled timepoints. With time the number of detected bins increases, while the average counts per bin are saturated. b. Scatterplot showing the H3K27me3 signal in selected bins (i.e. bins with non-zero signal in pluripotent cells) compared to the bins nearby in metacells. Bins that show spreading of H3K27me3 (top panel) are correlated with nearby bins, while others (bottom) are not. c. Single-cell heatmaps (similar to Fig 2c) showing the spread of signal with pseudotime for bins that were detected as spreading only in the “ectoderm” lineage, within cells selected from mesodermal lineage (15% of total). d. Pseudo-bulk track plots showing signals of pooled timepoints from our data (red) showing the spreading of certain peaks by 24 hpf, bottom bars show the domains (green) and sub-peaks (blue) detected for our data. Gray tracks show publicly available bulk datasets. e. Alternative to (d), showing the quantification of H3K27me3 signal density (number of cuts / kb) in H3K27me3 peaks vs 100-Kb region around peaks, in pseudo-bulks of annotated cell types (each dot is a region). The correlation improves with more differentiated cells, indicating signal spreading with time. Cluster 1 (red) which contains the hox genes are already broadly marked in pluripotent cells. f. Same as Fig 2d (top) showing the expression of genes near, but not at H3K27me3 enriched bins with pseudotime. g. Volcano plot showing results of the linear regression per peak with its nearest gene. Peaks with strongly correlated/anti-correlated signals between H3K27me3 and RNA are coloured in red in the volcano plot. h. UMAPs showing the H3K27me3 and RNA signal on top genes underlying the selected (red) peaks of (g). i. Heatmap showing signal localization around promoters (+- 50kb region) of mouse genes for H3K27me3 ChIP-seq from Xiang et al. (2019). Wider regions around promoters show a relative increase in H3K27me3 upon differentiation in ectoderm, compared to E5.5 epiblast.

Quality control and comparison of H3K27me and H3K4me1 signal in nuclei
a. Unique RNA fragments (UFIs) recovered per cell, for “spliced” and “unspliced” reads per time point, for H3K27me3 and H3K4me1-targeted cells from the nuclei batch. b. Number of genes detected versus unique RNA fragments (UFIs) recovered per cell, divided by targeted antibody (top: H3K27me3, bottom: H3K4me1). Nuclei are coloured by time point. c. Unique ChIC counts per cell divided by time point and targeted antibody. The amount of signal recovered for H3K27me3 increases with time, while the amount of H3K4me1 detected decreases. d. UMAP embedding (based on RNA), coloured by the target histone modification. e. UMAP showing the prediction confidence of label transfer per cell from Wagner et al. (2018) data to our data (similar to Fig. S3c). f. UMAP with metacell assignment (similar to Fig S3d) for 4-12 hpf data. g. UMAPs based on ChIC and RNA signal (same as Fig 3a) colored by manually assigned “germ layer” annotation.

Comparison of genomic distribution of H4K4me1 and H3K27me3 with time
a. Genome tracks showing H3K4me1 enrichment (blue) and H3K27me3 enrichment (red) on a randomly selected genomic region, demonstrating a high degree of overlap between the two. b. Total UMI counts per cell (similar to Fig 3b) on H3K27me3-unique enriched sites, and random 50-kb bins in the genome. Opposite to Fig 3b, H3K4me1 signal (top) on these regions goes down with time, while H3K27me3 signal remains unchanged. c. Histograms showing the ratio of H3K4me1 to H3K27me3 on (5 kb region centered at) promoters in each metacell. Metacells are sorted (top to bottom) and colored by their annotated germ layers. d. Correlation between histone modifications and gene expression (same as Fig 3c) in metacells, for promoters.

Derivation of latent time and lineages using RNA velocity on 4-12 hpf data
a. RNA velocity phase portraits showing spliced to unspliced count ratios of selected developmental genes. Thick red lines show the fits from the dynamical model (Bergen et al. 2020) used to derive transcription or degradation rates, while the dashed line shows the steady state. b. UMAPs colored by the confidence per cell for global RNA velocity signal extracted from the top 1000 genes. c. Vector field diagrams showing inferred differentiation trajectories projected over UMAP. d-e. UMAPs showing the latent time per cell inferred using cellrank (combining RNA-velocity and diffusion pseudotime), compared to the known real time of embryo collection.

Examples of TFs with predicted activation and repression functions and their epigenetic regulation
a. UMAPs showing expression and motif activities supporting the predicted functions of TFs (similar to Fig. 4b). b. UMAPs showing H3K4me1 and H3K27me3 levels on TF gene bodies, supporting the predicted chromatin regulation status of TFs (similar to Fig. 4c). c. Expression, Activity and the (average) expression of target genes for two activators (ctcf, tbxta) and two repressors (tcf71la, zeb1b). Target genes are identified as genes nearest to the H3K4me1 peaks associated with the selected TF motif.