Single-cell co-mapping reveals relationship between chromatin state and gene expression in early zebrafish development

  1. Vivek Bhardwaj  Is a corresponding author
  2. Alberto Griffa
  3. Helena Viñas Gaza
  4. Peter Zeller
  5. Alexander van Oudenaarden  Is a corresponding author
  1. Hubrecht Institute-KNAW (Royal Netherlands Academy of Arts and Sciences), Oncode Institute, Netherlands
  2. University Medical Center Utrecht, Netherlands
5 figures and 5 additional files

Figures

Figure 1 with 2 supplements
Whole-organism T-ChIC of zebrafish embryos quantifies full-length transcripts and histone modifications from the same single cell.

(a) woT-ChIC experimental workflow: (1) single cells from different time points are labeled with a combination of CellTracer dyes and permeabilized either to retain cytoplasmic RNA (whole-cell), or exclude it (nuclei) before antibody + pA-MNase incubation. (2) Cells are pooled and sorted in 384-well plates with position-indexed RNA and ChIC barcodes. Addition of Ca++ activates the MNase to cut on the target regions. (3) DNA and RNA fragments are ligated and repaired before re-pooling them for IVT and PCR amplification to produce sequencing libraries. (b) UMAP projections of single cells using signals from H3K27me3 (left) and transcriptome (right) and colored by timepoints. The six sampled time points (right) are pooled into three groups (early/middle/late) based on the complexity of H3K27me3 signal. (c) Single-cell track plot showing signal on the 450 kb region around the hoxc gene cluster. The heatmaps show signals (read counts) in single cells (capped from -1 to +1, where -ve signal shows RNA counts and +ve signal shows ChIC counts). The coverage tracks on top show the pseudo-bulk signal (blue: RNA, pink: H3K27me3). Publicly available bulk H3K27me3 datasets are shown for comparison (gray). (d) UMAP projections (based on transcriptome signal) showing gene-level normalized signal for RNA (blue) or H3K27me3 (pink) on two selected genes.

Figure 1—figure supplement 1
Quality control for the RNA and ChIC fraction.

(a) Unique Fragment Counts (UFIs) and genes detected based on the spliced (left) and unspliced (right) counts. Cells with woT-ChIC signal (in blue) show a trend comparable to the cells with only transcriptome signal (T-noChIC) (in gray). (b) UMAP (based on RNA), showing cells with multi-omic signal (blue) co-embed with cells containing only transcriptomic signal (gray). (c) Examples of cell type-specific miRNA expression detected by woT-ChIC’s whole transcriptome approach. (d) Unique ChIC counts (representing MN cuts) per bin against the number of detected bins, for 50 kb genomic bins. Cells passing the ChIC QC are colored in black, while the rest (QC Fail) cells are in gray. (e) Genome track plot showing the effect of QC-based cell filtering on signal enrichment. Top track (black) shows the signal after filtering of bad quality cells, while bottom (gray) track shows the filtered (bad quality) cells. (f) Unique RNA fragments (UFI) per cell, classified as “spliced” or “unspliced”, on pooled early (4–6 hpf), middle (8–12 hpf), and late (24 hpf) timepoints. (g) Unique ChIC counts (UMI) per cell on pooled early (4–6 hpf), middle (8–12 hpf), and late (24 hpf) timepoints. Mb; megabases. Chr; chromosome. UMI; unique molecular identifier. UFI: unique fragment identifier.

Figure 1—figure supplement 2
Evaluation of the quantitative chromatin signal from woT-ChIC.

(a) Pearson correlation of pseudo-bulk H3K27me3 signal for pooled time points from our study with comparable time points from the published datasets. (b) Fraction of cumulative signal of H3K27me3 in bins, ranked lowest to highest. Pooled time points from our study (blue, orange, and green) show a stronger enrichment of H3K27me3 compared to the corresponding time points in public bulk data (red, purple, and brown). A uniformly distributed signal over the genome (no enrichment) would lead to a diagonal line. (c) 2D density plot comparing log-scaled H3K27me3 gene-body signal with gene expression for three genes showing decreasing (hoxc3a), increasing (pcdh1b), and transient (rfx4) silencing during development (from Figure 1c and d) in single cells. (d) UMAP showing the transient silencing of rfx4 by H3K27me3 specific to the non-neural ectoderm.

Figure 2 with 2 supplements
Spatio-temporal spreading of H3K27me3 correlates with gene silencing.

(a) UMAP projection of cells based on H3K27me3 signal (same as Figure 1b, left) indicating the different cell types annotated using the transcriptome signal. (b) An example genomic locus that demonstrates the cis-spreading of H3K27me3 signal (pink) around the zic3 gene with time during development. Note that apart from the zic3 gene, the spreading also correlates with a downregulation of the expression (blue) for the nearby gene (rbmx) with time. (c) Single-cell heatmap. Each row is a single cell selected from ectoderm lineage (34.5% of all cells) showing the average H3K27me3 signal for the top 100 genes detected by the linear model, showing increase in spreading of signal at the 100 kb region surrounding the center bin with pseudotime. The center bin was identified as the bin with non-zero signal in pluripotent cells. PT = pseudotime, RT = real time. (d) Line plots comparing H3K27me3 spreading and gene expression with pseudotime. The bottom panel shows the average fraction of bins that show H3K27me3 signal on two sets of genes (spreading, non-spreading) with time, while the top panel shows the average gene expression of these gene sets along pseudotime. (e) Heatmap of H3K27me3 signals across cells (grouped by cell type, top) for genes with cell type-specific demethylation at 24hpf. Genes (rows) are grouped by celltype in which they are demethelyated. CT = cell type; PT = pseudotime. The analyzed cell types are indicated in the right legend, while top legend shows all cell types (colors same as a). (f) Correlation of H3K27me3 fold-change with change in gene expression, for the selected cell types. Top 5 genes with H3K27me3 loss are labeled per cell type.

Figure 2—figure supplement 1
Annotation of H3K27me3-RNA woT-ChIC data (4–24 hpf).

(a) Heatmaps showing prediction score (based on CCA-MNN) between cell types annotated by Wagner et al., 2018 on single cells from our study. Leiden clusters of our cells (based on RNA) are shown on the left. (b) UMAP (based on RNA) with Leiden clusters for our data. (c) UMAP colored by prediction confidence for the transferred label from Wagner et al. (d) UMAPs showing meta cells (dark circles). Cells and metacells are colored by pseudotime. (e) UMAPs with final annotation of cell types (labels are the same as Figure 2a). (f) Bar plot showing proportion of cell types detected at 10 and 24 hpf within the four biological replicates of woT-ChIC experiments (other time points were not profiled by all four replicates). Colors correspond to the cell type labels (from e).

Figure 2—figure supplement 2
H3K27me3 shows cis-spreading with time.

(a) Plot showing the number of detected bins compared to the average counts per bin in single-cells for the three pooled timepoints. With time, the number of detected bins increases, while the average counts per bin are saturated. (b) Scatterplot showing the H3K27me3 signal in selected bins (i.e., bins with non-zero signal in pluripotent cells) compared to the bins nearby in metacells. Bins that show spreading of H3K27me3 (top panel) are correlated with nearby bins, while others (bottom) are not. (c) Single-cell heatmaps (similar to Figure 2c) showing the spread of signal with pseudotime for bins that were detected as spreading only in the ‘ectoderm’ lineage, within cells selected from mesodermal lineage (15% of all cells). (d) Pseudo-bulk track plots showing signals of pooled timepoints from our data (red) showing the spreading of certain peaks by 24 hpf, bottom bars show the domains (green) and sub-peaks (blue) detected for our data. Gray tracks show publicly available bulk datasets. (e) Alternative to (d), showing the quantification of H3K27me3 signal density (number of cuts/kb) in H3K27me3 peaks vs 100 Kb region around peaks, in pseudo-bulks of annotated cell types (each dot is a region). The correlation improves with more differentiated cells, indicating signal spreading with time. Cluster 1 (red), which contains the hox genes, is already broadly marked in pluripotent cells. (f) Heatmap showing signal localization around promoters (±50 kb region) of mouse genes for H3K27me3 ChIP-seq from Xiang et al., 2020. Wider regions around promoters show a relative increase in H3K27me3 upon differentiation in ectoderm, compared to E5.5 epiblast. (g) Same as Figure 2d (top) showing the expression of genes near, but not at H3K27me3 enriched bins with pseudotime. (h) Volcano plot showing results of the linear regression per peak with its nearest gene. Peaks with strongly correlated/anti-correlated signals between H3K27me3 and RNA are colored in red in the volcano plot. (i) UMAPs showing the H3K27me3 and RNA signal on top genes underlying the selected (red) peaks of (h).

Figure 3 with 3 supplements
Integrative analysis of H3K27me3, H3K4me1, and transcriptome.

(a) UMAP projection of the single cells based on the three modalities H3K27me3 (left), RNA (center), and H3K4me1 (right) after integration of the two batches and annotation of cells. (b) Total UMI counts per cell on H3K4me1 enriched regions from the integrated dataset, divided into H3K4me1-unique regions and regions co-enriched for H3K27me3. H3K4me1 signal (top panels) on these regions remains unchanged with time, while H3K27me3 signal (bottom panels) increases. (c) Correlation between histone modification signal over gene bodies and gene expression, per metacell. Metacells are ordered by latent time (X-axis) and the Pearson correlation coefficient (Y-axis) between H3K4me1 and RNA (top) and H3K27me3 and RNA (bottom). (d) Scatterplot showing H3K4me1 signal and unspliced RNA counts for all genes of the two selected metacells indicated in (c), early (1) and late (2) in latent time. Colors corresponds to the germ layer annotation of the metacell, as indicated in (c).

Figure 3—figure supplement 1
Quality control and comparison of H3K27me and H3K4me1 signal in nuclei.

(a) Unique RNA fragments (UFIs) recovered per cell, for ‘spliced’ and ‘unspliced’ reads per time point, for H3K27me3 and H3K4me1-targeted cells from the nuclei batch. (b) Number of genes detected versus unique RNA fragments (UFIs) recovered per cell, divided by targeted antibody (top: H3K27me3, bottom: H3K4me1). Nuclei are colored by time point. (c) Unique ChIC counts per nuclei divided by time point and targeted antibody. The amount of signal recovered for H3K27me3 increases with time, while the amount of H3K4me1 detected decreases. (d) Integrated UMAP embedding of for 4–12 hpf data based on RNA, colored by the target histone modification (left), prediction confidence of label transfer using Wagner et al., 2018 (center), and metacell assignment (right). (e) Integrated UMAPs based on ChIC and RNA signal (same as Figure 3a), colored by manually assigned ‘germ layer’ annotation.

Figure 3—figure supplement 2
Comparison of genomic distribution of H3K4me1 and H3K27me3 with time.

(a) Genome tracks showing H3K4me1 enrichment (blue) and H3K27me3 enrichment (red) on a randomly selected genomic region, demonstrating a high degree of overlap between the two. (b) Total UMI counts per cell (similar to Figure 3b) on H3K27me3-unique enriched sites, and random 50 kb bins in the genome. H3K4me1 signal (top) on these regions goes down with time, while H3K27me3 signal remains unchanged. (c) Histograms showing the ratio of H3K4me1 to H3K27me3 on (5 kb region centered at) promoters in each metacell. Metacells are sorted (top to bottom) and colored by their annotated germ layers. (d) Correlation (similar to Figure 3c) between histone modification signal over promoters and gene expression, per metacell.

Figure 3—figure supplement 3
Derivation of latent time and lineages using RNA velocity on 4–12 hpf data.

(a) RNA velocity phase portraits showing spliced to unspliced count ratios of selected developmental genes. Thick red lines show the fits from the dynamical model (Bergen et al., 2020) used to derive transcription or degradation rates, while the dashed line shows the steady state. (b) UMAPs colored by the confidence per cell for global RNA velocity signal extracted from the top 1000 genes. (c) Vector field diagrams showing inferred differentiation trajectories projected over UMAP. (d, e) UMAPs showing the latent time per cell inferred using cellrank (combining RNA-velocity and diffusion pseudotime) compared to the known real time of embryo collection.

Figure 4 with 1 supplement
Prediction of TF activity using TF epigenetics and transcription.

(a) Schematic and outputs of the prediction model. (1) (top) Normalized H3K4me1 and H3K27me3 signal on the TF locus, TF (spliced) RNA signal, and indicators of cell state (pseudotime and lineage) are used to predict TF activity (bottom). (2) The lasso regression model is used to select the most useful predictors for each TF. The bottom right plot shows the TFs sorted by the R2 values on the independent test dataset. (3) Coefficients from the final models are ranked and compared to categorize the TFs. The number of TFs classified as activator/repressor, or regulated/independent is shown in the right plot. (b) UMAPs showing the expression and motif activities of TFs classified as ‘activators’ or ‘repressors’ using the above model. TF expression (left) is based on (normalized) spliced RNA and TF activity (right) is based on H3K4me1 signal on TFBS. For activators, the motif activities on TFBS are gained in cells where the TFs are expressed, while for the repressors, the motif activities are lost in the cells expressing the TF. (c) UMAPs show (normalized) H3K4me1 signal and H3K27me3 signal on TF gene body, for TFs classified as ‘regulated’ or ‘independent’. The histone modifications on regulated TFs are correlated with their activities.

Figure 4—figure supplement 1
Examples of TFs with predicted activation and repression functions and their epigenetic regulation.

(a) UMAPs showing expression and motif activities supporting the predicted functions of TFs (similar to Figure 4b). (b) Expression, activity, and the (average) expression of target genes for two activators (ctcf, tbxta) and two repressors (tcf71la, zeb1b). Target genes are identified as genes nearest to the H3K4me1 peaks associated with the selected TF motif. (c) UMAPs showing H3K4me1 and H3K27me3 levels on TF gene bodies, supporting the predicted chromatin regulation status of TFs (similar to Figure 4c).

Author response image 1

(1) (top) expression of tbx16, which was one of the common TFs detected in our study and also targeted by Saunders et al by CRISPR. tbx16 expression is restricted to presomitic mesoderm lineage by 12hpf, and is mostly absent from 24hpf cell types. (bottom) shows DE genes detected in different cellular neighborhoods (circled) in tbx16 crispants from 24hpf subset of cells in Saunders et al. None of these DE genes were detected as “direct targets” in our analysis and therefore seem to be downstream effects. (2) Effect of 3 different concentrations of EZH2 inhibitor (GSK123) on global H3K27me3 quantified by flow cytometry using fluorescent coupled antibody (same as we used in T-ChIC) in two replicates. The cells were incubated between 3 and 10 hpf and collected afterwards for this analysis. We observed a small shift in H3K27me3 signal, but it was inconsistent between replicates.

Additional files

Supplementary file 1

Number of cells acquired per timepoint, mark and batch.

https://cdn.elifesciences.org/articles/110400/elife-110400-supp1-v1.xlsx
Supplementary file 2

Comparison of median counts and detected genes per cell.

https://cdn.elifesciences.org/articles/110400/elife-110400-supp2-v1.xlsx
Supplementary file 3

Genes with differential H3K27me3 signal between selected 24hpf cell types and others.

https://cdn.elifesciences.org/articles/110400/elife-110400-supp3-v1.xlsx
Supplementary file 4

Classification of TFs based on the results of the penalized regression model.

https://cdn.elifesciences.org/articles/110400/elife-110400-supp4-v1.xlsx
MDAR checklist
https://cdn.elifesciences.org/articles/110400/elife-110400-mdarchecklist1-v1.docx

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Vivek Bhardwaj
  2. Alberto Griffa
  3. Helena Viñas Gaza
  4. Peter Zeller
  5. Alexander van Oudenaarden
(2026)
Single-cell co-mapping reveals relationship between chromatin state and gene expression in early zebrafish development
eLife 15:RP110400.
https://doi.org/10.7554/eLife.110400.2