Differentiation of the E12.5 mouse rV2 GABAergic and glutamatergic neurons based on transcriptome and chromatin accessibility

A. UMAP projection of rV2 lineage cells based on scRNA profiles.

B. Pseudotime (VIA) scores shown on the scRNA UMAP projection.

C. RNA expression of marker genes of progenitors and postmitotic precursors of GABAergic and glutamatergic neurons shown on the scRNA UMAP projection.

D. UMAP projection of rV2 lineage cells based on scATAC profiles.

E. Pseudotime (VIA) scores shown on the scATAC UMAP projection.

F. Inferred RNA expression of the marker genes of progenitors and postmitotic precursors of GABAergic and glutamatergic neurons shown on the scATAC UMAP projection (interpolated expression values).

G. Heatmap of scATAC and scRNA cluster similarity scores.

Accessibility of chromatin features in the Tal1 and Vsx2 loci during GABAergic and glutamatergic development.

A. Chromatin accessibility per cell group (normalized signal), Ensembl gene track (Genes), scATAC-seq features (Feat.), feature linkage to gene (Links with the LinkPeaks abs(zscore) >2) and nucleotide conservation (Cons.) within +/-50 kbp region around Tal1 TSS. Violin plots on the right show the expression levels of Tal1 and Pdzk1ip1 mRNA per the scATAC- seq cell group (indicated on the left).

B. Spline-smoothed z-score transformed heatmaps of chromatin accessibility at Tal1-linked scATAC-seq features (Tal1 cCREs) in the single cells of rV2 GABAergic and glutamatergic lineages with RNA expression of Tal1, Gata2, Gad1 and Slc17a6 (sliding window mean(width=6) smoothed) as column covariable. Cells on the x-axis are first grouped per cell group (top) and then ordered by the pseudotime (bottom) within each group.

C. Same as in A, but for the Vsx2 locus.

D. Accessibility of Vsx2 cCREs in the rV2 GABAergic and glutamatergic lineages, shown as in (B). The expression levels of Vsx2, Tal1, Gad1, and Slc17a6 are shown above the heatmaps.

TF binding in putative regulatory elements of Tal1 and the overlap between TFs interacting with other GABAergic fate selector genes.

A. scATAC features within +/-50 kb of the Tal1 gene. Tal1 cCREs are shown in blue. CUT&Tag, CUT&Tag consensus peaks indicating Tal1, Gata2, Gata3, Vsx2, Ebf1 and Insm1 binding in the E12.5 mouse r1. No Tead2 CUT&Tag consensus peaks were located in this region.

B-D. Footprint analysis of the features at +1 kb, +23 kb and +40 kb of the Tal1 TSS. Footprint scores at conserved TFBSs are shown for progenitors (PRO1-2), common precursors (CO1-2), GABAergic precursors (GA1-2) and glutamatergic precursors (GL1-2). In each dotplot, the strength of footprint at TFBS in the feature is shown as colour (Footprint score, average of cell group) and the expression of the TF gene in dot size (log1p). Average feature accessibility in cell groups is shown at the right. TFBS names (Hocomoco v12) are shown at the top and the TF gene names (mouse) are shown under the dotplot.

The red arrowhead in (B) indicates the conserved Gata2 TFBS at –37 bp position required for the neural expression of Tal1 (see also Supplementary Table 4).

E. Overlap of the TFs with footprints in the cCREs of Tal1, Gata2, and Gata3 in the common precursors of rV2 lineages (CO1-2) and in the rV2 GABAergic precursors (GA1-2). Venn diagrams show the number of TFs with an scATAC footprint at conserved TFBS in Tal1, Gata2, or Gata3 cCREs and with the gene expression (log1p) > 1.2 in the analysed cell group (Exp(TF)>1.2; Footprint in cCRE=1; TFBS cons>0.5). The TFs associated with the cCREs of all three selector genes in common precursors (CO1-2) and GABAergic precursors (GA1-2) are listed. Blue text: 19 TFs that interact with cCREs of Tal1, Gata2 and Gata3 in the CO1-2 cells. Some of these TFs continue to be expressed and interact with the Tal1, Gata2 and Gata3 genes in the GA1-2 cells. 22 TFs interact with Tal1, Gata2 and Gata3 in GA1-2 cells. Green text, TFs found associated with two selector genes in CO1-2 (green in the Venn diagram of CO1-2) and associated with all three selector genes in the GA1-2. Black text, TFs co-regulating the selector TFs in GA1-2 and not expressed in the CO1-2 cells. The TFs regulating both GABAergic and glutamatergic selectors are marked in bold.

§ The probability of finding n overlapping genes considering all mouse genes equally is p<1e- 6.

*,** The collective minimum statistical significance of feature to gene links for selector genes Tal1, Gata2 and Gata3 cCREs for the given TF is shown as: *p-value<0.05; **p-value<0.01 (with LinkPeaks z-score above 2 or below -2).

Expression patterns of transcription factors (TFs) associated with Tal1 candidate cis-regulatory elements (cCREs) in the developing anterior brainstem.

A–C. mRNA in situ hybridization with the indicated probes on transverse paraffin sections of E12.5 wild-type mouse embryos. A scheme (top left) illustrates the rV2 domain; the boxed area corresponds to the region shown in the images. Dashed lines indicate the ventricular surface (VS), marking the apical border of the ventricular zone (VZ). Dashed boxed areas indicate regions shown in the zoom-in panels.

A. mRNA detection of Tal1 (a1), Insm1 (a2) and Sox4 (a5). Overlays show the merged Tal1/Insm1 (a3 and a4), and merged Tal1/Sox4 (a6 and a7) signal.

B. mRNA detection of Tal1 (b1), E2f1 (b2) and Tead2 (b5). Overlays show merged Tal1/E2f1 (b3 and b4) and merged Tal1/Tead2 (b6 and b7) signal.

C. mRNA detection of Tal1 (c1) and Ebf1 (c2), and merged Tal1/Ebf1 (c3 and c4) signal. Arrowheads point to the co-localization of probes. The co-localization signal is mostly cytoplasmic. Scale bars: 25 μm in zoom-ins (a4, a7, b4, b7, c4), others 50 μm.

D-F. Quantification of the mRNA in situ hybridization signal. Example of StarDist ROIs overlaid with DAPI staining is shown in (D). The fluorescence intensity per cell is plotted along the apical–basal differentiation axis. Arrows indicate the apical-to-basal axis. In all plots, the y-axis represents average fluorescence intensity per cell; the x-axis represents the distance from the ventricular surface (VS) using average cell diameter as the unit. Lines represent the mean of three biological replicates; grey areas denote 95% confidence intervals.

G–I. Violin plots showing mRNA expression, inferred from scRNA-seq, in the rV2 neuronal populations. The pseudotime order of the GABAergic and glutamatergic lineage cell groups is shown by the branched arrows.

Dynamic accessibility of TF binding sites suggests a genome-wide function for Gata2, Tal1 and Gata3 in the selection of the GABAergic fate.

A. Genome-wide enrichment of TF binding sites in the accessible regions in GABAergic and glutamatergic lineages. Heatmap of chromVAR z-scores (avg(TFBS motifs per TF)) for the expressed TFs in rV2 cell groups. All TFs, of which a TFBS-motif chromVAR score was among top 10 scores in any cell group, are shown.

B. Violin plots of the TF gene expression (average scRNA expression) for Tal1, Gata2, and Gata3.

C. Motif accessibility (chromVAR score) of Tal1, Gata2, and Gata3 HOCOMOCO v12 TFBS motifs in the rV2 cell clusters. HOCOMOCO v12 contains three Tal1 motifs, two Gata2 motifs and two Gata3 motifs. Ordering of cell groups is the same in all violin plots.

Genomic targets of the GABAergic selector TFs.

A. Schematic explaining the strategy of identifying the targets of Tal1, Gata2 and Gata3 selector TFs. Within the TAD containing a gene, features overlapping a CUT&Tag peak for the selector TF, and a footprint for the TF at a position with weighted mean conservation score > 0.5 are identified. Genes linked to features fulfilling these conditions are considered target genes. For linkage, the Spearman correlation-based LinkPeaks score between the feature targeted by the selector TF and expression of the gene is required to be >2 (positive effect link) or <-2 (negative effect link) with a p-value <0.01.

B. Number of target genes and the overlap between Gata2, Gata3 and Tal1 target genes.

C. GSEA of Tal1 targets. Mouse genes are ranked by the difference in the expression in GA1-2 vs GL1-2 cell groups (log2 avg FC). Tal1 target genes are indicated with black lines. In leading edges, the GA1-2-enriched target genes are highlighted in blue and GL1-2- enriched genes with red. Scatterplots show the expression of the target genes in both edges. D-F. Characterization of Tal1 target genes.

D. Count of target features by the positive and negative effect link.

E. Count of target genes by the nearest linked feature, stratified by the nearest feature distance bins as indicated.

F. The variability of target gene expression in rV2 lineage cell clusters, stratified by the nearest feature distance bins.

G. Top terms in CellMarker gene set database using the list of Tal1 target genes with exp>0.5 (log1p) in GA1-2 or GL1-2 cell groups or both.

H. GSEA of Gata2 targets, as in (C).

I-L. Characterization of Gata2 target genes, as in (D-G).

M. GSEA of Gata3 targets, as in (C).

N-Q. Characterization of Gata3 target genes, as in (D-G).

Proposed gene regulatory network guiding the GABA- vs glutamatergic fate selection in the rV2.

The genes expressed in GABAergic cells (GA1-2) are in blue and the genes expressed in glutamatergic cells (GL1-2) in red. St18 is expressed in both GA1-2 and GL1-2 cell groups, but its expression in GL1-2 is weak. Arrows represent positive regulation and were drawn when the regulator and its target mRNAs were co-expressed in the same cell group. Blunt arrows represent negative regulation and were drawn when the regulator and the target expression were not found in the same cell group.

E12.5 mouse rhombomere 1 single-cell RNAseq.

A. UMAP plot of scRNA-seq clusters (left), and the expression level of Nkx6-1, Tal1, Slc17a6, Gad1, Nes and Phgdh RNA in the clusters (right).

B. The expression of GABA- and glutamatergic neuron markers in the UMAP of the ventrolateral r1 lineage (E12.5 rV2 scRNA-seq clusters).

rV2 progenitors (rV2 pro), common precursors (rV2 Co), GABAergic precursor (rV2 GABA) and glutamatergic precursor branches (rV2 Glut) are labelled and indicated with arrows.

E12.5 mouse rhombomere 1 single-cell ATACseq.

A. UMAP plot of scATAC-seq clusters (left), and the inferred expression level of Nkx6-1, Tal1, Slc17a6, Gad1, Nes and Phgdh RNA in the clusters (right).

B. The expression of GABA- and glutamatergic precursor markers in the UMAP of the ventrolateral R1 lineage clusters (E12.5 rV2 scATAC-seq clusters). Arrows indicate the clusters of rV2 lineage progenitors (rV2 PRO), common precursors (rV2 CO), GABAergic neuron (rV2 GABA) and glutamatergic neuron branches (rV2 Glut).

Comparison of the previously characterized enhancers of Tal1, Gata2, Gata3 and Vsx2 with the scATAC features identified in this study.

A. Tal1 enhancers, cCREs and scATAC features;

B. Gata2 enhancers, cCREs and scATAC features;

C. Gata3 enhancers, cCREs and scATAC features;

D. Vsx2 enhancers, cCREs and scATAC features.

On the genomic loci of the genes, the previously characterized enhancers are indicated in colour. Tissue-specificity is indicated in the legend at the side. The features identified in this study are shown on the ATAC features (E12.5 R1) track. The gene-linked features (cCREs) are labelled in blue text. The details and references of shown enhancers are found in the Supplementary Table 4.

Genomic features and feature accessibility at Gata2 and Gata3 gene loci.

A. Top, normalized scATAC-seq signal in +/-50 kb region of Gata2 TSS, in the rV2 lineage cell groups. Bottom, the Ensembl gene models (Genes), the linkage of features to target gene (Gata2) (Links), the scATAC features (Feat.; cCREs are shown in blue), and by-nucleotide conservation of DNA across vertebrate species (Cons.) is shown. Violin plots show the distribution of Gata2 RNA expression (log1p) in the rV2 cell groups.

B. Smoothed heatmaps of the accessibility of the Gata2 cCREs in the rV2 GABAergic and glutamatergic cell lineages (GABA, GLUT). Cell group identities are shown on top of heatmaps. Cells are ordered by the pseudotime value (Pseudotime). RNA expression (log1p, smoothed with a sliding window mean(width=6)) of Gata2, Vsx2, Gad1 and Slc17a6 is shown above the heatmaps.

C. scATAC-seq signal in the rV2 lineage cell groups and genomic features in +/-50 kb region of Gata3 TSS, similar to (A). Violin plots show the Gata3 RNA expression (log1p) in single cells of rV2 cell groups. Gata3 gene ATG is located +11 kb from the TSS, the position is indicated in Genes view.

D. Smoothed heatmaps of the accessibility of the Gata3 cCREs in the rV2 GABAergic and glutamatergic cell lineages (GABA, GLUT) and the RNA expression of Gata3, Tal1, Gad1 and Slc17a6 above the heatmaps.

Comparison of the previously characterized enhancers of Tal1, Gata2, Gata3 and Vsx2 with the scATAC features identified in this study.

A. Stacked bar chart showing the count (n) of scATAC features overlapping a previously characterized enhancer (Known enhancer). Features linked to gene (cCRE) and features not linked to gene are counted separately. Violet colour shows the count of characterized enhancers within the gene TAD that do not overlap scATAC feature(s). The details and references of known enhancers are found in the Supplementary Table 4.

B. Bar chart showing the percentage of cCREs out of all features in gene TAD, the percentage of scATAC features overlapping enhancers and the percentage of enhancers overlapping all scATAC features and cCREs. The colours indicate the gene locus of features and the target gene of enhancers or cCREs.

TF signatures found by scATAC-footprinting.

A. Heatmap of the TF footprint scores (as avg(TFBS footprint scores) per TF) over the cell groups. Y-axis has been hierarchically clustered with Euclidean distance and complete linkage.

Regulatory features of the glutamatergic fate selector Vsx2 and common regulators of Vsx2 and Vsx1.

A. Vsx2 gene with its protein-coding regions and associated chromatin features.

B. CUT&Tag with Gata2, Gata3, Tal1, Vsx2, Ebf1 and Insm1antibodies in E12.5 mouse r1. Locations of CUT&Tag consensus peaks at the Vsx2 gene locus are shown.

C-E. scATAC footprint analysis of the Vsx2-associated features at +20.5 kb, -61 kb and –68.8 kb of Vsx2 TSS. In each feature, the strength of footprint (averaged over the cell group shown at the side) at TFBS is shown in colour and the expression of the TF gene in dot size. Feature accessibility in cell groups is shown at the right (Accessibility). TFBS names are shown at the top and the TF gene names (mouse genes) are shown under the dotplot.

F. Venn diagrams of the overlap of the TFs interacting with the cCREs of Vsx1 and Vsx2 in the common precursors of rV2 lineages (CO1-2) and in the rV2 glutamatergic precursors (GL1-2). Common regulators Vsx1 and Vsx2 in CO1-2 (n=18) and in GL1-2 (n=21) are listed next to Venn diagrams. The 18 TFs regulating both Vsx1 and Vsx2 in CO are listed in blue text. The TFs regulating both glutamatergic and GABAergic selectors are marked in bold.

§ The probability of finding n overlapping genes, considering all mouse genes equally, is p<1e-6.

*,** The collective minimum statistical significance of feature-to-gene links for selector genes Vsx1 and Vsx2 cCREs for the given TF is shown as: *p-value<0.05; **p-value<0.01 (with LinkPeaks z-score above 2 or below -2).

CUT&Tag analysis of Tal1 and Vsx2 -associated chromatin in the E12.5 mouse r1.

A. Heatmaps and profile plots of the Tal1 CUT&Tag signal intensity along the mouse genes in each replicate. Genes are scaled and the areas of 3 kb before transcription start site (TSS) and 3 kb after transcription end site (TES) are shown. Next to heatmaps, genome views showing examples of CUT&Tag signal, detected peaks in all replicates (n=4, Re1 - Re4) and consensus peaks (Cons peaks).

B. Heatmaps and profile plots of the Vsx2 CUT&Tag signal in E12.5 r1 samples (n=4, Re1- Re4).

C. Genome views of the CUT&Tag signal with the positive control antibody (anti-H3K4me3) and the negative control antibody (anti-IgG, Re1-Re4), in the same gene loci as shown in (A) and (B). Consensus peak track is not shown. Only two consensus peaks were found between the Re1-Re4 of IgG treated samples.

D. Heatmaps and profile plots of the H3K4me3 CUT&Tag signal (n=1, Re1) and the CUT&Tag signal with anti-IgG antibody (n=4, Re1-Re4). Note that the scales on heatmaps are different.

CUT&Tag analysis of Gata2 and Gata3 -associated chromatin in the E12.5 mouse r1.

A. Heatmaps and profile plots of CUT&Tag reads along the mouse genes (gene lengths scaled, and +/- 3kb), in Gata2 CUT&Tag samples. Genome views show the Gata2 CUT&Tag signal at Tal1, Lmo1 and Gata2 gene loci, the detected peaks in all replicates (n=4, Re1 - Re4) and consensus peaks (Cons peaks).

B. Heatmaps and profile plots of CUT&Tag reads along the mouse genes (gene lengths scaled, and +/- 3kb), in Gata3 CUT&Tag samples. Genome views show the Gata3 CUT&Tag signal at Tal1, Lmo1 and Gata2 gene loci, the detected peaks in all replicates (n=3, Re1 - Re3) and consensus peaks (Cons peaks).

C. Correlation heatmaps of the CUT&Tag sample correlation by reads, and by peaks.

D. Venn diagrams showing the peak count and peak overlap between the replicates.

CUT&Tag analysis of Ebf1, Insm1 and Tead2 -associated chromatin in the E12.5 mouse r1.

A-C. CUT&Tag of Ebf1 (A), Insm1 (B) and Tead2 (C) -associated chromatin in E12.5 mouse r1. Heatmaps and profile plots show the CUT&Tag signal intensity along the mouse genes (gene lengths scaled, and +/- 3kb) in each replicate (n=2 for each TF, Re1 - Re2). Genome views show examples of the CUT&Tag signal and the detected peaks at indicated gene loci from all replicates and the defined consensus peaks (Cons peaks).

D. Correlation heatmaps showing the CUT&Tag sample correlation by peaks.

E. Venn diagrams showing the peak count and peak overlap between the replicates.

A. Footprint position density over length-normalized CUT&Tag peaks per each TF.

Normalised peak positions 0 - 1 refer to peak start - end positions.

Examples of definition of selector gene target features and target genes.

A. Example of corroborating evidence for Tal1, Gata2 and Gata3 regulating Lmo1.

Cons track shows nucleotide conservation (binary, with >0.5 thr.). cCREs show features linked to Lmo1, and footprint and CUT&Tag tracks show locations of bound footprints and the CUT&Tag signal (C&T) and CUT&Tag peak locations (C&T Peak) for the TF in question. The z-score and p-value of the feature linkage to the gene in rV2 cell lineage is shown above.