Figures and data

A conserved and neuroectoderm-specific long-range contact between a distant enhancer and the SOX2 locus.
(A) Genomic organization of a 1 Mb domain of the SOX2 TAD in human NSCs. 4C-sequencing with a SOX2 bait, ATAC-Sequencing, and H3K27ac and H3K27me3 ChIP-sequencing are shown. (B) Comparison of hESC and NSC HiC sequencing, ChIP-sequencing for CTCF, H3K27ac, and H3K27me3, RNA-sequencing transcript reads, and virtual 4C-Sequencing visualizations (derived from HiC sequencing) with a viewpoint of either the SOX2 gene promoter or the putative enhancer region. See also Figure S1A,B. (C) Genome browser view of HS1332 and adjacent HS1332b sequences corresponding to the putative enhancer in the human genome with H3K27ac ChIP-sequencing and ATAC-sequencing from NSCs, PhyloP score for 100 vertebrate genomes, and view of conserved sequences (green bars) for 6 select species. See also Figure S1C. (D) Comparison of mouse ESC and NSC HiC sequencing, ChIP-sequencing for CTCF, H3K27ac, and H3K27me3, and virtual 4C-Sequencing visualizations (derived from HiC sequencing) with a viewpoint of either the SOX2 gene promoter or the putative enhancer region. (E) Schematic demonstrating the hypothesis tested.

Disruption of HS1332 and the associated 3D chromatin loop impair neuralization, SOX2 expression and neuronal differentiation
(A) Graphical representation of targeted deletion of either the HS1332 enhancer or the CTCF nMotif. (B) Gel electrophoresis of PCR products of the HS1332-containing domain in ROSA26 vs ΔHS1332 hESCs (top), or the nMotif-containing domain in ROSA26 vs ΔnMotif hESCs (bottom). Additional information is found in Figures S2, S3. (C) Representative immunofluorescence images for hESC markers SOX2 (red), OCT4 (green), Nanog (teal) across the ROSA26, ΔHS1332, and ΔnMotif conditions at the hESC stage. Nuclei were counterstained with DAPI. (D) Graphical representation of embryoid body (SFEB) formation experiments highlighting the emergence of GFP expression from the Hes5::GFP reporter (top). Representative images from ROSA26, ΔHS1332, and ΔnMotif SFEBs (bottom left). The bar graph (bottom right) represents the quantification of GFP-positive SFEBs (n=3 biological replicates, one-way ANOVA F2,6=13.43, p <0.01; Tukey’s post hoc multiple comparison test: ROSA26 vs ΔHS1332 **p<0.01; ROSA26 vs ΔnMotif *p<0.05). (E) Graphical representation of directed in vitro differentiation to neural rosettes (top) and representative wide field fluorescence images of each line (bottom left). The bar graph in the middle shows quantification of the percentage of GFP-positive (GFP+) cells after directed differentiation as measured via flow cytometry, normalized to the ROSA26 condition in each biological replicate (n=6 biological replicates, one-way ANOVA F2,15=10.76, p=0.0013; Tukey’s post hoc multiple comparison test: ROSA26 vs ΔHS1332 **p<0.01). ns, not significant. The bar graph on the right shows mean GFP intensity in rosettes across experimental conditions (n=6 biological replicates, one-way ANOVA F2,9=5.921, p=0.0228; Tukey’s post hoc multiple comparison test: ROSA26 vs ΔHS1332 **p<0.01). ns, not significant. (F) qRT-PCR analysis shows no change in SOX2 transcript levels at the hESC stage (n = 3 biological replicates, one-way ANOVA F2,7=0.1535, p=0.8605) and a decrease in SOX2 transcript in ΔHS1332 and ΔnMotif rosettes (n=4 biological replicates, ANOVA F2,6=11.11, p=0.0096; Tukey’s post hoc multiple comparison test: ROSA26 vs ΔHS1332 *p < 0.05, ROSA26 vs ΔnMotif *p < 0.05). ns, not significant. (G) Representative immunoblots for SOX2 and β-Actin show SOX2 downregulation in rosettes but not hESCs in the ΔHS1332 and ΔnMotif conditions (left). Bar graphs (right) demonstrate summary densitometry statistics for hESCs (n=7 biological replicates, one-way ANOVA F2,18=0.1062, p=0.8998; Tukey’s post hoc multiple comparison test: ROSA26 vs ΔHS1332 p>0.05; ROSA26 vs ΔnMotif p>0.05) and rosettes (n=6 biological replicates, one-way ANOVA F2,15=11.10, p=0.0011; Tukey’s post hoc multiple comparison test: ROSA26 vs ΔHS1332 *p<0.05, ROSA26 vs ΔnMotif ***p<0.001). ns, not significant. (H) Immunofluorescence microscopy before and after differentiation of rosettes to neurons (left) and astrocytes (right). (I) qRT-PCR for RBFOX3 transcript normalized to GAPDH following 2 weeks of neuronal differentiation with BDNF and ascorbic acid (n=4 biological replicates, one-way ANOVA F2,8=8.999, p<0.01; Tukey’s post hoc multiple comparison test: ROSA26 vs ΔHS1332 *p < 0.05, ROSA26 vs ΔnMotif *p < 0.05). (J) qRT-PCR for SLC1A2 transcript normalized to GAPDH following 2 weeks of astrocytic differentiation with 4% FBS (n=4 biological replicates, one-way ANOVA F2,8=7.469, p=0.0104; Tukey’s post hoc multiple comparison test: ROSA26 vs ΔHS1332 *p<0.05, ROSA26 vs ΔnMotif *p<0.05).

Loss of HS1332 or CTCF-mediated 3D organization of the TAD impairs neuroectodermal development in teratomas
(A) Graphical representation of teratoma formation experiments. (B) Immunofluorescence microscopic analysis of ROSA26, ΔHS1332, and ΔnMotif teratomas harvested after 6 weeks. Representative images are shown for GFP, SOX2, PAX6, HNF3B, and smooth muscle actin (SMA). (C) Quantification of teratoma immunofluorescence markers: GFP (n=4 biological replicates, one-way ANOVA F2,8=28.84, p=0.0002; Tukey’s post hoc multiple comparison test: ROSA26 vs ΔHS1332 **p < 0.01, ROSA26 vs ΔnMotif ***p < 0.001); PAX6 (n=5 biological replicates, ANOVA F2,13= 6.030, p=0.0140, Tukey’s post hoc multiple comparison test: ROSA26 vs ΔHS1332 *p<0.05, ROSA26 vs ΔnMotif *p<0.05); SOX2 (n=4 biological replicates, ANOVA F2,11=4.934, p=0.0295; Tukey’s post hoc multiple comparison test: ROSA26 vs ΔHS1332 *p < 0.05, ROSA26 vs ΔnMotif *p<0.05); HNF3B (n=5 biological replicates, ANOVA F3,16=5.297, p=0.0100; Tukey’s post hoc multiple comparison test: ROSA26 vs ΔHS1332 **p<0.01, ROSA26 vs ΔnMotif p<0.05); SMA (n=3 biological replicates, F2,6=0.09587, p=0.9099; Tukey’s post hoc multiple comparison test: ROSA26 vs ΔHS1332 p>0.05, ROSA26 vs ΔnMotif p>0.05). ns, not significant. (D) Single-cell RNA sequencing data from teratomas in UMAP representations. Clusters are labeled by cell identity (left). A comparison of ROSA26 vs ΔHS1332 (center) and ROSA26 vs ΔnMotif (right) clusters with color coding is shown. Epi, epithelium; CycProg, cycling progenitors; Prog, progenitors; MSC, mesenchymal stem cells; Fib, fibroblasts; Myofib, myofibroblasts. (E) Bar graph representing the proportion of cells sequenced in each condition falling into an ectodermal, endodermal, or mesodermal identity. (F) scRNA-seq UMAP representation in which only ectodermal clusters from the prior UMAP (D) were extracted and re-clustered with identities assigned (left). Comparisons of ectodermal sub-clustering in ROSA26 vs ΔHS1332 (center) and ROSA26 vs ΔnMotif (right) with color coding are shown.

Depletion of forebrain identity by excision of HS1332
(A) Gene Ontology (GO) terms depleted in ΔHS1332 and ΔnMotif vs ROSA26 teratomas in scRNA-seq analysis. (B) Violin plots of expression of FOXG1 (left) and EMX2 (right) transcripts across ectodermal clusters in scRNA-seq datasets. (C) Immunofluorescence microscopy of FACS-sorted GFP-positive rosettes stained for FOXG1, PLZF, and SOX2 in hESC derived rosettes. PLZF is a TF marker of rosette progenitors.47,73 (D) Graphical representation of cerebral organoid generation from hESCs. (E) Organoids from the ROSA26, ΔHS1332, and ΔnMotif conditions at the 30 (left) and 40 (right)-day time-points were stained for the forebrain markers FOXG1 and EMX1, neuronal markers NeuN and TUJ1, SOX2 and GFP.

Effects of HS1332 excision and 3D chromatin loop perturbation on chromatin modifications and 3D architecture in the SOX2 TAD
(A) Genome browser view of H3K27ac ChIP-sequencing in the SOX2 TAD in hESCS using RPGC-normalized signal track and rosettes using input-normalized ppois signal track. (B) Genome browser view of RPGC-normalized H3K27me3 ChIP-sequencing in the SOX2 TAD in hESCS and rosettes. Additional data for peaks 1-3 are shown in Figure S6C. (C) 4C-sequencing with a SOX2 promoter bait in hESCS and rosettes. (D) Significant interacting peaks, as determined by Benjamini-Hochberg multiple testing procedure (red), in 4C-sequencing data of rosettes from the ROSA26 and ΔnMotif conditions (*p<0.05).

Loss of SOX2 local chromatin organization affects SOX2 expression and neuroglial differentiation in mature NSCs
(A) Agarose gel electrophoresis of PCR product containing the nMotif domain in control (NT-gRNA) vs ΔnMotif NSCs (left). The Sanger sequencing results of the two PCR products is also shown (right). (B) qRT-PCR for SOX2 transcript in control and ΔnMotif NSCs (n=3 biological replicates, t-test **p<0.01) (C) Immunofluorescence microscopy for SOX2 and NESTIN in control vs ΔnMotif NSCs. (D) Growth/viability measurement by WST8 absorbance assay in control and ΔnMotif NSCs (n=3 biological replicates, two-way ANOVA n=3, F3,6=32.1 p<0.01; Sidak’s multiple comparisons test **p<0.01 at indicated timepoints). (E) Immunofluorescence microscopy for TUJ1 and GFAP in neuronal and astrocytic differentiation of control and ΔnMotif NSCs. (F) qRT-PCR of neuronal transcript RBFOX3 and astrocytic transcript SLC1A2 after neuronal and astrocytic differentiation of control and ΔnMotif NSCs (n=3 biological replicates, t-test p=0.0091 and p=0.0171, respectively). (G) Genome browser view of SOX2 TAD comparing the following assays in NeuN+ or NeuN-cells isolated from postmortem DLPFC in PsychEncode: ATAC-Sequencing, H3K27ac ChIP-sequencing, SOX2 promoter virtual 4C-sequencing, and HS1332 enhancer virtual 4C-sequencing. (H) HiC plots of the SOX2 TAD in NeuN+ or NeuN-cells of postmortem DLPFC show preservation of 3D architecture in adult neurons and glia.







gRNAs used for targeted CRISPR deletions

Primers for Genotyping Deletions

4C-Sequencing Primers: Sequences containing 4C bait sequences and sequencing adapters

