Figures and data

Promoter Capture Hi-C identifies long-range regulatory elements in cerebellar granule cell progenitors (GCps).
(A) Schematic of the number of significant interactions (in grey) with a CHiCAGO score >=5 identified in our pcHi-C data-sets following each data filtering step (in ovals). Two independent replicates of pcHi-C library were pooled to create a ‘Superset’. Following filtering, 106589 cis protein-coding bait-to-otherEnd interactions (referred to as promoter interacting fragments or PIFs) were taken further for downstream analyses. (B) Graph demonstrating the cumulative percentage of significant interactions detected as genomic distance from the promoter capture baits increased. The first blue dotted line indicates the median interaction distance of cis PIFs (∼215kb) and the second blue dotted line indicates that 90% of interactions fell within 6711kb. (C-F) Bar charts illustrating the overlap of PIFs with (C) H3K4me1 ChIP-seq peaks, (D) H3K27ac ChIP-seq peaks, (E) heterochromatin regions defined by ensembl regulatory annotations for brain (E14.5) and (F) Fantom5 enhancers active in the cerebellum during p6-p9. Numbers above each graph indicate the enrichment (fold-change) of this overlap relative to 100 random subsets of HindIII fragments that have the same distribution in distances to the bait promoter regions as the significant PIFs. Error bars on random overlaps represent 95% confidence intervals from 100 permutations. (G) Top plot shows number of total normalised reads detected for each HindIII fragment interacting with the restriction fragment encompassing the Zic1 locus from the pcHi-C dataset. Pink indicates locations PIFs identified as ‘significant’ with a CHiCAGO score >=5. Genome tracks show ATACseq (turquoise) peaks, and H3K27ac (pink) and H3K4me1 (navy) ChIPseq peaks for the same genomic region. (H) Plots illustrating the proportion of all PIFs (left) or PIFs containing VISTA hindbrain enhancers (right) that interact with either only the promoter of their nearest protein-coding gene (light green), those that interact with the promoter of their nearest protein-coding gene and interact with promoters of distal protein-coding genes (mid-green) and the proportion of PIFs interacting with one or more distal promoters that are not their nearest protein-coding gene (blue). Note that the majority of interactions identified are long-range and not with the nearest promoter. (I-K) Examples of interactions detected in our dataset for experimentally validated VISTA hindbrain enhancer regions showing (I) a VISTA enhancer that interacts with its nearest protein-coding gene promoter, Ntrk2, (J) a VISTA enhacer interacting with its nearest protein-coding gene promoter (Tubb2) as well as promoters for distal genes Bphl and Ripk1 and (K) a VISTA enhancer that does not interact with its nearest protein-coding promoter but the distal promoter for Kmt2d. Insets show reporter activity for these VISTA enhancers in E11.5 or E12.5 mouse embryos (from VISTA enhancer site). Interacting HindIII fragments are shown in black with black arcs indicating fragments that localise together in the nucleus. Location of VISTA enhancer regions are shown in orange. Ensembl gene annotations are shown in dark navy, followed by H3K27ac (pink) and H3K4me1 (navy) ChIPseq data.

The majority of promoter interacting fragments (PIFs) are within 5kb of a PIF that contains an accessible ATAC peak.
Plots illustrating the number and proportions of (A) PIFs containing an accessible ATAC peak and (B) promoter-PIF interactions with a PIFs containing an accessible ATAC peak. Cumulative percentage graphs demonstrating the genomic distance (C) and number of HindIII restriction fragments (D) PIFs without accessible regions (without ATAC peak) are located from their nearest accessible PIF (PIF with ATAC peak). The blue dotted line in (C) indicates the median distance between a PIF without an ATAC peak and a PIF with an ATAC peak is 4.5 kb, whilst 66% of PIFs without an ATAC peak are within a range of 5 genomic HindIII fragments from an accessible PIF (D). (E) PIF and accessibility status of the 10 upstream and downstream genomic HindIII restriction fragments surrounding the ‘lead PIF’ with the highest and most significant CHiCAGO score for each captured promoter. Central point of heatmaps and coverage plots show the lead PIF (fragment 0). Heatmaps show the PIF status (left, green in heatmap if genomic fragment is PIF, white if not a PIF), ATAC status (middle, purple in heatmap if genomic fragment contains ATAC peak, white if no ATAC peak) and combined PIF and ATAC status (right, orange if genomic fragment is identified as a PIF and contains a ATAC peak, while if not). Each row represents the primary PIF for a different captured promoter, with rows ordered by descending CHiCAGO score, i.e most significant PIF at the top. Coverage plots at the top of each heatmap indicate the overall proportion of fragments flanking the primary PIF on either side that are also designated as PIFs (left, green), containing ATAC peaks (purple, middle) or PIF with an ATAC peak within it (orange, right). (F) Genomic location of ATAC peaks within and not in PIFs. (G) Boxplot showing the distributions of the average expression levels (TPM) of genes with promoters in proximity to PIFs containing ATAC peaks compared to promoters with PIFs that do not contain accessible regions.

The chromatin remodeller CHD7 regulates gene expression in GCps by promoting chromatin accessibility at distal enhancer elements.
(A) Venn diagram showing the number of ATAC accessible chromatin peaks found within PIFs (30,296 – pink), with altered accessibility in Chd7-deficient GCps (5369 – green), that overlap CHD7 ChiPseq peaks (4016 – blue) and the overlap of these subsets. 210 promoter-proximal ATAC peaks with CHD7 recruitment and which change accessibility in Chd7-deficient GCps are considered direct CHD7 targets. (B) Heatmap of direct CHD7 target genes which are differentially expressed in Chd7-deficient GCps and have at least one PIF which overlaps a CHD7 binding site that shows altered accessibility in Chd7-deficient GCps. Values depict z-score of normalized, log2 transformed and scaled RNAseq data between p7 WT (n=2) and Chd7-deficient KO (n=2) GCps. (C) Genome tracks at the Reln locus (chr5:21843156-22533049) showing the location of three putative Reln enhancers with CHD7 binding signal and decreased accessibility in Chd7-deficient GCps identified 395kb, 394kb and 249kb downstream of the Reln promoter (within the two orange shaded regions). Promoter proximal PIFs are designated by arcs (grey for any PIF with significant CHiCAGO score ≥5, pink for PIFs that also contain an ATAC peak that overlaps with CHD7 binding signal. Regions of differential chromatin accessibility (p-adj < 0.05) are displayed in the interval track (ATAC DA) showing 500bp regions of decreased (green) and increased (purple) accessibility in Chd7-deficient GCps between the WT ATAC (n=3) and Chd7-deficient (KO) ATAC signal (n=3). A representative ATAC signal track for WT (green) and CHD7 KO (black) GCps are shown above the ChIPseq signal tracks for CHD7 (grey), H3K27ac (pink) and H3K4me1 (dark blue). (D) UpSet plot showing the predominant combinations of chromatin modifications present at ATAC peaks that overlap CHD7 ChIPseq peaks and change accessibility in Chd7-deficient GCps. Number of ATAC peaks overlapping individual chromatin marks are shown in smaller horizontal bars on left of plot, vertical bars of main plot indicate number of ATAC peaks with combinations of H3K27ac, H3K4me1 and H3K4me3 marks as denoted in the panel below x-axis. Diagram illustrates CHD7-mediated chromatin remodelling (arrow). (E) Dot plot showing the number of ATAC peaks at different genomic locations (y-axis) for different combinations of chromatin marks (x-axis) from (D) for the direct CHD7 target sites that have ATAC peaks which change in accessibility in Chd7-deficient GCps and overlap CHD7 ChIPseq signal. (F) UpSet Plots illustrating predominant chromatin mark combinations for ATAC peaks that change in accessibility upon Chd7-depletion but do not exhibit CHD7 recruitment. Diagram illustrates chromatin remodelling (arrow) by an unknown factor (?). (G) aUpSet plots of ATAC peaks with CHD7 recruitment but no changes in accessibility upon Chd7-depletion. Diagram illustrates CHD7 presence without detectable chromatin remodelling. (H) Heatmap of significant SEA motif-enrichment results (p<0.05, q<0.05) for TAL-family transcription factors for genomic regions with ATAC peaks that are direct targets of CHD7 with CHD7 occupancy and change in accessibility in Chd7-deficient GCps (DA & CHD7), those that alter accessibility upon CHD7 depletion but do not display CHD7 occupancy in WT (DA no CHD7) and those which display CHD7 recruitment but do not alter accessibility upon CHD7 depletion (CHD7 no DA). White boxes denote no statistically significant enrichment and consensus motifs for TAL family members that show differential enrichment across the subsets are shown on the right. Also see Suppl. Fig. 1.

Atoh1-regulated distal regulatory elements and gene targets in primary GCps.
(A) Schematic venn diagram of the number of accessible ATAC-seq sites in PIFs that overlap with accessible ATAC-seq sites that display Atoh1 occupancy in GCps. The 7406 accessible sites in PIFs that have Atoh1 occupancy are proximal to promoters of 7090 protein coding genes, 599 of which are differentially expressed in Atoh1-deficient GCps. (B) Volcano plot showing magnitude of multiple testing corrected p-value and log2-fold expression changes of 599 protein coding genes in the Atoh1-deficient cerebellum that have been linked to distal regulatory regions with accessible sites that have Atoh1occupany (C) Plot demonstrating the number and proportion of Atoh1 bound ATAC-seq sites that interact their nearest promoter (263/1854), their nearest promoter and others (542/1854) and other promoters (1049/1854). (D) Genome tracks (chr9:90,467,566-92,242,745) showing the PIFs proximal to the Zic1 promoter. Promoter proximal PIFs are designated by arcs (grey for any PIF with significant CHiCAGO score ≥5, pink for PIFs that also contain an ATAC peak that overlaps with CHD7 binding signal, orange for PIFs that contain an ATAC peak that overlaps with Atoh1 ChIP peak and blue for a PIF that contains an ATAC peak overlapping with both a CHD7 and Atoh1 binding signals). The interval track below uses the same colour coding as the arcs (with the addition of black regions to show the captured promoter fragment) to show the width and location of the PIFs. A representative male ATAC signal track for WT (green) GCps are shown above the ChIP-seq signal tracks for Atoh1 (orange), CHD7 (grey), H3K27ac (pink) and H3K4me1 (dark blue).

Atoh1 is recruited to the majority of CHD7-regulated enhancers
(A) Venn diagram demonstrating the total number of accessible chromatin sites in PIFs that also have Atoh1 recruitment (pink). The number of genomic locations that demonstrate altered accessibility in Chd7-deficient GCps (light blue), 751/1465 of which were in PIFs that demonstrate Atoh1 recruitment. The number of ATAC sites with altered accessibility in Chd7-deficient GCps that also exhibit CHD7 recruitment in PIFs (light green), 632/1132 of which were in PIFs that also demonstrate Atoh1 recruitment. Finally, 197 PIFs were identified that exhibit differential accessibility in Chd7-deficient GCps and to which both Atoh1 and CHD7 are recruited. (B) Genome tracks at the Gli2 locus (chr1:118,717,920-119,121,120) showing the location of three putative, promoter-proximal PIFs designated by arcs (grey for any PIF with significant CHiCAGO score ≥5, orange for PIFs that also contain an ATAC peak that overlaps with Atoh1 binding signal and blue for PIFs that contain accessible regions that overlap with regions of Atoh1 and Chd7 binding). Regions of differential chromatin accessibility (p-adj < 0.05) are displayed in the interval track (ATAC DA) showing 500bp regions of decreased (green) and increased (purple) accessibility in Chd7-deficient GCps between the WT ATAC (n=3) and Chd7-deficient (KO) ATAC signal (n=3). A representative ATAC signal track for WT (green) and CHD7 KO (black) GCps are shown above the ChIPseq signal tracks for Atoh1(orange), CHD7 (grey), H3K27ac (pink) and H3K4me1 (dark blue). (C) Co-immunoprecipitation of ATOH1 and CHD7 in HEK293T cells, transiently transfected with FLAG-tagged CHD7, HA-tagged Atoh1, both or empty vector (EV). CHD7 was immunoprecipitated with anti-FLAG followed by western blot against FLAG and HA. Note the co-immunoprecipitation of HA-tagged ATOH1 upon FLAG-CHD7 immunoprecipitation.


Putative enhancer elements within PIFs exhibit enhancer activity in GCps
(A) Table showing the genomoic location, size and difference from Zic1 TSS of two putative enhancer-containing fragments from PIF1 (Fig. 1G). (B) Diagram of Zic1/4 locus with ATAC-seq, histone modification ChIP-seq and pcHi-C interactions shown in pink arcs. Note the location of PIF1, and the location of fragments #1, .1 and #1.2 in the magnified view below. Note the overlaps with VISTA enhancers hs1203 and hs654. (C,D) Enhancer activity of hindbrain VISTA enhancers in E11.5 mouse embryos. Mb=midbrain, Cb=cerebellum, RL-rhombic lip, Fb=forebrain. (E) Diagram of luciferase reporter construct. (F) Normalised luciferase activity of empty vector (pGL3) and enhancer fragments in SHH-NPD cells. *p<0.05, **p<0.01, t-test.

CHD7-regulated enhancers are enriched for proneural transcription factor motifs
A comparison of HOCOMOCO Mouse v11 CORE database transcription factor motifs enrichened within regulatory elements linked to CHD7 by SEA tool from MEME suite. Left panel describes TFClass family each transcription factor belongs to. Middle heatmap of q-values for significant SEA motif-enrichment results (P<0.05, Q<0.05) for genomic regions with ATAC peaks that are direct targets of CHD7 with CHD7 occupancy and change in accessibility in Chd7-deficient GCps (DA & CHD7), those that alter accessibility upon CHD7 depletion but do not display CHD7 occupancy in WT (DA no CHD7) and those which display CHD7 recruitment but do not alter accessibility upon CHD7 depletion (CHD7 no DA). White boxes denote no statistically significant enrichment. RNA expression level of transcription factors in WT GCps (n=2) is shown in right panel and color of bar indicates if the RNA expression level of that factor is unchanged in Chd7-depleted GCps (blue), upregulated (green) or downregulated (purple).

Comparison between our pcHi-C data from P7 mouse GCps and Hi-C data from P4 mouse cerebellum by Reddy et al.
(A) Overlap of promoters with distal interactions identified in Reddy et al (green) vs. pCHiC (blue). (B) Sankey diagram demonstrating the number of E-P interactions from Reddy et al dataset replicated in the pCHiC data along with regions that are unique to the Reddy et al data, illustrating the proportion of the the E-P interactions that we would not expect to replicate in our data due to the technical design of the capture experiment. A substantial subset of Reddy E-P interactions were not found in our data due to technical differences (promoters not captured in our study), bioinformatic reasons (removal of bait-to-bait interactions. For some of the Reddy E-Ps, no interactions were identified in our data, or different potential enhancers (PIFs) were found, which might reflect differences in cell type and developmental timing.