H3K115ac is associated with CGI promoters.

A) Nucleosome structure looking down on the dyad axis (modified from PDB-5X7X, Taguchi et al., 2017). The two H3 molecules are shown in cyan and yellow, other histones are in orange and DNΑ in green. N- terminal histone tails are hidden. Both copies of H3K115 (red and asterisked) are juxtaposed at the dyad axis close to the overlying DNΑ. B) Pearson’s correlation matrix with hierarchical clustering in mESCs. Correlation is computed for read counts in 10 Kb windows across the genome for ΑTΑC-seq data, ChIP-seq data for active (H3K122ac/H3K27ac/H3K27ac: GSE66023; H3K4me3: GSM1003756; H3K4me1: GSM 1003750; and repressive (H3K27me3: GSM 1276707), histone H3 modifications and for H3K115ac. C) Proportions of H3K115ac, H3K122ac and H3K27ac peaks that overlap genomic segments defined by chromHMM in the mouse genome (Ernst and Kellis, 2012; Pintacuda et al., 2017). D) Heatmap of H3K115ac ChIP-seq signal (this study), H3K122ac and H3K27ac (Pradeepa et al., 2016) in mESCs with respect to TSS of the top 50% of genes by expression and sorted by decreasing gene expression. E) UCSC genome browser screenshot showing ChIP-seq data for H3K115ac, H3K122ac, H3K27ac and ΑTΑC-seq in mESCs at the Sox2, Klf4 and Nanog loci. CpG islands (CGI) are indicated. Genome co-ordinates (Mb) are from the mm10 assembly of the mouse genome. F) Mean normalised reads of, (upper left) 4SU-seq centred at TSS of top 50% genes by expression (4SUseq tags in TSS +500bp region). Genes are divided into TSS that do (CGI+) or do not (CGI-) overlap with CpG islands. Upper right and lower panels show average profiles of H3K115ac, H2K122ac and H3K27ac read-density at these same TSS classes. The higher H3K115ac read-density at CGI+ TSS is not due to sample size (Wilcox, p-value < 2.2e-16, normalized read coverage within a window spanning TSS +500bp). G) Average profiles (mean normalised reads) of, (top); 4SU-seq in mESCs centred at protein-coding TSSs (+/-2kb) divided into quartiles (Q1-Q4) of 4SU-seq signal within 500bp upstream and downstream of the TSS for (left) TSS overlapping a CGI (+CGI), or (right) promoters without any CGI (-CGI). Below; H3K115ac signal around these TSS quartiles defined by 4SU

H3K115ac dynamics during differentiation

A) Boxplots displaying the changes in activity (4SU-seq) of promoters that gain or lose H3K115ac across the 7 days of mESC to NPC differentiation. Log2 fold change is shown relative to day 0. ‘Paired Wilcox, p <0.01. B) Bar plot showing enrichment of gene sets defined based on differential H3K115ac (gain/loss) and differential expression during differentiation(up/down). Enrichments are calculated for CGI and non- CGI promoters. *** Fisher’s exact test p < 0.01 ; n.s - p > 0.01, supplementary table 2. C) Aggregate profile plots for 4SU-seq, and ChIP-seq data for H3K115ac and H3K27me3 (Mikkelsen et al., 2007) from mESCs and NPCs at promoters with no significant change in H3K115ac occupancy, but significant transcriptional upregulation during differentiation. D) UCSC genome browser screenshot showing 4SU-seq, ATAC- seq, and ChIP-seq data for H3K115ac, and H3K27me3 (Mikkelsen et al., 2007) in mESCs and differentiated NPCs at the Efnaδ, and Slc22a23 loci. CpG islands (CGI) are indicated. Genome co-ordinates (Mb) are from the mm10 assembly of the mouse genome.

H3K115ac is associated with fragile nucleosomes

A) Contour plots depicting the high-density regions of chromatin fragments around TSS (+/-400bp) as a function of fragment length (bp) generated from; (top) Input MNase library, (centre) H3K27ac ChIP and (bottom) H3K115ac ChIP. Coverage refers to the proportion of ChIP-fragments in the indicated color-coded contours. The region from 100 bp upstream to the TSS is indicated with dashed lines in red. B) Mean nucleosome occupancy around mouse TSSs plotted with respect to the NDRs, scaled to a length of 500bp. MNase-seq data (top, West et al., 2014) were used to define NDR as the region between the 3’ boundary of-1 nucleosome and the TSS. Mean occupancy of (middle) H3K27ac or (bottom) H3K115ac ChIP-seq fragments is split into sub- and mono-nucleosomes around the scaled NDRs. C) Fragment length distribution (bp) of MNase-digested native chromatin fractionated with sucrose gradient sedimentation. Fractions with different nucleosome species (based on the fragment length) were pooled (indicated in parentheses). D) Input (top) and H3K115ac ChIP-seq (bottom) data, centred at TSS (+/-1 kb) of most active genes (top 25%) in mESCs, performed on different nucleosome species isolated with sucrose gradient sedimentation from panel C. Data is spike-in normalized.

H3K115ac marks active enhancers

A) Heatmap showing the coverage of H3K115ac, H3K122ac, H3K27ac and ATAC-seq centred on promoter distal accessible peaks (putative enhancers) in mESCs. Data is grouped in enhancers marked by all three H3 acetylation marks (Group 1; H3K27ac+ H3K122ac+ H3K115ac+), just H3K27ac together with H3K122ac (Group 2), or H3K27ac alone (Group 3). B) Heatmaps showing H3K115ac and ATAC-seq signal for common and dynamic enhancers between ESC and NPC. Loss/gain of H3K115ac correlates with loss/gain in chromatin accessibility. C) mESC enhancers selected based on the presence of two Oct 4 motifs (n=650) within the Tn5 accessible region, with the region between the two motifs scaled to the same length (shaded grey region). Top; MNase-seq signal (black, left y axis) and ATAC-seq (purple, right y axis). H3K115ac ChIP-seq (middle panel) and H3K27ac ChIP-seq (bottom panel) on mononucleosomes (monoNuc; green) or subnucleosome-sized fragments (subMuc; blue). D) Immunoblotting for H3, H3.3 and H3K115ac on whole cell extracts from E14 mESCs, H3.3 knock-out ESCs and the parental ESC line.

H3K115ac marks fragile nucleosomes at sites of high CTCF occupancy

A) Mean occupancy of nucleosomes derived from MNase-seq (from West et. al., 2014) around CTCF sites across the four quartiles of CTCF ChIP-seq peak strength. All CTCF motifs are oriented from 5’ to 3’ (left to right). Positions of first flanking upstream (−1) and downstream nucleosome (−1) positions are marked. B) H3K115ac ChIP-seq signal from (top) mononucleosomal and (bottom) subnucleosomal sized fragments in mESCs around CTCF motifs across the four quartiles of CTCF ChIP-seq peak strength as in (A). C) Contour plots depicting high density regions of (left) input, (centre) H3K115ac ChIP, (right) H3K27ac ChIP paired-end sequenced MNase fragments around top (Q4) and bottom (Q1) quartiles of CTCF ChIP-seq peaks in mESCs (data from Mas et al., 2018) as a function of fragment length. All CTCF motifs are oriented in the same direction.

H3K115ac antibody specificity. Related to Figure 1.

A) Dot blot for specificity of H3K115ac antibody against unmodified H3K115 and H3K115R peptides and acetylated H3K115 and H3K122 peptides. B) ChIP-gPCR with H3K115ac antibody from mESC chromatin spiked with equimolar amounts of bar-coded nucleosome species modified as indicated; acetyl (ac), butyryl (bu), crotonyl (cr), phopsho (ph) from the SNAP-ChIP K-acyl-stat panel. The gene body and promoter of KIf4 are included as endogenous targets. Error bars indicate standard deviation from three technical replicates for each of two biological replicates.

H3K115ac dynamics during differentiation. Related to Figure 2.

A) Aggregate profile plots for4SU-seq, and ChIP-seq data for H3K115acand H3K27me3 (Mikkelsen et al., 2007) at promoters with no significant change in H3K115ac occupancy, or transcription (<2 fold) during differentiation to NPCs. B) Gene ontology analysis for the background set of genes in Figure 2C. Top 15 biological processes are shown. C) UCSC genome browser screenshot showing 4SU-seq, ATAC-seq, and ChIP-seq data for H3K115ac, and H3K27me3 (Mikkelsen et al., 2007) in mESCs and differentiated NPCs at the Insml locus. CpG islands (CGI) are indicated. Genome co-ordinates (Mb) are from the mm10 assembly of the mouse genome. D) Enrichment (Log2 observed/expected) of H3K115ac, H3K122ac, H3K64ac and H3K27ac with polycomb target promoters (H3K27me3) in mESCs. *** p value (Fishers) <0.01. n.s. not significant (p>0.05). E) H3K64ac and H3K122ac mESC ChIP-seq profiles at the gene sets from Figure 2C and Figure 2SAthat show no change in H3K115ac during ESC to NPC differentiation and that are either transcriptionally up-regulated (Figure 2C), or that show no change in transcription upon differentiation (Figure 2SA) (ChiP-seq data from Pradeepa et. al., 2016).

H3K115ac is associated with sub-nucleosome sized fragments. Related to Figure 3

A) Mean coverage of fragments in MNase-digested input libraries for the most highly active TSSs (4SU Q4) and minimally active TSSs (Q1) in mESCs binned into different fragment lengths (bp). B) Schematic of selected fragment lengths to define sub-nucleosomes and mono-nucleosomes from input libraries. C) Mean occupancy of H3K115ac (top) or H3K27ac (below) ChiP-seq from mESCs plotted around (+/-1 kb) TSS. Sub-nucleosomal and mono-nucleosomal signals are plotted with together with the profile from their respective input libraries. D) Distribution of fragment lengths (bp) from paired-end libraries for; (top) native MNase ChiP-seq of H3K115ac and its input sample, (bottom) H3K27ac ChiP-seq libraries. Distribution is scaled to library size and the Wilcox test was used to calculate significance (Supplementary table 3). E) Density profiles of A/T content for Input and H3K115ac and H3K27ac ChiP-seq of libraries showing mono-nucleosomes (monoNuc) and (right) sub-nucleosomes (subNuc). A/T content does not differ between different input libraries (Wilcox test, p > 0.01), but H3K115ac marked subnucleosomes have a higher A/T content than subnucleosomes marked with H3K27ac (Wilcox test, p <0.01), H3K115ac marked mono-nucleosomes have higher A/T content than those with H3K27ac (Wilcox test, p <0.01). The A/T content of sub- and mono-nucleosomal particles marked with H3K115ac are not significantly differentt (Wilcox test, p > 0.01). Statistical data in Supplementary table 3

H3K115ac correlates with pioneering activity at mESC enhancers. Related to Figure 4.

A) Overlap of STARR-seq +ve sequences (Peng et al., 2020) with sites corresponding to; open (ATAC+) or closed (ATAC-) chromatin sites in the mESC genome, and for different mESCs histone aceyl-lysine peaks. ‘None’ indicates STARR-seq active sequences with no overlap with the indicated post translational modification class. H2B-NTac refers to regions marked with at least one of the acetyl-lysines in the N-terminal of histone H2B (K5, K11, K12, K16 and K20, Narita et al., 2023). B) the proportion of peaks for different histone acetylation marks that are defined as open or closed by ATAC-seq in mESCs. C) A Random Forest model was trained to predict H3K115ac-positive versus -negative peaks (but marked by H3K27ac and/or H3K122ac) based on ChiP-seq overlap. SHAP values were calculated to determine the impact on the model (top 20 features shown, left). The prediction confusion matrix is shown for the test data (right). D) Bar plot showing % of promoter- or enhancer-associated peaks in mESC and NPCs. Nearly 33% of all H3K115ac peaks are associated with promoters in both ESCs and NPCs with 70% of them common between ESCs and NPCs. In contrast, the majority of dynamic H3K115ac peaks are associated with enhancers (ESC-specific; n=1881, NPC- specific; n=2811). E) ATAC-seq signal around Oct4-occupied enhancers in mESCs divided into quartiles (Q1-Q4, low to high) of Oct4 ChiP-seq peak strength. F) H3K115ac (left) and H3K27ac (right) paired-end ChIP signal around Oct4-occupied enhancers. ChiP-seq reads are separated according to fragment length into mononucleosomes (green; monoNuc) and subnucleosomes (blue; subNuc). G) Heatmap depicting H3.3 turnover index (Deaton et al., 2016; GSM2080325_TI_ES.wig in GEO accession GSE78910) centered for H3K115ac-+ve and -ve regions.

H3K27ac and H3K115ac relative to CTCF binding sites and motif orientation. Related to Figure 5

A) H3K27ac ChIP-seq signal from mononucleosomal (top) and subnucleosomal (bottom) sized fragments in mESCs around CTCF motifs across the four quartiles of CTCF ChIP-seq peak strength. All CTCF motifs are oriented from 5’ to 3’ (left to right). B) Heatmap of MNase-seq data ranked by quartiles of CTCF occupancy with CTCF motifs oriented in the same (left) direction or randomised orientation (middle). C) As in (B) but divided into quartiles (Q1-Q4) of distance to the closest TAD boundary in mESCs (TAD boundaries from Bonev et al., 2017). Median distance (kb) to TAD boundary for each quartile is shown on the right; minimum and maximum distances are shown in Supplementary table 4. Within each TAD quartile, sites are sorted by CTCF occupancy in descending order.