The histone acetyltransferase (HAT) Mof is essential for mouse embryonic stem cell (mESC) pluripotency and early development. Mof is the enzymatic subunit of two different HAT complexes, MSL and NSL. The individual contribution of MSL and NSL to transcription regulation in mESCs is not well understood. Our genome-wide analysis show that i) MSL and NSL bind to specific and common sets of expressed genes, ii) NSL binds exclusively at promoters, iii) while MSL binds in gene bodies. Nsl1 regulates proliferation and cellular homeostasis of mESCs. MSL is the main HAT acetylating H4K16 in mESCs, is enriched at many mESC-specific and bivalent genes. MSL is important to keep a subset of bivalent genes silent in mESCs, while developmental genes require MSL for expression during differentiation. Thus, NSL and MSL HAT complexes differentially regulate specific sets of expressed genes in mESCs and during differentiation.https://doi.org/10.7554/eLife.02104.001
Embryonic stem cells are special cells that have the ability to become many different types of cells, such as skin, muscle, or neuronal cells. This process is called differentiation. They can also undergo a process called self-renewal to produce more embryonic stem cells. These processes are controlled by a complex network of enzymes, and the production of these enzymes depends on various genes within the organism being expressed as proteins.
The DNA that holds the genetic information inside cells spends most of its time wrapped around proteins called histones: this allows the DNA molecules—which can be up to several metres long in some species—to fit inside the cell nucleus; it also protects the DNA molecules, which are quite fragile, from damage. Enzymes that attach chemical groups called acetyl groups to histones have a central role in controlling the self-renewal and differentiation of embryonic stem cells.
Mof is an enzyme that attaches an acetyl group to a specific position in a particular histone. It is a subunit within two larger protein complexes that were originally identified in flies: the male-specific lethal (MSL) complex, which is only found in male flies, and the non-specific lethal (NSL) complex, which is found in both male and female flies. These complexes have been widely studied in flies, and the role of the Mof enzyme is also reasonably well understood in mammals. However, the roles of the MSL and NSL protein complexes in mammals are not fully understood.
Ravens et al. have now used a combination of a technique called ChIP-seq (which can identify binding sites anywhere in the genome) and genetic ‘knock down’ experiments to explore the roles of these two complexes in mouse embryonic stem cells and neuronal progenitor cells.
There is some overlap between the genes that the complexes act on. However, NSL acts on some genes than MSL does not act on, and vice versa. NSL mostly acts on genes that have ‘housekeeping’ functions and are expressed in many different cell types. MSL binds more to genes that are specific to embryonic stem cells, and acts on genes required for the development of neuronal progenitor cells. This means that NSL regulates the growth of embryonic stem cells, whereas MSL controls their development and differentiation.https://doi.org/10.7554/eLife.02104.002
Pluripotent mouse embryonic stem cells (mESCs) have the ability to self-renew or to differentiate into all cell types. Specific transcription factors like Oct4, Sox2, and Nanog form a core transcriptional network, which is required for the maintenance of mESC pluripotency (Orkin et al., 2008). Chromatin-modifying enzymes further regulate transcriptional mESCs networks and cellular differentiation processes and can be associated with activation or repression of genes (Orkin and Hochedlinger, 2011). Histone acetylation is important for mESC pluripotency and is regulated by the concerted action of histone acetyltransferases (HATs) and histone deacetylases (HDACs) (Meshorer and Misteli, 2006). Acetylation of histone proteins leads to an open and dynamic chromatin conformation allowing an active transcription state, which is also a signature of mESC pluripotency (Meshorer, 2007; Niwa, 2007; Efroni et al., 2008). During differentiation of mESCs, the overall transcription rates decrease, whereas the chromatin structure becomes more compact with a global reduction of histone H3 and H4 acetylation. In line with the requirement of histone acetylation in mESC maintenance and differentiation, genetic deletion or knockdown of several HATs affects mESC pluripotency (Lin et al., 2007; Fazzio et al., 2008; Gupta et al., 2008; Thomas et al., 2008; Zhong and Jin, 2009; Li et al., 2012).
HATs can be classified into two predominant families: the GCN5-related N-acetyltransferase (GNAT) family (i.e. Gcn5 and p300) and the Moz-Ybf2/Sas3-Sas2-Tip60 (MYST) family (i.e., Tip60 and Mof [male absent on the first]) (reviewed in Kimura et al., 2005). These enzymes often function as part of multi-protein co-activator complexes (reviewed in Lee and Workman, 2007). Mof (also known as Kat8 or Myst1), is a MYST-type HAT specific for histone H4 lysine 16 acetylation (H4K16ac) (Hilfiker et al., 1997; Smith et al., 2001, 2005; Taipale et al., 2005) and has been shown to be the catalytic subunit of two distinct protein complexes in Drosophila (d) and mammals: the male-specific lethal (MSL) and the non-specific lethal (NSL) complexes (Smith et al., 2005; Mendjan et al., 2006; Cai et al., 2010; Raja et al., 2010). In Drosophila, the dMSL complex is targeted to transcribed regions of male X-chromosomal genes, where it mediates dosage compensation (reviewed in Straub and Becker, 2007; Gelbart and Kuroda, 2009; Conrad and Akhtar, 2011). In contrast, dNSL is present at gene promoters of male and female chromosomes, where it regulates transcription of housekeeping genes (Prestel et al., 2010; Raja et al., 2010; Feller et al., 2012; Lam et al., 2012). Mof itself and subunits of the Mof-containing dMSL and dNSL HAT complexes are required for the binding of the two Drosophila Mof-containing complexes at promoters and gene bodies, which leads to H4K16 acetylation and gene expression (Raja et al., 2010; Kadlec et al., 2011).
Inactivation of Mof in mice (m) leads to early embryonic lethality as Mof−/− embryos fail to develop beyond the expanded blastocyst stage and die at implantation (Gupta et al., 2008; Thomas et al., 2008). Mof deletion correlated with cell cycle defects and cell death. Moreover, mESCs could not be derived from Mof−/− mouse embryos. In agreement, it was shown that Mof plays an essential role in the maintenance of mESC pluripotency (Li et al., 2012). H4K16 acetylation levels were undetectable in Mof−/− embryos, whereas the acetylation of other histone lysine residues was unaffected (Thomas et al., 2008). Surprisingly, loss of H4K16 acetylation upon neuronal differentiation of mESCs did not alter higher-order chromatin compaction (Taylor et al., 2013). Moreover H4K16ac and Mof were reported to be present at the transcription start sites (TSSs) of expressed genes in mESCs (Li et al., 2012; Taylor et al., 2013).
Details on the function of the mammalian Mof-containing MSL and NSL complexes have only recently started to emerge and revealed that Mof fulfils different functions within the two HAT complexes. Human (h) MSL complex is composed of the subunits MSL1, MSL2, MSL3, and MOF (Smith et al., 2005; Mendjan et al., 2006), while the hNSL complex is composed of nine subunits: NSL1, NSL2, NSL3, MCRS1, WDR5, PHF20, HCF1, OGT1, and MOF (Mendjan et al., 2006; Cai et al., 2010). Mammalian NSL complex appears to have broader substrate specificity than the MSL complex, as it is also able to acetylate non-histone targets (Li et al., 2009). However, the function of MSL and NSL complexes in mammalian cells and especially their role in establishing mESC pluripotency in not well understood.
To better understand the role of MSL and NSL in gene regulation and their individual contribution in epigenetic changes in mESCs, we have analysed these two Mof-containing complexes by chromatin immunoprecipitation coupled with high throughput sequencing (ChIP-seq) and by shRNA knockdown (KD) experiments in mESCs. The obtained genome-wide binding maps show that MSL and NSL locate to a large number of expressed genes and each complex has a distinct binding profile at promoters or gene bodies. Our combined ChIP-seq and KD data indicate that MSL and NSL have a combinatorial effect on a given set of genes, whereas some specific loci are only MSL- or only NSL-dependent. Our data indicate that NSL binds exclusively at promoters, while MSL binds more in gene bodies. We show that in mESCs NSL regulates cell growth whereas MSL is the main HAT complex acetylating histone H4K16. MSL is present at mESC-specific genes. Moreover, MSL binds to and regulates developmental genes in mESCs and during differentiation. Altogether our data demonstrate that MSL and NSL complexes are present at expressed genes in mESCs, but that MSL is essential for regulation of key mESC-specific and bivalent developmental genes.
To understand the global role of the two Mof-containing complexes in chromatin remodelling and how this regulates genes linked to self-renewal, proliferation, and/or differentiation, we set out to analyse the genome-wide binding of MSL and NSL in mESCs. To this end, we raised antibodies targeting Msl1 or Nsl1, which are specific subunits of the MSL or NSL complexes, respectively, and are known to play a role in the assembly and the regulation of these complexes (Raja et al., 2010; Kadlec et al., 2011). The specificity of the purified antibodies was demonstrated by western blot assays and immunoprecipitations followed by mass spectrometry using the multidimensional protein identification technology (MudPIT) (Figure 1). Western blot assays indicated that both of the generated antibodies are specific (Figure 1A,B). In addition, both antibodies immunoprecipitated (IP-ed) the endogenous MSL and NSL complexes with the previously described polypeptide composition (Cai et al., 2010; Figure 1C). Importantly, Mof was identified in both IP-ed MSL or NSL complexes, in the same range of abundance than Msl1, or Nsl1 (Figure 1C, Figure 1—source data 1 for all identified proteins by MudPIT). Gel filtration followed by western blot analyses further indicated that Msl1 and Nsl1 are only present in Mof-containing complexes as they have eluted from the Superose 6 column in the same molecular weight containing fractions as their respective entire endogenous MSL (about 240 kDa), or NSL (about 800 kDa) complexes (Figure 1D). Of note the enzymatic subunit Mof was detected in the respective MSL and NSL complexes, but in addition as a potentially free form in the 50 kDa range fractions (Figure 1D). These results together demonstrate the incorporation of all nuclear Msl1, or Nsl1, together with Mof, in their respective endogenous complexes and a fraction of ‘free’ Mof that is not present in either MSL or NSL.
To characterize the genome-wide role of MSL and NSL complexes, we carried out ChIP-seq analysis in mESCs using the above-characterized anti-Msl1 and anti-Nsl1 antibodies. The obtained binding maps of Msl1 and Nsl1 in mESCs were then compared at the UCSC genome browser to publicly available ChIP-seq data for Mof, H4K16 acetylation (H4K16ac), RNA Polymerase II (Pol II), and DNAse hypersensitive sites (DHS). At a representative genomic locus, Nsl1 peaks were detected at the TSSs of four expressed genes, where they co-localized with Mof, Pol II and DHSs, whereas Msl1 binding peaks were usually broader and together with H4K16ac downstream of Pol II peaks (Figure 2A). As previously reported, Mof is present at promoters, gene bodies (GBs) and intergenic regions (Li et al., 2012).
Using MACS14 algorithm (Zhang et al., 2008) we determined high-confidence binding sites (peaks) for Msl1 or Nsl1 (Figure 2—figure supplement 1A, Figure 2—source data 1) and selected peaks with various tag densities for ChIP-qPCR validation. The Msl1 and Nsl1 enrichments at five different loci as detected by ChIP-qPCR faithfully reflected the tag densities measured by ChIP-seq (Figure 2—figure supplement 1B,C). To further verify the specificity of the Msl1 and Nsl1 ChIP-seq results, we used lentiviral small hairpin (sh) RNA vectors to knockdown (KD) Msl1 or Nsl1 in mESCs (Figure 2—figure supplement 2A–D) and tested by ChIP-qPCR the decrease of Msl1 or Nsl1 binding at the TSSs and in the GBs of two genes that were co-bound by these factors (Figure 2—figure supplement 2E,D). The predominant binding of Msl1 to GBs was lost upon Msl1 KD, whereas Nsl1 binding to TSS was reduced following Nsl1 depletion. These results confirm our ChIP-seq analyses.
Next, we asked whether the two complexes bind to common or different loci. A pairwise comparison of the MSL or NSL enrichment at all high confidence binding loci revealed that the binding of both complexes show two populations and have a Pearson correlation coefficient of 0.23 (p-value=1.88 × 10−160) (Figure 2B). This indicates a significant overlap between Msl1 and Nsl1 binding populations, but suggests also a differential genome-wide binding of MSL and NSL. To know at which genomic regions the identified peaks localize, each peak was annotated either to promoter, GB (containing introns, exons, untranslated regions and transcription termination sites together) or intergenic regions. 74% of all Msl1 peaks are detected at GBs (Figure 2C). In contrast, the majority of identified Nsl1 peaks are present at promoter regions (67%) (Figure 2C). Moreover, only about 10% of all Msl1- or Nsl1-binding sites map to intergenic regions (as defined above, excluding introns). The majority of the 9890 Msl1-, or 6251 Nsl1-specific binding sites are at promoter and/or GB regions (Figure 2C), and after removal of redundant genes, we defined 5844 Msl1- and 4755 Nsl1-bound genes (Figure 2—figure supplement 1A, Figure 2—source data 2). As only 10% of the binding sites were detected at intergenic regions, we focused our further analyses on the role of MSL and NSL complexes in gene regulation at the promoter and/or GB regions.
To understand the genome-wide binding of MSL and NSL, we compared by k-means clustering either Msl1, or Nsl1 binding profiles with that of Mof and the presence of H4K16ac at 30,300 ENSEMBL transcription start sites (TSSs). In good agreement with our results showing that in mESCs Msl1 and Nsl1 incorporate in the endogenous MSL or NSL complexes, respectively, genome-wide Mof binding overlaps with that of Msl1 and Nsl1 around most of the TSSs, which are also H4K16ac positive (Figure 2D).
Interestingly, at promoters, MSL and NSL complexes have distinct binding profiles. Nsl1 and Mof show a sharp binding peak centred at the TSSs, while the average Msl1 binding profile is similar to H4K16ac (see below) and extends downstream from the TSSs in the GB regions (Figure 2E). Moreover, the Msl1 and Mof signals are enriched downstream of promoters at GBs, whereas the control and the Nsl1 signals are not (Figure 2F). Altogether our results demonstrate that MSL and NSL bind mostly to distinct sites in mESCs. NSL binds directly to the TSS region of genes, while the genome-wide location of MSL is both at TSSs and downstream of the TSSs of bound genes.
To assess the relationship between Msl1 and Nsl1 binding and gene expression in mESCs, we took advantage of available RNA-seq data (Tippmann et al., 2012) and compared the average expression of Msl1-, or Nsl1-bound genes (in median log2 FPKM values) to that of all ENSEMBL genes (Figure 3A). The median expression values for Msl1-, or Nsl1-positive genes were significantly higher as compared to all ENSEMBL genes, demonstrating that Msl1 and Nsl1 are mostly present at expressed genes in mESCs.
To determine whether the binding strength of Msl1 or Nsl1 correlates with gene expression, we compared Msl1 and Nsl1 enrichment around TSSs with gene expression data from the corresponding bound genes. Msl1- or Nsl1-positive genes were divided into five categories according to their expression levels (Figure 3B–D). As a control, in the same five categories densities of Pol II peaks at promoters correlated with gene expression with decreasing Pol II densities from highly to poorly expressed genes (Figure 3A; Barski et al., 2007). Importantly, the boxplot representation revealed a similar correlation as Pol II between Msl1 binding and gene expression, indicating that the stronger the gene is expressed the higher Msl1 and Pol II are enriched at the binding sites (Figure 3B,C). In contrast, there is no significant difference of the Nsl1 median values between the five groups, indicating that Nsl1-binding to promoters is not proportional with the level of expression (Figure 3D). Our results thus demonstrate that both Msl1 and Nsl1 bind to active genes, but that only the binding strength of Msl1, and not that of Nsl1, correlates with mRNA levels, suggesting a different dynamic and/or functional behaviour of the two complexes at the regulated loci.
As H4K16 is a known target of Mof in Drosophila and mammals (Hilfiker et al., 1997; Smith et al., 2001, 2005; Taipale et al., 2005), we compared Msl1 or Nsl1 binding sites with the presence of H4K16ac. Our scatter plot analyses indicate that there is a general overlap of Msl1 or Nsl1 with H4K16ac, whereas the correlation between Msl1 binding sites and H4K16ac is better (Pearson correlation coefficient 0.57) than between Nsl1 and H4K16ac (Pearson correlation coefficient 0.32), which is also reflected in the corresponding p-values (Figure 4A,B). The comparison of the distribution patterns of Msl1, Nsl1, Pol II and H4K16ac around the TSSs (±2 kb) of all Msl1- and Nsl1-bound genes further indicates that the Msl1 binding profile is more similar to the genome-wide presence of H4K16ac, than that of Nsl1 (Figure 4C). H4K16ac levels are enriched downstream of the TSSs overlapping with the binding profile of Msl1 (Figure 4C). In contrast, the centre of the Nsl1 binding profile centred at the TSS region does not overlap with that of the H4K16ac peak (Figure 4C). These binding profiles suggest a link between H4K16 acetylation and the MSL HAT complex (Figure 4).
Although it was demonstrated that Mof depletion in embryos results in a loss of H4K16ac (Gupta et al., 2008; Thomas et al., 2008), the exact contribution of the two Mof-containing HAT complexes to H4K16 acetylation remains to be determined. To address this question, we analysed the global acetylation of H4K16 after Msl1 or Nsl1 KD and also quantified acetylation of H4K5 and H4K8, two other proposed substrates for hNSL in differentiated human cells (Cai et al., 2010). Western blot analyses of total histone proteins from mESCs expressing shRNAs targeting Msl1 or Nsl1 revealed a dramatic reduction of H4K16ac upon Msl1 depletion, whereas Nsl1 KD did not affect H4K16ac levels (Figure 4D). This is in good agreement with the differential Msl1 and Nsl1 ChIP-seq profiles (Figure 4C). Moreover, H4K5ac and H4K8ac levels did not change in cells expressing either Msl1 or Nsl1 shRNA (Figure 4D). Altogether, the above results indicate that in mESCs (i) the enzymatic activity of the MSL complex is responsible for H4K16 acetylation downstream of the TSS, (ii) MSL is the main acetylase for H4K16 and (iii) the global H4K16 acetylating function of MSL cannot be compensated by other HAT complexes.
To understand the role of the two Mof-containing complexes for gene regulation and regulatory pathways in mESC, we further characterized genes bound either individually or together by Msl1 and/or Nsl1. Out of 10,600 Msl1- and Nsl1-bound genes about one quarter are co-bound by both complexes, while 3274 are only bound by Msl1 and 2185 only by Nsl1 (Figure 5A, Figure 2—source data 2). Our statistical analyses showed that these numbers are significant (Figure 5—figure supplement 1). To identify genes regulated specifically by MSL and/or by NSL, we analysed genes bound by only Msl1, by only Nsl1, or together by Msl1 and Nsl1 for gene ontology (GO). All three categories are enriched for GO terms such as metabolic process, gene expression, cell proliferation, and cell cycle. These GO terms represent housekeeping functions of every cell type, but can also be related to the cellular homeostasis of ESCs. Interestingly, genes bound by only Msl1 are enriched for GO terms such as embryo development, stem cell differentiation and maintenance (Figure 5B). Importantly, almost 50% of all reference genes associated with stem cell maintenance are Msl1 positive (Figure 5B).
Therefore, we investigated the presence of Msl1 (and Nsl1) at mESC-specific genes. Available RNA-seq data of mESCs (Tippmann et al., 2012) allowed us to define 282 genes expressed only in pluripotent mESCs. Out of these 282 mESC-specific genes, 123 (44%) are bound by Msl1 and only 40 (14%) are Nsl1 positive. Furthermore, about 100 mESC-specific genes are bound exclusively by Msl1, while 16 genes are bound only by Nsl1 (Figure 5C). Our statistical analyses indicate that only MSL binding at mESC-specific genes is higher than random (Figure 5—figure supplement 2).
To validate these bioinformatics analyses, all Msl1- and/or Nsl1-bound genes were divided into three categories (Figures 2B and 5A): Msl1- and Nsl1-bound genes (category 1), genes bound only by Nsl1 (category 2) and genes bound only by Msl1 (category 3). Msl1 and Nsl1 binding to the three gene categories were validated by ChIP-qPCR on a few selected genes. In agreement with all our above analyses, we observed that Msl1 binds to TSSs and/or GB regions of most genes from category 1 and 3 (Figure 5D), whereas Nsl1 is detected mostly at the TSSs of genes from category 1 and 2 (Figure 5E). Our results also show that Msl1 positive genes contain several mESC specific genes, including genes related to the core pluripotency network (e.g., Oct4, Nanog and Sox2) (Figure 5E). In summary, we demonstrate that the two Mof-containing complexes bind to shared, MSL-, or NSL-specific gene sets, but only the MSL complex is present at genes regulating the ESC pluripotency network and developmental processes.
In mouse embryos ablation of Mof results in lethality at embryonic day 3.5 and Mof also affects mESCs pluripotency (Gupta et al., 2008; Thomas et al., 2008; Li et al., 2012). To further analyse the cellular roles of MSL and NSL in mESCs, Msl1, or Nsl1 were individually depleted by shRNA KD (see Figure 2—figure supplement 2A–D). To exclude compensation between MSL and NSL complexes, a double KD of Msl1 and Nsl1 (shMsl1/shNsl1) was also carried out (Figure 6—figure supplement 1A). Next, total cell numbers were counted over 6 days. These analyses indicated that KD of Msl1 reduces slightly cell proliferation, while the KD of Nsl1, or the double KD of Msl1 and Nsl1 lead to a much slower cell growth (Figure 6A). When analysing cell morphology under these KD conditions, we did not observe any change in mESC shape. (Figure 6B) As the reduction of cell numbers under the KD conditions was not due to apoptosis (Figure 6—figure supplement 1B), we next carried out cell cycle analyses. These FACS measurements demonstrated that mESCs treated with shMSL1, shNsl1 and shMsl1/shNsl1 accumulate in the G1-phase of the cell cycle, with shNsl1 and shMsl1/shNsl1 being more severe than shMsl1 (Figure 6C). These results together suggest that Nsl1 might be more required for regulating housekeeping genes involved in cellular homeostasis of mESCs, as reflected in higher number of G1-phase cells and decreased cell proliferation of shNsl1 and shMsl1/Nsl1 mESCs.
To better understand the function of genes bound by MSL and NSL in mESCs, genome-wide expression changes were analysed by microarrays. To this end total RNA was isolated from control mESCs, or mESCs depleted for either Msl1, or Nsl1 (Figure 2—figure supplement 2A–D). In shMsl1 KD cells, 275 genes were found to be downregulated (with Msl1 itself is in the downregulated list (Figure 7—source data 1)), and 500 genes upregulated, as compared to control KDs. By comparing Msl1-bound and Msl1-regulated genes (Figure 7A), we found that Msl1 is present at about 30% (105 genes) of all downregulated and at 20% (107 genes) of all upregulated genes. In Nsl1 KD conditions, 1158 genes are downregulated (including Nsl1 itself) and 429 are upregulated, as compared to KD controls (Figure 7—source data 1). By comparing the genome-wide binding (ChIP-seq) and expression data changes following KD, we show that Nsl1 is present at 43% (441 genes) of all downregulated and at 5% (30 genes) of all upregulated genes (Figure 7B). The Msl1-, or Nsl1-KD affected genes determined by the microarray analyses were then confirmed by RT-qPCR under Msl1-, or Nsl1-KD conditions (Figure 7C,D). Interestingly, we noticed genes known to be involved in differentiation, like Nestin and Ntrk1, in the upregulated genes of shMsl1 mESCs (Figure 7C).
Altogether, our results show that the bound genes of which the expression is affected following either Msl1 or Nsl1 KD are those genes, which absolutely require either MSL or NSL for their correct regulation. Note however, that the relatively weak overlap between Msl1- and/or Nsl1-bound genes on one side and Msl1- and/or Nsl1-regulated genes on the other, may reflect that either the KDs were not sufficiently efficient, and/or that the global gene expression analysis detected only changes in the steady state levels of the mature mRNAs and not the changes in the neosynthesized pre-mRNAs. Along these lines, in our experimental system Msl1, or Nsl1 single, or double KD mESCs do not loose Oct4 expression (Figure 7—figure supplement 1A), a common marker of mESC pluripotency.
Since MSL and NSL are transcriptional co-activators, we were interested in the biological function of downregulated genes upon KD of either Msl1 or Nsl1. We also included available expression data from Mof knock-out (KO) mESCs (Li et al., 2012) to overcome the above-described limitations. Analysing the biological function of Msl1, Nsl1, or Mof downregulated genes, we observed GO terms like metabolic processes, gene expression, cell death, or cell cycle control (Figure 7—figure supplement 1B). However, only downregulated genes in Mof KO mESCs are significantly enriched for GO terms like stem cell differentiation or maintenance (Figure 7E). Several of these genes are amongst the Msl1 positive mESC-specific genes, such as Nanog, Sox2, Oct4 (Li et al., 2012) (Figure 7F). Moreover, these genes are bound by Msl1 and Mof (Figure 7—figure supplement 2). Thus, the Mof-regulation and the exclusive binding of Msl1 and Mof to these key pluripotency genes suggest that the MSL complex is a regulator of the pluripotency network in mESCs.
Our above analyses have shown that Msl1 binds not only to mESC-specific genes, but that it locates also to silent, or very weakly expressed, genes that become expressed to control mESC differentiation (Figure 8A, Figure 8—figure supplement 1A). Importantly, KD of Msl1 leads to the upregulation of developmental genes, such as Nestin and Ntrk1 (Figure 7C). These genes often contain both positive (H3K4me3) and negative (H3K27me3) epigenetic modifications (Figure 8A, Figure 8—figure supplement 1A). It is well established that H3K4me3 and H3K27me3 histone modifications co-localize at bivalent domains, which are poised for a quick activation during distinct differentiation processes (Azuara et al., 2006; Bernstein et al., 2006). The Ezh2 subunit of the polycomb repressive complex 2 (PRC2), which catalyzes histone H3K27 tri-methylation, is also a good marker of bivalent domains (Bernstein et al., 2006; Ku et al., 2008). To determine whether Msl1, or Nsl1, would bind to bivalent domains genome-wide, we compared the combined list of all Msl1 and Nsl1 binding sites with Pol II and Ezh2 profiles, together with H3K4me3 and H3K27me3 marks (Figure 8B). The heatmap indicates that about 343 Msl1 binding sites significantly co-localize with Ezh2, H3K27me3, and H3K4me3, which define the bivalent domains (see Cluster C in Figure 8B, Figure 8—figure supplement 1B for statistical analyses). Importantly, these 343 bivalent domain sites are negative for Nsl1 binding. The presence of Msl1 (our study) and Mof (Li et al., 2012) at bivalent genes in mESCs suggests that the MSL complex is involved in keeping these developmental genes silenced, but poised for activation in mESCs.
As MSL, but not NSL, was found to be required for keeping bivalent genes silent or low expressed in pluripotent mESCs, we asked whether MSL could regulate bivalent gene expression during mESCs differentiation. For this, mESCs were differentiated into neuronal progenitor cells (NPCs) under control and Msl1 KD conditions (Figure 8—figure supplement 1C). Bivalent genes such as Pax6, Hes5, Mapt2 and Nestin, which are also considered as key markers of NPC differentiation, were upregulated in pluripotent mESC under Msl1 KD conditions (Figure 8C; and see above). In contrary, but in agreement of the regulatory role of the MSL complex at these genes, these key developmental marker genes were downregulated in NPCs in which Msl1 was silenced by shRNA expression during NPC differentiation (Figure 8D). Note however, that Msl1 KD cells morphologically are still able to form NPC-like cells (Figure 8—figure supplement 1D). These results together indicate the important regulatory requirement of the MSL complex first for keeping the subset of bivalent genes poised for activation in mESCs and then for turning them on during mESC differentiation.
In this study, we analysed two Mof-containing complexes, MSL and NSL to understand their transcription regulation function in mESCs. The proteomic characterization of the mMsl1-, or mNsl1-containing complexes indicated that the subunit composition of the mMSL and the mNSL complexes is identical to human complexes (Figure 1C and see Mendjan et al., 2006; Cai et al., 2010). Importantly, the comparison of the abundance of Msl1 or Nsl1 with Mof in the respective complexes (Figure 1C) and our gel filtration analyses (Figure 1D) demonstrated the incorporation of Msl1 and Nsl1 together with Mof in their respective endogenous complexes. Moreover, the gel filtration experiment indicated the potential existence of ‘free’ Mof that may not be present in either MSL or NSL. Together, the proteomic analyses suggest that there is no free Msl1 or Nsl1 in the nuclei of mESCs and that Msl1 and Nsl1 are specific to the MSL and NSL complexes, respectively.
This observation is important for our study as it indicates that the ChIP binding profiles obtained with either anti-Msl1 or anti-Nsl1 antibodies represent the behaviour of the corresponding endogenous Mof-containing MSL, or NSL HAT complexes. Furthermore, our findings are consistent with previous observations, which suggested that Msl1 and Nsl1 directly interact with Mof in their respective complexes, to stabilize the assembly of these complexes and to regulate their HAT activity (Raja et al., 2010; Kadlec et al., 2011).
Similarly to Drosophila (Prestel et al., 2010; Raja et al., 2010; Kadlec et al., 2011; Feller et al., 2012; Lam et al., 2012), our results demonstrate that mouse MSL and NSL have mainly distinct binding profiles at transcribed genes in mESCs. NSL binding overlaps with Pol II binding and DHSs at TSSs, while MSL locates more downstream of promoters towards the GB (Figure 2). These evolutionary conserved binding profiles further suggest that the function of the MSL and NSL complexes in transcriptional regulation are also conserved between Drosophila and vertebrate cells.
Surprisingly, in mESCs the KD of Msl1, but not that of Nsl1, leads to a global loss of H4K16ac, without reducing H4K5ac and H4K8ac levels (Figure 4D). In differentiated human cells Mof, and subunits of MSL (i.e. Msl1, Msl3) or the Nsl1 subunit of NSL have been reported to be crucial for global H4K16 acetylation by either MSL or NSL, respectively (Li et al., 2009; Zhao et al., 2013). Thus, while the subunit composition of the two Mof-containing MSL and NSL complexes is conserved between mESCs and differentiated human cells (Figure 1C; Cai et al., 2010), the function of the NSL complex seems to be differently regulated in pluripotent mESCs than in differentiated cells. Our observation that the KD of the Nsl1 subunit of NSL does not abolish global H4K16ac levels in mESCs suggests that in these pluripotent cells NSL may have a very localized HAT activity around TSSs of bound genes and the acetylation at these loci cannot be detected in total histone preparations, in contrary to the Msl1 KD. This difference may also be due to the more dynamic recruitment of NSL by mESC-specific factors. In contrast to MSL, NSL binding to promoters does not correlate with RNA expression levels (Figure 3B–D). This further suggests that the two Mof-containing complexes have different mechanisms of action in transcriptional regulation in mESCs. Moreover, the depletion of MSL function is supposed to recapitulate those chromatin perturbations and related cellular changes that are caused by the Mof KO and are linked to H4K16ac loss.
Strikingly, the gene expression of only a small number of MSL or NSL bound genes was directly affected when either Msl1 or Nsl1 was depleted (Figure 7A,B). This might be due to either inefficient KDs, or the measurement of steady-state mature mRNA levels under our experimental conditions. As contrary to Msl1 and Nsl1, ‘free’ Mof was detected in mESC nuclear extracts (Figure 1D), we cannot exclude the possibility that to certain extent Mof alone could compensate for the function of the MSL or NSL complex under Nsl1 and Msl1 KD conditions. However, the abolished global H4K16ac levels in shMsl1 mESCs would rather propose that other transcriptional co-activators, modifying other histone residues than H4K16, could compensate the role of H4K16ac in transcriptional activation.
In summary, our data demonstrating that MSL is the main HAT complex responsible for global H4K16ac in mESCs (Figure 4D), together with the finding that genome-wide binding profiles of Msl1 and Mof overlap with H4K16ac, (Figure 2D) suggest that the co-activator role of MSL is linked to its H4K16 acetylation function at the bound genes.
The transitions between distinct chromatin states, from the open acetylated chromatin of the pluripotent mESCs to the more compact deacetylated chromatin of the differentiated cells, suggest the requirement for a tightly regulated chromatin acetylating/deactylating balance that participates in defining pluripotency on one hand and the consequent commitments for distinct differentiation pathways on the other hand. The HAT Mof is important for mESC pluripotency. Mof-deficient embryos have slight cell cycle defects and undergo cell death (Thomas et al., 2008). Under our experimental conditions, KD of Msl1 and Nsl1 alone or together did not affect expression of key transcription factors of the pluripotency network (Figures 6A, Figure 7). Our observation that in mESCs Nsl1 KD leads to decreased cell numbers during proliferation and an increase in cells in the G1-phase of the cell cycle shows that NSL might influence mESC proliferation (Figure 6B–D). This further indicates that NSL might be required for the homeostasis of mESC, either by directly regulating transcription or through acetylation of non-histone targets.
H4K16ac was shown to promote chromatin fibre decompaction in vitro (Shogren-Knaak et al., 2006; Robinson et al., 2008; Allahverdi et al., 2011). Our data showing that Msl1 KD abolishes global H4K16ac levels, together with the aberrant chromatin compaction observed in the Mof-deficient embryos (Thomas et al., 2008), suggest that the MSL complex is an important factor in establishing high acetylation levels required for more open chromatin conformation and consequent mESC pluripotency. In addition to its role as a general regulator of H4K16ac in mESCs, the MSL complex seems to be recruited to ESC-specific loci to regulate different steps in the transcription process, such as (i) chromatin accessibility, (ii) pre-initiation complex formation and/or (iii) Pol II transcription elongation rates. Importantly, the Msl1-bound mESC-specific genes are regulated by Mof. Note however, these Mof- and Msl1-bound and Mof-regulated genes were not affected by the KD of Msl1 (Figure 5B–D, Figure 7E,F). As above explained this may be due to the different experimental systems used here and the Mof KO study. Nevertheless, we assume that these genes are regulated by the whole MSL complex. Altogether, the exclusive co-binding of Msl1 and Mof to pluripotency genes suggests that the MSL complex is a regulator of the pluripotency network in mESCs.
Bivalent genes, which are either repressed or expressed at very low levels in mESCs can be directly upregulated or completely silenced upon differentiation (Azuara et al., 2006; Bernstein et al., 2006). So far, little is known about the function of HATs at bivalent genes. Interestingly, we show that a subset of bivalent genes (about 350 genes) is bound by MSL in ESCs (Figure 8A,B, Figure 8—figure supplement 1A) and consequently that the KD of Msl1 results in the upregulation of a subset of bivalent genes in mESCs (Figure 8C). Consistent with our study, Mof has also been shown to be present at bivalent genes (Li et al., 2012). It seems that the MSL complex, probably in concert with HDACs and/or other chromatin remodelling factors, can have a silencing function at these bivalent genes. In contrast, the same genes require MSL for expression during differentiation (Figure 8D). Even though the morphology of NPCs was not obviously influenced under Msl1 KD conditions, expression of key developmental NPC genes, such as Pax6 and Hes5, were downregulated during NPC differentiation (Figure 8D). Thus, our findings together with the observation that Mof is also binding to bivalent genes in mESCs, strongly suggests that the presence of MSL at bivalent loci is important for keeping these bivalent genes poised in pluripotent mESCs, allowing a quick transcriptional upregulation of the same genes during mESC differentiation.
In summary, MSL and NSL are key transcriptional co-activators at a large number of expressed genes in mESCs, whereas each complex has a distinct binding profile either at promoters (NSL) or gene bodies (MSL). MSL and NSL have overlapping and distinct roles in transcriptional regulation in mESCs. NSL binds mostly to genes with housekeeping functions and mediates mESC proliferation suggesting that NSL is important for the cellular homeostasis of mESCs. MSL is the main acetyltransferase complex acetylating H4K16. Moreover, MSL binds to mESC-specific genes, which are de-regulated in Mof ablated mESCs. Moreover MSL is present at bivalent domains in mESCs, where it may poise genes for activation during mESC differentiation. Importantly, expression of those genes is directly regulated by MSL in differentiated NPCs. In the future, it will be interesting to investigate how the genome-wide function of MSL and NSL changes during distinct mESC differentiation pathways.
Wild-type male mESCs (E14.wt) were cultivated on 0.1% gelatine (Sigma, France) and CD1 feeder cells (37°C, 5% CO2) in DMEM (4.5 g/l glucose) w-Glutamax-I, 15% foetal calf serum ESC-tested, leukemia inhibiting factor (5 μg) (Sigma), 50 mM ß-Mercaptoethanol (Invitrogen, France), penicillin/streptomycin (Invitrogen), 200 mM L-glutamine (Invitrogen), and non-essential amino acids (GIBCO, France). To work under feeder-free conditions cells were treated with 1 mg/ml Collagenase (GIBCO) and 2 mg/ml Dispase (GIBCO) and cultivated for one passage without feeder cells on 0.1% gelatine (Sigma) coated plates. Experiments were conducted at passage 26–29. Mouse embryonic fibroblasts (3T3 ATCC) were cultivated in DMEM (4.5 g/l glucose), 10% newborn calf serum and gentamycin (Invitrogen).
For NPC generation, we followed the protocol of Bibel et al. (2007). Briefly, 6 × 106 mESC were cultured in DMEM (4.5 g/l glucose) w-Glutamax-I, 10% foetal calf serum ESC-tested, 50 mM ß-Mercaptoethanol (Invitrogen), penicillin/streptomycin (Invitrogen), 200 mM L-glutamine (Invitrogen), and non-essential amino acids on bacteriological Petri dishes (37°C, 5% CO2) to start differentiation. After 4 days retinoic acid (5 μm) (Sigma) was added to induce NPC formation. Experiments were conducted 8 days after differentiation.
Polyclonal anti-Msl1 (3208) and anti-Nsl1 (3130) antibodies were generated by immunization of rabbits with the N-terminal (3-210 amino acids) region of mouse Msl1 or C-terminal region (762-1037 amino acids) of mouse Nsl1. The fragment was amplified and cloned in pET28b (Novagen, France) vector to express proteins in E. coli (BL21). For primer sequences see Supplementary file 1. Polyclonal antibodies were purified through Affi-Gel columns (Bio-Rad). For WB analysis anti-Msl1 (3208) or anti-Nsl1 (3130) antibodies were diluted 1:2000.
Nuclear extracts were prepared from 30 P15 plates of mESCs with 80% confluency as described in Demeny et al. (2007). Proteins of 3 mg (Msl1) or 1 mg (Nsl1) mESC nuclear extracts were immunoprecipitated (IP) with 100 μl protein A Sepharose beads and 20 μl of the anti-Msl1 (3208) or 20 μl of the anti-Nsl1 (3130) antibody. Antibody-protein A Sepharose containing the bound proteins were washed three times with IP buffer (25 mM Tris-HCl pH 7.9, 10% glycerol, 0.1% NP40, 0.5 mM DTT, 5 mM MgCl2) and 100 mM KCl and afterwards with IP buffer containing 250 mM KCl. Proteins were eluted from protein A Sepharose beads 150 μl of 0.1 M Glycine pH 2.6. Elutions were neutralized by adding 50 μl of 2 M Tris pH 8.5.
MudPIT analyses were performed as previously described (Washburn et al., 2001; Florens et al., 2006). In summary, protein mixtures were TCA precipitated, urea-denaturated, reduced, alkylated, and digested with endoproteinase Lys-C (Roche) followed by modified trypsin digestion (Promega). Peptide mixtures were loaded onto a triphasic 100 μm diameter fused silica microcapillary column described as follows (McDonald and Yates, 2002). Loaded microcapillary columns were placed in-line with a Quaternary Dionex Ultimate 3000 HPLC pump and a LTQ Velos linear ion trap mass spectrometer equipped with a nano-LC electrospray ionization source (ThermoFischerScientific). A fully automated 12-steps MudPIT run was performed as previously described (Florens et al., 2006) during which each full MS scan (from 300 to 1700 m/z range) was followed by 20 MS/MS events using data-dependent acquisition. Proteins were identified by database searching using SEQUEST (Eng et al., 1994) within ThermoProteome Discoverer 1.3 and 1.4 (ThermoFischerScientific). Tandem mass spectra were searched against a Mus musculus protein sequence database containing 16,604 entries (from the Swissprot 2013-04-03 release). In all searches, cysteine residues were considered to be fully carboxyamidomethylated (+57 Da statically added) and methionine considered to be oxidized (+16 Da dynamically added). Proteins were considered as specific in a given IP data set if they were absent or 10-fold minimum enriched as compared to a MOCK IP, performed on the same protein input by using a non-specific antibody targeting yeast TAF90. Relative protein abundance for each protein in either the anti-Msl1, or the anti-Nsl1 IPs was estimated by the calculation of a Normalized Spectral Abundance Factor (NSAF) (Zybailov et al., 2006). NSAF values were calculated from the spectral counts of each identified protein. To account for the fact that following enzymatic digestion larger proteins result in more peptides/spectras than small proteins, each given spectral count was divided by the corresponding protein length to provide a spectral abundance factor (SAF). To obtain NSAF, SAF values were normalized against the sum of all SAF values in the corresponding run. Thus, NSAF values obtained from a given protein mixture, such as immunoprecipitated protein complexes, allow the comparison of the abundance of a given protein/subunit to another in the same mixture/complex.
For gel filtration a Superose 6 (10/300) column pre-equilibrated in 25 mM Tris pH 7.9, 1 mM DTT, 5 mM MgCl2, 150 mM KCl, and 5% Glycerol was used. 250 μl calibration mix containing Dextran Blue (2 MDa) and Biorad calibration kit (ref 151-1901) with marker sizes of 670 kDa, 158 kDa, 44 kDa, 17 kDa, and 1.35 kDa were injected at 0.3 μl/min. 500 μl of mESC nuclear extract containing 1 mg protein was injected and run at 0.3 μl/min. 40 fractions were collected and analysed by western blot. For western blot the anti-Mof (A300-992a; Bethyl) antibody was used.
ChIP was carried out as described previously with slight modifications (Krebs et al., 2011). At 80% confluency mESCs were cross-linked with 1% formaldehyde for 10 min at room temperature, lysed and shared mechanically using the Covaris E210 to obtain a chromatin fragment size of 200–500 bp. IP were carried out using 500 μg of chromatin. For the IP 3 μg of purified Msl1 3208 or Nsl1 3130 antibodies were used. The input was obtained from 50 μg of chromatin, pre-cleared, and directly reverse crosslinked. DNA was purified using a Qiaquick (Qiagen, France) column. Quantitative real-time PCR (qPCR) was performed with SYBR Green (Roche). Primer sequences are summarized in the Supplementary file 1.
10 ng of precipitated DNA obtained from ChIP was used for Solexa sequencing. To create a genomic library, we followed the instructions of NEXTFlex v12.03 (BIO Scientific) for Msl1 and the NEBNext protocol (E6240; Biolabs) for Nsl1. Libraries were validated with the Agilent Bioanalyzer. Single reads run sequencing was conducted with the HiSeq 2000. Image analysis and base calling were done with the Illumina pipeline (1.8.2). The July 2007 Mus musculus genome assembly (NCBI37/mm9) from NCBI was used for the sequence alignment by the software Bowtie (0.12.7) (Langmead, 2010). All analyses were conducted with unique reads. Bed files were used to create read density (wig) files by extending reads to 200 bp length and creating 25 bp bins. We further included following sequencing datasets, which were obtained from Gene Expression Omnibus (www.ncbi.nlm.nih.gov/geo/) in our analysis: Input (GSM798320) (Karmodiya et al., 2012), RNA Polymerase II (GSM307623), H3K4me3 (GSM307618), H3K27me3 (GSM307619), Ezh2 (GSM327668) (Mikkelsen et al., 2007), H4K16ac (GSM1156617) (Taylor et al., 2013), and Mof (GSM915227) (Li et al., 2012). Fastq files were generated from SRA lite format and aligned to the NCBI37/mm9 assembly using Bowtie (0.12.7) (Langmead, 2010). DHS were obtained from Encode/UW (GSM1014154).
To detect Msl1 and Nsl1 peaks, the algorithm MACS14 (Zhang et al., 2008) was applied using default parameters with slight modifications. For Msl1 peak detection the p-value cutoff was set to 10−5, no shifting model was built and the shift size was defined as 200. The annotation was based on the ENSEMBL 67 database (mm9). Peaks were annotated to genomic features (TSS, TTS, CDS Exons, 5'UTR, 3′UTR, Introns and intergenic) using the software HOMER (4.2) (Heinz et al., 2010) with default parameters.
To calculate the Msl1, Nsl1, or Pol II enrichment at a given gene either the peak tag density of the nearest peak to the TSS (in a region of +2 kb), was obtained through MACS14 (Zhang et al., 2008) or the total tag density around the TSS (+2 kb) was taken. Further analysis and graphical representation were conducted using the software R.
Density profiles around the TSS and GB were obtained through seqMINER (Ye et al., 2011). For the comparison and analysis of genomic features between data sets, the software BEDTools (2.17.0) (Quinlan and Hall, 2010) was used. Scatter plots and Pearson correlations with Pearson p-values were obtained by calculating the log2 values of read densities normalized to the control at the given peaks or around ENSEMBL transcription start sites. K-means linear clustering was conducted and represented with seqMINER. Venn diagrams were generated with Biovenn (Hulsen et al., 2008). Manteia (v.2) was used for GO analysis of batch gene entries to understand the biological function (Tassy and Pourquie, 2013). Only GO levels between 1 and 10 were taken into consideration and compared between groups.
To verify the statistical significance of the obtained Msl1- or Nsl1-bound gene groups in Figure 5A,C and Figure 8B, we performed bootstrap statistical analyzes for Figure 5A,C and Figure 8B. In all these analyses, we used the total pool of 26,460 ENSEMBL genes. Next out of these pools, we randomly selected the same number of total events (genes or binding sites) than those determined non-randomly in the corresponding figures (i.e., 10600 in Figure 5A; 282 in Figure 5C and 13,505 in Figure 8). This random selection was then compared with the different given interest gene lists (i.e. 3274, 2570, and 2185 for Figure 5A) and the number of genes (IDs) belonging to the non-random experimental group was determined. We repeated this process of random selection and gene list crossings 10,000 times and represented the number of IDs and their observed frequencies as histograms (see corresponding figure supplements). For each gene list, we computed an average (mean) and a standard deviation (sd) of the number of random matches. A z-score is computed as: z = (mean-expect)/sd, where 'expect' is the number of expected interest genes. p-values associated to these scores are indicated in the corresponding figure legends. On each histogram we indicated in bold the number of IDs found in the non-random experimental group. The p-value represents the significance of the difference between the randomly found average and the experimental ID numbers.
Gene expression levels are based on the ENSEMBL 67 database (mm9). Raw data of mESCs and NPCs were taken from Gene Expression Omnibus (GSE34473) and processed using the software tools TopHat (Trapnell et al., 2009) and HTSeq with default parameters. FPKM (fragments per kilobase of exon per million fragments mapped) values were calculated with Cufflinks (Roberts et al., 2011). Differentially expressed genes (DE) in mESCs and NPCs were identified with the bioconductor package DESeq (1.14.0) (Anders and Huber, 2010) using default parameters.
shRNA approaches were conducted with pLKO.1 puro shRNA vectors (Sigma–Aldrich, France) of the TRC2 library. For Ns1 KD the TRCN0000241466 shRNA clone and for Msl1 KD the TRCN0000241378 shRNA clone was used. Double KD of Msl1 and Nsl1 was conducted with equal amount of the TRCN0000241466 and TRCN0000241378 shRNA clones. For control the shRNA non-target control (Product No. SHC002) was applied. Production of lentiviral particles as well as infection of mESCs was conducted according to the manufacturer's protocol. 3 days after viral transfection of 2 × 106 mESCs selection with puromycin (2 μg /ml) (InvivoGen) was started. Experiments were conducted 5 days after viral transfection. KD efficiency was tested at RNA levels through reverse transcriptase (RT)-qPCR (see Supplementary File 1) and at protein levels through western blot of whole protein extracts. Moreover, mRNA expression of selected genes was analysed by (RT)-qPCR, whereas primer sequences are summarized in Supplementary file 1. Total RNA, which was used for gene expression profiles and cDNA synthesis, was isolated with TRIzol reagent (Invitrogen) and treated with DNAse. cDNA was synthesized with Transcriptor reverse transcriptase (Roche) using random hexamers according to the manufacturer's protocol. For normalization of protein amount by WB analyses the anti-Tubulin (T6557; Sigma-Aldrich) antibody and ponceau solution (Sigma-Aldrich) was applied. To analyse the pluripotency state of KD mESCs the anti-Oct4 (611202; BD Labs) antibody was used. To analyse cell morphology images were taken with the digital inverted EVOS XL core (Fisher Scientific, France) microscope using a 10X objective.
Histones were prepared from mESCs by lysing cells in 10 mM HEPES, pH 7.5, 1.5 mM MgCl2, 10 mM KCl, 0.5 mM DTT, 100 mM Natrium Butyrate, and 0.2 M HCl for 30 min on ice, centrifuged and dialysed first against 0.1 M acidic acid and then against water. Samples were analysed by western blot for histone modifications using the anti-H4K16ac (07-329; Millipore, France), anti-H4K5ac (51997; Abcam, UK), anti-H4K8ac (15823; Abcam), and anti-H3 (1791; Abcam) antibody.
Cell growth analyses was started 6 days after lentiviral infection by plating 1 × 105 mESCs on 0.1% gelatine coated per 6-well plates in triplicates. mESCs were counted in triplicates using Neubauer cell counting chambers at indicated time points. mESCs were split every second day to 1 × 105 mESCs/6-well.
7 days after lentiviral infection 5 × 105 mESCs were dissolved in 1 ml PBS (0.1% NaCitrate and 0.1% TritonX 100). Propidium iodide (50 μg/ml) was added and after 4 hr incubation on ice cells were analysed by the FACS calibur. Data were analysed using the CellQuest software.
Cell death was examined using the APOPercentage apoptosis assay (A1000/DC79; Biocolor, France) following the manufacturer's instructions. As a positive control apoptosis was induced with 10 mM hydrogen peroxide for 8 hr in sh control cells. Absorbance was read at 550 nm and normalized to the blank control (without cells).
Experiments were designed with three independent biological replicas. Biotinylated cDNA targets were prepared, starting from 150 ng of total RNA, using the Ambion WT Expression Kit (Cat 4411974), and the Affymetrix GeneChip WT Terminal Labelling Kit (Cat 900671) according to Affymetrix recommendations. Following fragmentation and end-labeling, 3 μg of cDNAs were hybridized on GeneChip Mouse Gene 2.0 ST arrays (Affymetrix, UK) for whole-transcript expression profiles. Washed and stained chips were scanned with the GeneChip Scanner 3000 7 G (Affymetrix) at a resolution of 0.7 μm. Obtained raw data (.CEL intensity files) were processed with Affymetrix Expression Console software version 1.1 to calculate probe set signal intensities using Robust Multi-array Average (RMA) algorithms with default settings.
To select the DE genes, we used the fold change rank ordering statistics (FCROS) method (Dembele and Kastner, 2014). In the FCROS method, k pairs of test/control samples are used to compute fold changes (FC). For each pair of test/control samples, obtained FCs for all genes are ranked in increasing order. Ranks that result are associated to genes. Then, the k-ranks of each gene are used to calculate a statistic, and resulting probability (f-value) is used to identify the DE genes with an error level of 5%.
Msl1 and Nsl1 ChIP-seq data sets as well as gene expression profiles of sh control, sh Msl1 and sh Nsl1 mESCs are deposited at Gene Expression Omnibus (www.ncbi.nlm.nih.gov/geo/) under the accession numbers: GSE53797 and GSE56646.
The effects of histone H4 tail acetylations on cation-induced chromatin folding and self-associationNucleic Acids Research 39:1680–1691.https://doi.org/10.1093/nar/gkq900
Subunit composition and substrate specificity of a MOF-containing histone acetyltransferase distinct from the male-specific lethal (MSL) complexThe Journal of Biological Chemistry 285:4268–4272.https://doi.org/10.1074/jbc.C109.087981
Dosage compensation in Drosophila melanogaster: epigenetic fine-tuning of chromosome-wide transcriptionNature Reviews Genetics 13:123–134.https://doi.org/10.1038/nrg3124
Global transcription in pluripotent embryonic stem cellsCell Stem Cell 2:437–447.https://doi.org/10.1016/j.stem.2008.03.021
An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein databaseJournal of the American Society for Mass Spectrometry 5:976–989.https://doi.org/10.1016/1044-0305(94)80016-2
Drosophila dosage compensation: a complex voyage to the X chromosomeDevelopment 136:1399–1410.https://doi.org/10.1242/dev.029645
The mammalian ortholog of Drosophila MOF that acetylates histone H4 lysine 16 is essential for embryogenesis and oncogenesisMolecular and Cellular Biology 28:397–409.https://doi.org/10.1128/MCB.01045-07
Structural basis for MOF and MSL3 recruitment into the dosage compensation complex by MSL1Nature Structural & Molecular Biology 18:142–149.https://doi.org/10.1038/nsmb.1960
A decade of histone acetylation: marking eukaryotic chromosomes with specific codesJournal of Biochemistry 138:647–662.https://doi.org/10.1093/jb/mvi184
Current protocols in bioinformatics/editoral boardCurrent protocols in bioinformatics/editoral board, Chapter 11, Unit 11 17, 10.1002/0471250953.bi1107s32.
Histone acetyltransferase complexes: one size doesn't fit allNature Reviews Molecular Cell Biology 8:284–295.https://doi.org/10.1038/nrm2145
Developmental potential of Gcn5(−/−) embryonic stem cells in vivo and in vitroDevelopmental Dynamics 236:1547–1557.https://doi.org/10.1002/dvdy.21160
Chromatin in embryonic stem cell neuronal differentiationHistology and Histopathology 22:311–319.
Chromatin in pluripotent embryonic stem cells and differentiationNature reviews. Molecular Cell Biology 7:540–546.https://doi.org/10.1038/nrm1938
The transcriptional network controlling pluripotency in ES cellsCold Spring Harbor Symposia on Quantitative Biology 73:195–202.https://doi.org/10.1101/sqb.2008.72.001
Quantitative proteomic analysis of distinct mammalian Mediator complexes using normalized spectral abundance factorsProceedings of the National Academy of Sciences of the United States of America 103:18928–18933.https://doi.org/10.1073/pnas.0606379103
30 nm chromatin fibre decompaction requires both H4-K16 acetylation and linker histone evictionJournal of Molecular Biology 381:816–825.https://doi.org/10.1016/j.jmb.2008.04.050
Linking global histone acetylation to the transcription enhancement of X-chromosomal genes in Drosophila malesThe Journal of Biological Chemistry 276:31483–31486.https://doi.org/10.1074/jbc.C100351200
Dosage compensation: the beginning and end of generalizationNature Reviews Genetics 8:47–57.https://doi.org/10.1038/nrg2013
hMOF histone acetyltransferase is required for histone H4 lysine 16 acetylation in mammalian cellsMolecular and Cellular Biology 25:6798–6810.https://doi.org/10.1128/MCB.25.15.6798-6810.2005
Manteia, a predictive data mining system for vertebrate genes and its applications to human genetic diseasesNucleic Acids Research 42:D882–D891.https://doi.org/10.1093/nar/gkt807
Large-scale analysis of the yeast proteome by multidimensional protein identification technologyNature Biotechnology 19:242–247.https://doi.org/10.1038/85686
seqMINER: an integrated ChIP-seq data interpretation platformNucleic Acids Research 39:e35.https://doi.org/10.1093/nar/gkq1287
Critical roles of coactivator p300 in mouse embryonic stem cell differentiation and Nanog expressionThe Journal of Biological Chemistry 284:9168–9175.https://doi.org/10.1074/jbc.M805562200
Statistical analysis of membrane proteome expression changes in Saccharomyces cerevisiaeJournal of Proteome Research 5:2339–2347.https://doi.org/10.1021/pr060161n
Danny ReinbergReviewing Editor; Howard Hughes Medical Institute, New York University School of Medicine, United States
eLife posts the editorial decision letter and author response on a selection of the published articles (subject to the approval of the authors). An edited version of the letter sent to the authors after peer review is shown, indicating the substantive concerns or comments; minor concerns are not usually shown. Reviewers have the opportunity to discuss the decision before the letter is sent (see review process). Similarly, the author response typically shows only responses to the major concerns raised by the reviewers.
Thank you for sending your work entitled “MSL and NSL HAT complexes have overlapping and distinct roles, with MSL being the embryonic stem cell-specific regulator” for consideration at eLife. Your article has been evaluated by a Senior editor, a Reviewing editor, and 2 reviewers.
The Reviewing editor and the reviewers discussed their comments before we reached this decision, and the Reviewing editor has assembled the following comments to help you prepare a revised submission.
The reviewers were generally supportive of the paper but felt that there were still some fairly significant issues that would need to be addressed before the paper could be considered for publication. Most important relate to the biological functions of the complexes. What is the phenotype of the Msl1 or Nsl1 knockdown and how does this compare to Mof knockdown/depletion? Is all the Mof really in the NSL or MSL complex, as you claim? If you can address the reviewers’ comments (summarized below) in a timely manner we would be happy to consider a revised version.
1) Genetic mutation of Mof has a lethal phenotype in mice and ES cells. The authors apply depletion of Msl1 and Nsl1 but do not seem to provide data on how these affect stem cell behaviour. Self-renewal and differentiation potential need to be analysed carefully. Also if lethality is caused by loss of either complexes (NSL or MSL, the latter more likely seems to be the major factor in ES cells) then changes in gene expression could be due to cell death. This needs to be carefully considered and discussed. Alternatively, does combined Msl1 and Nsl1 depletion recapitulate the Mof mutation?
2) In Figure 2A H4K16as shows peaks (right most H4K16ac peak) where neither Nsl1 nor Msl1 are enriched. This indicates that other complexes also act. A Mof chromatin IP would be useful to see if H4K16ac overlaps with Mof as would be expected.
3) The authors conclude that there is no free Nsl1 and Msl1 protein as all would associate in Mof complexes. I wonder if the data allows this conclusion. To make this point the authors would need to demonstrate that their measurements would in principle allow detecting 10% free Msl1 or Nsl1 protein. To conclude from “Msl1 or Nsl1, incorporate in their respective complexes with similar abundance as does Mof” is too vague to make this point.
4) The overlap between MSL/NSL and bivalent genes appears not very significant (Figure 6B – only cluster C contains bivalent genes). The large majority of Msl1 and Nsl1 bound genes are not H3K27me3. Throughout the manuscript genes are categorized but statistical testing seems to be not performed. Rigorous statistical tests and control groups need be included to establish meaningful correlations.
5) Acetylation of other residues than H4K16ac is suggested (Discussion) as a redundant mechanism. It is not clear how to reconcile this speculation fits with the Mof phenotype.https://doi.org/10.7554/eLife.02104.024
1) Genetic mutation of Mof has a lethal phenotype in mice and ES cells. The authors apply depletion of Msl1 and Nsl1 but do not seem to provide data on how these affect stem cell behavior. Self-renewal and differentiation potential need to be analysed carefully. Also if lethality is caused by loss of either complexes (NSL or MSL, the latter more likely seems to be the major factor in ES cells) then changes in gene expression could be due to cell death. This needs to be carefully considered and discussed. Alternatively, does combined Msl1 and Nsl1 depletion recapitulate the Mof mutation?
As required we have included following experiments in our revised manuscript to address the reviewers’ first points:
a) To analyse whether Msl1 and Nsl1 play a role in maintenance of ESC pluripotency, we have conducted single Msl1 (shMsl1), or Nsl1 (shNsl1) or combined of Msl1 and Nsl1 (shMsl1/Nsl1) knockdowns (KDs) in mESCs (see new Figure 6–figure supplement 1A). First we determined total cell numbers of ESCs, in which we depleted either Msl1, or NSL1, or both. Under these conditions we observed that especially shNsl1 and shMsl1/Nsl1 ESCs had a much slower cell proliferation rate, when compared to control ESCs, whereas the morphology of these ECS was not altered (see new Figure 6A and 6B). Next we measured whether the observed reduction of cell numbers was due to apoptosis. However, we did not find any increase in apoptotic cells under these KD conditions (new Figure 6–figure supplement 1B). Therefore next we tested whether the KD ESCs would be blocked in any particular cell cycle phase by using FACS analyses (new Figure 6C). In agreement with the above observed cell numbers, we found that KD ESCs accumulated in G1-phase of the cell cycle and that this increase was more severe in shNsl1 and shMsl1/Nsl1 ESCs. We have further described these new observations in the Results section “NSL influences cellular proliferation of mESC”.
b) Note, however, that we do not observe detectable changes in Oct4 (new Figure 7–figure supplement 1A) and alkaline phosphatase (data not shown) protein levels in shMsl1, shNsl1 and shMsl1/shNsl1 KD mESCs. As described in the Results and Discussion sections of our revised version, we cannot exclude that the lentiviral KDs are only partially efficient or the compensation of MSL and/or NSL function by “free” Mof (see also new Figure 1D and our answer to point 3). Even though the pluripotency state does not seem to be affected (as judged by the expression of the used markers) and the double knockdown does not entirely recapitulate the Mof KO phenotype, our new results suggest that Nsl1 is more required for regulating housekeeping genes involved in cellular homeostasis of mESCs. As required these new observations are further discussed in the revised manuscript.
c) Since we observed MSL binding at developmental bivalent genes (Figure 5 and Figure 8A and 8B), which are also upregulated in shMsl1 mESCs (new Figure 7C), in agreement with the reviewers question, we have further analysed the differentiation potential of mESCs depleted for Msl1. For this, mESCs were differentiated into neuronal progenitor cells (NPCs) under control and Msl1 KD conditions (see new Figure 8–figure supplement 1C). First, we analysed expression profiles of bivalent as well as developmental genes, such as Pax6, Hes5, Mapt2 and Nestin, which are also considered as key markers of NPC differentiation. Importantly, our new results show that while these key developmental marker genes, including several bivalent genes, become upregulated in pluripotent mESC under Msl1 KD conditions (new Figure 8C), their expression are in contrary downregulated in NPCs in which Msl1 was silenced during NPC differentiation (new Figure 8D). Note however, that Msl1 KD cells morphologically are still able to form NPC-like cells (Figure 8–figure supplement 8D). These new results indicate the important regulatory requirement of the MSL complex for the expression of bivalent genes, known to become upregulated during cellular differentiation, in mESC and further differentiated NPCs. These new observations and figures are described in the Results section and further discussed in the Discussion section.
In conclusion, we show that Nsl1 regulates proliferation and cellular homeostasis of mESCs. Knockdown of Msl1 leads to a global loss of histone H4K16ac indicating that MSL is the main HAT acetylating H4K16 in mESCs. MSL is enriched at many mESC-specific genes, but also at bivalent domains. Interestingly, MSL is important to keep a subset of bivalent genes silent in pluripotent ESCs, while the same genes require MSL for expression during differentiation. In agreement, during neuronal differentiation MSL is essential for the regulation of key developmental genes.
2) In Figure 2A H4K16as shows peaks (right most H4K16ac peak) where neither Nsl1 nor Msl1 are enriched. This indicates that other complexes also act. A Mof chromatin IP would be useful to see if H4K16ac overlaps with Mof as would be expected.
By using the commercially available anti-MOF antibody batches we obtained only very low enrichment values by ChIP-qPCR at defined genomic loci (such as the region shown in Figure 2A). The differences between our Mof ChIP-qPCR results and the available Mof ChIP-seq data might be due to different experimental setups or antibody batches.
Thus, as suggested we included the published Mof ChIP-seq results (Li et al. (2013) in our analyses (see new Figure 2A, D, E and F). Figure 2A shows that the “right most H4K16ac peak” (with a very low tag density) in the Cdk19 gene overlaps with Mof binding. However, at this specific loci neither Nsl1, nor Msl1 can be detected. This can be explained by several ways: a) the “free” Mof detected in ESC nuclear extracts (see new Figure 1D and our answers to point 3) is binding at these sites, b) the anti-mMof antibody used in the ChIP-seq study, in addition to mouse Mof, is also recognizing another protein (with a similar epitope) that may also bind to DNA, or c) due to their specific conformation and/or involvement in special chromatin structures neither MsL1, nor Nsl1 can be crosslinked at these sites. Along the same lines, Straub et. al. (2013 Genome Research), when comparing ChIP-chip and ChIP-seq profiles, suggested that Drosophila Msl1 and Mof might not be detectable by ChIP-seq approaches in genebodies due to the extensive fragmentation of chromatin (to obtain very small DNA fragments (200 bp)), which are then used for sequencing. Thus, extensive fragmentation might result in disruption, or at least partial disruption, of large chromatin-bound complexes, resulting in ChIP-seq signal loss of proteins not directly associated with the chromatin. If Mof would be closer to the DNA than Msl1, or Nsl1, respectively, in MSL or NSL complexes at certain loci, the sole detection of Mof could be explained by the suggestion of Straub et al (2013).
Importantly however, our new Figures 2D, E and F together show that the genome-wide Mof binding profile at promoters and in genebodies is very similar to Nsl1 (at promoters) or Msl1 (at regions downstream from promoters) binding (Figure 2E and Figure 2F).
These new results are now described in the Results section.
3) The authors conclude that there is no free Nsl1 and Msl1 protein as all would associate in Mof complexes. I wonder if the data allows this conclusion. To make this point the authors would need to demonstrate that their measurements would in principle allow detecting 10% free Msl1 or Nsl1 protein. To conclude from “Msl1 or Nsl1, incorporate in their respective complexes with similar abundance as does Mof” is too vague to make this point.
We apologize if our conclusion obtained from our mass spec results (Figure 1C) were not well explained. Therefore, we improved the description of the meaning of the NSAF abundance calculations in our revised version (see Results section and Materials and methods).
Briefly, the development of non-gel-based, “shotgun” proteomic techniques such as Multidimensional Protein Identification (MudPIT) has provided powerful tools for studying large-scale protein characterization in complex biological systems. As during enzymatic digestions of protein mixtures for proteomic analyses large proteins contribute more peptide/spectra than small ones, a normalized spectral abundance factor (NSAF) was defined to account for the effect of protein length on spectral count for comparing protein abundance in the different samples (Zybailov et al. 2006, Journal of Proteome Research; Florens et al. 2006, Methods). NSAF is calculated as the number of spectral counts (SpC) identifying a protein, divided by the protein’s length (L), divided by the sum of SpC/L for all proteins in the experiment. Thus, NSAF allows the comparison of abundance of individual proteins in multiple independent samples and has been applied to quantify the subunit abundance in various protein mixtures and in multiprotein complexes (Florens et al. 2006, Methods; Paoletti et al. 2006, PNAS; Bieniossek et al. 2013, Nature). This NSAF counting method provides an easy way of identifying proteins with similar (or different) abundances in immunoprecipitated (IP-ed) protein complexes using MudPIT. This is the method we used. As the NSAF values of Msl1 (7,8) and Mof (13,5), or Nsl1 (9,3) and Mof (14), were in a comparable range (see Figure 1C) in either the anti-Msl1, or the anti-Nsl1 protein immunoprecipitations, we concluded that all the IP-ed Msl1, or Nsl1 associated with comparable amounts of Mof in mESCs, respectively. If free Msl1 or Nsl1 had been present in mESCs, the NSAF values would have been much higher for Msl1 than for Mof, or for Nsl1 than for Mof, which was clearly not the case.
In addition, as required we further addressed the reviewers’ concern and performed a gel filtration (GF) experiment using nuclear extracts prepared from mouse ESCs as described by the new Figure 1D. This new data (shown in new Figure 1D) now further indicates that Msl1 and Nsl1 incorporate in their respective complexes (together with Mof), which elute from the GF column at their respective molecular weights (250 kDa for MSL and 760 kDa for NSL). Moreover, we did not detect any free Msl1, or Nsl1 (in the 65 kDa and 130 kDa range), respectively) present in mESC nuclear extracts. Thus, we can assume that Msl1 and Nsl1 ChIP-seq profiles are representative for MSL and NSL complex binding (however see also answer to point 2). Moreover, our new GF results showing the elution of Mof in the 50 kDa range (without Nsl1 and Msl1), suggest the existence of a small “free“ Mof pool in ESCs.
We think that we have now convincingly demonstrated (as originally stated) that Msl1 or Nsl1 are exclusively present in their respective HAT complexes, in which Mof is present with similar abundance (with a close to 1:1 subunit ratio) than either Msl1, or Nsl1.
4) The overlap between MSL / NSL and bivalent genes appears not very significant (Figure 6B – only cluster C contains bivalent genes). The large majority of Msl1 and Nsl1 bound genes are not H3K27me3. Throughout the manuscript genes are categorized but statistical testing seems to be not performed. Rigorous statistical tests and control groups need be included to establish meaningful correlations.
The overlap between MSL and NSL with bivalent sites is now included in our revised manuscript in Figure 8B. We focused on all MSL and NSL binding sites and analysed the overlap with H3K27me3, Ezh2 and H3K4me3, which are common markers for bivalent genes (Figure 8B). Following the reviewers’ concerns we have conducted Bootstrap statistical analysis, which is explained in the Materials and methods section. Indeed, our statistical analyses demonstrated that the identified number of bivalent genes is significantly enriched compared to a random selection of genes (see new Figure 8–figure supplement 1B). The analysis, the result and the obtained p-value are further described in the legend of Figure 8–figure supplement 1B.
5) Acetylation of other residues than H4K16ac is suggested (Discussion) as a redundant mechanism. It is not clear how to reconcile this speculation fits with the Mof phenotype.
We agree with the reviewers that our speculation did not fit with the Mof phenotype. Therefore, as required we have now deleted the original supplementary Figure 6 and have re-written the corresponding paragraph “MSL is the main H4K16 HAT in mESCs” in the Discussion.https://doi.org/10.7554/eLife.02104.025
- Làszlò Tora
The funder had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
We are grateful to JW Conaway, ME Torres Padilla, V Pavet-Portal and D Langer for materials, helpful discussions and advice. We thank F Klein for the help in the gelfiltration experiment, M Gerard, D Devys, and A Krebs for critically reading the manuscript and for helpful comments, the IGBMC microarray and sequencing platform data generation and bioinformatics support; G Duval for antibody generation; the mass-spectrometry facility; M Hestin and G Rossi for help in ESC culturing. SR was supported by a fellowship from ARC. This work was supported by funds from CNRS, INSERM, Strasbourg University, and ANR (ANR-09-BLAN-0266; ANR-09-BLAN-0052) grants. This study was also supported by the grant ANR-10-LABX-0030-INRT, under the frame programme Investissements ANR-10-IDEX-0002-02.
- Danny Reinberg, Howard Hughes Medical Institute, New York University School of Medicine, United States
© 2014, Ravens et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.