1. Developmental Biology
  2. Stem Cells and Regenerative Medicine
Download icon

TALE factors use two distinct functional modes to control an essential zebrafish gene expression program

  1. Franck Ladam
  2. William Stanney
  3. Ian J Donaldson
  4. Ozge Yildiz
  5. Nicoletta Bobola
  6. Charles G Sagerström  Is a corresponding author
  1. University of Massachusetts Medical School, United States
  2. University of Manchester, United Kingdom
Research Article
  • Cited 0
  • Views 461
  • Annotations
Cite as: eLife 2018;7:e36144 doi: 10.7554/eLife.36144

Abstract

TALE factors are broadly expressed embryonically and known to function in complexes with transcription factors (TFs) like Hox proteins at gastrula/segmentation stages, but it is unclear if such generally expressed factors act by the same mechanism throughout embryogenesis. We identify a TALE-dependent gene regulatory network (GRN) required for anterior development and detect TALE occupancy associated with this GRN throughout embryogenesis. At blastula stages, we uncover a novel functional mode for TALE factors, where they occupy genomic DECA motifs with nearby NF-Y sites. We demonstrate that TALE and NF-Y form complexes and regulate chromatin state at genes of this GRN. At segmentation stages, GRN-associated TALE occupancy expands to include HEXA motifs near PBX:HOX sites. Hence, TALE factors control a key GRN, but utilize distinct DNA motifs and protein partners at different stages – a strategy that may also explain their oncogenic potential and may be employed by other broadly expressed TFs.

https://doi.org/10.7554/eLife.36144.001

Introduction

Many transcription factors (TFs) involved in vertebrate embryogenesis are expressed across relatively large time windows that encompass a variety of cellular and morphological changes. While it seems likely that such TFs function by the same mechanism throughout embryogenesis, there is no a priori reason that this should be the case. One group of TFs in this category is the TALE (three amino acid loop extension) family of homeodomain proteins. The TALE family includes Pbx, as well as the closely related Prep and Meis proteins (Waskiewicz et al., 2002; Deflorian et al., 2004; Pöpperl et al., 2000). Pbx and Prep/Meis were originally identified as factors that form complexes with Hox TFs to drive cell fate decisions and tissue-specific gene expression starting at gastrula/segmentation stages (reviewed in [Moens and Selleri, 2006; Ladam and Sagerström, 2014; Merabet and Mann, 2016]). Accordingly, several Hox-dependent enhancers contain regulatory elements consisting of immediately adjacent Pbx and Hox half-sites, usually of the form TGATNNAT (Pöpperl et al., 1995; Maconochie et al., 1997; Grieder et al., 1997; Ryoo and Mann, 1999), located a short distance from TGACAG (HEXA) binding sites for Prep/Meis monomers (Amin et al., 2015; Ferretti et al., 2005; Tümpel et al., 2007; Jacobs et al., 1999; Ferretti et al., 2000). TALE factors also act in complexes with other tissue-specific TFs (e.g. Pdx1 (Peers et al., 1995), Rnx (Rhee et al., 2004), MyoD (Knoepfler et al., 1999; Berkes et al., 2004), Eng (Kobayashi et al., 2003), Otx2 (Agoston and Schulte, 2009) and Pax6 [Agoston et al., 2014]) during gastrulation/segmentation stages. Additionally, TALE factors have oncogenic potential and have been implicated in various types of leukemia (Kamps and Baltimore, 1993; Nourse et al., 1990; Moskow et al., 1995). In agreement with an important developmental role, disruption of TALE function leads to severe embryonic phenotypes such that mice homozygous for null mutations in pbx1, prep1 or meis1 die in utero, while pbx3 mutants die a few days after birth (Rhee et al., 2004; Selleri et al., 2001; Fernandez-Diaz et al., 2010; Hisa et al., 2004). Similarly, disruption of the earliest expressed TALE genes in zebrafish (prep1.1, pbx2 and pbx4) produces severe embryonic defects (Deflorian et al., 2004; Waskiewicz et al., 2002; Pöpperl et al., 2000).

In spite of their function having been defined primarily at gastrula/segmentation stages, TALE factors are actually present throughout embryogenesis. In particular, zebrafish Prep and Pbx mRNA and protein is both maternally deposited and ubiquitously expressed in the later embryo (Fernandez-Diaz et al., 2010; Deflorian et al., 2004; Choe et al., 2002; Pöpperl et al., 2000; Vlachakis et al., 2000). Since all known TFs that bind TALE factors are not expressed until gastrula stages or later, it follows that TALE factors may have distinct roles prior to gastrula stages. Accordingly, Prep and Pbx can be detected at gene regulatory elements prior to the binding of their partner TFs. For instance, Prep and Pbx occupy the hoxb1a enhancer prior to Hoxb1b binding and before hoxb1a expression (Choe et al., 2014), while Pbx binds the myogenin locus before MyoD and prior to onset of myogenin expression (Berkes et al., 2004). Here, we explore the possibility that TALE factors may have uncharacterized roles during early embryogenesis. We find that maternally deposited TALE factors primarily occupy a 10 bp DECA motif at blastula stages. This motif was previously identified as a binding site for Prep:Pbx dimers (Chang et al., 1997; Knoepfler and Kamps, 1997; De Kumar et al., 2017; Laurent et al., 2015; Penkov et al., 2013), but was not assigned a biological role. We also find that these DECA sites have adjacent binding sites for the NF-Y pioneer TF and we show that TALE and NF-Y form a complex. Furthermore, TALE and NF-Y are required for the gradual transition to an active chromatin state of a gene network controlling anterior embryonic development. By segmentation stages, the binding repertoire of TALE factors expands to also include HEXA sites and PBX:HOX binding sites associated with the same gene network. Hence, TALE TFs control an anterior gene network throughout zebrafish embryogenesis, but do so by employing distinct DNA motifs and protein partners at different embryonic stages.

Results

TALE factors control a gene network regulating formation of anterior embryonic structures in zebrafish

TALE factors play a key role in early vertebrate embryogenesis, as evidenced by the phenotypes observed in TALE loss-of-function animals. In particular, loss of prep1.1, pbx2 and/or pbx4 function in zebrafish produces smaller heads and reduced eye size, as well as CNS defects – including disruptions of hindbrain segmentation – and cardiovascular defects that manifest themselves in the form of cardiac edema (Deflorian et al., 2004; Waskiewicz et al., 2002; Pöpperl et al., 2000), but the genetic basis of these defects is not well understood. In order to comprehensively identify TALE-dependent genes involved in embryogenesis, we used RNA-seq to compare gene expression in wildtype versus TALE loss-of-function animals. We focused on the function of pbx2, pbx4 and prep1.1 since these genes are ubiquitously expressed and represent the predominant TALE factors in the early zebrafish embryo (Deflorian et al., 2004; Waskiewicz et al., 2002; Pöpperl et al., 2000; Choe et al., 2002; Vlachakis et al., 2000). We used gene knock-down (KD; see Figure 1—figure supplement 1A–C for details) to generate embryos lacking Pbx and Prep function (as reported previously [Waskiewicz et al., 2002; Deflorian et al., 2004; Pöpperl et al., 2000]) and we observe the expected phenotype – including a reduced head, smaller eyes, cardiac edema, loss of pectoral fins, loss of hindbrain Mauthner neurons and disrupted cartilage formation in the head region (Figure 1—figure supplement 1A,B). Comparisons of RNA-seq data from control and TALE KD embryos at developmental stages (Figure 1A) when TALE-dependent tissues are being specified (early gastrula; 6hpf) or initiating morphogenesis (segmentation stages; 12hpf) revealed minimal gene expression changes at 6hpf (Figure 1—figure supplement 1D–F), but extensive changes at 12hpf (Figure 1B). Specifically, the expression of 671 genes (526 genes downregulated and 145 upregulated; Figure 1C) is altered in TALE KD embryos compared to control embryos at 12hpf. GO-term analysis on the genes downregulated in 12hpf TALE KD embryos revealed an enrichment for roles in embryonic development – particularly head formation, neural development (including eye and hindbrain development) and circulatory system formation (Figure 1D), consistent with the TALE KD phenotype. Furthermore, these TALE-regulated genes are enriched for transcriptional regulators and a large number encode known TFs (Figure 1D,E), suggesting that this gene set defines a gene regulatory network (GRN). Upon comparison to previously reported TALE loss-of-function phenotypes, we find that of 13 Pbx-dependent genes identified in the zebrafish retina and hindbrain (French et al., 2007), seven (egr2b, mafba, eng2b, rx2, gdf6a, hmx4, meis3) are also downregulated in our analysis. Similarly, of six genes downregulated in Prep loss-of-function zebrafish (Deflorian et al., 2004), four (pax6a, hoxb1a, hoxa2b, hoxb2a) are downregulated in our experiment. This suggests that our RNA-seq analysis captured a comprehensive set of TALE-dependent genes. We conclude that TALE TFs control a gene regulatory network (TALE GRN), which instructs anterior embryonic development and that becomes operative between 6hpf and 12hpf.

Figure 1 with 1 supplement see all
TALE factors control a gene network regulating formation of anterior embryonic structures.

See also Figure 1—figure supplement 1. (A) Schematic of zebrafish embryogenesis indicating time points used for RNA-seq and ChIP-seq analyses. The 3.5hpf time point represents a stage prior to robust zygotic gene expression, while 12hpf corresponds to the time when tissue morphogenesis is initiated. The 6hpf time point for RNA-seq was selected to capture changes in gene expression occuring shortly after ZGA. ZGA = zygotic genome activation; hpf = hours post-fertlization. (B) Scatter plot showing average TPM gene expression as identified by RNA-seq in control vs TALE KD 12hpf embryos. Genes with significant expression variation (p-adj ≤0.01) are highlighted in red. Statistical test = Wald test in DeSeq2. (C) Graph showing the number of genes up/downregulated (p-adj ≤0.01, fold-change ≥1.5) in 12hpf TALE KD samples vs control. (D) DAVID analysis of genes downregulated (p-adj ≤0.01, fold-change ≥1.5) in 12hpf TALE KD samples vs control. Note that only select categories are presented, a full list of GO terms is available in Supplementary file 3. FDR = Benjamini multiple testing False Discovery Rate. (E) Expression fold-change of select genes significantly downregulated in 12hpf TALE KD samples compared to control. Genes were selected based on their role in regulation of relevant embryonic structures.

https://doi.org/10.7554/eLife.36144.002

Genomic TALE occupancy is continuously and dynamically associated with the TALE GRN during embryogenesis

To determine how genomic TALE occupancy relates to the TALE GRN, we carried out ChIP-seq for Prep1.1 in zebrafish embryos. We assessed TALE binding both at 12hpf (early segmentation stage; when TALE-dependent gene expression is detectable; Figure 1A,B), and also at 3.5hpf (late blastula stage; prior to robust zygotic gene expression; Figure 1A, Figure 2—figure supplement 1A,B). Analysis of two biological replicates at each stage (using a cutoff of FE ≥ 10; Figure 2A, Supplementary file 1) yielded ~13,300 peaks at 3.5hpf (Prep3.5hpf) and ~24,200 peaks at 12hpf (Prep12hpf), the majority of which are located within 30 kb of a transcription start site (TSS; Figure 2B). We note that out of the 13,300 Prep3.5hpf peaks, ~60% co-localize with a Prep12hpf peak (Figure 2C), suggesting that a large fraction of binding sites remains occupied throughout embryogenesis. However, an additional ~16,500 peaks detectable at 12hpf do not co-localize with a Prep3.5hpf peak, demonstrating that additional binding sites become occupied at later stages. We refer to binding sites observed only at 12hpf as ‘12hpf-only’ (Prep12hpf-only). We noticed that although the Prep12hpf-only peaks do not co-localize with Prep3.5hpf peaks, the two types of sites nevertheless appear to be preferentially located near one another (Figure 2A). Indeed, a quantitative analysis of peak distribution revealed that 58% of all Prep12hpf-only peaks are located within 40 kb of a Prep3.5hpf peak (Figure 2D, Figure 2—figure supplement 1C).

Figure 2 with 1 supplement see all
Genomic TALE occupancy is continuously and dynamically associated with the TALE GRN during embryogenesis.

See also Figure 2—figure supplement 1. (A) Representative UCSC browser tracks illustrating Prep binding at the hoxb1a and mafba loci in 3.5 and 12hpf embryos. (B) Graph showing the distribution of Prep3.5hpf and Prep12hpf binding sites relative to TSSs. (C) Venn diagram illustrating co-localization of Prep peaks in 3.5hpf and 12hpf embryos. Two peaks are considered to co-localize if their summits are within 50 bp. (D) Chart illustrating percent of Prep12hpf-only peaks found at various distances from Prep3.5hpf peaks. (E) GO term enrichment for Prep3.5hpf and Prep12hpf-only peaks identified by GREAT using the nearest gene within 5 or 30 kb association rule. In the case of GO terms associated with genes within 30 kb, only select categories are presented, a full list of GO terms is available in Supplementary file 3. FDR = Binomial False Discovery Rate. (F) Graph showing percent of TALE GRN genes (p-adj ≤0.01, fold-change ≥1.5) associated (≤5 or 30 kb) with Prep3.5hpf and Prep12hpf-only peaks. p-values for enrichment above a random set of genes were calculated using the Pearson correlation test.

https://doi.org/10.7554/eLife.36144.004

GO-term analyses revealed that genes associated with either Prep3.5hpf or Prep12hpf-only peaks are enriched for functions related to transcriptional regulation and embryonic development – particularly neural development, but also heart and muscle formation (Figure 2E). These functions correspond well with the phenotype observed in TALE KD embryos (Figure 1—figure supplement 1A,B) and with the GO-terms associated with the TALE GRN (Figure 1D), suggesting that Prep occupancy is linked with the TALE GRN throughout embryogenesis. Accordingly, we find that ~70% (350/526) of the TALE GRN genes are located within 30 kb of a Prep3.5hpf or a Prep12hpf-only peak (Figure 2F).

We conclude that Prep occupies genomic binding sites associated with the TALE GRN as early as late blastula stages. ~60% of these sites are also occupied at segmentation stages, but by this stage a large number of additional binding sites (Prep12hpf-only sites) have become bound by Prep. Since these later sites are also associated with the TALE-GRN, Prep binding is dynamically and continuously associated with the TALE GRN during zebrafish embryogenesis.

TALE factors utilize distinct binding motifs at early versus late stages of embryogenesis

The widespread genomic binding of Prep at blastula stages has not been reported previously and we therefore examined the characteristics of these binding sites in greater detail. To this end, we used the MEME de novo motif discovery tool (Bailey et al., 2009; Machanick and Bailey, 2011) and identified a 10 bp TGATTGACAG sequence as the predominant motif centered at Prep3.5hpf peak summits (Figure 3A). This ‘DECA motif’ contains immediately adjacent Pbx and Prep half sites and was initially identified as a binding site for TALE dimers in vitro (Chang et al., 1997; Knoepfler and Kamps, 1997). Subsequently, the DECA motif has been detected at sites co-occupied by Pbx and Prep in embryonic stem cells and in the mouse trunk (Laurent et al., 2015; Penkov et al., 2013; De Kumar et al., 2017), but it has not been assigned a biological function. To test if DECA sites are co-occupied by Pbx also in the zebrafish embryo, we selected twelve binding sites and used ChIP-qPCR to assay Pbx occupancy. We find that Pbx is present at eleven of the twelve sites at 3.5hpf and that all twelve are occupied by Pbx at 12hpf (Figure 3C), revealing that Prep and Pbx co-occupy DECA sites at least through segmentation stages.

TALE factors utilize distinct binding motifs at early versus late stages of embryogenesis.

(A) Sequence logo and localization relative to Prep peak summits of sequence motifs identified by MEME at Prep3.5hpf peaks. (B) Sequence logo and localization relative to Prep peak summits of sequence motifs identified by MEME at Prep12hpf-only peaks. (C) ChIP-qPCR showing Pbx4 binding at Prep-occupied DECA sites at 3.5hpf and 12hpf, labeled with the name of the nearest gene. Data of three independent biological replicates are presented as mean fold change ± SEM of Pbx4 IP vs control IgG. Statistical test: unpaired t-test. (D) Graph showing percent of Prep3.5hpf and Prep12hpf-only peaks that contain DECA or HEXA motifs. (E) Heatmaps displaying chromatin accessibility at 4hpf (derived from ATAC-seq data [Kaaij et al., 2016]) at DECA (left panel) and HEXA (right panel) enriched peaks. (Prep3.5hpf and Prep12hpf-only peaks were used as a source of DECA- and HEXA-enriched sites, respectively.).

https://doi.org/10.7554/eLife.36144.006

Notably, the DECA motif detected at Prep3.5hpf peaks is distinct from the typical configuration of binding motifs recognized by TALE factors in their role as cooperating with tissue-specific TFs (reviewed in [Ladam and Sagerström, 2014; Merabet and Mann, 2016]). Since this role was characterized primarily at segmentation stages (Ferretti et al., 2005, 2000; Jacobs et al., 1999; Tümpel et al., 2007; Pöpperl et al., 1995), we considered the possibility that the Prep12hpf-only peaks may represent TALE factors acting together with tissue-specific TFs. Indeed, MEME analysis of Prep12hpf-only peaks returned a 6 bp TGACAG (HEXA) motif, but not the DECA motif (Figure 3B). HEXA motifs are binding sites for monomeric Prep (or Meis) factors (Chang et al., 1997; Berthelsen et al., 1998; Shen et al., 1997a) and have been found at several Hox-dependent regulatory elements (Amin et al., 2015; Ferretti et al., 2000; Ryoo et al., 1999; Jacobs et al., 1999; Tümpel et al., 2007). Accordingly, MEME also identified a TGATTTAT sequence, which represents a binding site for TALE:HOX dimers (Penkov et al., 2013; Shen et al., 1997b; Chang et al., 1996), at the Prep12hpf-only peaks (Figure 3B). This Hox motif is not located at the center of the Prep peaks, but is off-set by ~10 bp, as has been observed previously at regulatory elements where Prep/Meis acts with Hox TFs (Jacobs et al., 1999; Ferretti et al., 2005, 2000). We next examined the prevalence of the different motifs at Prep3.5hpf versus Prep12hpf-only peaks. We find that 75% of Prep3.5hpf binding sites contain a DECA motif, while only 7% of Prep12hpf-only sites do so. Conversely, 44% of all Prep12hpf-only binding sites, but only 11% of Prep3.5hpf sites, contain a HEXA motif (Figure 3D). Consistent with HEXA motifs being associated with a Prep cofactor role, we also find that PBX:HOX binding sites are more prevalent at Prep12hpf peaks (24%) than at Prep3.5hpf peaks (5%). It is surprising that HEXA sites are not occupied by Prep at blastula stages and we considered the possibility that HEXA sites may not be accessible at this stage. We made use of previously published ATAC-seq data (Kaaij et al., 2016) to examine DNA accessibility at DECA versus HEXA sites at 4hpf and find that HEXA sites are considerably more accessible than DECA sites (Figure 3E), suggesting that chromatin accessibility is not a limiting factor for Prep binding at HEXA sites in the blastula stage embryo.

While both DECA and HEXA sites have been reported previously, our data show for the first time that there is a temporal order to how TALE factors utilize these motifs during embryogenesis. Specifically, TALE factors occupy primarily DECA sites at blastula stages and these motifs remain occupied at least until segmentation stages, but by segmentation stages additional binding sites become utilized so that TALE factors also occupy HEXA motifs associated with binding sites for tissue-specific TFs such as Hox proteins.

Some TALE-occupied sites are associated with chromatin marks at Blastula stages

Previous analyses of individual DNA elements containing HEXA motifs adjacent to PBX:HOX motifs demonstrated that these act as enhancers in mouse and zebrafish (Pöpperl et al., 1995; Jacobs et al., 1999; Ferretti et al., 2005; Choe et al., 2009; Ferretti et al., 2000; Di Rocco et al., 1997; Manzanares et al., 2001; Tümpel et al., 2007; Wassef et al., 2008). Conversely, de novo motif discovery in conserved hindbrain enhancers – combined with functional testing in zebrafish – identified HEXA and PBX:HOX motifs as being essential for enhancer activity (Parker et al., 2011; Grice et al., 2015). Accordingly, we find that the Prep12hpf-only peaks are found at highly conserved regions of the genome (Figure 4—figure supplement 1A) and are associated with chromatin modifications known to mark enhancers (Figure 4—figure supplement 1B). Finally, we find that of 74 hindbrain enhancers active at 48–72hpf (Grice et al., 2015), 19 (26%; Figure 4—figure supplement 1C) are associated with a Prep12hpf-only peak. Hence, the arrangement of HEXA sites associated with PBX:HOX motifs (and other tissue-specific TF motifs) that we observe at 12hpf is very likely to represent enhancer elements.

In contrast, no biological function has yet been assigned to elements containing DECA motifs. We characterized 11 Prep-occupied DECA sites in greater detail and find that eight are associated with genomic regions conserved in five other fish species (Figure 4—figure supplement 1D). Six of these elements are also conserved in mammals, suggesting that they play an evolutionarily important role. To identify a role for these elements, we tested whether Prep3.5hpf peaks correlate with particular chromatin features by comparison to available ChIP-seq data sets from 4.5hpf blastula stage zebrafish embryos (Bogdanovic et al., 2012; Zhang et al., 2014; Lee et al., 2015). Ranking TALE-bound regions based on their level of H3K4me1 (a histone modification associated with enhancers and promoters) reveals a clear pattern (Figure 4A). In particular, K-means clustering produced four clusters of sequences, three of which (representing ~25% of all TALE-occupied sites) are highly marked by H3K4me1. To distinguish TALE-occupied sites associated with chromatin marks from sites that lack (or display very low levels of) such marks, we refer to them as MPADs (Modified Prep Associated Domains) and non-MPADs, respectively. We find that MPADs are also enriched for H3K4me3 (a mark of active promoters) and H3K27ac (a mark of active enhancers and promoters). In addition, MPADs center on nucleosome-depleted regions and are highly enriched for RNA polymerase II occupancy (Figure 4A,B). MPADs are also preferentially found within 5 kb of TSSs (Figure 4C), are enriched near genes involved in transcriptional regulation and embryonic development (Figure 4D, Supplementary file 2) and are found at conserved sites in the genome (Figure 4E). In contrast, the remaining 75% of TALE-occupied sites display only sparsely modified histones at this stage (Figure 4A). These non-MPAD sites lack a nucleosome free region (Figure 4B) and are only weakly associated with RNA Polymerase II, but they are highly methylated on CpG dinucleotides. The non-MPAD sites are mostly found at distances greater than 5 kb from TSSs (Figure 4C), associated genes are not enriched for any specific functions (Figure 4D) and they are not highly conserved (Figure 4E).

Figure 4 with 2 supplements see all
Some TALE-occupied sites are associated with chromatin marks at blastula stages and developmental control genes are enriched near MPADs displaying repressive histone modifications.

See also Figure 4—figure supplements 1 and 2. (A) Heatmaps displaying chromatin features at genomic regions occupied by Prep at 3.5hpf. H3K4me1 signals at Prep-occupied elements was analyzed by K-mean (k = 4) clustering (left panel). H3K4me3, H3K27ac, H3K27me3, nucleosome, RNA-pol2 subunit RPB1 and Methyl CpG signals are displayed based on the H3K4me1 clustering order. (B) Average nucleosome signal at MPADs and non-MPADs (as defined in A). (C) Distribution of MPADs and non-MPADs relative to TSSs. (D) GO term enrichment for MPADs and non-MPADs identified by GREAT (nearest gene within 30 kb). Note that genes associated with Class 3 MPADs or non-MPADS are not enriched for GO terms. Only select categories are presented, a full list of GO terms is available in Supplementary file 2. FDR = Binomial False Discovery Rate. (E) Conservation of 3.5hpf Prep-occupied sites among vertebrates generated using PhastCons vertebrate 8-way comparison. The score shown is the probability (0 ≤ p ≤ 1) that each nucleotide belongs to a conserved genomic element. (F) Heatmaps displaying chromatin features at MPADs. H3K27ac and H3K27me3 signals at MPADs were analyzed by K-mean (k = 4) clustering. H3K4me1, H3K4me3, nucleosome and RBP1 signals are displayed based on the H3K27ac/me3 clustering order. (G) Box plots showing average expression of genes near (≤30 kb) each of the four MPAD classes, as determined by RNA-seq on 6hpf embryos. Data are presented as log2 of mean TPM (transcripts per million) values from three biological replicates. Statistical test: pairwise comparison with Kruskal-Wallis followed by Dunn's post-hoc test.

https://doi.org/10.7554/eLife.36144.008

Prep occupancy has not been assessed in blastula stage embryos of other animal species, but previous analyses in murine embryonic stem cells (mESCs) identified Prep as bound to DECA motifs ([Laurent et al., 2015]; see also Figure 4—figure supplement 2A). We find that ~40% (1595/4008) of the Prep-associated genes in mESCs have orthologs with a nearby Prep3.5hpf peak in zebrafish (Figure 4—figure supplement 2B,C), indicating that Prep binding near developmental control genes is evolutionarily conserved. Sorting Prep-occupied regions from mESCs based on their enrichment for H3K4me1 revealed characteristics similar to those observed in zebrafish (Figure 4—figure supplement 2D,E), although there are many fewer unmodified regions in mESCs than in zebrafish embryos. Hence, at blastula stages, TALE-occupied sites can be divided into ones that are associated with various chromatin marks and are located near promoter regions of developmental control genes (MPADs), and ones that are largely devoid of histone marks and that are not associated with specific gene functions (non-MPADs).

Developmental control genes are enriched near MPADs displaying repressive histone modifications

We noticed that a subset of MPADs shows detectable enrichment for the repressive H3K27me3 histone modification (Figure 4A). To examine this finding further, we ranked MPADs based on their level of H3K27ac and H3K27me3 at blastula stages. K-means clustering divided the resulting distribution into four groups (Figure 4F). For the sake of comparison, we refer to these as Class 1–4 MPADs. In particular, MPADs with high (Class 1) and intermediate (Class 2) levels of H3K27ac are associated with high levels of H3K4me3 and RNA Pol II occupancy, while elements with low levels of H3K27ac (Class 3 and 4) are not. Notably, the subset of MPADs with the lowest level of H3K27ac are associated with high levels of H3K27me3 (Class 4). When we analyze the GO-terms of genes associated with each of the four MPAD classes, we find that H3K27me3-modified Class 4 MPADs are more highly associated with developmental control genes than are Class1-3 MPADs (Figure 4D). In agreement with the chromatin profile at MPADs, RNA-seq analysis at 6hpf (shortly after the onset of zygotic gene expression) revealed that genes associated with Class 1 and 2 MPADs are expressed at higher levels than genes associated with Class 3 and 4 MPADs (Figure 4G). Similarly, ranking MPADs from mESCs based on H3K27ac levels revealed categories analogous to those observed in zebrafish (Figure 4—figure supplement 2F,G).

Hence, MPADs can be further subdivided such that Class 1 and 2 display active chromatin marks and are found near genes expressed at 6hpf. In contrast, Class 4 MPADs are marked by H3K27me3 and are associated with genes involved in developmental processes, but these are not highly expressed at 6hpf. Class 3 MPADs are only marked by H3K4me1 and genes associated with this class show an intermediate level of expression at 6hpf, but they are not enriched for specific biological functions. We conclude that the chromatin state of MPADs correlates with the biological function of nearby genes and that developmental control genes are primarily associated with repressed (H3K27me3-modified) Class 4 MPADs.

Class 4 MPADs transition to an active chromatin state during embryogenesis

We next examined whether chromatin modifications at MPADs change as embryogenesis progresses by comparing their H3K27ac status at the blastula stage (4.5hpf) to that at late gastrula (9hpf) – when the embryonic axes have formed and organogenesis is beginning. We find that Class 1 and 2 MPADs undergo a reduction in the level of H3K27ac modification from 4.5hpf to 9hpf (Figure 5A,B), while RNA-seq at 12hpf (to capture changes in gene expression corresponding to chromatin changes at 9hpf; Figure 5C) shows that the associated genes are expressed at similar levels at 12hpf and 6hpf (Figure 5D). In contrast, Class 4 MPADs display higher levels of H3K27ac at 9hpf than at 4.5hpf and their associated genes show the greatest increase in expression between 6hpf and 12hpf. Class 3 MPADs show an intermediate effect with a small change in H3K27ac levels and a slight increase in expression of associated genes. We also find that many of the TALE-occupied regions that are sparsely modified at 4.5hpf (non-MPADs defined in Figure 4A) become more highly modified by H3K27ac as development progresses (Figure 5—figure supplement 1A,B). Genes associated with the non-MPADs undergoing the greatest increase in H3K27ac levels show the greatest increase in expression (Figure 5—figure supplement 1C) and are also enriched for functions related to later stages of embryogenesis (Figure 5—figure supplement 1D). Hence, Class 4 MPADs (and, to a lesser extent, Class 3 MPADs and non-MPADs) undergo an increase in H3K27ac and expression of the associated genes is significantly upregulated by 12hpf.

Figure 5 with 1 supplement see all
Class 4 MPADs transition to an active chromatin state during embryogenesis.

See also Figure 5—figure supplement 1. (A) Heatmap displaying the change in H3K27ac signal (log2 of fold-change) at MPADs between 4.5 and 9hpf of zebrafish embryogenesis. Ranking of MPADs is the same as in Figure 4F. (B) Average change in H3K27ac signal between 4.5hpf and 9hpf (log2 of fold-change) at MPADs. (C, D) Box plots showing expression of genes associated (≤30 kb) with each of the four MPAD classes, as determined by RNA-seq on 6hpf and 12hpf embryos. Data are presented as log2 of mean TPM values at 12hpf (C) or as log2 of mean TPM fold-change between 12hpf and 6hpf (D). Statistical test: pairwise comparison with Kruskal-Wallis followed by Dunn's post-hoc test.

https://doi.org/10.7554/eLife.36144.011

TALE factors control the chromatin state at class 4 MPADs associated with the anterior GRN

The fact that developmental control genes are associated with Class 4 MPADs suggests that the TALE GRN genes may fall into this category. Indeed, we find that TALE GRN genes are significantly associated with Class 4 (and Class 3), but not Class 1 or 2, MPADs (Figure 6A,B). A closer analysis of the TALE GRN genes associated with Class 3 and 4 MPADs revealed that they are enriched for functions related to transcriptional regulation and early embryonic processes (Figure 6C) that align well with the developmental defects observed in TALE KD embryos. In fact, 27 of the 34 TALE GRN genes associated with Class 4 MPADs encode TFs (Figure 6E) and a literature review uncovered that ~65% (22/34) have been previously implicated in the formation of embryonic structures that are affected in TALE KD embryos (Figure 6E; Supplementary file 4). These findings suggest that TALE factors act via Class 4 (and, to a certain extent, Class 3) MPADs to control a core set of TFs in the TALE GRN. To directly test this possibility, we assessed whether TALE factors are required for the expression of MPAD-associated genes by 12hpf. We find that expression of genes associated with Class 1 and 2 MPADs is relatively insensitive to TALE KD, while genes associated with Class three and, in particular, Class 4 MPADs are downregulated in TALE KD embryos (Figure 6D,E). Since Class 4 MPADs show an increase in H3K27ac between 6hpf and 9hpf (Figure 5A), we examined the impact of TALE TFs on 9hpf H3K27ac levels. Using ChIP-qPCR, we find that H3K27ac levels are reduced at 57% (4/7) of TALE GRN-associated Class 4 MPADs in TALE KD embryos (Figure 6F). These findings indicate that TALE factors act by regulating a chromatin transition – from repressive chromatin in blastula stage embryos to active chromatin in segmentation stage embryos – at a core set of genes encoding TFs that direct primarily anterior development in the zebrafish embryo.

TALE factors control the chromatin state at Class 4 MPADs associated with the TALE GRN.

(A) Localization of TALE KD downregulated genes (p-adj ≤0.01, fold-change ≥1.5) relative to MPADs. The number of TALE-dependent genes within 30 kb of MPADs is indicated above each bar. p-values for enrichment above a random set of genes were calculated using the Pearson correlation test. (B) Representative UCSC browser tracks of the zic5 locus illustrating the position of a Class 4 MPAD and histone modifications in 4.5hpf and 9hpf embryos. (C) DAVID analysis of TALE KD downregulated genes (p-adj ≤0.01, fold-change ≥1.5) near Class 3 and 4 MPADs. Note that only select categories are presented, a full list of GO terms is available in Supplementary file 3. FDR = Benjamini multiple testing False Discovery Rate. (D) Box plots showing change in expression of genes near (≤30 kb) each of the four MPAD classes, as determined by RNA-seq at 12hpf. Data are presented as log2 of mean TPM fold-change between TALE KD and control. Statistical test: pairwise comparison with Kruskal-Wallis followed by Dunn's post-hoc test. (E) Graph showing the TPM expression fold-change in TALE KD vs control 12hpf embryos for all TALE dependent genes (n = 34) near (≤30 kb) Class 4 MPADs. Genes in red control the formation of structures affected by TALE KD (see Supplementary file 4). (F) H3K27ac/Histone H3 signal ratio at Class 4 MPADs as determined by ChIP-qPCR in 9hpf control vs TALE KD embryos. MPADs are labeled with the name of the nearest TALE-dependent gene. Data of three independent biological replicates are presented as mean fold change ± SEM of TALE KD vs control. Statistical test: unpaired t-test.

https://doi.org/10.7554/eLife.36144.013

NF-Y proteins regulate TALE GRN expression and form complexes with TALE factors

Since TALE factors commonly function in complexes with other TFs, it is possible that they have novel interaction partners when bound at DECA motifs. Indeed, the DREME discovery tool detected three motifs in addition to the DECA motif at Prep3.5hpf peaks (Figure 7A). We cannot confidently assign a TF to the AT(A/G)TTAA motif, and the CC(C/A)C(G/A)CCC motif could bind any member of the large Sp/Klf family. The CCAAT motif was detected in a previous Prep ChIP-seq analysis (Penkov et al., 2013), but it was not pursued further. In our analysis, DREME predicted this motif to be selective for the NF-Y transcription factor (Dolfini et al., 2009). While the other motifs are enriched at both Prep3.5hpf and Prep12hpf-only peaks, the NF-Y motif is specifically enriched at Prep3.5hpf peaks (Figure 7B). NF-Y is also maternally deposited in zebrafish (Figure 7—figure supplement 1A), consistent with a joint role for TALE and NF-Y factors at blastula stages. Using ChIP-qPCR, we tested 15 TALE-occupied sites with nearby CCAAT motifs and detect NF-Y binding at nine of them (Figure 7C), demonstrating that co-occupancy is relatively frequent. Accordingly, using ChIP-seq data from mESCs (Oldfield et al., 2014), we find that ~50% of all Prep peaks are found near NF-Y peaks also in this cell type (Figure 7D), demonstrating that co-localization of TALE and NF-Y TFs is evolutionarily conserved.

Figure 7 with 1 supplement see all
NF-Y proteins regulate TALE GRN expression and form complexes with TALE factors.

See also Figure 7—figure supplement 1. (A) Sequence logo and localization relative to Prep peak summits of motifs identified by DREME at Prep3.5hpf peaks. (B) Enrichment of motifs in Prep3.5hpf and Prep12hpf-only peaks as defined by AME. p-values for enrichment above random occurrence (3.5hpf and 12hpf-only columns) or between two Prep peak populations (3.5hpf vs 12hpf-only and 12hpf-only vs 3.5hpf columns) were calculated using the ranksum test in AME. Motifs are represented in IUPAC code (M = A or C; R = A or G). (C) ChIP-qPCR showing NF-YB binding at CCAAT motif-containing MPADs in 9hpf embryos. MPADs are labeled with the name of the nearest gene. Data of three independent biological replicates are presented as mean fold change ± SEM of NF-YB IP vs control IgG. Statistical test: unpaired t-test. (D) Venn diagram illustrating the overlap of Prep and NF-YB peaks in mESCs. Two peaks are considered to overlap if their summits are within 500 bp. (E) RT-qPCR analysis of gene expression in 12hpf NF-YDN injected embryos. Results are shown as gene expression fold-change in NF-YDN vs control for select TALE-dependent genes. Data of three independent experiments are presented as mean fold change ± SEM of NF-YDN injected vs control embryos. Statistical test = unpaired t-test. (F) H3K27ac/Histone H3 signal ratio at MPADs (labeled with the name of the nearest gene) as determined by ChIP-qPCR in 9hpf control vs NF-YDN injected embryos. Data of three independent biological replicates are presented as mean fold change ± SEM of NF-YDN vs control. Statistical test: unpaired t-test. (G) Co-IP experiments showing interaction of Myc-Prep (left panels) and HA-Pbx4 (right panels) with Flag-NF-YB in transfected HEK293 cells. HC = Ig heavy chain. Asterisks indicate non-specific signal. (H) Model diagram. At blastula stages (left side) TALE binds DECA motifs (TGATTGACAG) near NF-Y motifs (CCAAT). At this stage, most binding sites are occupied by nucleosomes and those associated with developmental control genes are marked by H3K27me3 (red lollipops). Binding of TALE and NF-Y leads to deposition of H3K27ac (green lollipops) and improved accessibility. At segmentation stages (right side), TALE continues to bind DECA motifs near NF-Y motifs, but Prep also binds HEXA motifs (TGACAG) near PBX:HOX motifs (TGATTTAT). Most of the HEXA motifs lack nucleosomes and are found within 40 kb of a DECA/NF-Y site (indicated by dashed connecting line). At this stage, developmental control genes are marked by H3K27ac and are expressed.

https://doi.org/10.7554/eLife.36144.014

The role for NF-Y in embryogenesis is not well characterized, but it has been reported that mice mutant for nf-ya (the DNA binding subunit of the NF-Y complex) die in utero prior to embryonic day 8.5 (Bhattacharya et al., 2003), consistent with a role for NF-Y in early embryogenesis. Furthermore, a study targeting zebrafish nf-yb with antisense morpholino oligos described a relatively mild head phenotype that was attributed to defective cartilage formation (Y.-H. Chen et al., 2009). Using a previously reported dominant negative construct (NF-YDN [Nardini et al., 2013; Mantovani et al., 1994]) to disrupt NF-Y function, we observe a small head, as well as defects in development of the eyes, heart and tail (Figure 7—figure supplement 1B). The effect of the NF-YDN is somewhat more severe than that resulting from TALE KD (Figure 1—figure supplement 1A,B), but the two phenotypes share some features – including smaller head and eyes, as well as cardiac edema – suggesting that NF-Y may also regulate the expression of genes in the TALE GRN. To test this, we analyzed expression of 21 TALE-dependent genes associated with Class 4 MPADs (out of the 34 such genes identified in Figure 6A; six of these were also confirmed as associated with NF-Y occupancy in Figure 7C) and find that 18 (86%) are downregulated upon NF-Y disruption (Figure 7E). Furthermore, NF-Y disruption leads to a decrease in H3K27ac at MPADs associated with these genes (Figure 7F), similar to our observation following disruption of TALE function (Figure 6F). A shared role for TALE and NF-Y factors in controlling H3K27ac may be broadly relevant at the blastula stage, since we find that TALE peaks with adjacent CCAAT motifs are generally associated with higher levels of H3K27ac and lower levels of H3K27me3 than TALE peaks that lack a nearby CCAAT box (Figure 7—figure supplement 1C). We do not find any differences in the distribution of NF-Y motifs among the various MPAD classes, suggesting that NF-Y is generally associated with TALE occupancy (Figure 7—figure supplement 1D). We noticed from our bioinformatics analysis that NF-Y sites occur very close to DECA sites, with the average spacing being ~20 bp (Figure 7A), raising the possibility that NF-Y may physically interact with TALE proteins. Since Prep:Pbx is a heterodimer and NF-Y is a heterotrimeric TF, we tested the ability of Prep and Pbx to bind NF-YA and/or NF-YB in pairwise combinations by co-immunoprecipitation from transfected HEK293 cells. In this context, we find that both Prep and Pbx interact with the NF-YB (Figure 7G) and NF-YA (Figure 7—figure supplement 1F) subunits, indicating that Prep:Pbx and NF-Y can form complexes. We conclude that NF-Y binds adjacent to TALE factors at DECA sites and that both factors are required for regulation of the TALE GRN, possibly by functioning in a complex.

As discussed above, genomic elements containing HEXA and PBX:HOX motifs have been shown to function as enhancers (Pöpperl et al., 1995; Jacobs et al., 1999; Ferretti et al., 2005; Choe et al., 2009; Ferretti et al., 2000; Di Rocco et al., 1997; Manzanares et al., 2001; Tümpel et al., 2007; Wassef et al., 2008), but it is not clear if elements containing DECA and NF-Y sites have such activity. In particular, most TALE GRN genes associated with Class 4 MPADs have tissue-specific expression patterns, but the TALE and NF-Y factors are ubiquitously expressed, suggesting that genomic elements containing only DECA and NF-Y sites may not be sufficient to drive gene expression. Accordingly, by testing seven DECA and NF-Y site-containing genomic elements for enhancer activity in HEK293 cells, we find that only one drives luciferase reporter expression (Figure 7—figure supplement 1E). This finding is consistent with previous reports that mis-expression of TALE factors in zebrafish embryos does not cause developmental defects (Vlachakis et al., 2001; Choe et al., 2002) and suggests that elements containing DECA and NF-Y sites function together with other regulatory elements that provide tissue-specific input (see Discussion).

Discussion

In its previously defined role as acting in complexes with Hox TFs, Prep binds at monomeric HEXA sites near binding sites for Pbx:Hox dimers to control gene expression (Ferretti et al., 2005; Tümpel et al., 2007; Jacobs et al., 1999; Ferretti et al., 2000; Amin et al., 2015). Accordingly, our analysis detected HEXA motifs with nearby PBX:HOX motifs at Prep binding sites associated with a TALE-dependent anterior GRN in segmentation stage (12hpf) zebrafish embryos. Strikingly, we find that TALE-occupancy is associated with this GRN already at blastula stages (3.5hpf), but at this stage TALE factors instead utilize DECA sites (consisting of immediately adjacent Pbx and Prep sites). We also discovered that NF-Y binds CCAAT motifs near DECA sites and forms complexes with TALE factors. Finally, we demonstrate that TALE and NF-Y are both required for the transition to an active chromatin profile at GRN-associated genes. Hence, TALE factors control an anterior GRN throughout embryogenesis, but the choice of binding motifs and partner proteins varies such that TALE factors interact with NF-Y at DECA sites starting at blastula stages and then expand their binding repertoire to also include HEXA sites, where they interact with Pbx:Hox dimers, by segmentation stages (see summary model in Figure 7H).

Although DECA sites were identified previously (Penkov et al., 2013; De Kumar et al., 2017; Laurent et al., 2015; Knoepfler and Kamps, 1997; Chang et al., 1997), they have not been assigned a biological function. Our experiments now reveal that genomic elements containing DECA and NF-Y motifs may not be sufficient to act as enhancers. Instead, TALE and NF-Y bind many of these elements prior to the appearance of active chromatin marks. Indeed, we note that many genomic loci bound by Prep at 3.5hpf are highly occupied by nucleosomes (Figures 3E and 4A), indicating that Prep can access its binding sites in compacted embryonic chromatin. Furthermore, we find that TALE factors are required for the deposition of H3K27ac marks at these elements (Figure 6F). This may be a general function of TALE factors since several TALE proteins bind CBP (Choe et al., 2009; Saleh et al., 2000) – the enzyme responsible for H3K27 acetylation (Tie et al., 2009) – and Pbx reportedly promotes active chromatin in a breast cancer cell line (Magnani et al., 2011). Additionally, NF-Y contains a histone-fold and makes both specific and non-specific contacts with DNA (Nardini et al., 2013), suggesting that NF-Y may access its binding site by displacing histones. Hence, the joint activity of TALE and NF-Y may represent a pioneer function (Iwafuchi-Doi and Zaret, 2016) that permits access to DECA/NF-Y sites in compacted chromatin (see summary model in Figure 7H). Although only ~50% of TALE-occupied sites are associated with a NF-Y motif at 3.5hpf, there are also nearby motifs for SP/KLF (Figure 7A) and KLF4 is a pioneer factor (Soufi et al., 2015) that binds TALE proteins (Bjerke et al., 2011), suggesting that TALE proteins may act together with various other TFs in a pioneer role at DECA sites.

We find that many of the TALE-dependent genes identified by our analysis are expressed in the anterior embryo. Since TALE and NF-Y factors are present ubiquitously, this suggests that additional tissue-restricted inputs are required to achieve spatially appropriate expression of these genes during embryogenesis. We therefore hypothesize that TALE and NF-Y pioneer activity is required for nearby tissue-specific enhancers to become functional (see summary model in Figure 7H). In fact, the additional Prep-occupied sites that emerge by 12hpf may represent such tissue-specific enhancers. Some of these sites contain monomeric HEXA motifs near PBX:HOX motifs in an arrangement found at many hindbrain enhancers (Grice et al., 2015) and they are enriched near DECA/NF-Y sites. These 12hpf Prep sites contain not only PBX:HOX binding sites, but also motifs for other tissue-specific TFs (such as myogenic factors) indicating that DECA/NF-Y motifs may play a general role in promoting access to enhancers. We also note that TALE factors arose prior to Hox genes in evolution (Bürglin and Affolter, 2016; Hrycaj and Wellik, 2016; Holland, 2013), suggesting that TALE activity at DECA sites may represent an original function and that TALE factors may have been subsequently co-opted to function together with tissue-specific TFs.

Maternally deposited material controls embryonic development in zebrafish until 3hpf-4hpf. Indeed, TALE and NF-Y are maternally deposited in zebrafish ([Deflorian et al., 2004; Choe et al., 2002; Waskiewicz et al., 2002; Chen et al., 2009]; Figure 2—figure supplement 1A,B; Figure 7—figure supplement 1A) and by 3.5hpf – the stage when we carried out our ChIP-seq analysis – zygotic Prep, Pbx and NF-Y expression is not yet detectable (Figure 2—figure supplement 1A, Figure 7—figure supplement 1A). Hence, the initial activity of TALE and NF-Y at DECA/NF-Y sites at 3.5hpf is likely maternally directed, while DECA/NF-Y sites and HEXA/PBX:HOX sites detected at 12hpf are more likely occupied by zygotically produced factors. Differences between maternally and zygotically controlled stages of embryogenesis may also explain why Prep binds HEXA sites efficiently at 12hpf, but not at 3.5hpf. Specifically, it is possible that Prep cannot bind HEXA sites as a monomer but requires the cooperation of tissue-specific TFs (such as Hox proteins) that are not present maternally. Indeed, our recent work demonstrated that binding of Meis proteins (that are closely related to Prep proteins) to HEXA motifs is stabilized by Hox proteins in segmentation stage mouse embryos (Amin et al., 2015).

Prep binds many genomic loci in the 3.5hpf embryo and these sites display diverse chromatin states, such that Class 1 and 2 MPADs are associated with genes expressed by 6hpf, Class 4 MPADs with genes expressed by 12hpf and non-MPADs with genes expressed at later stages of embryogenesis (Figures 4F–G and 5C–D, Figure 5—figure supplement 1). While our functional analysis indicates that primarily genes associated with Class 4 MPADs are affected by TALE KD (Figure 6D), this is likely a result of our choosing the 12hpf timepoint for RNA-seq. Indeed, we show that non-MPADs continue to transition to an active chromatin state at least until 24hpf (Figure 5—figure supplement 1), but any genes that become expressed as a result of this transition would not have been detected by our analysis. For instance, muscle differentiation involves TALE function (Berkes et al., 2004; Knoepfler et al., 1999) and Prep peaks are found near genes involved in myogenesis (Figure 2E). Although expression of myogenic genes is somewhat affected in TALE KD embryos (Figure 1D,E) much of muscle differentiation takes place after 12hpf suggesting that this expression effect would be more pronounced at later stages. Accordingly, the effect of TALE factors at Class 3 MPAD-associated genes is less pronounced (Figure 6D), possibly because these genes are involved in muscle development (Figure 6C). Genes associated with Class 1 and 2 MPADs are only mildly TALE-dependent (Figure 6D). Strikingly, ~70% of ‘first-wave’ genes (ones activated by maternal factors in the early zygote [Lee et al., 2013]) are located near Prep peaks (Figure 7—figure supplement 1G) – particularly near Class 1 and 2 MPADs (Figure 7—figure supplement 1H) – but expression of these genes is not affected by TALE KD (Figure 1—figure supplement 1D–F). The reason for this is not clear, but the pluripotency factors Nanog, Pou5fl and SoxB1 are required for expression of first-wave genes (Leichsenring et al., 2013; Lee et al., 2013) and may act redundantly with TALE and NF-Y at these early stages. Accordingly, our RNA-seq analysis found that expression of nanog, pou5fl and soxB1 is not disrupted in TALE KD embryos. Alternatively, the onset of the knockdown effect may be delayed, preventing it from disrupting early TALE activity required for first-wave gene expression.

Lastly, TALE factors act as oncogenes in several systems and have been specifically implicated in various types of leukemia (Kamps and Baltimore, 1993; Nourse et al., 1990; Moskow et al., 1995). Their oncogenic potential has generally been considered in the context of their action as transcription cofactors to Hox proteins (Eklund, 2007). Our finding that TALE factors use additional binding motifs and interaction partners, as well as their ability to promote an active chromatin state, suggests that this model should be expanded to also consider non Hox-related mechanisms for TALE factor-mediated leukemogenesis.

Materials and methods

Key resources table
Reagent type
or resources
DesignationSource or referenceIdentifierAdditional
information
AntibodyRabbit polyclonal
anti-Prep
(Choe et al., 2014)N/A
AntibodyRabbit polyclonal
anti-Pbx4
(Choe et al., 2014)N/A
AntibodyRabbit polyclonal
anti-NF-YB
Santa-Cruzsc13045RRID:AB_2152107
AntibodyRabbit polyclonal
anti-H3K27ac
Abcamab4729RRID:AB_2118291
AntibodyRabbit polyclonal
anti-Histone H3
Abcamab1791RRID:AB_302613
AntibodyMouse monoclonal
anti-Myc
Roche11667149001RRID:AB_390912
AntibodyMouse monoclonal
anti-Flag
Sigma-AldrichF3165RRID:AB_259529
AntibodyRabbit polyclonal
anti-HA
Abcamab9110RRID:AB_307019
AntibodyRabbit polyclonal
anti-IgG control
Abcamab46540RRID:AB_2614925
AntibodyMouse polyclonal
anti-IgG control
Millipore12-371bRRID:AB_2617156
AntibodyAnti-mouse IgG,
HRP-linked secondary
antibody
GE healthcareLNA91V/AG
AntibodyAnti-mouse IgG, Alexa
Fluor 488 conjugated
secondary antibody
Molecular ProbesA11001RRID:AB_2534069
AntibodyMouse monoclonal
3A10
Developmental Studies
Hybridoma bank
531874RRID:AB_531874
AntibodyAnti-rabbit IgG,
HRP-linked secondary
antibody
Jackson Laboratories211-032-171RRID:AB_2339149
AntibodyLipofectamine 2000Invitrogen52887
Strain, strain
background (E. coli)
Subcloning Efficiency
DH5α Competent Cells
ThermoFisher Scientific18265017
Chemical compound,
drug
4-ThiouridineSanta-Cruzsc204628
Chemical compound,
drug
EZ-Link HPDP-BiotinPierce21341
peptide, recombinant
protein
Dynabeads MyOne
Streptavidin C1
ThermoFisher Scientific65001
peptide, recombinant
protein
Protein-A DynabeadsThermoFisher Scientific10001D
Commercial assay
or kit
TruSeq ChIP Library
Preparation Kit
IlluminaIP-202–1012
Commercial assay
or kit
TruSeq Stranded mRNA
LT sample prep Kit
IlluminaRS-122–2101
Commercial assay
or kit
mMESSAGE mMACHINE
SP6 Transcription Kit
ThermoFisher ScientificAM1340
Commercial assay
or kit
Q5 Site-Directed
Mutagenesis Kit
New England BiolabsE0554S
OtherPrep ChIP-seq and
Inputs in 3.5hpf
zebrafish embryos
This paperGEODeposited data
OtherPrep ChIP-seq and
Inputs in 12hpf
zebrafish embryos
This paperGEODeposited data
OtherTALE knock-down and
control RNA-seq in
6hpf zebrafish embryos
This paperGSE102662Deposited data
OtherTALE knock-down and
control RNA-seq in
12hpf zebrafish embryos
This paperGSE102662Deposited data
OtherPrep1 ChIP-seq and
Inputs in mESCs,
WIG files
(Laurent et al., 2015)GSM1545025 and GSM1545026Deposited data
OtherATAC-seq in 4hpf zebrafish
embryos, fastq files
(Kaaij et al., 2016)SRR2747531Deposited data
OtherH3K4me1 ChIP-seq in 4.5hpf
zebrafish embryos, WIG files
(Bogdanovic et al., 2012)GSM915193Deposited data
OtherH3K4me3 ChIP-seq in 4.5hpf
zebrafish embryos, WIG files
(Bogdanovic et al., 2012)GSM915189Deposited data
OtherH3K27ac ChIP-seq in 4.5hpf
zebrafish embryos, WIG files
(Bogdanovic et al., 2012)GSM915197Deposited data
OtherH3K27ac ChIP-seq in 9hpf
zebrafish embryos, WIG files
(Bogdanovic et al., 2012)GSM915198Deposited data
OtherH3K27ac ChIP-seq in 24hpf
zebrafish embryos, WIG files
(Bogdanovic et al., 2012)GSM915199Deposited data
OtherH3K27me3 ChIP-seq in 4.5hpf
zebrafish embryos, WIG files
(Zhang et al., 2014)GSM1081557Deposited data
OtherMNase-seq in 4.5hpf zebrafish
embryos, WIG files
(Zhang et al., 2014)GSM1081554Deposited data
OtherRNA-Pol2 ChIP-seq in 4.5hpf
zebrafish embryos, WIG files
(Zhang et al., 2014)GSM1081560Deposited data
OtherMeDIP-seq (Methyl CpG) in
4.5hpf zebrafish embryos,
BedGraph files
(Lee et al., 2015)GSM1274386Deposited data
OtherNF-YA ChIP-seq in mESCs(Oldfield et al., 2014)GSM1370111Deposited data
OtherH3K4me1 in mESCs,
BigWig files
ENCODE
www.encodeproject.org
GSM1000121Deposited data
OtherH3K4me3 in mESCs,
BigWig files
ENCODE
www.encodeproject.org
GSM1000124Deposited data
OtherH3K27ac in mESCs,
BigWig files
ENCODE
www.encodeproject.org
GSM1000126Deposited data
OtherH3K27me3 in mESCs,
BigWig files
ENCODE
www.encodeproject.org
GSM1000089Deposited data
OtherDNase-seq in mESCs,
BigWig files
ENCODE
www.encodeproject.org
GSM1014154Deposited data
OtherMeDIP-seq (Methyl CpG)
in mESCs
(C.-C. Chen et al., 2013)GSM859494Deposited data
Cell line (Human)HEK-293T cellsATCCATCC CRL-3216RRID:CVCL_0063
Strain, strain
background (Zebrafish)
strain EKWEkkwill breedershttp://www.ekkwill.com/
OtherOligonucleotidesSee Supplementary file 5
Recombinant
DNA
6xMyc-Prep1.1 in
PCS2 + MT
(Choe et al., 2002)N/A
Recombinant
DNA
HA-Pbx4 in PCS2+(Choe et al., 2009)N/A
Recombinant
DNA
Flag-NF-YA in PCS2+This PaperN/A
Recombinant
DNA
Flag-NF-YB in PCS2+This PaperN/A
Recombinant
DNA
NF-YDN in PCS2+This paperN/A
Recombinant
DNA
pGL3-Promoter vectorPromegaE1761
Recombinant
DNA
Tle3 element in pGL3
Promoter vector
This paperN/A
Recombinant
DNA
Pax5 element in pGL3
Promoter vector
This paperN/A
Recombinant
DNA
Prdm14 element in pGL3
Promoter vector
This paperN/A
Recombinant
DNA
Tcf3a element in pGL3
Promoter vector
This paperN/A
Recombinant
DNA
Her6 element in pGL3
Promoter vector
This paperN/A
Recombinant
DNA
Dachb element in pGL3
Promoter vector
This paperN/A
Recombinant
DNA
Fgf8 element in pGL3
Promoter vector
This paperN/A
Recombinant
DNA
pGL3-Control vectorPromegaE1741
Software, algorithmFastQCBabraham Institutehttps://www.bioinformatics.babraham.ac.uk/projects/fastqc/RRID:SCR_014583
Software, algorithmFastQ ScreenBabraham Institutehttps://www.bioinformatics.babraham.ac.uk/projects/fastq_screen/RRID:SCR_000141
Software, algorithmTrimmomatic 0.32(Bolger et al., 2014)https://github.com/timflutre/trimmomaticRRID:SCR_011848
Software, algorithmBowtie 2.2.3(Langmead and Salzberg, 2012)https://github.com/BenLangmead/bowtie2RRID:SCR_005476
Software, algorithmSAMtools 0.1.19(Li et al., 2009)https://github.com/samtools/samtoolsRRID:SCR_002105
Software, algorithmMACS 2.1.0.20140616(Zhang et al., 2008)https://github.com/taoliu/MACS
Software, algorithmRSEM 1.2.28 in the Dolphin
interface of University of
Massachuetts Worcester
Biocore
(Li and Dewey, 2011)http://www.umassmed.edu/biocore/introducing-dolphin/RRID:SCR_013027
Software, algorithmDESeq2 in the Dolphin
interface of University of
Massachuetts Worcester
Biocore
(Anders and Huber, 2010)http://www.umassmed.edu/biocore/introducing-dolphin/RRID:SCR_015687
Software, algorithmGalaxy web interface(Goecks et al., 2010)https://usegalaxy.orgRRID:SCR_006281
Software, algorithmBedTools in galaxy(Quinlan and Hall, 2010)https://usegalaxy.orgRRID:SCR_006646
Software, algorithmDeepTools in galaxy(Ramírez et al., 2014)https://usegalaxy.org
Software, algorithmMEME-ChIP(Machanick and Bailey, 2011;
Bailey et al., 2009)
http://meme-suite.org/tools/meme-chipRRID:SCR_001783
Software, algorithmDAVID 6.8(Huang et al., 2009b2009a)https://david.ncifcrf.gov/RRID:SCR_001881
Software, algorithmGREAT 3.0.0(McLean et al., 2010;
Hiller et al., 2013)
http://bejerano.stanford.edu/great/public/htmlRRID:SCR_005807
Otheranti-Prep1.1 morpholino
oligonucleotide
Gene Tools, LLCN/A
Other5'-TGGACACAGACTGGGCAG
CCATCAT-3'Fluorescein
(Deflorian et al., 2004)
Otheranti-Pbx2 morpholino
oligonucleotide
Gene Tools, LLCN/A
Other5'-CCGTTGCCTGTGATG
GGCTGCTGCG-3'
(Erickson et al., 2007)
Otheranti-Pbx4 morpholino
oligonucleotide
Gene Tools, LLCN/A
Other5'-AATACTTTTGAGCCGA
ATCTCTCCG-3'
(Erickson et al., 2007)

Animal care

All procedures on zebrafish adults and embryos were approved by the University of Massachusetts Institutional Animal Care and Use Committee (IACUC). EKW zebrafish were kept in groups of 10 individuals under constant water flow at 28°C. To collect embryos, 2 males and three females were crossed for 30 min. Subsequently, the embryos were collected in egg water (60 ug/ml of instant ocean salts, 0.0002% methylene blue). After 2 hr, dead and un-fertilized embryos were manually removed and the remainder left to develop until they reached the appropriate developmental stage and then used in the experimental procedures described below.

Interference with protein function in embryos

Injection of capped messenger RNAs encoding an NF-Y or a Prep/Meis dominant negative protein (NF-YDN and PBCAB, respectively [Mantovani et al., 1994; Choe et al., 2002]) or a cocktail of morpholino antisense oligonucleotides directed against the TALE proteins, were used to interfere with NF-Y and TALE function. TALE knockdown was achieved by injection of antisense morpholino oligos (MOs) targeting pbx2, pbx4 and prep1.1 as reported previously (Deflorian et al., 2004; Waskiewicz et al., 2002). The use of MOs is necessitated by the fact that mutant lines are not available for all TALE factors, and the existing mutants are embryonic lethal. Hence, MOs allow us to produce the large number of embryos required for RNA-seq and ChIP-qPCR experiments. Importantly, the phenotype of pbx4 MO-injected embryos is indistinguishable from that of pbx4 mutant embryos (Waskiewicz et al., 2002), demonstrating that pbx4 MOs are specific. prep1.1 MOs produce the same phenotype as pbx4 mutants (Deflorian et al., 2004), as expected of proteins acting together in a dimer. prep1.1 MOs also produce the same phenotype as embryos injected with a dominant negative construct disrupting Prep/Meis function (Choe et al., 2002), further indicating that the knockdown is specific.

Sample size was not selected based on statistical analysis, but on previous published reports demonstrating that these reagents produce phenotypes in >85% of injected embryos (Deflorian et al., 2004; Waskiewicz et al., 2002; Choe et al., 2014; Mantovani et al., 1994). Embryos were randomly selected for inclusion in injected or control pools. Dead animals were excluded from RNA-seq and ChIP-seq experiments, but not from phenotypic analyses in Figure 1—figure supplement 1 and Figure 7—figure supplement 1. No other animals were excluded. Experiments were not blinded.

In vitro synthesis of capped mRNAs

PCS2 + plasmids containing the NF-YDN or PBCAB coding sequence was linearized by NotI digest and purified with a PCR purification kit column (Qiagen). Capped messenger RNAs were synthesized using the SP6 mMessage mMachine kit (ThermoFisher Scientific) from 2 ug of linearized plasmid following manufacturer's instructions. The DNA template was then removed by the addition of 2 µl of TURBO DNase and incubation at 37°C for 15 min. Subsequently, synthesized capped mRNAs were purified on the RNeasy kit columns (Qiagen), quantified on a Nanodrop (ThermoFisher Scientifics) and their quality assessed on a 2% agarose gel.

Injections into zebrafish embryos

300 pg of mRNA or a mixture of morpholinos (Prep1.1, Pbx2 and Pbx4 at 2.7 ng each) mixed with water and 0.1% phenol red dye were injected into 1 to 2 cell stage zebrafish embryos. Following the injection, embryos were raised to the desired time point and used for experimental procedures.

Assessment of TALE loss of function phenotype

For whole-mount immunostaining, 48hpf embryos were fixed in 4% paraformaldehyde/8% sucrose/1x PBS overnight. Fluorescent staining with the 3A10 primary antibody (1:100; Developmental Studies Hybridoma Bank) and the goat anti-mouse Alexa Fluor 488 secondary antibody (1:200; Molecular Probes A11001) was used to detect Mauthner neurons. For assessment of cartilage formation, 5dpf embryos were fixed in 4% paraformaldehyde/1X PBS overnight, bleached in 30% hydrogen peroxide for 2 hr and stained overnight in 1% HCL/70% ethanol/0.1% alcian blue.

Identification of in vivo TF binding sites

ChIP-seq

Groups of 500 zebrafish embryos (total of 10,000 at 3.5hpf and 5000 at 12hpf per biological replicate) were dissociated in 1XPBS by pipetting and fixed for 10 min in 1% formaldehyde. Fixation was stopped by the addition of glycine to a final concentration of 125 mM and cells were pelleted and frozen in liquid nitrogen. Subsequently, cell pellets were processed following a ChIP protocol described previously (Amin et al., 2015). Nuclei were extracted by the addition of 500 μl L1 buffer (50 mM Tris-HCl pH8.0, 2 mM EDTA, 0.1% NP-40, 10% glycerol, 1 mM PMSF) followed by incubation for 5 min on ice and pelleted by centrifugation (3000 rpm, 5 min at 8°C). Nuclei were lysed in 300 μl SDS lysis buffer (50 mM Tris-HCl pH8.0, 10 mM EDTA, 1% SDS) and chromatin sheared into smaller fragments (300 bp on average) by 3 rounds of sonication with a Palmer sonicator (10 s ON – 2 s OFF for a total of 1 min per round, amplitude 40%).

Samples were diluted 10 times in dilution buffer (50 mM Tris-HCl pH8.0, 5 mM EDTA, 200 mM NaCl, 0.5% NP-40, 1 mM PMSF) and pre-cleared by the addition of 50 μl protein-A dynabeads (ThermoFisher Scientific) and incubation for 3 hr at 4°C. After removal of the beads, 10 ul of anti-Prep or pre-bleed antiserum was added (Key Resources Table). Immune complexes were precipitated by the addition of 50 μl of protein-A dynabeads (ThermoFisher Scientific) and incubated for 3 hr at 4°C. Beads were washed five times in wash buffer (20 mM Tris-HCl pH8.0, 2 mM EDTA, 500 mM NaCl, 1% NP-40, 0.1% SDS, 1 mM PMSF), three times in LiCl buffer (20 mM Tris-HCl pH8.0, 2 mM EDTA, 500 mM LiCl, 1% NP-40, 0.1% SDS, 1 mM PMSF) and three times in TE buffer (10 mM Tris-HCl pH8.0, 1 mM EDTA, 1 mM PMSF).

Chromatin fragments were eluted by the addition of 50 μl of freshly made elution buffer (10 mM Tris-HCl pH8.0, 1 mM EDTA, 2% SDS) and incubation at 25°C for 15 min followed by an incubation at 65°C for another 15 min. Then, DNA fragments were reverse cross-linked by adding 2.5 μl of 5M NaCl and incubating at 65°C O/N. Finally, DNA fragments were recovered in 10 μl nuclease free water using a PCR purification mini-elute kit (Qiagen).

ChIP DNA fragments and their corresponding input were quantified on a Qubit with the dsDNA HS assay kit (ThermoFisher Scientific). 10 ng of DNA was used for library preparation using the Tru-seq ChIP Sample Preparation Guide (Illumina Inc). For samples containing less than 10 ng of DNA the entire eluted DNA was used. Briefly, sample DNA was blunt-ended and phosphorylated, and a single 'A' nucleotide added to the 3' ends of the fragments in preparation for ligation to an adapter with a single-base 'T' overhang. Omitting the size selection step, the ligation products were then PCR-amplified to enrich for fragments with adapters on both ends. Libraries were sequenced on an Illumina HiSeq2500 Sequencer.

ChIP-qPCR

The ChIP protocol for ChIP-qPCR is the same as described in the ChIP-seq section above except that a total of 1000 wild-type or injected embryos were collected for NF-YB and Pbx4 ChIPs and 200 embryos for Histone H3 and H3K27ac ChIPs. The following antibodies were used: 10 µl of anti-Prep1.1 and anti-Pbx4 in house sera and their corresponding pre-bleed control sera; 8 µg of anti-NF-YB rabbit polyclonal antibody and control rabbit polyclonal IgG. The relative quantification of select genomic regions was determined by qPCR using specific primers pairs (see Supplementary file 5) and 2 µl of ChIP DNA eluate.

Quantification of gene expression

Total RNA extraction from zebrafish embryos

Total RNA from 50 to 100 6hpf or 12hpf zebrafish whole embryos was extracted with the RNeasy kit (Qiagen) following manufacturer's instructions. Total RNA was then used in RNA-seq and RT-qPCR reactions.

RNA-seq

Total RNA quantification and quality assessment was performed on a Bioanalyzer (Agilent) and only total RNAs with a RNA Integrity Number above nine were further considered. Then, 3 ug of total RNA was used to construct RNA-seq libraries with the Illumina Truseq stranded mRNA library kit after PolyA + RNA enrichment. The quality and size of the fragments was determined on a Bioanalyzer (Agilent) and single-end 100 bp reads were generated on a Hi-Seq sequencer at the molecular biology core of the University of Massachusetts Medical School.

RT-qPCR

500 ng to 1 µg of total RNA was reverse transcribed using the high capacity cDNA kit (ThermoFisher Scientific). The relative quantity of select mRNAs was determined by qPCR: each 25 ul total PCR reaction contained 2 µl of cDNA diluted 10-fold, 0.2 mM of each specific primer (see Supplementary file 5) and qPCR master mix (Biotool) to a 1X final concentration. The reactions were loaded onto a 7300 real-time PCR system (Applied Biosystems).

Generation of expression vectors

Myc-Prep1.1 (NM_131891.3), HA-Pbx4 (NM_131447.1) encoding plasmids were described previously (Choe et al., 2009, 2002). Flag-NF-YA and Flag-NF-YB plasmids were generated by PCR amplification of the zebrafish NF-YA (NM_001082795.1) and NF-YB (NM_001013322.2) coding sequences from 24hpf zebrafish cDNA using specific primers bearing EcoRI/XhoI and XbaI/SnabI restriction sites respectively. The amplified sequences were then introduced into a PCS2 + plasmid backbone. Subsequently, a Flag tag sequence was PCR amplified from a p3xFLAG-CMV−7.1 vector using specific primers bearing EcoRI (for NF-YA) or StuI/XbaI (for NF-YB) and cloned 5' to the NF-YA or B coding sequences. The NF-YDN plasmid was constructed as previously described (Mantovani et al., 1994). Briefly, three point mutations (R279A, G280A, D281A) located in the conserved NF-YA DNA binding domain, preventing NF-YA DNA binding but not interactions with the other members of the NF-Y complex, were introduced using the Q5 site directed mutagenesis kit (New England Biolabs) and primers bearing the mutations. Plasmids for luciferase reporter assays were generated by amplifying ~500 bp genomic fragments containing the Prep binding sites associated with the tle3a, pax5, prdm14, tcf3a, her6, dachb and fgf8 loci (using the primers listed in Supplementary file 5) and cloning into the XhoI sites of the pGL3-Promoter vector (Promega E1761)

All the plasmids were validated by Sanger sequencing, amplified in DH5α bacterial cells and extracted using the PureLink HiPure Plasmid Midiprep Kit (ThermoFisher Scientific). All primer sequences can be found in Supplementary file 5.

Luciferase assays and assessment of protein-protein interactions

Transfection

3 × 106 HEK-293T cells were seeded on 10 cm dishes and allowed to grow overnight in antibiotic-free growth medium (DMEM (Gibco) supplemented with 10% FBS (Hyclone)). HEK293T cells were obtained from ATCC (ATCC CRL-3216). These cells were not independently authenticated and were not tested for mycoplasma. The next day, the cells were incubated for 5 hr in Opti-MEM (Gibco) medium containing a mixture of plasmid DNA and Lipofectamine 2000 (Invitrogen) following manufacturer’s instructions. Subsequently, the cells were incubated overnight in fresh antibiotic-free growth medium.

Immunoprecipitation of TALE-NF-Y protein complexes

Transfected cells were lysed in 4 mL of ice cold Co-IP Buffer (50 mM Tris-HCl pH 7.5, 150 mM NaCl, 0.2 mM EDTA, 1 mM DTT, 0.5% Triton X100, 1X Complete Protease Inhibitor (Roche)) and incubated on ice for 30 min. Cell lysates were centrifuged at 2,000 g for 10 min at 4°C to remove cell debris and pre-cleared by incubation at 4°C after the addition of 50 μL of Protein A/G Agarose Beads blocked in 1% BSA for 1 hr (Roche). To immunoprecipitate the target protein, 8 μg of the appropriate antibody (see Key Resources Table) was added to each sample before incubation at 4°C overnight. The next morning 40 μL of Protein A/G Agarose beads blocked with 1% BSA was added and each sample incubated for 4 hr at 4°C. Non-specific binding was eliminated by five washes in 1 mL of Co-IP Buffer. Finally, the immune-complexes were eluted in 80 μL of 1X Laëmmli Buffer (Biorad) containing 2.5% beta-mercaptoethanol and agitated for five minutes at 95°C.

Western blot

20 μL of each IP sample or 13 μL of each Input sample were loaded onto a 4–20% gradient polyacrylamide gel (Bio-Rad) and the proteins separated at 200V until the dye front reached the end of the gel. The separated proteins were then transferred onto a methanol-activated PVDF membrane at 100V for one hour. After incubation for one hour in blocking buffer (5% non-fat dehydrated milk in Tris Buffered Saline with Tween (TBST; 50 mM Tris-HCl pH 7.5, 150 mM NaCl, 0.1% Tween 20)) the membranes were probed with specific antibodies (see Key Resources Table) diluted in TBS-Tween plus 5% BSA and incubated overnight at 4°C. The next day after four washes of 10 min in TBS-Tween the membrane were probed with the appropriate secondary antibody diluted in TBS-Tween plus 5% BSA and incubated at 4°C for two hours. After four washes of ten minutes in TBS-Tween the ECL reaction was performed and chemiluminescence detected with a LAS3000 (Fuji) machine.

Luciferase reporter assay

For the reporter assays, 100 or 400 ng of each luciferase reporter plasmid was co-transfected (see above for transfection protocol) with 200 ng of each TF (Meis, Pbx, NF-YA and NF-YB) or with 800 ng of control plasmid, as well as together with 50 ng of a plasmid expressing renilla luciferase. Luciferase was quantified using the DualGlo Luciferase system (Promega E2920) in a Perkin Elmer Envision 2104 Multiplate reader and firefly luciferase levels were normalized to renilla levels. Each assay was performed in triplicate and is presented as mean fold induction ± SD over transfection with empty vector. A vector containing the SV40 enhancer (pGL3-Control vector; Promega E1741) was used as positive control.

Quantification and statistical analysis

Analysis of expression and ChIP data was done as outlined below using standard bioinformatics packages. Default statistics tools included in each package were used (except as indicated) and the exact parameters for each type of analysis are listed below.

Processing of RNA-seq data

Fastq files containing strand specific trimmed and filtered reads were processed using the University of Massachusetts Medical School Dolphin web interface (see Key Resources Table). Reads were quality checked with FastQC aligned to the DanRer10 zebrafish transcriptome and normalized gene expression TPM (Transcripts Per Million) values calculated using RSEM_v1.2.28 with parameters -p4 --bowtie-e 70 --bowtie-chunkmbs 100 (Li and Dewey, 2011). Identification of differentially expressed genes (DEGs) was performed with DeSeq2 (Anders and Huber, 2010) on three independent biological replicates for each control or TALE KD conditions except for RNA-seq data of TALE KD vs Control embryos at 12hpf. In this latter experiment one outlier replicate was excluded from the analysis. DeSeq2 identified DEG with p-adj ≤0.05 (Benjamini and Hochberg FDR) and to compensate for the loss of one biological replicate only DEGs with p-adj ≤0.01 were used in all subsequent analyses.

Processing of ChIP-seq data

Fastq files for ChIP-seq analysis contained 101 bp paired-end sequence for Prep 3.5hpf and 12hpf, two biological replicates each, and matched input-DNA controls. After an assessment of the raw sequence quality using FastQC (Babraham Institute. n.d, 2016) and Fastq-screen (Babraham Institute. n.d, 2016) the sequence reads were filtered to remove any remaining adapter sequence or poor quality 3’ end sequence using Trimmomatic version 0.32 (Bolger et al., 2014). Default parameters for ILLUMINACLIP and SLIDINGWINDOW were used. MINLENGTH was set to 50 bp, except for Prep 3.5hpf replicate 2 with which 36 bp was used. The reads were then mapped to the GRCz10 (danRer10/September 2014) release of the entire zebrafish genome from the UCSC browser (Tyner et al., 2017) using Bowtie2 version 2.2.3 (Langmead and Salzberg, 2012). The output SAM file was further filtered to remove reads with poor mapping quality and discordant mapped read pairs, using SAMtools view version 0.1.19 (Li et al., 2009) (with flags used -f 2 -q30). Peak calling was performed using MACS2 version 2.1.0.20140616 (Zhang et al., 2008), excluding reads that mapped to the mitochondrial genome and unassembled contigs in the assembly. Default parameters were used, except that the effective genome size was set to 1.03e9 (this equates to 75% of the total genome sequence, excluding ‘N’ bases. The q-value threshold was set to 0.05. Candidate binding regions were then filtered to retain those with a fold enrichment of ≥10. Upon applying these criteria, we noticed that one biological replicate for each ChIP-seq experiment (3.5hpf and 12hpf) underperformed, but more than 95% of the peaks were identified also in the second biological replicate (see ‘Quantification of ChIP peak overlap’ below and Supplementary file 1). Therefore, the best biological replicate for each experimental condition was considered for downstream analysis.

Analysis of qPCR results

Gene expression analysis

Gene expression was determined and normalized to gapdh expression using the following formula (0.5gene of interest Ct value/0.5 gapdh Ct value). The mean value and standard error of the mean (SEM) for three independent biological replicates of control and experimental conditions were calculated using Excel. Statistical significance of mean variations between two conditions was calculated using an unpaired t-test in Excel. Two conditions are considered significantly different if p-value≤0.05.

ChIP DNA enrichment analysis

DNA enrichment was determined and normalized to input values using the following formula (0.5IP Ct value/0.5 Input Ct value). Then the mean value and standard error of the mean (SEM) for three independent biological replicates of control and experimental conditions were calculated using Excel. When necessary the results were expressed as a fold change of specific ChIP signal over control IgG ChIP signal. Statistical significance of mean variations between two conditions was calculated using an unpaired t-test in Excel. Two conditions are considered significantly different if p-value≤0.05.

Analysis of GO term enrichment

GREAT (version 3.0.0 [McLean et al., 2010; Hiller et al., 2013]) allowed for the analysis of GO term enrichment using Prep binding site coordinates as Input. The analysis was performed using the single nearest gene within 5 or 30 kb association rule since most Prep sites are found within 30 kb of a TSS. GO terms were ranked by Binomial False Discovery Rate (FDR) values. The results are presented as -log2 transformed FDR values and only GO terms with FDR ≤ 0.05 (-log2(FDR) ≤ 4.32) were considered significant.

DAVID (version 6.8 [Huang et al., 2009b, 2009a]) was used to identify enriched GO terms associated with genes identified in the RNA-seq analysis and/or found to be near Prep binding sites. The Benjamini multiple testing False Discovery Rate (FDR) was use to rank the identified GO terms. The results are presented as -log2 transformed FDR values and only GO terms with FDR ≤ 0.05 (-log2(FDR) ≤ 4.32) were considered significant.

Analysis of TF peak features

All TF binding site coordinates used in the following analysis were defined as 200 bp coordinates centered on the ChIP peak summit. Unless otherwise specified, only peaks with an FE ≥ 10 were considered.

Prep binding sites distribution relative to TSSs

The distribution of Prep binding sites relative to TSSs was calculated using the windowbed tool from the bedtools suite (Quinlan and Hall, 2010) in the Galaxy toolshed (Goecks et al., 2010) searching for the number of Prep binding sites found within 5 or 30 kb (from their center) of any Ensembl zebrafish (Zv9) or mouse (Mm9) TSSs.

Identification of prep peak associated genes

A gene was considered associated with a Prep binding site if any of its Ensembl (Zv9) TSS was found within 5 or 30 kb from a Prep peak. Prep-associated genes were defined using the windowbed tool from the Bedtools suite in Galaxy searching for Ensembl TSS (for instance those of differentially expressed genes in TALE KD embryos or first-wave wave genes) found within 5 or 30 kb of the center of any Prep binding site. Statistical significance of Prep binding association with genes of interest (first wave genes and TALE KD differentially expressed genes) over a random population of genes was determined with a Pearson correlation test with a statistical significance ≤0.05.

Quantification of ChIP peak overlap

The overlap between two populations of ChIP peaks was analyzed using the intersect tool from the Galaxy toolshed. Two Prep peaks (in different ChIP biological replicates or in ChIP-seq results from 3.5hpf vs. 12hpf) were considered to overlap if their summits were within 50 bp (See also Processing of ChIP-seq data above). Prep and NF-YA peaks in mESCs were considered to overlap if their summits were within 500 bp.

Identification of the Prep12hpf-only peak population

Prep12hpf-only ChIP-seq peaks were identified by subtracting Prep12hpf peaks overlapping with all Prep13.5hpf peaks identified by MACS2 without applying any enrichment cut-off. This strategy allowed for stringent identification of 11468 Prep12hpf-only binding sites not occurring at 3.5hpf that were used for subsequent analysis.

TF binding motif analysis

MEME and DREME (MEME-suite version 4.11.1 [Machanick and Bailey, 2011; Bailey et al., 2009]) were used to identify significantly enriched de novo binding motifs. DREME ran in a default mode, MEME was set to search for a maximum of six 4 to 12 nucleotide long motifs. Motif distribution relative to ChIP-seq peak summit was defined by CENTRIMO using default parameters. AME (MEME-suite version 4.11.1 [Machanick and Bailey, 2011; Bailey et al., 2009]) was used to calculate the relative enrichment between two datasets using default parameters (Ranksum test, p-value≤0.05). In the case of a relative enrichment against a control set of sequence, the « shuffled input sequences » mode was selected. The occurrence of TF binding motifs in Figure 3D and Figure 7—figure supplement 1D) was calculated using a custom Python script (moth.py, Source code 1) with the input files provided in Figure 3—source data 1. To do so, regular expression matches were identified on both strands of the input sequences, and the number of sequences containing at least one occurrence of a motif was calculated. HEXA motifs were identified in sequences that did not contain any DECA motif.

Sequence conservation analysis

Average conservation score around Prep1 binding sites was computed in the Deeptool suite using Prep1-bound sequences and the UCSC vertebrate PhastCons eight way (Zebrafish, Medaka, Stickleback, Tetraodon, Fugu, X. tropicalis, Mouse, Human) wig file as regions of interest and score input files respectively. For Figure 4—figure supplement 1A, a set of 11000 random chromosomal coordinates was generated from the zv11 zebrafish genome assembly using the randCoord.py custom python script (Source code 2).

Analysis of chromatin features

Chromatin heatmaps and mean score profiles of Prep binding sites in fish embryos and mESCs were generated with the Deeptools (version 2.0 [Ramírez et al., 2014]) suite of tools in the Galaxy toolshed. BED files containing Prep biding site coordinates and wiggle files of previously published datasets (Key Resources Table) downloaded from GEO or ENCODE were used as inputs. First, signal matrices at Prep bound regions were made using the compute matrix tool in reference-point mode with the following parameters: distance upstream and downstream of the start site of the regions defined in the BED file: 1000 or 2,000 bp, bin size: 25 bp. When necessary, the regions were ranked based on mean signal values. Second, score matrices were used to generate heatmaps and mean score profiles with the plot heatmaps and plot profile tools respectively. We note that the public ChIP-seq and ATAC-seq datasets are from slightly different timepoints (4.5hpf and 4hpf, respectively) than our Prep ChIP-seq dataset (3.5hpf). Since each dataset requires hundreds to thousands of embryos (that cannot be individually staged) and zebrafish development is slightly asynchronous, it is likely that collecting embryos at these three timepoints will result in considerable overlap of the actual stages analyzed.

References

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
    FastQ screen
    1. Babraham Institute. n.d
    (2016)
    Accessed October 28, 2016.
  6. 6
    FastQC
    1. Babraham Institute. n.d
    (2016)
    Accessed October 28, 2016.
  7. 7
  8. 8
  9. 9
  10. 10
    The B subunit of the CCAAT box binding transcription factor complex (CBF/NF-Y) Is essential for early mouse development and cell proliferation
    1. A Bhattacharya
    2. JM Deng
    3. Z Zhang
    4. R Behringer
    5. B de Crombrugghe
    6. SN Maity
    (2003)
    Cancer Research 63:8167–8172.
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
    Meis family proteins are required for hindbrain development in the zebrafish
    1. SK Choe
    2. N Vlachakis
    3. CG Sagerström
    (2002)
    Development 129:585–595.
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
    Segmental expression of Hoxb2 in R4 requires two separate sites that integrate cooperative interactions between Prep1, pbx and hox proteins
    1. E Ferretti
    2. H Marshall
    3. H Pöpperl
    4. M Maconochie
    5. R Krumlauf
    6. F Blasi
    (2000)
    Development 127:155–166.
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
    Evolution of homeobox genes
    1. PW Holland
    (2013)
    Wiley Interdisciplinary Reviews: Developmental Biology 2:31–45.
    https://doi.org/10.1002/wdev.78
  39. 39
  40. 40
  41. 41
  42. 42
  43. 43
  44. 44
  45. 45
  46. 46
  47. 47
  48. 48
  49. 49
  50. 50
  51. 51
  52. 52
  53. 53
  54. 54
  55. 55
  56. 56
  57. 57
  58. 58
  59. 59
  60. 60
    Dominant negative analogs of NF-YA
    1. R Mantovani
    2. XY Li
    3. U Pessara
    4. R Hooft van Huisjduijnen
    5. C Benoist
    6. D Mathis
    (1994)
    The Journal of Biological Chemistry 269:20340–20346.
  61. 61
    Independent regulation of initiation and maintenance phases of Hoxa3 expression in the vertebrate hindbrain involve auto- and Cross-Regulatory mechanisms
    1. M Manzanares
    2. S Bel-Vialar
    3. L Ariza-McNaughton
    4. E Ferretti
    5. H Marshall
    6. MM Maconochie
    7. F Blasi
    8. R Krumlauf
    (2001)
    Development 128:3595–3607.
  62. 62
  63. 63
  64. 64
  65. 65
  66. 66
  67. 67
  68. 68
  69. 69
  70. 70
  71. 71
  72. 72
  73. 73
  74. 74
  75. 75
  76. 76
  77. 77
  78. 78
    Regulation of Hox target genes by a DNA bound homothorax/Hox/Extradenticle complex
    1. HD Ryoo
    2. T Marty
    3. F Casares
    4. M Affolter
    5. RS Mann
    (1999)
    Development 126:5137–5148.
  79. 79
  80. 80
    Requirement for Pbx1 in skeletal patterning and programming chondrocyte proliferation and differentiation
    1. L Selleri
    2. MJ Depew
    3. Y Jacobs
    4. SK Chanda
    5. KY Tsang
    6. KS Cheah
    7. JL Rubenstein
    8. S O'Gorman
    9. ML Cleary
    (2001)
    Development 128:3543–3557.
  81. 81
  82. 82
  83. 83
  84. 84
  85. 85
  86. 86
  87. 87
    Meis3 synergizes with Pbx4 and Hoxb1b in promoting hindbrain fates in the zebrafish
    1. N Vlachakis
    2. SK Choe
    3. CG Sagerström
    (2001)
    Development 128:1299–1312.
  88. 88
  89. 89
  90. 90
  91. 91
  92. 92

Decision letter

  1. Marianne Bronner
    Reviewing Editor; California Institute of Technology, United States

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "TALE factors use two distinct functional modes to control an essential zebrafish gene expression program" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by Marianne Bronner as the Senior/Reviewing Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Hugo Parker (Reviewer #1); Licia Selleri (Reviewer #2); Miguel Torres (Reviewer #3).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission. The essential revisions are described below but we also include the full reviews of the three reviewers for further details.

Summary:

In this manuscript, Ladam et al. investigate how TALE factors contribute to early zebrafish development by characterizing Prep1.1 DNA-binding profiles at blastula and segmentation stages. They correlate these binding profiles with chromatin marks at these loci and with gene expression data from wild-type vs. TALE knock-down embryos. By focusing on two developmental timepoints, they make the interesting finding that the DNA-binding preference of Prep1.1 appears to expand during development. Strikingly, early binding to 'DECA' motifs is a feature at the blastula stage and additional binding to 'HEXA' motifs with adjacent Pbx-Hox motifs is seen by the segmentation stage. Exploring this further, they identify NF-Y as a potential binding partner of Pbx and Prep in blastula embryos. This leads to a model in which TALE factors utilise different modes of DNA-binding at different times in development, depending on the availability of co-factors such as NF-Y and Hox factors.

Essential revisions:

1) The authors must demonstrate that their model is correct by testing several of the identified elements by reporter assays in zebrafish embryos (or at the very least in cell lines in vitro), coupled with mutation of predicted TALE and NF-Y binding sites to address the importance of these sites for enhancer function.

2) The TALE knock-down phenotype needs to be better characterized with appropriate validation of the specificity of the morpholino.

3) The authors need to explain the temporal discrepancies of all their assays.

4) The authors should provide clarifications/explanations regarding the statement that TALE GRN genes are significantly associated with Class 4 and Class 3, but not Class 1 or 2, MPADs (Figure 6A, B).

5) The authors should include a final diagram depicting the model of Prep1.1/TALE DNA-binding dynamics across developmental time and how this relates to the activation of the components of the TALE-GRN, changes in chromatin state and interactions with co-factors.

6) Absolute statements should be avoided (e.g. "TALE factors control gene expression by regulating a chromatin transition… at a core set of genes encoding TFs that direct anterior development.")

Reviewer #1:

This work addresses an important question because Prep and Pbx proteins are crucial factors for vertebrate development and are present throughout embryogenesis but most of the focus up to now has been on their roles during segmentation stages, with their early roles and mode/s of DNA-binding receiving relatively little attention. The finding that these factors exhibit an expansion of their DNA-binding repertoire between blastula and segmentation stages is novel and interesting, representing a significant advance in our understanding of the dynamic roles of these factors during early development. However, I have a few points that I think should be addressed, that would considerably strengthen the conclusions and the manuscript.

i) The TALE knock-down phenotype needs to be described/characterised in more detail, to provide a more comprehensive view of the developmental context and to validate the specificity of the morpholino cocktail. For instance, hindbrain segmentation/neuroanatomy and craniofacial morphology should be characterised in the triple morphants to provide more detailed evidence that the knockdown is as expected given the previously characterised MO phenotypes. I also suggest moving the justification for using MO's that is in the Figure 1—figure supplement 1 legend to the Materials and methods section or the main text, where it will be more prominent.

ii) An assumption made is that Prep1.1-bound sites, or at least a sub-set of them, represent enhancer elements. The authors must demonstrate that this is true by testing a few such elements by reporter assay in zebrafish embryos, coupled with mutation of predicted TALE and NF-Y sites to address the importance of these sites for enhancer function. This can be done relatively quickly by transient transgenesis, is frequently used for mechanistic dissection of cis-regulatory elements, and will provide crucial evidence for the functionality of these putative enhancers.

iii) The authors use data from mouse ESCs to infer evolutionary conservation of the TALE GRN and of TALE-NF-Y co-localised binding, which expands the scope beyond zebrafish. A complementary approach is to address how many Prep1.1-bound peaks overlap with fish-mammal conserved non-coding elements that have been described in the literature. It is also worth checking if any are homologous to elements in the VISTA enhancer browser and have been experimentally validated in transgenic mice. This is straightforward to do and could potentially add weight to the argument that these interactions are evolutionarily conserved.

iv) This manuscript would really benefit from a final diagram depicting the model of Prep1.1/TALE DNA-binding dynamics across developmental time and how this relates to the activation of the components of the TALE-GRN, changes in chromatin state and interactions with co-factors.

Reviewer #2:

The paper by Franck Ladam et al. presents original studies on unexplored mechanisms that underlie very early roles of TALE transcription factors (TFs) in blastula/gastrula versus later roles during segmentation stages of zebrafish development.

This interesting study makes a substantial leap forward by identifying binding of TALE factors at genomic Pbx:Prep (DECA) sites during early zebrafish developmental stages, at 3.5 hpf. The authors also report that binding motifs for the maternal NF-Y TF are enriched near DECA sites and that NF-Y can form complexes with TALE proteins. Interestingly, the authors demonstrate that in the later post-gastrula embryo, at segmentation stages, GRN-associated TALE occupancy expands to include HEXA motifs with adjacent PBX:HOX sites. Therefore, the authors convincingly demonstrate that TALE factors control a key GRN, but utilize distinct DNA motifs and protein partners at different developmental stages, which is a novel and as yet unexplored mechanism underlying differential and temporally-restricted functions of TALE TFs in vertebrate embryogenesis.

The manuscript comprises an arsenal of high-quality results that are illustrated in complex and impeccable figures. The identification of distinct sequence-based mechanisms that underlie TALE binding to DNA thus directing different developmental functions at successive stages of embryogenesis is per se a fundamental finding. This study will be of high importance and interest for TALE biology, which sorely lacks mechanistic research conducted in vivo in animal models. The original findings reported in this paper can also open newpaths of investigation on roles of TALE factors in the earliest stages of embryogenesis in other organisms, including mammals. In addition, the distinct strategies that are being employed by TALE factors to execute developmental processes at different stages of embryogenesis might be similarly adopted by other TFs with widespread expression patterns, thus opening additional avenues for broader investigations.

Concerns that should be addressed:

1) Results – General Consideration:

The authors conduct multiple genome-wide experiments and/or mine available genome-wide datasets, including RNA-Seq, ChIP-seq for Prep1, ATAC Seq, ChIP-seq for chromatin marks. The time-point analyzed are not fully consistent across these experimental approaches, e.g. RNA-Seq for controls and TALE KD zebrafish embryos is performed at 6hpf (early gastrula) and at 12hpf (segmentation stage); ChIP-seq for Prep1.1 is performed at 3.5hpf (blastula) and at 12hpf (segmentation stage); ATAC-Seq at 4hpf (blastula; available datasets); and ChIP-seq for chromatin marks at 4.5hpf (blastula; available datasets). While this could be somewhat concerning, the findings and the overall message emerging from the study are strong and do not appear to be weakened by the slight temporal discrepancies. Indeed: a) out of the 13,300 Prep peaks at 3.5hpf, ~60% co-localize with a Prep peak at 12hpf, suggesting that a large fraction of binding sites remains occupied throughout embryogenesis; b) an additional ~16,500 peaks detectable at 12hpf do not co-localize with Prep 3.5hpf peaks, demonstrating that additional binding sites become occupied at later developmental stages; c) Prep binding is dynamically and continuously associated with the TALE GRN during zebrafish embryogenesis; d) TALE factors utilize distinct binding motifs at very early versus late stages of embryogenesis; e) TALE-occupied sites are associated with specific chromatin states at blastula stages; f) developmental control genes are enriched near Modified Prep Associated Domains (MPADs) displaying repressive histone modifications; g) Class 4 MPADs transition to an active chromatin state during later stages of embryogenesis; h) TALE and NF-Y factors have joint roles at very early developmental stages and can form complexes.

This reviewer is not particularly concerned about the minor temporal discrepancies present among the various genome-wide assays conducted in this study, given the strong message emerging from all of the reported high-throughput experiments. However, it might be useful to underscore throughout the text and in the Discussion that TALE factors adopt distinct mechanistic strategies in zebrafish blastula and early gastrula versus segmentation stages; in other words to simply cluster together the functions of TALE TFs in blastula and early gastrula within one single group (comprising 3.5, 4, 4.5, and 6 hpf). To this end, it would help to slightly modify the cartoon illustrating the subsequent zebrafish developmental stages in Figure 1A. The authors could group [blastula stages and early gastrula stages] within one single bracket or inside one single box andthe [segmentation stages] inside another bracket or box. Accordingly, this clustering could be clarified in the figure legend.

2) Results, subsection “TALE factors control the chromatin state at Class 4 MPADs associated with the anterior GRN”: The authors state that TALE GRN genes are significantly associated with Class 4 and Class 3, but not Class 1 or 2, MPADs (Figure 6A, B). However, RNA-seq (Figure 5C) shows that genes associated with Class 3 MPADs (and also Class 1 and 2 MPADs) are expressed at similar levels at 12hpf and 6hpf (Figure 5D). In contrast, Class 4 MPADs display higher levels of H3K27ac at 9hpf than at 4.5hpf (Figure 5A, B) and their associated genes show the greatest increase in expression between 6hpf and 12hpf. In addition, only class 4 MPADs showed a strong switch to an active chromatin state from 4.5hpf to 9hpf during zebrafish embryogenesis (Figure 5A, B), while class 3 MPADs did not exhibit any significant switch (Figure 5A, B). Collectively, these results are somewhat difficult to reconcile. The authors should qualify these findings and try to explain these differences. Are other factors necessary for activation of Class 3 MPADs? Or do the acetylation changes appear at a later time-point for Class 3 MPADs? Can other scenarios be envisaged? It would be helpful to add these considerations at least to the Discussion section of the paper.

3) Results: The authors state: "These findings indicate that TALE factors control gene expression by regulating a chromatin transition – from repressive chromatin in blastula stage embryos to active chromatin in segmentation stage embryos – at a core set of genes encoding TFs that direct anterior development."

This statement is very strong and absolute, whereas it does not hold across the entire animal kingdom. In fact, in the mouse TALE factors substantially affect "posterior" development, as shown by the presence of severe posterior developmental defects in various mouse models with LOF for different TALE TFs. For example, in Pbx LOF mouse embryos it has been reported that the posterior axial skeleton, the hindlimb, and the urogenital system are severely compromised, among other posterior structures. In addition, also in the zebrafish embryo, Prep binds many loci in addition to ones associated with the anterior GRN and some of these additional sites are near genes that regulate other developmental processes known to involve TALE function. For instance, Prep peaks are found near genes involved in myogenesis (Figure 2D, E) and expression of myogenic genes is disrupted in TALE KD embryos, as the authors report in the Discussion section of the paper. Given all of these considerations, the statement above should be at least qualified and also toned down:

"These findings indicate that TALE factors control gene expression by regulating a chromatin transition – from repressive chromatin in blastula stage embryos to active chromatin in segmentation stage embryos – at a core set of genes encoding TFs that direct primarily anterior development in the zebrafish embryo."

4) Results, subsection “NF-Y proteins regulate TALE GRN expression and form complexes with TALE factors”, first paragraph: The authors describe a NF-Y motif that is specifically enriched at Prep3.5hpf peaks (Figure 7B). NF-Y is also maternally deposited in zebrafish (Figure 7—figure supplement 1A), consistent with a joint role for TALE and NF-Y factors at very early developmental stages. This is a critical finding, as it identifies a potential new cofactor for TALE proteins. The authors could better emphasize this exciting finding, for example drawing a parallel between the roles of NF-Y in zebrafish and in mouse embryonic development. In fact, also in the mouse NF-Y has been shown to have critical functions for very early embryonic development (Bhattacharya et al., 2003). This parallel would broaden the impact and breadth of the reported findings and would relate them also to other species, e.g. mammals.

5) Discussion: Given the wealth of high-quality results and novel findings that are reported in this interesting study, it would greatly help to add one additional figure – or figure panel – with a cartoon that summarizes and illustrates in a succinct manner the most salient findings. This would leave the reader with a strong and easy-to-remember 'take-home' message.

Reviewer #3:

The manuscript by Ladam et al. reports a ChIPseq analysis of Prep proteins at two developmental stages of the zebrafish embryo -blastula and segmentation- and the correlation of these data with transcriptome modifications associated to combined downregulation of Prep and Pbx factors using a morpholino approach. This work identifies a shift in binding preference by Prep factors between the two stages analyzed. While in the blastula the preferred binding site is a Prep-Pbx combined site, in the segmenting embryo the preference shifts to binding sites in which Prep binds to the Prep-alone binding site or the Pbx-Hox binding site (to which is can bind forming trimers). Classification of the blastula binding sites by their Histone Modification profile identifies a subset of targets that show repressive marks at the blastula stage and become active at the segmentation stage in a regionally-restricted manner. This class is enriched in genes whose expression and Histone modification profile is sensitive to the loss of Prep-Pbx function, indicating a pioneer function for Prep-Pbx in this tissue-specific genes.

In addition to this, authors analyze the occurrence binding sites for the pioneer transcription complex, which appear strongly associated to the Prep-Pbx binding motifs. They characterize association of Prep-Pbx to subunits of the NF-Y complex and they demonstrate that NF-Y activity is required for the transcriptional and epigenetic activation of Prep-Pbx responsive genes.

This is an important work mainly for two conclusions; one is that TALE transcription factors show stage-specific preference for binding sites defined by binding sequence. The data are compelling on this conclusion and suggest that it is the activation of tissue-specific transcriptional cofactors at later stages what directs this specificity.

The second finding is the identification of a cooperative complex between two pioneering complexes Prep-Pbx and NF-Y. This is an important finding, although some aspects remain only superficially explained or are not fully conclusive. My main concern on this point is that the authors claim to have discovered the function of the Prep-Pbx-NF-Y binding sites, however they do not perform any functional assay on this sequence. In my opinion, the data shown are only correlative with respect to this point. While elimination of Prep-Pbx function or elimination of NF-Y function affects a common set of targets, this does not demonstrate that the effect is through the action of these proteins as a complex on the Prep-Pbx-NF-Y binding sequence. Also, is the Prep/Pbx or NF-Y function different for those sites/genes in which only the Prep-Pbx sites are present versus those in which the Prep-Pbx-NF-Y site is found?. Also, given that the set of sensitive genes are tissue-specific and their activation associates with the colonization of nearby Prep-only and Pbx-Hox sites, could the function of these new sites be the one that is relevant for chromatin opening and transcriptional activation and not the interaction at the Prep-Pbx site?. Therefore, if possible, experiments in which the Prep-Pbx-NF-Y binding sequence is functionally analyzed for factor binding and enhancer activity should be included to make conclusions stronger.

In connection to this, a more comprehensive description of the prevalence of the Prep-Pbx and Prep-Pbx-NF-Y sites among the classes studied would help understanding whether there is specific association to subclasses. This would include the classification according to stage (blastula-specific, segmentation-specific and common) and MPAD classes. Also a detailed description of the occupancy of specific sites associated to the TALE GRN containing the Prep-Pbx-, Prep-Pbx-NFy, Prep-only or pbx-Hox binding sites and how these evolve between blastula and segmentation would be a very valuable addition to the manuscript.

https://doi.org/10.7554/eLife.36144.065

Author response

Essential revisions:

1) The authors must demonstrate that their model is correct by testing several of the identified elements by reporter assays in zebrafish embryos (or at the very least in cell lines in vitro), coupled with mutation of predicted TALE and NF-Y binding sites to address the importance of these sites for enhancer function.

There are two types of elements identified by our analyses – one where Prep occupies monomeric HEXA binding sites associated with PBX:HOX motifs (this type of element is much more frequently occupied at 12hpf than at 3.5hpf) and one where Prep occupies DECA sites associated with NF-Y motifs (this is the predominantly occupied motif at 3.5hpf).

Elements containing HEXA+PBX:HOX motifs have been studied previously. Analyses of individual such elements in the mouse demonstrated that they act as enhancers and that the Prep, Pbx and Hox binding sites are essential for enhancer function (Pöpperl et al., 1995; Jacobs, Schnabel, and Cleary 1999; Ferretti et al., 2005; Ferretti et al., 2000; Di Rocco, Mavilio, and Zappavigna 1997; Manzanares et al., 2001; Tümpel et al., 2007; Wassef et al., 2008). Accordingly, we have previously shown that an element from the zebrafish hoxb1a locus (that contains a Prep+Pbx:Hox motif) drives gene expression in vivo and in vitro and that the TALE and Hox sites are essential (Choe et al., 2009; Vlachakis, Ellstrom, and Sagerström 2000). Furthermore, motif discovery in CNEs combined with functional testing in zebrafish recently identified HEXA+PBX:HOX motifs as being essential for enhancer activity (Grice et al., 2015; Parker et al., 2011). Hence, previous work has demonstrated that HEXA+PBX:HOX containing elements act as enhancers. This is now discussed in the Results subsection “Some TALE-occupied sites are associated with chromatin marks at blastula stages”, first paragraph. The revised manuscript now also includes additional data showing that the HEXA+PBX:HOX elements identified in our study are evolutionarily conserved (Figure 4—figure supplement 1A) and associated with known enhancer marks (Figure 4—figure supplement 1B). Finally, we now show that, of 74 enhancers known to be active in the zebrafish hindbrain (Grice et al., 2015; Parker et al., 2011), 19 (26%; Figure 4—figure supplement 1C) coincide with Prep12hpf-only peaks. Since enhancer activity was assayed at 2-3 days of development, but our ChIP data is derived from 12hpf (and we applied a stringent FE>10 cutoff for the Prep peaks), this represents a substantial overlap. These results are now discussed in the aforementioned Results subsection. Hence, our analyses of HEXA+PBX:HOX elements are consistent with previous reports defining these elements as enhancers.

Since the DECA+NFY containing elements are novel, they have not been previously assayed for enhancer activity. Our data show that many such elements are associated with chromatin marks indicative of enhancers. In particular, our analysis in Figure 4A, B uses H3K4me1 (a modification that is well established as a mark of enhancers) to define MPADs at 4.5hpf and Figure 4F shows that many MPADs are marked by H3K27ac (a mark found at active enhancers) at 4.5hpf. Notably, many MPADs that lack H3K27ac at 4.5hpf, gain it over the next several hours (Figure 5A, B). In fact, even some Prep-bound sites that lack H3K4me1 at 4.5hpf (non-MPADs) gain enhancer marks during embryogenesis (Figure 5—figure supplement 1). Hence, many DECA+NFY elements appear to be (or become) associated with enhancer marks. We also find that these elements are enriched at regions of the genome that are conserved with mammals (revised Figure 4E), consistent with DECA+NFY elements playing an evolutionarily critical role. Importantly, even though DECA+NFY motifs are associated with enhancers, these motifs are not necessarily expected to be sufficient to convey enhancer activity. In fact, we have previously shown both in vivo and in vitro that TALE factors are not sufficient to drive gene expression (Choe, Ladam, and Sagerström 2014; Choe et al., 2009; Vlachakis, Choe, and Sagerström 2001), indicating that TALE factors alone do not mediate enhancer activity. Furthermore, Prep, Pbx and NF-Y are all ubiquitously expressed, but their binding is associated with genes that are expressed in a tissue-restricted manner, indicating that there must be additional regulatory input at enhancers for these genes. Accordingly, in the original manuscript, we did not propose that the DECA+NFY elements act as enhancers. Instead, our observation that TALE and NF-Y occupy many genomic sites prior to the appearance of enhancer marks (Figure 4A) and are required for the deposition of H3K27ac modifications at TALE-dependent genes (Figure 6F, 7F), led us to suggest that these factors are instead required at a step prior to establishment/action of enhancers (possibly in a ‘pioneer’ role). This is now clarified in the second paragraph of the Discussion and in a new model diagram in Figure 7H. Nevertheless, as requested by the reviewers, we have now cloned and tested genomic regions containing DECA+NFY sites in HEK293 cells. We have previously used transfection in HEK293 cells to demonstrate that HEXA+PBX:HOX containing elements from the zebrafish hoxb1a locus acts as an enhancer (Choe et al., 2009), so this is a reasonable system to assay zebrafish enhancers. We find that only one of the seven elements drives expression of a luciferase reporter (this is true also when TALE and NF-Y TFs are co-transfected with the reporter plasmids; data shown in Figure 7—figure supplement 1E), indicating that DECA+NFY elements may not function as classical enhancers. This is consistent with our proposed model that Prep, Pbx and NF-Y instead act at these elements to play an earlier role (possibly as pioneer factors) to permit the subsequent action of tissue-specific TFs (such as Hox TFs) at nearby enhancers. This model is clarified in the new Figure 7H.

2) The TALE knock-down phenotype needs to be better characterized with appropriate validation of the specificity of the morpholino.

[Additional details from reviewer 1: The TALE knock-down phenotype needs to be described/characterised in more detail, to provide a more comprehensive view of the developmental context and to validate the specificity of the morpholino cocktail. For instance, hindbrain segmentation/neuroanatomy and craniofacial morphology should be characterised in the triple morphants to provide more detailed evidence that the knockdown is as expected given the previously characterised MO phenotypes. I also suggest moving the justification for using MO's that is in the Figure 1—figure supplement 1 legend to the Materials and methods section or the main text, where it will be more prominent.]

Several published studies indicate that complete removal of Pbx or Prep function in zebrafish converges on the same phenotype (Pöpperl et al., 2000; Waskiewicz, Rikhof, and Moens 2002; Waskiewicz et al., 2001; Choe, Vlachakis, and Sagerström 2002; Deflorian et al., 2004). Accordingly, we expect that simultaneous disruption of Pbx and Prep function will also produce this phenotype. The published phenotype has been characterized morphologically, as well as by alcian blue staining for head cartilage, by immunostaining for hindbrain neurons and by in situ hybridization/microarray analysis for changes in gene expression. In the original submission of this manuscript, we show that our TALE KD phenotype is morphologically indistinguishable from the published phenotype (Figure 1—figure supplement 1A). We have now also assayed head cartilage formation and differentiation of hindbrain Mauthner neurons – we find that both are affected as expected based on the published reports (Figure 1—figure supplement 1B; this analysis also revealed that the pectoral fins are lost in our TALE KD embryos, as expected (Pöpperl et al., 2000)). This analysis is now covered in the Results subsection “TALE factors control a gene network regulating formation of anterior embryonic structures in zebrafish”. Lastly, we compared our RNA-seq data from TALE KD embryos to gene expression changes in two published reports (French et al., 2007; Deflorian et al., 2004). It is difficult to do a direct quantitative comparison since our RNA-seq was done at 12hpf after KD of both Prep and Pbx, while French et al. knocked down only Pbx and assayed gene expression (by microarray and in situ hybridization) at 18hpf and Deflorian et al. knocked down only Prep and assayed (only by in situ hybridization) at 20-24hpf. In spite of these caveats, of the 13 genes downregulated in French et al., seven were downregulated in our analysis and of the six genes downregulated in Deflorian et al., four were downregulated in our experiment (this is covered in the aforementioned subsection). We note that gene expression changes in TALE KD embryos become more pronounced over time (i.e., we see greater changes at 12hpf than at 6hpf), likely explaining why not all previously reported TALE-dependent genes are detected by our RNA-seq. In sum, we conclude that our TALE KD produces a phenotype very similar to the published phenotype resulting from Pbx or Prep knockdown.

As requested, the MO justification has been moved to the Materials and methods subsection “Interference with protein function in embryos”.

3) The authors need to explain the temporal discrepancies of all their assays.

[Additional details from reviewer 2: The time-point analyzed are not fully consistent across these experimental approaches, e.g. RNA-Seq for controls and TALE KD zebrafish embryos is performed at 6hpf (early gastrula) and at 12hpf (segmentation stage); ChIP-seq for Prep1.1 is performed at 3.5hpf (blastula) and at 12hpf (segmentation stage); ATAC-Seq at 4hpf (blastula; available datasets); and ChIP-seq for chromatin marks at 4.5hpf (blastula; available datasets). While this could be somewhat concerning, the findings and the overall message emerging from the study are strong and do not appear to be weakened by the slight temporal discrepancies.]

Our analyses center on two timepoints – late blastula (3.5-4.5hpf) and early segmentation (12hpf). These timepoints were chosen because they represent a stage prior to zygotic gene expression (late blastula) and a stage coincident with the onset of tissue morphogenesis (early segmentation). These stages are now indicated in Figure 1A and the rationale for their selection is explained in the legend to Figure 1.

For the late blastula stage, we use chromatin ChIP-seq (4.5hpf) and ATAC-seq (4hpf) datasets for comparisons to our 3.5hpf Prep ChIP-seq data. The use of datasets from slightly different timepoints allowed us to use published datasets instead of replicating previous work. We agree with reviewer 2 that the difference between 3.5hpf, 4hpf and 4.5hpf is unlikely to affect our conclusions for two key reasons. 1) at these stages all zebrafish blastomere cells appear to be pluripotent and capable of differentiating into any tissue derivative (Ho and Kimmel 1993). 2) genome-wide analyses at these stages require large numbers of embryos (hundreds to thousands) that cannot be individually staged. Since zebrafish development is not perfectly synchronized, embryos will develop at slightly different speeds – even in a clutch from a single female. Therefore, when large numbers of embryos are collected, they are likely to contain a range of slightly different stages, meaning that ChIP-seq/ATAC-seq analyses done at 3.5hpf, 4hpf and 4.5hpf will actually have considerable overlap in the actual stages of the embryos analyzed. This fact is now mentioned in the Materials and methods subsection “Analysis of chromatin features”. We did not carry out RNA-seq at 3.5hpf because zygotic gene expression is not yet active at this timepoint. Instead, we selected the 6hpf timepoint (early gastrula), when zygotic gene expression is more robust. However, we do not detect any differentially expressed genes at 6hpf, so this RNA-seq dataset is not used to identify TALE-dependent genes.

4) The authors should provide clarifications/explanations regarding the statement that TALE GRN genes are significantly associated with Class 4 and Class 3, but not Class 1 or 2, MPADs (Figure 6A, B).

[Additional details from Reviewer 2: The authors state that TALE GRN genes are significantly associated with Class 4 and Class 3, but not Class 1 or 2, MPADs (Figure 6A, B). However, RNA-seq (Figure 5C) shows that genes associated with Class 3 MPADs (and also Class 1 and 2 MPADs) are expressed at similar levels at 12hpf and 6hpf (Figure 5D). In contrast, Class 4 MPADs display higher levels of H3K27ac at 9hpf than at 4.5hpf (Figure 5A, B) and their associated genes show the greatest increase in expression between 6hpf and 12hpf. In addition, only class 4 MPADs showed a strong switch to an active chromatin state from 4.5hpf to 9hpf during zebrafish embryogenesis (Figure 5A, B), while class 3 MPADs did not exhibit any significant switch (Figure 5A, B). Collectively, these results are somewhat difficult to reconcile. The authors should qualify these findings and try to explain these differences. Are other factors necessary for activation of Class 3 MPADs? Or do the acetylation changes appear at a later time-point for Class 3 MPADs? Can other scenarios be envisaged? It would be helpful to add these considerations at least to the Discussion section of the paper.]

Our data show that TALE-dependent genes are associated with Class 3 and 4, but not Class 1 or 2 MPADs (Figure 6A). The reviewer comments that Class 3 MPAD-associated genes appear to behave more like Class 1 and 2 MPAD-associated genes in some regards and asks us to address this issue. However, our data indicate that, while the Class 3 MPAD-associated genes undergo smaller changes than Class 4-associated genes, they nevertheless show a greater increase in expression and higher H3K27ac levels than genes associated with Class 1 and 2 MPADs (Figure 5A-D), suggesting that they differ from Class 1 and 2 genes in these regards. These differences are now emphasized on lines 296-297. We do not know why Class 3 MPAD-associated genes show less pronounced signs of activation than Class 4 MPAD-associated genes, but we note that Class 3 MPAD-associated genes are more highly enriched for functions related to muscle differentiation (Figure 6C). This process is just beginning at the timepoint we used to assay gene expression (RNA-seq in Figure 5C, D) and H3K27ac (ChIP-seq in Figure 5A), suggesting that activation of Class 3 MPAD-associated genes may still be ongoing at this stage. This possibility is now discussed in the fifth paragraph of the Discussion.

5) The authors should include a final diagram depicting the model of Prep1.1/TALE DNA-binding dynamics across developmental time and how this relates to the activation of the components of the TALE-GRN, changes in chromatin state and interactions with co-factors.

A diagram is now included as Figure 7H.

6) Absolute statements should be avoided (e.g. "TALE factors control gene expression by regulating a chromatin transition… at a core set of genes encoding TFs that direct anterior development.")

[Additional details from Reviewer 2: This statement is very strong and absolute, whereas it does not hold across the entire animal kingdom. In fact, in the mouse TALE factors substantially affect "posterior" development, as shown by the presence of severe posterior developmental defects in various mouse models with LOF for different TALE TFs. For example, in Pbx LOF mouse embryos it has been reported that the posterior axial skeleton, the hindlimb, and the urogenital system are severely compromised, among other posterior structures. In addition, also in the zebrafish embryo, Prep binds many loci in addition to ones associated with the anterior GRN and some of these additional sites are near genes that regulate other developmental processes known to involve TALE function. For instance, Prep peaks are found near genes involved in myogenesis (Figure 2D, E) and expression of myogenic genes is disrupted in TALE KD embryos, as the Authors report in the Discussion section of the paper. Given all of these considerations, the statement above should be at least qualified and also toned down: "These findings indicate that TALE factors control gene expression by regulating a chromatin transition – from repressive chromatin in blastula stage embryos to active chromatin in segmentation stage embryos – at a core set of genes encoding TFs that direct primarily anterior development in the zebrafish embryo."]

The statement referenced by reviewer 2 has been revised as requested (subsection “TALE factors control the chromatin state at Class 4 MPADs associated with the anterior GRN”). We have also reviewed the text to make sure that our statements do not imply general conclusions, unless warranted.

Reviewer #1:

[…] i) The TALE knock-down phenotype needs to be described/characterised in more detail, to provide a more comprehensive view of the developmental context and to validate the specificity of the morpholino cocktail. For instance, hindbrain segmentation/neuroanatomy and craniofacial morphology should be characterised in the triple morphants to provide more detailed evidence that the knockdown is as expected given the previously characterised MO phenotypes. I also suggest moving the justification for using MO's that is in the Figure 1—figure supplement 1 legend to the Materials and methods section or the main text, where it will be more prominent.

This issue is addressed under Essential revisions – point 2 above.

ii) An assumption made is that Prep1.1-bound sites, or at least a sub-set of them, represent enhancer elements. The authors must demonstrate that this is true by testing a few such elements by reporter assay in zebrafish embryos, coupled with mutation of predicted TALE and NF-Y sites to address the importance of these sites for enhancer function. This can be done relatively quickly by transient transgenesis, is frequently used for mechanistic dissection of cis-regulatory elements, and will provide crucial evidence for the functionality of these putative enhancers.

This issue is addressed under Essential revisions – point 1 above.

iii) The authors use data from mouse ESCs to infer evolutionary conservation of the TALE GRN and of TALE-NF-Y co-localised binding, which expands the scope beyond zebrafish. A complementary approach is to address how many Prep1.1-bound peaks overlap with fish-mammal conserved non-coding elements that have been described in the literature. It is also worth checking if any are homologous to elements in the VISTA enhancer browser and have been experimentally validated in transgenic mice. This is straightforward to do and could potentially add weight to the argument that these interactions are evolutionarily conserved.

This issue was addressed as part of Essential revisions – point 1 above. Specifically, the revised Figure 4E now shows that MPAD sites (particularly the Class 4 sites that are found near TALE-dependent developmental regulators) coincide with genomic regions conserved among vertebrates. In addition, we selected a subset of elements for more in depth analyses of conservation and this data is now presented in Figure 4—figure supplement 1D. Lastly, we find that many enhancers shown to be active in the hindbrain (Grice et al., 2015; Parker et al., 2011), coincide with Preppeaks (Figure 4—figure supplement 1C and subsection “Some TALE-occupied sites are associated with chromatin marks at blastula stages”, first paragraph).

iv) This manuscript would really benefit from a final diagram depicting the model of Prep1.1/TALE DNA-binding dynamics across developmental time and how this relates to the activation of the components of the TALE-GRN, changes in chromatin state and interactions with co-factors.

A diagram is now included as Figure 7H.

Reviewer #2:

[…] Concerns that should be addressed:

1) Results – General Consideration:

[…] However, it might be useful to underscore throughout the text and in the Discussion that TALE factors adopt distinct mechanistic strategies in zebrafish blastula and early gastrula versus segmentation stages; in other words to simply cluster together the functions of TALE TFs in blastula and early gastrula within one single group (comprising 3.5, 4, 4.5, and 6 hpf). To this end, it would help to slightly modify the cartoon illustrating the subsequent zebrafish developmental stages in Figure 1A. The authors could group [blastula stages and early gastrula stages] within one single bracket or inside one single box andthe [segmentation stages] inside another bracket or box. Accordingly, this clustering could be clarified in the figure legend.

This issue is addressed under Essential revisions – point 3 above.

2) Results, subsection “TALE factors control the chromatin state at Class 4 MPADs associated with the anterior GRN”: The authors state that TALE GRN genes are significantly associated with Class 4 and Class 3, but not Class 1 or 2, MPADs (Figure 6A, B). However, RNA-seq (Figure 5C) shows that genes associated with Class 3 MPADs (and also Class 1 and 2 MPADs) are expressed at similar levels at 12hpf and 6hpf (Figure 5D). In contrast, Class 4 MPADs display higher levels of H3K27ac at 9hpf than at 4.5hpf (Figure 5A, B) and their associated genes show the greatest increase in expression between 6hpf and 12hpf. In addition, only class 4 MPADs showed a strong switch to an active chromatin state from 4.5hpf to 9hpf during zebrafish embryogenesis (Figure 5A, B), while class 3 MPADs did not exhibit any significant switch (Figure 5A, B). Collectively, these results are somewhat difficult to reconcile. The authors should qualify these findings and try to explain these differences. Are other factors necessary for activation of Class 3 MPADs? Or do the acetylation changes appear at a later time-point for Class 3 MPADs? Can other scenarios be envisaged? It would be helpful to add these considerations at least to the Discussion section of the paper.

This issue is addressed under Essential revisions – point 4 above.

3) Results: The authors state: "These findings indicate that TALE factors control gene expression by regulating a chromatin transition – from repressive chromatin in blastula stage embryos to active chromatin in segmentation stage embryos – at a core set of genes encoding TFs that direct anterior development."

This statement is very strong and absolute, whereas it does not hold across the entire animal kingdom. In fact, in the mouse TALE factors substantially affect "posterior" development, as shown by the presence of severe posterior developmental defects in various mouse models with LOF for different TALE TFs. For example, in Pbx LOF mouse embryos it has been reported that the posterior axial skeleton, the hindlimb, and the urogenital system are severely compromised, among other posterior structures. In addition, also in the zebrafish embryo, Prep binds many loci in addition to ones associated with the anterior GRN and some of these additional sites are near genes that regulate other developmental processes known to involve TALE function. For instance, Prep peaks are found near genes involved in myogenesis (Figure 2D, E) and expression of myogenic genes is disrupted in TALE KD embryos, as the authors report in the Discussion section of the paper. Given all of these considerations, the statement above should be at least qualified and also toned down:

"These findings indicate that TALE factors control gene expression by regulating a chromatin transition – from repressive chromatin in blastula stage embryos to active chromatin in segmentation stage embryos – at a core set of genes encoding TFs that direct primarily anterior development in the zebrafish embryo."

This issue is addressed under Essential revisions – point 6 above.

4) Results, subsection “NF-Y proteins regulate TALE GRN expression and form complexes with TALE factors”, first paragraph: The authors describe a NF-Y motif that is specifically enriched at Prep3.5hpf peaks (Figure 7B). NF-Y is also maternally deposited in zebrafish (Figure 7—figure supplement 1A), consistent with a joint role for TALE and NF-Y factors at very early developmental stages. This is a critical finding, as it identifies a potential new cofactor for TALE proteins. The authors could better emphasize this exciting finding, for example drawing a parallel between the roles of NF-Y in zebrafish and in mouse embryonic development. In fact, also in the mouse NF-Y has been shown to have critical functions for very early embryonic development (Bhattacharya et al., 2003). This parallel would broaden the impact and breadth of the reported findings and would relate them also to other species, e.g. mammals.

A more extensive discussion of previous work on NF-Y in embryogenesis (including the Bhattacharya reference) is now included in the Results subsection “NF-Y proteins regulate TALE GRN expression and form complexes with TALE factors”, second paragraph.

5) Discussion: Given the wealth of high-quality results and novel findings that are reported in this interesting study, it would greatly help to add one additional figure – or figure panel – with a cartoon that summarizes and illustrates in a succinct manner the most salient findings. This would leave the reader with a strong and easy-to-remember 'take-home' message.

A diagram is now included as Figure 7H.

Reviewer #3:

[…] This is an important work mainly for two conclusions; one is that TALE transcription factors show stage-specific preference for binding sites defined by binding sequence. The data are compelling on this conclusion and suggest that it is the activation of tissue-specific transcriptional cofactors at later stages what directs this specificity.

The second finding is the identification of a cooperative complex between two pioneering complexes Prep-Pbx and NF-Y. This is an important finding, although some aspects remain only superficially explained or are not fully conclusive. My main concern on this point is that the authors claim to have discovered the function of the Prep-Pbx-NF-Y binding sites, however they do not perform any functional assay on this sequence. In my opinion, the data shown are only correlative with respect to this point. While elimination of Prep-Pbx function or elimination of NF-Y function affects a common set of targets, this does not demonstrate that the effect is through the action of these proteins as a complex on the Prep-Pbx-NF-Y binding sequence.

We have now carried out a functional analysis of these binding sites (see Essential revisions – point 1 above).

In the original manuscript, we were careful not to claim that a Prep:Pbx:NFY complex represents the functional unit controlling the TALE GRN – since this is difficult to prove experimentally. Our data show 1) that TALE and NF-Y TFs occupy binding sites (DECA and CCAAT) that are close to each other (average distance = 20bp), 2) that these TFs regulate a shared set of genes and 3) that they can form a complex. Hence, “We conclude that NF-Y binds adjacent to TALE factors at DECA sites and that both factors are required for regulation of the TALE GRN, possibly by functioning in a complex.”

Also, is the Prep/Pbx or NF-Y function different for those sites/genes in which only the Prep-Pbx sites are present versus those in which the Prep-Pbx-NF-Y site is found?

We partially addressed this point in the original manuscript (new Figure 7—figure supplement 1C), showing that TALE-occupied sites with adjacent NF-Y motifs have higher levels of H3K27ac, reduced nucleosome occupancy and reduced levels of H3K27me3 compared to ones lacking NF-Y motifs. However, these effects are relatively subtle and we did not examine them further. The main limitation with this avenue of inquiry is that we do not have NF-Y ChIP-seq data in zebrafish. Hence, we do not know which NF-Y sites are occupied (our ChIP-qPCR analysis in Figure 7C indicates that ~60% of NF-Y sites are occupied), which makes it difficult to draw conclusions about NF-Y activity at these sites.

Also, given that the set of sensitive genes are tissue-specific and their activation associates with the colonization of nearby Prep-only and Pbx-Hox sites, could the function of these new sites be the one that is relevant for chromatin opening and transcriptional activation and not the interaction at the Prep-Pbx site?.

The experiments in Figure 6F and 7F, which demonstrate that TALE and NF-Y are required for increased H3K27 acetylation, specifically assay the chromatin state at MPADs (that contain DECA+NFY motifs) and were done at a stage (9hpf) prior to expression of most tissue-specific genes. This suggests that TALE and NF-Y acting at DECA+NFY motifs play an important role.

Therefore, if possible, experiments in which the Prep-Pbx-NF-Y binding sequence is functionally analyzed for factor binding and enhancer activity should be included to make conclusions stronger.

This issue is addressed under Essential revisions – point 1 above.

In connection to this, a more comprehensive description of the prevalence of the Prep-Pbx and Prep-Pbx-NF-Y sites among the classes studied would help understanding whether there is specific association to subclasses. This would include the classification according to stage (blastula-specific, segmentation-specific and common) and MPAD classes. Also a detailed description of the occupancy of specific sites associated to the TALE GRN containing the Prep-Pbx-, Prep-Pbx-NFy, Prep-only or pbx-Hox binding sites and how these evolve between blastula and segmentation would be a very valuable addition to the manuscript.

We do not find any significant differences in the distribution of DECA versus DECA+NFY sites among the various regulatory elements (MPADs and non-MPADs), nor do we find a difference in their association with TALE GRN genes. This is now shown in Figure 7—figure supplement 1D and discussed in the second paragraph of the subsection “NF-Y proteins regulate TALE GRN expression and form complexes with TALE factors”.

https://doi.org/10.7554/eLife.36144.066

Article and author information

Author details

  1. Franck Ladam

    Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, United States
    Contribution
    Conceptualization, Formal analysis, Investigation, Visualization, Writing—review and editing
    Competing interests
    No competing interests declared
  2. William Stanney

    Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, United States
    Contribution
    Investigation, Writing—review and editing
    Competing interests
    No competing interests declared
  3. Ian J Donaldson

    Faculty of Biology, Medicine and Health, University of Manchester, Manchester, United Kingdom
    Contribution
    Formal analysis, Writing—review and editing
    Competing interests
    No competing interests declared
  4. Ozge Yildiz

    Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, United States
    Contribution
    Investigation, Writing—review and editing
    Competing interests
    No competing interests declared
  5. Nicoletta Bobola

    Faculty of Biology, Medicine and Health, University of Manchester, Manchester, United Kingdom
    Contribution
    Conceptualization, Supervision, Funding acquisition, Writing—review and editing
    Competing interests
    No competing interests declared
    ORCID icon 0000-0002-7103-4932
  6. Charles G Sagerström

    Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, United States
    Contribution
    Conceptualization, Supervision, Funding acquisition, Writing—original draft, Project administration, Writing—review and editing
    For correspondence
    Charles.Sagerstrom@umassmed.edu
    Competing interests
    No competing interests declared
    ORCID icon 0000-0002-1509-5810

Funding

National Institute of Neurological Disorders and Stroke (NS38183)

  • Charles G Sagerström

Biotechnology and Biological Sciences Research Council (BB/N00907X/1)

  • Nicoletta Bobola

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We are grateful to the Genomic Technologies and Bioinformatics Core Facilities at the University of Manchester, UK and to Alper Kucukural at the University of Massachusetts Bioinformatics Core for assistance.

Ethics

Animal experimentation: This study was submitted to and approved by the University of Massachusetts Medical School Institutional Animal Care and Use Committee (protocol A-1565) and the University of Massachusetts Medical School Institutional Review Board (protocol I-149).

Reviewing Editor

  1. Marianne Bronner, Reviewing Editor, California Institute of Technology, United States

Publication history

  1. Received: February 22, 2018
  2. Accepted: June 8, 2018
  3. Accepted Manuscript published: June 18, 2018 (version 1)
  4. Version of Record published: June 28, 2018 (version 2)

Copyright

© 2018, Ladam et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 461
    Page views
  • 63
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Developmental Biology
    Ariel M Pani, Robert Goldstein
    Research Article
    1. Cell Biology
    2. Developmental Biology
    Matthew Richard Johnson et al.
    Research Article Updated