1. Developmental Biology
  2. Chromosomes and Gene Expression
Download icon

Establishment and maintenance of heritable chromatin structure during early Drosophila embryogenesis

  1. Shelby A Blythe  Is a corresponding author
  2. Eric F Wieschaus  Is a corresponding author
  1. Howard Hughes Medical Institute, Princeton University, United States
Short Report
  • Cited 30
  • Views 3,466
  • Annotations
Cite this article as: eLife 2016;5:e20148 doi: 10.7554/eLife.20148

Abstract

During embryogenesis, the initial chromatin state is established during a period of rapid proliferative activity. We have measured with 3-min time resolution how heritable patterns of chromatin structure are initially established and maintained during the midblastula transition (MBT). We find that regions of accessibility are established sequentially, where enhancers are opened in advance of promoters and insulators. These open states are stably maintained in highly condensed mitotic chromatin to ensure faithful inheritance of prior accessibility status across cell divisions. The temporal progression of establishment is controlled by the biological timers that control the onset of the MBT. In general, acquisition of promoter accessibility is controlled by the biological timer that measures the nucleo-cytoplasmic (N:C) ratio, whereas timing of enhancer accessibility is regulated independently of the N:C ratio. These different timing classes each associate with binding sites for two transcription factors, GAGA-factor and Zelda, previously implicated in controlling chromatin accessibility at ZGA.

https://doi.org/10.7554/eLife.20148.001

Introduction

Early during development, the genetic loci involved in cell fate specification and differentiation adopt chromatin structure that allows cells to respond accurately to spatial and temporal developmental cues (Guenther et al., 2007; Zeitlinger et al., 2007; Vastenhouw et al., 2010; Levine, 2011; Li et al., 2014; Zhang et al., 2014; Harrison and Eisen, 2015; Sun et al., 2015). Cells of the early embryo must propagate these chromatin states following mitosis and DNA replication (Ferraro et al., 2016). However, it is unclear how such patterns of chromatin accessibility are established in the short interphases and rapid cell cycle progression that characterize early embryonic development, and it is equally unclear how such patterns are inherited following each cell division.

In order for chromatin structure to be maintained continually over multiple cell divisions, mechanisms must exist to overcome nucleosome disruption during DNA replication and mitosis. Nucleosomes are displaced during replication of the parental DNA strand, and re-establishment of prior open and accessible states requires the direct competition between mechanisms for de novo nucleosome assembly and cis-regulatory factors (Ramachandran and Henikoff, 2016). Similarly, mitosis is characterized by highly condensed chromatin structure, and it remains unclear the degree to which prior accessibility states are maintained under these conditions. Notably, despite this highly condensed state, it is possible that the underlying structure retains the overall organization and accessibility state of interphase nuclei (Earnshaw and Laemmli, 1983). Indeed, a recent study of chromatin structure in murine erythroblasts revealed that the patterns of DNase hypersensitivity are largely preserved during mitosis (Hsiung et al., 2015). We wanted to assess when such features of chromatin structure are acquired during early embryogenesis.

Historically, these questions have been difficult to address in embryos. Classical methodologies for measuring chromatin accessibility (e.g., DNase hypersensitivity) and nucleosome positioning (e.g., Micrococcal nuclease digestion) typically require large quantities of input chromatin, on the order of 103–105 embryos. Such a high requirement for input material significantly limits the temporal resolution of these experiments, requiring embryos to be pooled from collections spanning hours. Yet, changes in early embryonic chromatin structure are predicted to occur over minute timescales (Blythe and Wieschaus, 2015b). A recently developed approach, ATAC-seq (Buenrostro et al., 2013), overcomes many of these limitations, allowing for sensitive measurement of both chromatin accessibility and nucleosome positioning in samples as small as single cells.

ATAC-seq generates profiles of chromatin accessibility by fragmenting the genome with Tn5 transposase. Fragmentation preferentially occurs at open, nucleosome-free regions. Accessible regions of the genome are recovered as small (50–100 bp) fragments. In addition, because nucleosome occupancy is refractory to fragmentation, a second population of larger protected fragments is also recovered, and these are used to predict nucleosome positions (Buenrostro et al., 2013). In practice, nucleosome positioning is best estimated by coverage of large protected fragments that are flanked by small accessible fragments (Schep et al., 2015). Therefore, the organization of generally ‘open’ chromatin, composed of a mixture of short tracts of accessible DNA with interspersed nucleosomes is effectively measured by this technique.

In the following, we have applied ATAC-seq to measure changes in chromatin accessibility and nucleosome positioning over multiple cell divisions with 3-min time resolution during the three syncytial cell cycles (NC11-13) preceding the Drosophila midblastula transition (MBT) and large-scale zygotic genome activation (ZGA) (Farrell and O'Farrell, 2014; Blythe and Wieschaus, 2015a; Harrison and Eisen, 2015). During this period, nuclei rapidly cycle synchronously between S-phase and M-phase with stereotypic cell cycle timing and little or no distinction between early- and late-replicating chromatin compartments (Farrell and O'Farrell, 2014). Despite this intense cell cycle activity, embryos arrive at the MBT having established the initial chromatin state upon which developmental patterning systems will initially operate (Blythe and Wieschaus, 2015a; Harrison and Eisen, 2015). It is unclear whether such regions are established anew during each cell cycle, or instead whether establishment entails the acquisition of mechanisms to counteract the otherwise deleterious consequences of mitosis and DNA replication. To address this question, we measured how patterns of chromatin accessibility are established and maintained in early embryos.

Results and discussion

We performed ATAC-seq on samples from 13 timepoints (n ≥ 3 embryos per timepoint) spanning NC11 and NC13 in 3-min intervals and called peaks in order to identify when regions of ‘open’ chromatin first become accessible (see Materials and methods). Within a total set of 9824 accessible peaks, two general temporal classes emerge from this analysis that reflect the timing of the initial acquisition of accessibility: peaks that are accessible early and that persist throughout the entire period of observation (Figure 1A, middle panel, and Figure 1B ‘Open by NC11’, n = 3084 (31%)), and peaks that gain accessibility during this period (Figure 1A: right panel; and Figure 1B: New at NC12 or NC13, n = 6740 (69%)). These peaks were assigned to genomic features based on existing annotations (promoters, insulators, and enhancers; see Figure 1B and Materials and methods). During this period, 78% (3027 of 3887) of all promoters and insulators dynamically gain accessibility during NC12 and NC13, whereas 47% (1538 of 3260) of all enhancers are open early, already by NC11. We also detect accessibility at 717 peaks overlapping with 854 experimentally validated enhancers from the Vienna Fly Enhancers collection (19.9% of all enhancers in the collection, n = 3604) (Kvon et al., 2014). This subset of enhancers is similarly enriched for sites that gain accessibility by NC11 (41%, 294 of 717: p=1.1x10−7 by Fisher’s Exact Test for ± NC11 by ± Fly Enhancer). These results suggest that, on average, enhancer accessibility precedes that of promoters. This effect is evident on both short and long timescales. Over short timescales on the order of minutes, we observe stable enhancer accessibility precedes that of associated promoters (e.g. hunchback P1, tribbles, empty spiracles, Ultrabithorax, sloppy paired 2, forkhead) (Figure 1A and Figure 1—figure supplements 48). This observation is consistent with a model where promoter accessibility is limiting for activation of expression of these genes in spatially restricted patterns, and that enhancer elements could be ‘primed’ independently of these promoters in advance of expression. In support of this, we also observe that within the set of Vienna Fly Enhancers overlapping our set of accessible peaks, although 386 (45.2%) of these enhancers drive active gene expression in early embryos (stage 4–6), 468 of these enhancers (54.8%) are not transcriptionally active until hours later in development (Figure 1C). Notably, compared with peaks that open early (NC11), the set of peaks that open late (NC12-13) are enriched for enhancers that will be active later in development (>stage 4–6: n = 158 for NC11 and 310 for NC12 and 13: p=1.13x10−7 by Fisher’s Exact Test for ± NC11 by ± stage 4–6 expression).

Figure 1 with 26 supplements see all
Sequential establishment of chromatin accessibility at the MBT.

(A) (Left column) Single embryos were collected for ATAC seq at the indicated time points. Micrographs show stage-matched panels from a time-lapse image of a single Histone H2Av-GFP embryo. Scale bar = 10 μm. Coverage of sequence reads from accessible chromatin over the hunchback (center) and scylla (right) loci are shown. The hunchback P2 promoter/enhancer and associated shadow enhancer in addition to the later-acting stripe enhancer (arrows) are open and accessible at all time points measured. A schematic summarizing the regulatory interactions of the hunchback locus is shown at bottom left. The scylla TSS gains accessibility at NC12 + 9’ and is stably maintained thereafter. Scale bars (red) equal 1 kb. Plots show mean coverage from at least n = 3 replicates. (B) The fraction of each genomic feature present during each cell cycle is plotted as a stacked bar chart. ‘NC12 new’ and ‘NC13 new’ indicate the set of peaks newly called present in each respective cell cycle. Not shown: 2898 peaks not found to overlap with available genomic annotations used to classify enhancers, promoters, or insulators. (C) The association of functionally validated enhancers within each ATAC-seq timing class was calculated, and this plot shows what fraction of these are active at the indicated timepoints. Solid bars indicate the fraction of enhancers whose expression is first detected at the indicated timepoint. The lighter remaining bar indicates the total fraction of active enhancers associated with each ATAC-seq timing class. Color coding is as for panel B, and enhancers not overlapping with the ATAC-seq peaks are shown in red. The estimated elapsed time post NC11-NC13 for each scored stage of development (bottom axis) is indicated on the top axis. (D) Odds ratios for enrichment of the indicated genomic features and transcriptional regulators were calculated. Early chromatin accessibility is enriched for enhancers (p=2.54x10−109) and binding of Zelda (p=3.7x10−225). Late or dynamic chromatin accessibility is enriched for promoters (p=4.62x10−42), insulators (p=7.71x10−14), and binding of GAF (p=3.76x10−19). p-Values are from two-sided Fisher’s exact test on contingency tables constructed on [-/+ feature by early/dynamic].

https://doi.org/10.7554/eLife.20148.002

In agreement with prior evaluations of the onset of zygotic gene expression (Pritchard and Schubiger, 1996; De Renzis et al., 2007; Liang et al., 2008; Harrison et al., 2011; Lott et al., 2011; Chen et al., 2013; Ali-Murthy et al., 2013; Blythe and Wieschaus, 2015b), we find that promoters for early expressed genes associated with the initial steps of embryonic patterning are in general open and accessible early, from NC11 onward. This includes the sets of gap genes (giant, hunchback P1, Krüppel, and knirps, see Figure 1A and Figure 1—figure supplements 911), pair-rule genes (even-skipped, hairy, odd-skipped, fushi tarazu, runt, paired, sloppy paired 1, and odd paired, see Figure 1—figure supplements 1219), and terminal genes (huckebein and tailless, see Figure 1—figure supplements 20 and 21) that initially subdivide the embryonic anterior-posterior axis and termini. Likewise, we find that promoters for genes required to pattern the dorsal-ventral axis (decapentaplegic, zerknullt, short gastrulation, brinker, snail, and twist, data not shown) are scored open by NC11. Later acting embryonic patterning genes, such as the segment polarity genes, demonstrate variable timing for acquisition of chromatin accessibility, with the wingless promoter opening by NC11, but those of engrailed and patched open at NC12, and the hedgehog promoter becomes open by NC13 (Figure 1—figure supplements 2225). Similarly, the ‘head gap genes’ also show variable acquisition of accessibility, with orthodenticle (oc), cap’n’collar and buttonhead opening by NC11 (data not shown), but empty spiracles opens at NC12, and crocodile opens at NC13 (Figure 1—figure supplements 5 and 25). A complete tabulation of recovered promoters and categorization of timing classes is provided in Supplementary file 4.

These two timing classes of chromatin regions are also differentially enriched for binding of Zelda and GAGA-factor (GAF), two transcription factors previously implicated in driving large-scale changes in chromatin structure during ZGA (Blythe and Wieschaus, 2015b; Schulz et al., 2015; Sun et al., 2015). Loci that are open early are enriched for binding of Zelda, whereas dynamic loci are enriched for binding of GAF (Figure 1D). These results suggest that the initial establishment of open chromatin proceeds in a stepwise or sequential manner where accessibility of genomic cis-regulatory elements precedes that of promoter and insulator elements. The coordinated change in promoter accessibility that occurs during NC12 and NC13 coincides with the large-scale recruitment of RNA Pol II to thousands of genes at ZGA (Blythe and Wieschaus, 2015b). These observations further support a model where the gain of promoter accessibility is the limiting factor for large-scale ZGA.

The timing of the MBT is controlled by at least two biological clocks, one which measures the relative nucleo-cytoplasmic (N:C) ratio and another which is N:C ratio independent and instead depends on elapsed developmental time (Newport and Kirschner, 1982; Edgar et al., 1986; Edgar and Schubiger, 1986; Tadros et al., 2003, 2007; Lu et al., 2009; Blythe and Wieschaus, 2015a). We have previously predicted that chromatin structure is the primary target of regulation by these biological timers (Blythe and Wieschaus, 2015b; Blythe and Wieschaus, 2015a). If this were the case, then we would expect that delaying the onset of the MBT would lead to corresponding delays in acquisition of chromatin accessibility. To distinguish the relative contribution of each timer to establishment of chromatin accessibility, we performed an ATAC-seq timecourse on pairs of N:C ratio-matched haploid embryos maternally mutant for the sesame locus (ssm, flybase: Hira, see also Materials and methods) (Loppin et al., 2000; Bonnefoy et al., 2007). Haploid embryos undergo one additional pre-MBT mitotic division in order to achieve the same N:C ratio as their diploid counterparts (Figure 2A) (Edgar et al., 1986). Therefore, we predicted that any N:C ratio-dependent chromatin remodeling would be delayed in haploid compared with diploid embryos (Lu et al., 2009), whereas the timing of N:C ratio independent events would remain unchanged.

Figure 2 with 2 supplements see all
Dynamic acquisition of chromatin accessibility is regulated by the N:C ratio.

(A) Cell cycle times for diploid (+/+, n = 21) and haploid embryos (ssm, n = 12) were measured by time lapse confocal imaging of His2Av-GFP. Mean cell cycle times ± SEM are indicated. Colored regions indicate periods of equivalent N:C ratios (0.25–1.0) between genotypes. (B) Accessible peaks were identified for haploid embryos and the fraction of overlap or co-occurence between diploids and haploids was calculated and plotted for the indicated cell cycles. (C) Median accessibility for haploid and diploid embryos was calculated for the sets of regions indicated in the legend (n = 2630 N:C ratio dependent regions, n = 1658 time-dependent regions). All differences in magnitude between time- and N:C ratio-dependent peaks in haploid embryos are statistically significant at p<0.01 by a randomization test, whereas only the NC13 + 15’ time point achieves p<0.01 in the diploid dataset. Significance testing performed by proportionally dividing all datapoints into two groups at random and testing whether observed differences in median accessibility were equal to or greater than the difference observed between the ‘N:C ratio’ and ‘time’ groupings (one-tailed permutation test for n = 1×104 trials). (D) Median accessibility data from panel C was re-scaled along the x-axis to match relative N:C ratios as shown in panel A. (E) Odds ratios for enrichment of the indicated genomic features and transcriptional regulators were calculated. Time-dependent chromatin accessibility is enriched for enhancers (p=5.78x10−7) and Zelda binding (p=2.42x10−12), whereas N:C ratio dependent chromatin accessibility is enriched for promoters (p=5.66x10−17), insulators (p=2.42x10−4), and GAF binding (p=2.72x10−09). p-Values are from two-sided Fisher’s exact test on contingency tables constructed on [-/+ feature by time/N:C]. GAF odds ratios were calculated from regions that were GAF+ and Zelda-. GAF is significantly enriched in both N:C and time-dependent classes if co-occurrence with Zelda is considered.

https://doi.org/10.7554/eLife.20148.029

Open and accessible chromatin regions in haploids were identified on a per-cell-cycle basis as described above (n = 7217 total peaks in ssm). At equivalent developmental times, only 34.3 ± 8.1% of dynamic peaks are accessible in haploids compared with diploids. In contrast, at equivalent N:C ratios, 84.8 ± 11.8% of dynamic peaks are accessible in haploids versus diploids (Figure 2B, n = 4288 scored peaks). Based on this analysis, we classified each scored peak as N:C ratio- or time-dependent (n = 2630 and 1658, respectively). In diploid embryos, changes in accessibility occur with similar dynamics in both time- and N:C-ratio-dependent loci. In haploids, time-dependent loci gain accessibility in advance of N:C-ratio-dependent loci (Figure 2C), which in turn only become accessible upon attaining the correct N:C ratio (Figure 2D). Importantly, classification into these different timing classes recapitulates the early versus late/dynamic dichotomy we observe between different genomic regulatory features (Figure 2E). Time-dependent loci are significantly enriched for enhancers and Zelda-associated regions (Figure 2E), similarly to the set of early accessible domains (Figure 1D). In contrast, N:C-ratio-dependent loci are significantly enriched for both promoter, insulator and GAF-associated regions (Figure 2E), similarly to the set of late or dynamic regions (Figure 1D). These results indicate that, similar to large-scale ZGA and cell cycle remodeling, the acquisition of chromatin accessibility at the MBT is likewise sensitive to the embryonic N:C ratio. Taken together, these results are consistent with a model where the biological timers that determine the MBT operate via distinct sets of chromatin regulatory factors such as Zelda and GAF in order to impart the sequential establishment of chromatin structure in advance of ZGA.

One general feature of chromatin we observe is that patterns of accessibility, once established, are maintained during both DNA replication and mitosis. We expected that large-scale chromatin condensation at metaphase would result in loss of accessibility during mitosis. This is not the case. Regulatory elements, such as promoters and enhancers (for example: hb P2 promoter/enhancer and shadow enhancer, Figure 1A, middle panel arrows) remain accessible even under conditions of large-scale chromatin condensation that accompanies mitotic metaphase (NC11 + 9’, NC12 + 12’, and NC13 + 18’, see also Figure 1—figure supplement 1). Genome-wide, 73.5 ± 10% (mean ± sd) of all peaks maintain accessibility during metaphase (see also Figure 1—figure supplement 2). These results indicate that mitotic chromatin condensation does not necessarily erase prior accessibility states and that functional heritability of such states (Ferraro et al., 2016) across cell cycles could therefore stem from a simple mechanism for mitotic stability of chromatin accessibility.

When examined on a finer scale, established open regions display subtle cyclic periods of maximal and minimal accessibility that arise from DNA-replication-associated disruption and recovery of nucleosome structure. The period of minimal average chromatin accessibility correlates with the initial post-mitotic burst of DNA replication (Blumenthal et al., 1974), as estimated by the intensity of nuclear foci of PCNA-EGFP at replication factories (McCleland et al., 2009) (Figure 3A, see also Figure 3—figure supplement 1 and Materials and methods). For example, during each cycle, the hb P2 promoter/enhancer displays minimum accessibility 3 min after entering interphase and subsequently reaches maximum accessibility during mitotic prophase (Figure 3B, red trace). The observed increase in overall accessibility during interphase positively correlates with the transcriptional activity of this locus, as measured by an MS2::MCP reporter (Garcia et al., 2013) (Figure 3B, blue trace). However, at mitotic metaphase, although transcriptional activity ceases (Figure 3B) and RNA Polymerase II is largely evicted from chromatin (Figure 3C), chromatin accessibility is nonetheless maintained (Figure 3B, grey bars). This suggests that the major constraint on the maintenance of chromatin architecture is nucleosome disruption during DNA replication and is consistent with a model where regulatory factors function to rapidly re-establish patterns of accessibility following passage of the replication fork.

Figure 3 with 1 supplement see all
Patterns of stable and dynamic chromatin accessibility over the cell cycle.

(A) Genome-wide mean ATAC-open coverage is plotted (red, error bars show std. dev. between peaks, n = 9824) over the mean scaled heterogeneity in PCNA-EGFP to estimate DNA replication activity (blue, n = 4, error bars show std. dev. between embryos). Mitotic phases are indicated by grey shading. (B) Mean coverage of sequencing reads from accessible chromatin over the hunchback P2 promoter/enhancer is plotted (red) over the mean measured spot fluorescence intensity from a hunchback P2> MS2(24) LacZ: : MCP-GFP reporter (blue, n = 3, error bars show std. dev. between embryos). Mitotic phases are indicated by grey shading. (C) Immunofluorecence for chromatin morphology (DAPI) and RNA Polymerase II (CTD pSer5) in a region of an NC13 embryo transitioning from prophase to metaphase is shown. Observed mitotic states are indicated at left. Scale bar = 10 μm.

https://doi.org/10.7554/eLife.20148.032

Indeed, previous studies indicate that DNA replication can disturb patterns of nucleosome distribution, resulting in short-term interference with chromatin accessibility at newly replicated loci, which is counteracted by competition with regulatory factors that promote re-establishment of interphase chromatin architecture (Ramachandran and Henikoff, 2016). To measure how nucleosome positioning changes during these early embryonic cell cycles, we mapped nucleosome positions flanking the total set of accessible chromatin regions using NucleoATAC (Schep et al., 2015). One conserved feature of promoter chromatin architecture is a characteristic nucleosome-free region (NFR) upstream of the TSS. Promoters are useful for evaluating nucleosome positioning because, unlike in enhancers, the occupying nucleosomes adopt stereotypical phased positions around the NFR, anchored to a determined point of reference, the TSS, which facilitates the evaluation of dynamic nucleosome behavior at accessible sites following DNA replication.

In the Drosophila genome, on average, the +1 nucleosome is centered 135 bp downstream of the TSS, and the −1 nucleosome is centered 180 bp upstream of the TSS (Mavrich et al., 2008). Such NFR structures are observed at promoters between NC11 and NC13 (Figure 4A and Figure 4—figure supplement 1B). However, significant disruption to this promoter structure is clearly observed during DNA replication, particularly at 3 min into NC11 (Figure 4A and B, NC11 + 3’). Promoters largely recover nucleosome positioning by prophase and metaphase (Figure 4B and Figure 4—figure supplement 1, NC11 + 6’ and +9’, respectively). Nucleosome organization at promoters undergoes a similar disordering and recovery during NC12. However, the ability of replication to disrupt nucleosome structure is strikingly reduced during NC13, the developmental period when large-scale changes in zygotic transcriptional activity are first observed (Figure 4A,B, and Figure 4—figure supplement 1, NC13 + 3’ and +6’) (Chen et al., 2013; Blythe and Wieschaus, 2015b). This effect is also reflected in the times required for prior accessibility levels to be recovered at these regions following replication. On average, NFRs recover prior accessibility levels by 9 min into NC12. Despite the longer overall cell cycle time and incipient changes to the rate of S-phase progression during NC13, recovery of prior accessibility states is achieved on average by 6 min into NC13 (Figure 4—figure supplement 2). These results indicate that over this period there is a shift in the balance between replication-coupled nucleosome disordering and the counteracting regulatory factors that maintain patterns of chromatin accessibility. Initially, post-replicative gains in accessibility, although maintained through mitosis, are erased by DNA replication during the following cell cycle. Stabilization of nucleosome structure during NC13, however, would allow for large-scale recruitment of RNA Pol II to promoters early in the cell cycle. These observations are consistent with previous observations that the temporal regulation of the MBT stems from conflicting interactions between RNA transcription and DNA replication, at the level of the initial establishment of accessible chromatin structure (Blythe and Wieschaus, 2015b).

Figure 4 with 2 supplements see all
Attenuation of replication coupled nucleosome disruption at the MBT.

(A) Predicted nucleosome occupancy is plotted for the selected time points (top) for the 800 bp region flanking a set of early embryonic promoters, centered over the TSS. These timepoints correspond to the initial phases of DNA replication in each cell cycle. Nucleosome profiles for the complete timecourse are provided in Figure 4—figure supplement 1. Promoters are ordered on the y-axis by the average −1 nucleosome position over the entire time course, and the relative nucleosome occupancy is represented by the colorbar at left. The log10 average occupancy signal for each time point is plotted below each heatmap (red). Maximum nucleosome signal at the TSS over the entire timecourse is indicated by the grey line. (B) Mean-normalized relative NFR sizes for time (red) and N:C ratio (blue) dependent loci are plotted for haploid embryos for comparison with diploid embryos (green). Data are plotted as a function of N:C ratio (x-axis). NFRs for time-dependent loci demonstrate minimal closing during metaphase (N:C = 0.25, p<0.01, asterisk) and early S-phase (N:C = 0.5, p<0.01, asterisk) compared with N:C-ratio-dependent loci and N:C ratio matched diploid loci. p-Values indicate frequencies of observing differences between randomly selected loci greater than or equal to that observed for the time- and N:C-ratio-dependent groups (one-tailed permutation test for n = 1×104 trials). Error bars show the 95% confidence interval for differences between the plotted median values. (C) Mean-normalized relative NFR sizes for Zelda-associated (red) and GAF-associated (blue) promoters was plotted for haploid embryos. Plotting and significance testing is as described for panel B. Zelda-associated promoters demonstrate increased NFR stability during early S-phase during both N:C = 0.25 and N:C = 0.5 (asterisks, p<0.01, one-tailed permutation test for n = 1×104 trials). Error bars show the 95% confidence interval for differences between the plotted median values.

https://doi.org/10.7554/eLife.20148.034

We next addressed whether the resistance to replication-coupled nucleosome disruption acquired by open regions at NC13 arises independently of the changes DNA replication rate and origin firing that also occur at this cycle (Shermoen et al., 2010; Farrell and O'Farrell, 2014). To distinguish these possibilities, we measured changes in NFR size between time- and N:C-ratio-dependent promoters in haploid embryos (Figure 4B). The average NFR size for N:C-ratio-dependent promoters in haploids correlates well with that of diploid embryos when plotted over equivalent N:C ratios (Figure 4B, blue trace), demonstrating attenuated replication coupled nucleosome disruption during the final cell cycle before the MBT. In contrast, the set of time-dependent promoters in haploids demonstrate attenuated disruption of NFR size one cell cycle earlier than their N:C ratio dependent counterparts, prior to S-phase remodeling (Figure 4B, red trace, asterisks). Likewise, in haploid embryos, early, time-dependent Zelda-associated promoters demonstrate increased resistance to replication-associated nucleosome disruption compared with later, N:C-ratio-dependent GAF-associated promoters (Figure 4C). This indicates that resistance to replication-coupled nucleosome disruption occurs independently of S-phase remodeling and is linked to the mechanisms that drive the initial establishment of accessible states.

Taken together, our observations are consistent with a model for ZGA where chromatin accessibility at genomic features is conferred in defined steps and that establishment of particular states depends on competition between DNA replication machinery and cis-regulatory factors. Such competition is not limited to embryonic tissues but is likely a general mechanism for recovery of accessibility states following replication (Ramachandran and Henikoff, 2016). Our results extend these observations by indicating that such competition may be a major point of regulation for the developmental acquisition of stable patterns of chromatin accessibility. Our results also indicate that stability of nucleosome positioning during mitosis is a major mechanism for epigenetic inheritance of prior chromatin states from one cell cycle to the next. Whether such stability requires an active mechanism for stabilization, as has been proposed in mammalian systems (Blobel et al., 2009; Hsiung et al., 2015), remains to be determined. Our model therefore predicts that the initial establishment of stable accessible chromatin states during periods of intense mitotic activity is limited by the availability of transcription factors. Upon reaching a critical activity threshold, regulatory factors such as Zelda and GAF would successfully compete with nucleosomes to efficiently re-establish patterns of chromatin accessibility following DNA replication. This competition model is consistent with a prior observation that reduction of function for either GAF or Zelda is sufficient to reduce conflicts between the DNA replication machinery and some feature of chromatin remodeling at ZGA (Blythe and Wieschaus, 2015b). Such competitive interactions between systems for transcription and replication may also be conserved timing mechanisms in vertebrate embryos (Almouzni and Wolffe, 1995; Amodeo et al., 2015).

Materials and methods

Model organism

All reported experiments were performed on embryos from transgenic and mutant variants of the Fruit Fly Drosophila melanogaster (NCBI Taxon 7227).

ATAC-seq sample preparation

Embryos produced from either wild-type (w; His2Av-GFP) or homozygous sesame (w ssm185b; His2Av-GFP) mothers were collected on yeasted apple juice agar plates. Individual embryos were selected from plates and arrayed in individual wells in Nunc Microwell Mini Trays (VWR, Radnor, PA), and nuclear morphology was observed by Histone-GFP fluorescence under an AMG EVOS FL digital microscope at 20x magnification. Temperatures for sample collection were maintained between 21°C and 24°C to minimize variation in cell cycle timing (Supplementary file 2). Under these conditions, cell cycle times were indistinguishable from those measured by confocal microscopy and reported in Figure 4A (Blythe and Wieschaus, 2015b). Developmental stage was determined as follows: NC12 = 12 to 18 nuclei per 2500 μm2; NC13 = 22–30 nuclei per 2500 μm2; and NC14 = > 30 nuclei per 2500 μm2. Cell cycles were timed from the onset of anaphase of the previous cell cycle. The staging of samples at 3-min intervals relies on the synchrony of nuclear divisions and the extraordinary reproducibility in the timing and duration of cell cycles in the cleavage stage Drosophila embryos. Each embryo was hand-selected for analysis at metaphase of the preceding cycle. To ensure synchrony, only embryos in which all nuclei had entered metaphase within a 30 s window were selected. Anaphase was then required to have begun across the length of the embryo within a 60 s window and the total elapsed time for mitosis (metaphase start to anaphase start) was not allowed to exceed 3 min. Embryos exceeding these limits ( ~10%) were discarded. Under these conditions, embryos selected at cycle 10 anaphase reach metaphase of cycle 11 in 8.2 ± 0.4 min ± s.d., embryos selected at cycle 11 anaphase reach metaphase of cycle 12 in 10.7 ± 0.9 min ± s.d., embryos selected at cycle 12 anaphase reach metaphase 13 in 17.2 ± 0.9 min ± s.d. This precision allows us to select and process embryos for each metaphase stage by selecting at the previous anaphase. Consistent with the reproducible progression of the Drosophila cell cycles, embryos processed at metaphase and timed from the previous cycle interval cluster with each other and single metaphase embryos selected in the previous cell cycle show highly similar and correlated patterns of accessibility (see Figure 1—figure supplement 1).

Upon selecting an embryo, a timer was started upon observation of anaphase of the cell cycle prior to the target collection stage. Because processing embryos for ATAC-seq requires approximately 2.5 min, to collect at NC12 + 6 min, for example, a previously selected embryo was collected from its well at 3.5 min post-anaphase. It was then dechorionated by a 45 s exposure to freshly prepared 4% bleach (Clorox) followed by 3 x ~6 ml washes in deionized water. Subsequently, the embryo was placed inside a detached cap of a 1.5-ml microcentrifuge tube and at 6 min post-anaphase, the embryo was overlaid with 10 μl of ice cold Lysis Buffer (Buenrostro et al., 2015) (10 mM Tris-HCl pH 7.4; 10 mM NaCl; 3 mM MgCl2; 0.1% Igepal CA-630) and immediately macerated to homogeneity with a fire-polished microcapillary tube (Drummond Microcap 20 μl, Sigma-Aldrich, St. Louis, MO). The cap was next inverted and placed over a 1.5 ml low-retention microcentrifuge tube (Eppendorf) containing an additional 40 μl of Lysis Buffer and spun for 10 min at 500 RCF at 4°C. Supernatant removal was monitored under a stereomicroscope to ensure that the crude nuclear pellet, visible as a yellow-grey mass, was not disturbed. The nuclear pellet was then placed in dry ice prior to further processing until several additional samples were collected. All samples were frozen for at least 30 min prior to fragmentation and library preparation. In a set of optimization experiments (data not shown), freezing the pellet for short periods of time had no effect on the overall efficiency or quality of ATAC-seq data.

Fragmentation and amplification of ATAC-seq libraries were performed essentially as described previously (Buenrostro et al., 2015). Fragmentation reaction volumes were 10 μl for a single embryo, and used 2.5 μl of Tn5 Transposase from the Illumina Nextera Sample Preparation Kit. The volume of Tn5 was determined empirically, where 0.5 μl produced insufficient fragmentation, and 5 μl was over-fragmented. Fragmentation reactions took place for 30 min on an Eppendorf Thermomixer set to 37°C and 800 rpm. Samples were purified through Qiagen Minelute columns following the manufacturer’s protocol for enzymatic reaction clean up, eluting in 10 μl. Optimization experiments indicated that this purification was essential for high-quality ATAC-seq libraries from single embryos, although this step is considered ‘optional’ in published protocols (Buenrostro et al., 2015).

Library amplification was performed as described previously (Buenrostro et al., 2015). All libraries amplified with a range of 11–14 total PCR cycles, and the number of cycles roughly corresponded to developmental stage, where older embryos required fewer and younger embryos required more cycles (data not shown, see Supplementary file 2). Amplified libraries were purified from PCR reactions with 1.8x Ampure SPRI beads following the manufacturer’s instructions and 1 μl of a 1:10 diluted library prep was evaluated on a Bioanalyzer HS-DNA chip to estimate quality. Concentrations were estimated using a Qubit fluorometer. To control for sample loss or contamination during preparation, samples showing significant deviations were discarded, based on final library concentration, known DNA content of the starting material, total number of PCR cycles, and the overall variance of these values for all libraries for a single experiment.

For wild-type NC12 and NC13 samples, single embryos were collected for each time point (Supplementary file 1). The wild-type NC11 sample set was collected subsequently, with the purpose of comparing directly with the wild-type NC12 data set, where we observe a transitional state of chromatin structure. To facilitate this direct comparison, two NC11 embryos were collected per time point, per replicate, in order to minimize differences introduced by reduced genomic nuclear content. Similarly, the haploid NC12 and NC13 samples were collected with the intent of direct comparison with wild-type NC12 and NC13. As such, two haploid embryos were collected per time point, per replicate. Single haploid NC14 embryos were collected per time point per replicate. As such, samples ranged between 4096 and 8192 estimated diploid genomic equivalents (Supplementary file 2).

For samples where two embryos were collected per time point, per replicate, individual embryos were collected and processed through the fragmentation step described above. Samples were combined during the Qiagen Minelute clean up step. This approach enabled samples to be fragmented under identical conditions. During optimization experiments, combination of two embryos in a single 10 μl fragmentation volume resulted in inefficient fragmentation compared with individual embryos, and could stem from increased non-nuclear/genomic contaminating material per unit volume of fragmentation reaction.

We note that the input sample to the fragmentation reaction contains not only genomic DNA, but also a significant proportion of mitochondrial DNA. During library preparation, we made no effort to separate nuclear and mitochondrial fractions. During early development (0–2.5 hr), mitochondrial DNA comprises between 50 and 99% of the total DNA content in the egg. Replication of mitochondrial DNA is not coupled to that of nuclear DNA, and whereas nuclear DNA amplifies exponentially in early development, mitochondrial DNA does not replicate until much later (>6 hr) (Rubenstein et al., 1977). Therefore, since mitochondrial DNA content remains effectively constant during the period of sample preparation and is abundantly recovered, this conveniently ‘buffers’ the library preparation despite the large changes in genomic, nuclear DNA content.

Attention was given to sample randomization, where different timepoints were collected on different days whenever possible. Likewise, distribution of libraries within pools was designed to minimize inclusion of identical timepoints and identical library prep dates within the same pool. Sample collection, library preparation, and analysis was not blinded.

When sequenced to extreme depth, ATAC-seq can reveal ‘footprints’ of transcription factor binding sites (Buenrostro et al., 2013). Although we did not formally attempt to measure whether transcription factor footprints are resolved in this data set, it is likely that these sequencing libraries are not of sufficient depth to reliably detect this phenomenon. On average, our libraries were sequenced to a depth (~107 reads) that was sufficient for the purposes of calling peaks and distinguishing nucleosome positions. However, this depth is an order of magnitude less than the recommended depth for resolving footprints (~108 reads) (Buenrostro et al., 2015).

Pooled, barcoded ATAC-seq libraries were subjected to 2×67 bp paired end sequencing on an Illumina HiSeq 2500 at the Lewis-Sigler Institute for Integrative Genomics Sequencing Core Facility, Princeton, NJ.

Data pre-processing

Barcode-split sequencing files were first subjected to adapter trimming, using TrimGalore version 0.4.0 (www.bioinformatics.babraham.ac.uk) with the optional parameters --trim1 and --paired.

Trimmed reads were mapped to the Drosophila melanogaster genome (dm6) using BWA (Li and Durbin, 2009) (http://bio-bwa.sourceforge.net). First, individual read ends were mapped using BWA aln with default parameters. Paired-end mapped reads were then compiled using BWA sampe with option –a 5000.

Mapped reads were further processed using samtools (Li et al., 2009) (http://www.htslib.org). BWA output was imported using samtools import with default parameters. Imported reads were sorted (samtools sort), and indexed (samtools index). Reads likely originating from PCR or optical duplicates were marked using Picard MarkDuplicates (https://broadinstitute.github.io/picard/) using default parameters.

Reads were filtered using samtools view. Reads were required to have a map quality score greater than or equal to 30 (-q 30), to be mapped, to be proper pairs, and not to be secondary mappings. Duplicates were also removed.

Paired-end reads were subsequently imported into R as a GenomicRanges (Lawrence et al., 2013) (https://bioconductor.org) object. Only reads mapping to chrX, chr2L, chr2R, chr3L, chr3R, and chr4 were used for downstream analysis, although chrY was also imported to sex single-embryo samples. Fragment ends were adjusted to reflect the original Tn5 interaction site, as described previously (Buenrostro et al., 2013): Watson strand start sites had four base pairs subtracted, and Crick-strand start sites had five base pairs subtracted. In practice, this is executed on a GRanges object where paired end reads have been collapsed into single intervals by calling: start(granges.object) - 4; end(granges.object) + 5.

Biological replicates for individual time points were merged for subsequent analysis.

To distinguish reads originating from open/accessible chromatin reads, samples were fit to a mixed exponential/Gaussian distribution as described previously(Buenrostro et al., 2013). Based on these fits, a ≤98 bp cutoff was selected to identify reads originating from open chromatin regions. Merged timepoint datasets were filtered for reads of width ≤98 bp and designated as ‘open’ reads.

Such reads were either kept in GenomicRanges format for further analysis, or were exported as bed files for peak-calling by Zinba (Rashid et al., 2011).

Read length distribution and estimation of stage-specific library preparation bias

To examine the effect of cell cycle stage on the recovery of sequenced fragments, we plotted histograms of the distribution of recovered fragment sizes (Figure 1—figure supplement 3). As discussed in the introduction, the estimate of accessibility and nucleosome positioning by ATAC seq is sensitive to chromatin topology, where these features can be measured reliably in generally ‘open’ chromatin, but that the expected reduced recovery of small accessible fragments in generally ‘closed’ chromatin could limit the resolution of the data. Given that mitotic chromatin, in particular, is superficially expected to have a compact and largely inaccessible structure, we paid close attention to the relative recovery of small and large sequencing fragments for these stages. We observe that indeed, a greater proportion of large, nucleosome-protected fragments are recovered in both metaphase and early interphase (+3 min) samples (Figure 1—figure supplement 3). The reason for this distribution could be biological in nature: a greater proportion of the genome may be stably associated with nucleosomes during metaphase and early interphase compared with later times. However, we were careful to consider that this distribution could instead reflect an artifact. In principle, during periods of intense chromatin compaction, a significantly smaller proportion of genomic DNA could perhaps be fragmented by Tn5, but since libraries are PCR-amplified to similar final concentrations, then relatively fewer unique ‘open’ regions would be recovered than expected. We predicted that if such an artifact were present, that we would have less fragmented DNA in samples originating from metaphase stages. However, as discussed above, the abundance of mitochondrial DNA effectively buffers samples from wide variation in input DNA content. In this case, we would nonetheless expect to observe fewer overall unique recovered sequence reads that map to genomic DNA, if a significant artifact was present in the metaphase samples. To address this possibility, we plotted the inferred quantity of initial fragmented DNA from our individual library preps, and the number of unique sequence reads recovered for each library (Figure 1—figure supplement 3). Although there is a trend for metaphase samples on average to have less starting fragmented DNA content and fewer uniquely mapping reads, the magnitude of the difference is small, and indistinguishable from a random permutation of samples. We were therefore confident that open regions recovered from metaphase samples do in fact reflect regions of chromatin accessibility and are minimally influenced from major factors pertaining to sample preparation.

Peak calling

Sample sizes (n ≥ 3 biological replicates per timepoint) were sufficient to yield on average 15 million unique sequencing reads per pooled timepoint that map to the canonical chromosome assemblies (dm6) (see Supplementary file 1). We found that peak-calling was sensitive to sequencing read depth (data not shown), where reproducibility was significantly affected when fewer than 17.5 million reads were used. Therefore, we pooled samples by cell cycle, or by genotype (all timepoints) in order to call peaks on open chromatin reads.

Peaks were called using Zinba (Rashid et al., 2011) version 2.03.1 (https://code.google.com/archive/p/zinba/issues/69). To call peaks for any individual cell cycle, merged open datasets for all timepoints within an individual cell cycle were combined. Likewise, all open chromatin datasets were merged together for calling peaks on the entire dataset for a particular genotype.

A custom mappability file for Drosophila melanogaster genome assembly dm6 with 67 bp reads was generated using the Gerstein Mappability Map software (Rozowsky et al., 2009) (http://archive.gersteinlab.org/proj/PeakSeq/Mappability_Map/Code/). For paired-end reads, extension values for generateAlignability(), basealigncount(), and run.zinba() were set as the average fragment size in the set of reads originating from input open chromatin datasets, 65 ± 2 bp.

Peaks were called using run.zinba() with the following optional parameters: input=’none’, winSize=300, offset=50, extension=(see above), selectmodel=FALSE, formula=exp_count~exp_cnvwin_log+align_perc, formulaE=exp_count~exp_cnvwin_log+align_perc, formulaZ=exp_count~align_perc, FDR=TRUE, threshold=0.05, winGap=0, cnvWinSize=7.5E + 4, refinepeaks=TRUE.

Parameters were selected to optimize resolution of closely positioned regions of open chromatin, e.g. the even-skipped locus on chr2R. With these parameters, individual closely-packed known enhancers, insulators, and coding elements are effectively distinguished.

Following peak calling, only regions with posterior probability scores ≥0.8 were retained.

To estimate sample reproducibility within biological replicates from identical timepoints (Figure 1—figure supplement 1, Supplementary file 1), CPM normalized counts of open chromatin fragments within peaks were summed and correlation coefficients and standard deviations between replicates were calculated.

Open chromatin coverage calculation and normalization

Example R code for calculating coverage and performing normalization are provided as a supplementary file.

To calculate coverage of open chromatin reads, the average number of ‘open’ reads within 10 bp windows tiling the chromosomes of interest was calculated. Coverage values were normalized to the counts per million of the original, unsplit (‘open’ and >98 bp read) dataset. The rationale for normalizing to the full dataset rather than to just open reads was that this would likely minimize artificial inflation of open read coverage in cases where open reads comprised a smaller fraction of the overall dataset.

Finally, all open chromatin coverage measurements were normalized by standardization to the mean and standard deviation of coverage over a set of 25,000 randomly selected background regions. To select background regions, the set of peak open regions were widened to 20,000 bp, reduced, and subtracted from the genome assembly. Thereafter, 25,000 random positions were selected and widened to reflect the distribution of widths in the set of open peaks. Coverage within these background regions was then calculated, and regions with zero coverage were discarded (~5%). The distribution of counts within background regions approximated a log-normal distribution. Mean and standard deviation of these background regions was calculated and used to transform the coverage measurements for the entire genome.

This normalization strategy relies on the assumption that randomly selected background regions would have a similar likelihood of background fragmentation during sample preparation, regardless of cell cycle stage or developmental stage. In practice, this transformation has the beneficial effect of setting a consistent floor for the data across all timepoints. During this analysis, all major conclusions derived from the open chromatin datasets were confirmed with both simple CPM normalization, and this background standardization and were determined to be essentially equivalent.

Assigning open peaks to genomic features

Peaks were assigned to either promoter, insulator, or enhancer categories by identifying peaks that overlap known genomic annotations (Harrison et al., 2011; Celniker et al., 2009; Kvon et al., 2014). A peak was assigned to the ‘promoter’ category if any or all of the peak overlapped with the 50 bp region flanking the transcriptional start site. Peaks were assigned to the ‘insulator’ category if a peak was not assigned to the ‘promoter’ class, and if any or all of a peak overlapped with peaks bound by two or more of the following insulator proteins: CTCF, GAF, BEAF32, CP190, Mod(Mdg4), and Su(Hw). Finally, peaks were assigned to the ‘enhancer’ category if a peak was not assigned to the ‘promoter’ or ‘insulator’ class, and if any or all of a peak overlapped with peaks associated with either a published set of experimentally validated enhancers (Kvon et al., 2014) or two or more of the following chromatin marks/ transcription factors: CBP, Histone H3K4me1, Histone H3K27ac, or Zelda. The rationale for selecting this panel of factors to categorize putative enhancers was that the set of functionally validated enhancers is not complete and represents less than 20% of the genome. The CBP/K4me1/K27ac set of marks is typically used to identify active enhancers in other genomics studies (Heintzman et al., 2009; Creyghton et al., 2010; Zentner et al., 2011; Rada-Iglesias et al., 2012; Calo and Wysocka, 2013). Finally, Zelda binding outside of promoter regions is often associated with highly occupied target regions that often attract additional transcription factors and these regions are typically over-represented for developmentally active enhancer regions [(Kvon et al., 2012) and references therein)]. Peaks that did not fall into one of these three categories were categorized as ‘other’.

Datasets from modEncode were downloaded from (http://data.modencode.org) as peak lists. Genomic coordinates were converted to the dm6 assembly and combined: CTCF 0–12 hr ChIP-chip (769 and 770); GAF 0–12 hr ChIP-chip (23); BEAF32 0–12 hr ChIP-chip (21); CP190 0–12 hr ChIP-chip (22); Mod(Mdg4) 0–12 hr ChIP-chip (24); Su(Hw) 0–12 hr ChIP-chip (27 and 901); Nej/CBP 0–4 hr ChIP-chip (875 and 900); H3K4me1 0–4 hr ChIP-chip/seq (423 and 777); H3K27ac 0–4 hr ChIP-chip/seq (424, 970 and 834).

Haploid embryos

Embryos produced from mothers mutant for sesame/Hira (ssm185b) are fertilized normally but are unable to de-condense the male pronucleus during the first zygotic cell cycle (Loppin et al., 2000). These embryos subsequently develop as haploids and therefore have altered N:C ratios at equivalent elapsed times during early development. sesame mutant embryos are morphologically normal during the early stages of development through gastrulation, and a significant proportion of embryos progress to form cuticle. We chose to generate haploid embryos using sesame, because the two other mutants that yield haploid embryos (maternal haploid, and ms(3)K81) develop with additional, uncharacterized defects that affect overall cell cycle duration, and high rates of DNA damage and abnormal nucleokinesis ([Edgar et al., 1986], and data not shown). We expected that these additional phenotypes would confound the temporal comparison between diploid and haploid embryos. sesame mutants, in contrast, develop with nearly indistinguishable cell cycle rates and overall fidelity of nuclear division compared with wild-type embryos (Figure 2A, and data not shown). The product of the sesame locus, Hira, is a histone chaperone responsible for the DNA replication independent deposition of the histone variant H3.3 (Ray-Gallet et al., 2002). H3.3 is typically deposited in regions of open, transcriptionally active chromatin. Notably, the primary essential function of Hira in early embryos is to facilitate male pronuclear decondensation: H3.3 deposition is grossly unaffected in sesame mutant embryos, indicating a Hira-independent mechanism for H3.3 deposition in early embryos (Bonnefoy et al., 2007). On the basis of these observations, we expected that differences observed between wild-type and sesame mutant embryos would primarily reflect alterations in embryonic DNA content, and would be minimally confounded by H3.3-dependent effects on chromatin accessibility. At present, however, we are unable to rule out any H3.3-independent effects of Hira on the acquisition of chromatin accessibility in our experimental results.

Classification of N:C-ratio-dependent and –independent peaks

Peaks were called on ssm datasets as described for wild-type. For each cell cycle, we calculated the co-occurrence of peaks in either ssm or wild type. To assign a peak to either the N:C-ratio-dependent or –independent class, wild-type peaks that gain accessibility in either NC12 or NC13 (late or dynamic peaks) only were scored, and a peak was required also to be ‘present’ in the set of ssm peaks. We consistently observed that 27% (2654 of 9824) of accessible peaks in wild type are not called open in ssm mutant embryos at similar degrees of sequencing coverage. These ‘missing’ peaks were not assigned to any timing category (see Supplementary file 3), nor were wild type peaks that were designated ‘Open by NC11’.

Nucleosome positioning

Nucleosome positions were calculated using the NucleoATAC software package (Schep et al., 2015) (https://github.com/GreenleafLab/NucleoATAC) version 0.3.1 using default parameters. Zinba peak regions were widened to 2500 bp, centered over the maximum read density value calculated, and used as input regions of interest for NucleoATAC analysis. BAM-formatted reads were used, wherein duplicate-filtered reads mapping to canonical chromosomes from biological replicates were pooled by timepoint and genotype.

NucleoATAC returns the predicted positions and occupancy scores of nucleosomes based on the coverage of nucleosome-protected ATAC fragments delimited by flanking ‘open’ ATAC fragments. Following nucleosome position calling, predicted dyad centers were modeled into nucleosomes by widening to 147 bp. NFRs were calculated by finding the predicted NucleoATAC dyad centers immediately upstream (−1) and downstream ( +1) of the position 25 bp upstream of the TSS. We found that using the −25 position for calculation of the −1 and +1 nucleosomes was necessary because the +1 nucleosome can at certain timepoints occlude the TSS, and in these circumstances using the TSS as the vantage point for NFR identification will mistakenly identify the +1 and+2 nucleosome. For certain highly expressed genes (e.g., slam, bnk, sry-alpha) a +1 nucleosome was not called by NucleoATAC. For approximately 33% of all promoters, either the −1 or +1 nucleosome was not called by NucleoATAC. In these cases, the promoter was omitted from further analysis. Plots in Figure 4A show the distribution of modeled nucleosomes weighted by the calculated occupancy score reported by NucleoATAC. Plots in Figure 4B and C show the relative changes and deviations in NFR size between different genotypes and subgroupings, the NFR measurements were mean-normalized such that the average NFR size per group has a 0 bp size. Non-mean normalized NFR sizes (i.e., raw average base-pair distances) for wild-type are presented in Figure 4—figure supplement 1.

EGFP-PCNA imaging and analysis

EGFP-PCNA was produced by via a transgene consisting of a genomic fragment sufficient to complement PCNA loss of function (dm6 chr2R:20261602–20263418) into which an N-terminal Drosophila codon optimized EGFP cassette and linker sequence was engineered (sequence available upon request). Data were collected by time-lapse laser scanning confocal microscopy of embryos expressing Histone H2Av-RFP and EGFP-PCNA at a 15 s frame rate with a 63×1.4 NA oil immersion objective. Z-stacks were collected to cover the entire span of cortically localized nuclei between NC10 and NC14 at 0.5 μm intervals. Images were maximum-projected for analysis with Matlab (2014A). PCNA tracks along with DNA Polymerase during replication and the GFP-fusions appear as bright, heterogeneously-distributed foci within S-phase nuclei (Figure 3—figure supplement 1) (McCleland et al., 2009). To estimate the overall timing of S-phase between NC11 and NC13, we quantified the degree of PCNA heterogeneity over time. Nuclei were segmented on the Histone RFP channel, and heterogeneity in EGFP-PCNA signal was measured using a grayscale correlation matrix (Figure 3—figure supplement 1). EGFP-PCNA signal was down-sampled to six grayscale levels and correlation matrices were calculated for radial sets of eight pixels separated by 200 nm distance, using the Matlab function graycomatrix.m. For each frame, heterogeneity was calculated as 1 – ‘Homogeneity’, as reported by graycomatrix.m. This yields a value between 0 and 1 where 0 indicates that all pixels neighbor pixels of similar intensities, and one indicates that all pixels neighbor pixels of different intensities. The by-frame heterogeneity measurements were next scaled to the mean EGFP-PCNA fluorescence intensity to yield an estimate of the fraction of PCNA associated with heterogeneously distributed fluorescence intensity. We note that similar results have been reported using a different quantification approach that measures the overall coefficient of variation in nuclear PCNA fluorescence intensity rather than grayscale correlation (Deneke et al., 2016).

MS2::MCP-GFP reporter imaging and analysis

Embryos from a cross between w; Histone H2Av-RFP/+; MCP-GFP(4F)/+ females and y w; HbP2-promoter>MS2(24)-LacZ males (Garcia et al., 2013) were imaged by laser scanning confocal microscopy at a 30 s frame rate with a 63×1.4 NA oil-immersion objective from NC10 through NC14. Z-stacks were collected at 0.5 μm intervals for a 12 μm cortical region of the embryo. Image processing and analysis was performed as described previously (Garcia et al., 2013).

Cell cycle timing

Cell cycle timing for w ssm185b; His2Av-GFP was performed as described previously. Wild-type cell cycle times reflect data reported previously (Blythe and Wieschaus, 2015b).

Data availability

Raw sequence files, coverage files, nucleosome positions, and an annotated peak list have been deposited in Gene Expression Omnibus GSE83851.

References

  1. 1
  2. 2
    Constraints on transcriptional activator function contribute to transcriptional quiescence during early Xenopus embryogenesis
    1. G Almouzni
    2. AP Wolffe
    (1995)
    The EMBO Journal 14:1752–1765.
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
  39. 39
  40. 40
  41. 41
  42. 42
  43. 43
  44. 44
  45. 45
  46. 46
  47. 47
  48. 48
  49. 49
  50. 50
  51. 51
  52. 52
  53. 53
  54. 54
    Regulation of maternal transcript destabilization during egg activation in Drosophila
    1. W Tadros
    2. SA Houston
    3. A Bashirullah
    4. RL Cooperstock
    5. JL Semotok
    6. BH Reed
    7. HD Lipshitz
    (2003)
    Genetics 164:989–1001.
  55. 55
  56. 56
  57. 57
  58. 58

Decision letter

  1. Allan C Spradling
    Reviewing Editor; Howard Hughes Medical Institute, Carnegie Institution for Science, United States

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Establishment and maintenance of heritable chromatin structure during early Drosophila embryogenesis" for consideration by eLife. Your article has been favorably evaluated by Kevin Struhl (Senior Editor) and three reviewers, one of whom is a member of our Board of Reviewing Editors. Two of the three reviewers have agreed to reveal their identity: Michael Eisen (Reviewer #2) and Ken Zaret (Reviewer #3).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

This is a very nice study, that was thoughtfully and carefully executed. The authors used ATAC-seq to probe the chromatin structure of individual pre-blastoderm Drosophila embryos during the final three cleavage divisions leading up to the mid-blastula transition (MBT), a fundamental transition for developing embryos that is of wide general interest. Major chromatin changes leading up to MBT are expected from previous work in Drosophila documenting cytological heterochromatin formation during this period. Employing staged individual embryos allows an exceptionally high and unprecedented level of temporal and cell cycle resolution. Consequently, the study is able to address widely interesting and important issues, including how chromatin accessibility changes during gene activation, and how chromatin is affected by rapid DNA replication and short cell cycle times. While the results are mostly descriptive, they strongly support particular mechanisms and provide a new, higher-level framework for all future studies of this period of embryonic development. Overall, the work represents an important step forward in our understanding of chromatin changes in the early embryo, and it will have a big influence on the field.

Essential revisions:

1) Greater clarity of presentation for a broad audience. The manuscript was well written but in a short and highly compressed format. Consequently, multiple intellectual and technical issues are handled in a user-unfriendly manner, apparently to save space. Although carrying out new experiments is not necessary, the authors should take advantage of eLife's flexible format to make changes that will render the presentation more accessible to a broad audience. Instead of leaving most discussion to the end of the paper, the manuscript would become more accessible if critical questions of interest were raised first, then the rationale of how the experiments and bioinformatic processing can test the issues was presented, all before describing the results and conclusions.

For example, the ATAC-seq method underlies the entire paper and its strengths and limitations should be discussed in an introductory section. Based on their experiments, and previous knowledge, authors should discuss whether ATAC-seq "openness" is mostly a reflection of nucleosome positioning along the primary chromatin fiber. Is there any sensitivity to "higher order" structure? If so, it is not currently apparent and should be documented. Whether the binding of regulatory proteins can be detected should be addressed, as transcription factor binding is not currently apparent given that enhancers become more open in association with gene activation.

It is essential to present real numerical data for the y-axes of Figure 1A that are comparable for all the tracks shown and transparently reveal all data manipulations that were used. Currently, it is not fully clear how the presented data could be reproduced, since there is vague mention of normalization steps, discarding of data and the occasional use of two rather than one embryo (do you divide by 2?).

As an example of user unfriendly data analysis, Figure 3 plots "mean-scaled heterogeneity in PCNA-EGFP". There is no explanation why "mean-scaled heterogeneity" represents a logical way to summarize DNA replication during the cell cycle (or even what it means). The majority of readers will remain mystified. Several of the authors' later arguments require that DNA replication take place in a highly front-loaded manner during S phase. That way, reductions in nucleosome positioning that persist only during the start of S can be ascribed to DNA replication interference. However, a case that PCNA-EGFP heterogeneity provides such information is not made.

One of the most interesting and unique aspects of these data are their implication that rapid replication shortly after M phase represents a significant impediment to establishing/maintaining nucleosome positions and by inference, gene regulation. The authors should raise the issue of interference between replication and nucleosome positioning early in the paper, during analysis of Figure 1 data. Existing ATAC-seq peaks should be plotted and shown to double across the cell cycle and fall after M. Discrepancies (such more than a twofold decrease in early S phase) should be quantitated numerically and in duration to provide a quantitative basis for discussing interference. The paper should tout the analysis of NFR both as a mark of promoter opening, as well as a tool for analyzing replication effects and better explain the data processing, modeling and conversion to "relative NFR." Finally, the authors should explain why replication interference would be manifested as a cell cycle dependent oscillation in relative NFR, rather than simply as an oscillation in NFR signal strength without a size change due to essentially random nucleosome removal.

2) Too few primary data are shown to appropriately document these findings and address closely related questions. Although carrying out new experiments is not necessary, sample data at NC11-13 from other genomic regions containing well known genes activated in early embryos should be shown to allow the behavior of more peak regions to be observed. These should include a large gene such as Ubx which has been claimed to be too large to transcribe fully prior to MBT. Can ATAC-seq detect heterochromatin formation during NC11-13 in unique sequence centric regions? Including any available data on NC9 and NC10 embryo in Figure 1A, even if of lower resolution, would help reveal more about the onset of the 33% of peaks already present at NC11.

3) More biological analysis to enhance the paper's impact for the general community. An important part of the study concerns the grouping of open region peaks into functional categories: promoters, enhancers, insulators, etc. The accuracy and biological meaning of these groupings is not currently presented in a critical or convincing manner. First, enhancers and insulators were identified using ChIP-seq data from much older embryos. What is the relevance of 0-12hr embryo data to NC11-NC13? The definitions used for insulators and enhancers in terms of protein binding and/or chromatin modifications seemed questionable and were not discussed or justified. The authors should prepare a (smaller) dataset of elements that have been functionally validated during NC11-13 and the next two hours of development and determine if they are consistent with conclusions based on their current dataset. The conclusion that (on average) enhancers become open before promoters, is touted in the Abstract, but its meaning and significance are scarcely discussed. Do such conclusions about opening of promoters and enhancers hold up using groups of specific known genes that become active at different times? When do the enhancers and promoters of gap and pair-rule genes become open, for example? How does the observed landscape of chromatin changes relate to what is known about gene action in early embryos?

4) Address issues regarding the maintenance of chromatin states through mitosis. There are three issues here: 1) whether open configurations are maintained through M, 2) whether maintenance is due to an epigenetically based "transcriptional memory", and 3) whether maintenance has an actual function- namely to ensure faithful inheritance. The fact open configurations survive M seems clear, since embryos in M phase display no loss of open features. However, is it possible, due to changes in embryonic chromatin during sample preparation, that apparent maintenance might be an artifact of partial progression of some processes into the next cell cycle? Since changing levels of trans acting factors likely cause these chromatin features in the first place, the re-appearance of features after M phase is to be expected whether or not there is any epigenetic "transcriptional memory." None of the data presented by the authors currently presents a convincing case that nucleosome positioning is inherited "epigenetically" and beyond that there were no functional tests of the importance of such inheritance. What in this study goes beyond the work of Hsuing and Blobel published this last year? Ramachandran and Henikoff 2016 should be cited and be clear about their own novelty here.

5) Address issues regarding the analysis of elements in haploid vs diploid embryos. Haploid embryos are mutant, and may not be entirely normal. The authors describe how many elements responded to N/C ratio (defined by the haploid/diploid test) or not, but the biological importance of this distinction is not well justified. The authors describe how many peaks overlap with binding sites for Zelda or for GAF (not necessarily in NC11-13 embryo cells), and find correlations with N/C dependent or independent classes. While this is consistent with a model in which GAGA factor plays a role in N/C ratio sensing, it is far from proof of such a relationship. It would be useful to see if there is genomic or other data out there on the genes targeted by Zelda and GAF factors in CHIP or knockout studies around the MBT and if the expected genes of the authors are affected.

https://doi.org/10.7554/eLife.20148.051

Author response

[…]

Essential revisions:

1) Greater clarity of presentation for a broad audience. The manuscript was well written but in a short and highly compressed format. Consequently, multiple intellectual and technical issues are handled in a user-unfriendly manner, apparently to save space. Although carrying out new experiments is not necessary, the authors should take advantage of eLife's flexible format to make changes that will render the presentation more accessible to a broad audience. Instead of leaving most discussion to the end of the paper, the manuscript would become more accessible if critical questions of interest were raised first, then the rationale of how the experiments and bioinformatic processing can test the issues was presented, all before describing the results and conclusions.

We have revised the manuscript to clarify the rationale, approach, and experimental results with a more general reader in mind. We have added additional introductory information to the manuscript that presents the critical questions we are addressing. See also: further responses below.

For example, the ATAC-seq method underlies the entire paper and its strengths and limitations should be discussed in an introductory section. Based on their experiments, and previous knowledge, authors should discuss whether ATAC-seq "openness" is mostly a reflection of nucleosome positioning along the primary chromatin fiber. Is there any sensitivity to "higher order" structure? If so, it is not currently apparent and should be documented. Whether the binding of regulatory proteins can be detected should be addressed, as transcription factor binding is not currently apparent given that enhancers become more open in association with gene activation.

We have added a passage to the Introduction describing ATAC seq, its limitations, and the biological interpretation of ‘openness’ as measured by the assay (Introduction paragraph 4). Because ATAC-seq is a relatively new technology, some of the likely technical limitations have not been fully resolved in the literature. We have therefore approached the possibility of technical limitations empirically. The major “higher order structure” that could be problematic in our study is the condensed state of mitotic chromatin. We have added a section to the Materials and methods (“Read Length Distribution and Estimation of Stage-Specific Library Preparation Bias”) that outlines in detail how we approached the issue of comparing interphase and mitotic chromatin samples. This includes an additional supplemental figure (Figure 1—figure supplement 3) with raw read length distribution data and estimates of initial fragmented DNA concentration and actual unique sequence read distributions from interphase and metaphase staged samples (raw data available in Supplementary file 2). We do observe the expected change in relative proportions of nucleosome associated and open chromatin fragments depending on cell cycle stage. However, we find no inherent overall bias in the library preparation from metaphase-staged embryos, and therefore conclude that the data accurately reflect the accessibility status of metaphase chromatin.

With respect to regulatory protein binding, this phenomenon of ‘footprinting’ originally observed in Buenrostro 2013 can be seen with really high sequencing depth, beyond – we assume – what we have done here. We have added a passage to the Materials and methods (under the “ATAC-seq Sample Preparation” section) that mentions this. Practically, we had the choice to sequence many timepoints to good coverage, or to sequence fewer samples to very deep coverage. We chose to resolve time in this case. As we mention in the added text, we did not make any effort to determine whether, at the current sequencing depth, we nonetheless observe this TF footprinting.

It is essential to present real numerical data for the y-axes of Figure 1A that are comparable for all the tracks shown and transparently reveal all data manipulations that were used. Currently, it is not fully clear how the presented data could be reproduced, since there is vague mention of normalization steps, discarding of data and the occasional use of two rather than one embryo (do you divide by 2?).

The y-axes for the coverage plots in Figure 1A are now shown in the revised version.

We have now included an R markdown document that has annotated code for performing normalization of the data. The normalization approach described in the new code supplement effectively follows step by step the initial description of data normalization we included in the Materials and methods. In principle, a reader could prepare input datasets (either their own replicates of this data analysis, or their own homemade ATAC-seq data) and change a few lines of code in the markdown, and perform the same normalization as is used in the paper.

The “discarding” of data mentioned above refers not to discarding of sequenced libraries (i.e., cherry-picking), but rather discarding of libraries as a result of quality control assessments made during preparation (see end of fourth paragraph of Materials and methods, subheading “ATAC-seq Sample Preparation”). To clarify, all that is meant by ‘discarding’ here is that, following preparation of all libraries for a particular genotype, we estimated the starting quantity of fragmented DNA from a single sample by dividing the final library concentration by 2^(number of PCR cycles). The distribution of these values were plotted and outliers (3x the interquartile range) were discarded. The rationale for this is that, when preparing libraries from a single embryo, we were concerned that either some of the nuclear pellet would be lost during the initial sample preparation, or that the sample would be contaminated by exogenous DNA from the environment (e.g., contamination of embryo preps with brewer’s yeast from the embryo collection plates). If any of these were the case, then we would expect to resolve these as extreme outliers less than or greater than the population mean. In practice, these outliers were rare and constituted <5% of all samples prepared for this study. Those that we did find represented samples where we suspected we lost some or all of the nuclear pellet prior to Tn5 fragmentation.

We did not divide by two when we used two embryos. In general, we found that the quantity of DNA in the libraries did not vary significantly with respect to stage or DNA content (the expected 4-fold difference per embryo in genomic DNA content across all stages tested). This can be attributed to the large quantity of mitochondrial DNA that is also present in the library preparations. We have added a new paragraph to the Materials and methods, under the “ATAC-seq Sample Preparation” subheading, that describes the ‘buffering’ role of mitochondrial DNA in the library preparation steps.

As an example of user unfriendly data analysis, Figure 3 plots "mean-scaled heterogeneity in PCNA-EGFP". There is no explanation why "mean-scaled heterogeneity" represents a logical way to summarize DNA replication during the cell cycle (or even what it means). The majority of readers will remain mystified. Several of the authors' later arguments require that DNA replication take place in a highly front-loaded manner during S phase. That way, reductions in nucleosome positioning that persist only during the start of S can be ascribed to DNA replication interference. However, a case that PCNA-EGFP heterogeneity provides such information is not made.

We agree that this was confusing in the manuscript we initially submitted, and we have made the following changes to clarify this analysis.

We have added a supplemental figure attached to Figure 3 that illustrates directly what we are measuring, and have added additional clarification of this rationale to the Materials and methods. PCNA forms bright foci in syncytial blastoderm embryos at sites of DNA replication. The supplemental figure demonstrates representative frames from one of the movies that we used in this analysis, and these frames are matched to individual points along a plot of “heterogeneity” (1-Homogeneity) to help the reader interpret the plotted data. McCleland et al., JCB 2009 showed nicely that these patterns of heterogeneous fluorescence intensity are directly linked to DNA replication activity, insofar as blocking DNA origin licensing effectively eliminates the formation of these foci. We also note in the Materials and methods that a recent report has also approached this problem to the same end by quantifying the coefficient of variation in PCNA fluorescence (Deneke et al., Dev.Cell 2016). Follow-up work from the O’Farrell group (Shermoen, et al. Curr.Biol 2010) has demonstrated that replication during NC11 and NC12 does not largely distinguish between early and late replicating domains that are typically ascribed to eu- and hetero-chromatic compartments, but that during NC13 late replicating domains partly begin to emerge. Since we exclusively consider “euchromatic” portions of the genome in this manuscript, we are confident that our PCNA measurements provide a useful estimate of when DNA replication could be interfering with the underlying organization of chromatin in these regions. The ‘front-loaded’ aspect of replication in early embryos is a long-standing observation, which is consistent with our observations here. By measuring frequencies of observed ‘eye-forms’ of replication forks in EM-spreads of blastoderm DNA, Blumenthal et al. 1974 estimated that the interval of time between initiation and completion of replication is approximately 1.25 minutes, and that the vast majority of interphase genomic chromatin is post-replicative. See Figure 5 and text on pages 209-210 in Blumenthal for their argument. This is consistent with a model where DNA replication during the blastoderm cell cycle is highly front-loaded.

One of the most interesting and unique aspects of these data are their implication that rapid replication shortly after M phase represents a significant impediment to establishing/maintaining nucleosome positions and by inference, gene regulation. The authors should raise the issue of interference between replication and nucleosome positioning early in the paper, during analysis of Figure 1 data. Existing ATAC-seq peaks should be plotted and shown to double across the cell cycle and fall after M. Discrepancies (such more than a twofold decrease in early S phase) should be quantitated numerically and in duration to provide a quantitative basis for discussing interference. The paper should tout the analysis of NFR both as a mark of promoter opening, as well as a tool for analyzing replication effects and better explain the data processing, modeling and conversion to "relative NFR." Finally, the authors should explain why replication interference would be manifested as a cell cycle dependent oscillation in relative NFR, rather than simply as an oscillation in NFR signal strength without a size change due to essentially random nucleosome removal.

We considered rearranging the paper to include a more in-depth presentation of the replication/nucleosome question earlier in the Results section and chose instead to introduce the problem more generally in the Introduction (Paragraph 2 of the Introduction now covers the general issue of both replication and mitosis with respect to maintenance of accessibility states). In the current organization of the manuscript, we do in fact use the Figure 1A data to introduce our specific treatment of the replication/nucleosome question, albeit later in the Results section than the first mention of Figure 1A. The reason for this is that we organized the paper to first deal with ‘establishment’ and then ‘maintenance’ because we felt this provided a more intuitive flow to the logic of the paper. We hope that this is a satisfactory compromise.

We have touted the usefulness of the NFR as an analysis tool in the revised Results section.

We have added additional technical detail for the calculation of the NFR sizes to the Materials and methods under the “Nucleosome Positioning” subheading. We also corrected some mis-numbered references to figures in this section.

The reason that replication associated nucleosome disruption is manifested as an oscillation in NFR size is related to the way our calculation of NFR size reacts to the effect of nucleosome placement following replication. From a single vantage point (the TSS) we are calculating the position of the first upstream and first downstream nucleosome, not the overall signal intensity within the maximum limits of the NFR region. This distinction, we think, addresses the heart of the request above. When nucleosome insertion occurs following replication, there is a chance that the insertion will occur within the NFR, in which case, our calculation indicates a reduction in NFR size. Alternatively, to invoke the Ramachandran and Henikoff model, if sufficient competition exists between regulatory factors and nucleosome assembly mechanisms, then it is less likely that a nucleosome will be inserted within the NFR (or that it will be captured in our measurements). In this case, our calculation indicates less of an effect on NFR size.

To address this in another way, we have now also measured changes in recovery of ATAC-seq fragments over the NFR as requested by the reviewer. To do this, we determined the maximum extent of the NFR by calculating the extreme positions of the -1 and +1 nucleosomes as predicted by our analysis. Then, we calculated the maximum ‘open’ ATAC-seq coverage value within each NFR. Accessibility increases by approximately two fold over each of the cell cycles measured, and there is less than a two-fold decrease in accessibility from mitosis into the subsequent early S-phase. We also measured how long it takes during either NC12 or NC13 for accessibility to recover to levels greater than or equal to accessibility during the preceding metaphase. On average, recovery of accessibility to prior metaphase levels occurs by 9 minutes into NC12, but by 6 minutes into NC13. We have added this observation to the Results text and have plotted the data in Figure 4—figure supplement 2.

We are grateful for the suggestion to pursue this analysis, as we became intrigued by the distribution of recovery times for NFRs. Examination of the distribution of this data indicated that the distribution of recovery times is biphasic in both cell cycles. To visualize this, we performed spline interpolation of the accessibility values for each NFR, and repeated the calculation of recovery times. During each cell cycle, there is an early (3-minute) and late (6-9 minute) population of recovering NFRs. The proportion of early and late NFRs is nearly equal during NC12, whereas there are a substantially greater proportion of early-recovering NFRs during NC13. These observations support our prior conclusion that recovery of chromatin architecture at NFRs becomes more robust during NC13. This new analysis is summarized as well in Figure 4—figure supplement 2.

2) Too few primary data are shown to appropriately document these findings and address closely related questions. Although carrying out new experiments is not necessary, sample data at NC11-13 from other genomic regions containing well known genes activated in early embryos should be shown to allow the behavior of more peak regions to be observed. These should include a large gene such as Ubx which has been claimed to be too large to transcribe fully prior to MBT. Can ATAC-seq detect heterochromatin formation during NC11-13 in unique sequence centric regions? Including any available data on NC9 and NC10 embryo in Figure 1A, even if of lower resolution, would help reveal more about the onset of the 33% of peaks already present at NC11.

We have added a selection of 23 plots (including Ubx) similar to the ones shown in Figure 1A as supplemental figures to Figure 1. These are most but not all of the genes referred to in the revised text. In reality, it would be impractical to show a nice large set of example regions that would cover all the early embryo favorites. The necessary files for making these plots are available through GEO, and we have posted a link on the lab webpage to allow readers to browse through the data through the UCSC genome browser (http://molbiolabs.princeton.edu/wieschaus/data). If there are even more ways to disseminate these data files beyond what we have done here, we would be interested in pursuing those as well. We encourage the referees to examine the link above and comment on the presentation of the data. We note in both the Figure 1 legend and in the Materials and methods the options available to the reader for viewing their own favorite loci.

We are likewise interested in the question of heterochromatin formation in early embryos, but are not satisfied that we have made any significant progress on this question. If we take the centric regions of chromosomes 2, 3, and X, mask them for regions that are uniquely mappable, and calculate the average ATAC-seq ‘open’ coverage over these regions, we find that even at NC11, mean heterochromatic accessibility is lower than that of an equal number of randomly sampled euchromatic regions. In line with the prediction that cytological heterochromatin emerges at or around the MBT, we observe that NC12 and NC13 have progressively less open heterochromatin. These differences between eu- and hetero-chromatin are statistically significant by permutation test at p<0.001 for all timepoints in the analysis, and we have not tested whether the progressive reductions we observe from NC11 to 13 are themselves statistically significant. In this latter case, we are unsure of the significance of the magnitude of the observed effect, and would overall prefer to link these measurements to functional assays that are currently in development. This initial analysis is promising, but we are not ready to include this as a major observation associated with the current study.

We are also interested in evaluating accessibility in earlier embryos, but have not yet performed such experiments. We agree that this would help resolve the NC11 peaks into more distinct classes. One reason that we did not push earlier in this analysis is that NC11 is the earliest developmental stage where we can confidently ascribe a timepoint within a nuclear cycle. Mitosis 9 occurs below the cortical layer of the embryo and therefore the beginning of NC10 is therefore difficult to observe. As a result, such earlier timepoints will indeed represent data of lower time resolution. This, coupled with the fact that we’d likely have to pool several earlier embryos of various stages, led us to the conclusion that such experiments would not necessarily fit with the current dataset from a sampling standpoint.

3) More biological analysis to enhance the paper's impact for the general community. An important part of the study concerns the grouping of open region peaks into functional categories: promoters, enhancers, insulators, etc. The accuracy and biological meaning of these groupings is not currently presented in a critical or convincing manner. First, enhancers and insulators were identified using ChIP-seq data from much older embryos. What is the relevance of 0-12hr embryo data to NC11-NC13? The definitions used for insulators and enhancers in terms of protein binding and/or chromatin modifications seemed questionable and were not discussed or justified. The authors should prepare a (smaller) dataset of elements that have been functionally validated during NC11-13 and the next two hours of development and determine if they are consistent with conclusions based on their current dataset. The conclusion that (on average) enhancers become open before promoters, is touted in the Abstract, but its meaning and significance are scarcely discussed. Do such conclusions about opening of promoters and enhancers hold up using groups of specific known genes that become active at different times? When do the enhancers and promoters of gap and pair-rule genes become open, for example? How does the observed landscape of chromatin changes relate to what is known about gene action in early embryos?

We have added additional text to the Materials and methods to justify the categorization of the enhancer class, which derives largely from previous observations of chromatin modifications and protein occupancy from a broad collection of genomics studies.

We would like to point out that for enhancers, we used a combination of functionally validated enhancers from the Stark Lab (Kvon et al., Nature 2014), binding of Zelda (Eisen Lab, 1-3h embryos), binding of CBP (modEncode 0-4h embryos), Histone H3K4 monomethylation (modEncode, 0-4h), and Histone H3K27 acetylation (modEncode, 0-4h). All of these data sets are as closely matched temporally as possible to the stages we are examining in the paper.

To validate the core conclusion we derive from the categorization of enhancers, we added the suggested analysis on the Kvon/Stark set of functionally validated enhancers to the Results and Discussion section and Figure 1. In brief, this subset of enhancers demonstrates the preferential early accessibility we observe with the broader set of ‘enhancers’. We note here that this collection does not include several well-characterized early enhancers (e.g., hunchback P2) and is therefore likely to be under-respresented for features in the early time class.

In the process of examining the validated enhancers, we also measured when these enhancers are active, based on the annotations supplied by Kvon/Stark. First, as expected, compared with the total set of Kvon/Stark enhancers our NC11-NC13 accessible regions are enriched for enhancers with the earliest scored expression (stage 4-6, where NC11-13 is during stage 4 and stage 6 is approximately 1.5 hours later, during gastrulation). Second, the enhancers that are accessible between NC11 and NC13 continue to be expressed throughout development (through stage 16, approximately ≥12 hours later). Third, enhancers that are open early (NC11) are more likely to be expressed early, whereas enhancers that open between NC12 and 13 are significantly enriched for enhancers whose first expression comes much later in development, suggesting that these enhancers are ‘primed’ during NC12 and 13 for expression at a later stage. Taken together, these observations support the conclusions we draw from the broader categorization of ‘enhancers’, and further suggest routes for future investigation regarding the phenomenon of enhancer priming.

With respect to our use of 0-12 hour ChIP-chip data for assigning insulators: we acknowledge that this data set is not ideal, but it is the most comprehensive available dataset that measures multiple insulator binding proteins, and we feel that even with this caveat, our analysis responsibly uses this data for categorization of this genomic feature. Our options for this analysis were 1) to use the 0-12 hour data; 2) generate all of this data ourselves for our stages of interest; 3) use a 2-4 hour dataset that lacks one of the major insulator proteins mod(mdg4); 4) not score insulators at all. Formally, we designate a region in our data that has measured accessible chromatin between NC11 and NC13 as an insulator if it ‘can’ bind two or more of the insulator proteins between 0 and 12 hours in the modEncode dataset. Given that these insulator proteins are expressed between NC11 and 13, it seems reasonable to assume that they would bind these regions once they were made accessible. Furthermore, our analysis with the Kvon/Stark enhancer set described above does suggest that events in NC11-NC13 do have lasting effects until much later in development. A more careful and direct analysis will therefore be necessary to truly answer what similarities there are between a 12 hour embryo and a syncytial blastoderm, but from the point of view of assigning genomic features, we think that this approach is the best available given the current state of the field.

Finally, we agree that it is important to link these observations back to real knowledge about gene activity and function in early embryogenesis. Similar to our response to the major comment above regarding providing more examples of primary data for more genes of interest, it would be difficult to cover all the favorites here. We have therefore added more general discussion regarding groups of certain favorite/developmentally rich genes to the Results and Discussion, and have provided a new supplemental table that summarizes the set of peaks corresponding to promoters, identifies them by associated flybase gene identifiers and names, and provides the scored “open by” category, and the NC-ratio independent or dependent classification, when appropriate. With this information, the general behavior of any reader’s favorite gene can be easily looked up, and coupled with the provided data visualization either on our website or by downloading the processed data, further investigation can be made into the finer details. For the majority of the genes we have discussed in this new section, we have included browser views in the Supplement to Figure 1, as described above.

4) Address issues regarding the maintenance of chromatin states through mitosis. There are three issues here: 1) whether open configurations are maintained through M, 2) whether maintenance is due to an epigenetically based "transcriptional memory", and 3) whether maintenance has an actual function- namely to ensure faithful inheritance. The fact open configurations survive M seems clear, since embryos in M phase display no loss of open features. However, is it possible, due to changes in embryonic chromatin during sample preparation, that apparent maintenance might be an artifact of partial progression of some processes into the next cell cycle? Since changing levels of trans acting factors likely cause these chromatin features in the first place, the re-appearance of features after M phase is to be expected whether or not there is any epigenetic "transcriptional memory." None of the data presented by the authors currently presents a convincing case that nucleosome positioning is inherited "epigenetically" and beyond that there were no functional tests of the importance of such inheritance. What in this study goes beyond the work of Hsuing and Blobel published this last year? Ramachandran and Henikoff 2016 should be cited and be clear about their own novelty here.

From a logical standpoint, we cannot unequivocally exclude the possibility that sample preparation somehow introduces an artifact that allows for the ‘partial progression of some processes into the next cell cycle’. However, we feel that this is unlikely. Another way to state the problem is whether our observation of maintained accessibility during metaphase is indeed real. We initially approached this observation with healthy skepticism along the lines of the abovementioned artifact. However, we were encouraged when we found Hsiung and Blobel’s recent Genome Research paper, which reports a similar maintenance of DNase hypersensitive sites in a murine erythroblast cell line during mitosis. Given the importance and relevance of Hsiung and Ramachandran’s previous studies, we have cited them more prominently in the manuscript, and have more clearly discussed how we advance their important prior observations.

Hsiung’s observations are made in a completely unrelated experimental model. In addition, Hsiung uses a completely different assay system, DNase hypersensitivity, to measure accessibility, and their approach to collecting mitotically staged samples is done pharmacologically, i.e., nocodazole arrest. Finally, Hsiung fixes the samples with formaldehyde, which more than likely prevents further biological activity. Given the similarity between our observations despite the significant differences in experimental approach, we take our results as demonstrating preserved local accessibility patterns in an otherwise condensed mitotic chromatin state, especially given that any artifact would have to apply to both our and Blobel’s divergent experimental systems.

Moreover, the reviewers’ question implies that “epigenetic transcriptional memory” is distinct from the effects of “trans-acting factors”, that transcription factors do one thing, and there exists a separate mechanism for memory. This is not necessarily the case. We agree that these are interesting questions, but our failure to identify changes in accessibility during mitosis argues that physical structures are maintained and thus preclude our addressing more subtle epigenetic mechanisms that might maintain chromatin memory over longer periods. We do not know the mechanism responsible for the stability of nucleosome positioning/ chromatin accessibility during mitosis. It may be the most thermodynamically favorable way for the cell to deal with the task of chromatin condensation, or it may result from a previously unappreciated ability for trans-acting factors to remain associated with their DNA targets during mitosis. It may also reflect a separate mechanism for epigenetic memory. We also have not determined whether the loss and recovery of nucleosome stability following DNA replication is deleterious to the precision with which transcriptional reactivation takes place. All these are big questions we hope to address in future investigations.

5) Address issues regarding the analysis of elements in haploid vs diploid embryos. Haploid embryos are mutant, and may not be entirely normal. The authors describe how many elements responded to N/C ratio (defined by the haploid/diploid test) or not, but the biological importance of this distinction is not well justified. The authors describe how many peaks overlap with binding sites for Zelda or for GAF (not necessarily in NC11-13 embryo cells), and find correlations with N/C dependent or independent classes. While this is consistent with a model in which GAGA factor plays a role in N/C ratio sensing, it is far from proof of such a relationship. It would be useful to see if there is genomic or other data out there on the genes targeted by Zelda and GAF factors in CHIP or knockout studies around the MBT and if the expected genes of the authors are affected.

We have clarified the experimental rationale for examining haploids in the Results and Discussion section.

We have added a detailed description of sesame and have clearly acknowledged any potentially confounding features of the of the mutant phenotype in the Materials and methods (new section entitled “Haploid Embryos”).

We agree and acknowledge that the correlations between Zelda/GAF and regions of open chromatin are simply that, correlations, and not proof of N:C ratio sensing. While detailed information regarding the effects of Zelda loss of function on chromatin accessibility have been reported previously (Schulz 2015, Sun 2015), loss of function for GAF in embryos is difficult to achieve, and – at best – represents rare pleiotropic escapers from an earlier ovarian sterile phenotype (Bhat et al., 1996, Greenberg et al., 2001). Correspondingly, all of our attempts to generate GAF-null embryos have failed and we are currently pursuing several alternative approaches to definitively address the mechanistic relationship between GAGA and the MBT timers.

To further substantiate the correlation between Zelda and N:C-ratio independent chromatin regions, we obtained a previously published Faire-seq dataset identifying regions of chromatin that are dependent on Zelda for accessibility (Schulz, 2015) and performed statistical analysis for enrichment of Zelda-dependent loci and our different timing classes. We note that in Schulz et.al, the fraction of Zelda-bound regions (i.e., ChIP-peaks) is much larger than the set of identified regions that are absolutely dependent on Zelda for full chromatin accessibility, and therefore represent a small subpopulation of the Zelda binding regions both we and Schultz et al. have used to identify “Zelda targets”. Nonetheless, we observe that the Schultz-Zelda-dependent regions are likewise enriched for N:C ratio independent acquisition of chromatin accessibility. Performing the identical Fisher’s exact test we use in Figure 2, we find that Zelda-dependent regions are enriched for N:C ratio independence with a p-value of 0.001415 and a log2 odds ratio of 1.11.

We were concerned that the Schultz dataset could potentially be missing some Zelda-dependent regions owing to differences in sample collection and because of unappreciated technical differences between ATAC-seq and Faire-seq. We therefore double-checked these results using an unpublished Zelda ATAC-seq dataset that is currently part of an independent manuscript from our laboratory that is currently under review. These data were collected from embryos staged twelve minutes into NC14 (approximately 15 minutes older than the oldest embryo examined in this manuscript). Of note, our unpublished dataset identifies substantially more regions that depend on Zelda for chromatin accessibility, yet is still consistent with Schulz et al. in the fact that not all Zelda bound regions require Zelda for accessibility. When we perform the statistical analysis as described above, we also find that regions that require Zelda for chromatin accessibility are more likely to be N:C ratio independent with a p-value of 2.097x10-08, and a log2 odds ratio of 0.78. We feel that these supplemental tests strongly support the conclusion from our manuscript that on average the effect of Zelda’s interaction with chromatin is timed independently of the nucleo-cytoplasmic ratio.

We also double-checked the correlation of GAF binding with N:C ratio-dependence. As mentioned above, no chromatin accessibility data from GAF-null embryos exists, or is technically possible to obtain at the moment. Therefore, we obtained a 0-4 hour GAF ChIP-seq dataset from modEncode (accession# 3814) and tested for a correlation between binding and N:C ratio dependence. With the statistical analysis described above, we find that regions bound by GAF and not Zelda in this dataset are more likely to be NC ratio dependent with a p-value of 0.014 and a log2 odds ratio of 0.56. In the future, we hope to be able to address this issue more directly with more closely time-resolved GAF ChIP-seq data and hopefully loss of function accessibility data.

We note that we have included the details of these analyses here in our response, but have not incorporated them into the manuscript proper.

https://doi.org/10.7554/eLife.20148.052

Article and author information

Author details

  1. Shelby A Blythe

    Howard Hughes Medical Institute, Princeton University, Princeton, United States
    Contribution
    SAB, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article
    For correspondence
    sblythe@princeton.edu
    Competing interests
    The authors declare that no competing interests exist.
  2. Eric F Wieschaus

    Howard Hughes Medical Institute, Princeton University, Princeton, United States
    Contribution
    EFW, Analysis and interpretation of data, Drafting or revising the article
    For correspondence
    efw@princeton.edu
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0727-3349

Funding

National Institutes of Health (F32HD072653)

  • Shelby A Blythe

Howard Hughes Medical Institute

  • Shelby A Blythe
  • Eric F Wieschaus

National Institutes of Health (R37HD15587)

  • Eric F Wieschaus

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank S Di Talia, P Klein, M Levo, and S Little for comments on the manuscript; A Amodeo, K Chen, T Gregor, C Hannon, M Levine, P Schedl, and T Schüpbach for helpful discussions; H Garcia for assistance with quantification of MS2-MCP data; and W Wang and the staff of the Lewis-Sigler Institute Sequencing Core Facility. This work was supported in part by NIH grant number R37HD15587 (EFW) and Ruth Kirschstein NRSA Postdoctoral Fellowship F32HD072653 (SAB). EFW is an investigator with the Howard Hughes Medical Institute.

Reviewing Editor

  1. Allan C Spradling, Howard Hughes Medical Institute, Carnegie Institution for Science, United States

Publication history

  1. Received: July 28, 2016
  2. Accepted: November 21, 2016
  3. Accepted Manuscript published: November 23, 2016 (version 1)
  4. Version of Record published: December 14, 2016 (version 2)

Copyright

© 2016, Blythe et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 3,466
    Page views
  • 1,026
    Downloads
  • 30
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, Scopus, PubMed Central.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Developmental Biology
    2. Evolutionary Biology
    Suparna Ray et al.
    Research Communication
    1. Cell Biology
    2. Developmental Biology
    Torey R Arnold et al.
    Research Article Updated