Abstract
Thermodynamic models of gene regulation can predict transcriptional regulation in bacteria, but in eukaryotes, chromatin accessibility and energy expenditure may call for a different framework. Here, we systematically tested the predictive power of models of DNA accessibility based on the MonodWymanChangeux (MWC) model of allostery, which posits that chromatin fluctuates between accessible and inaccessible states. We dissected the regulatory dynamics of hunchback by the activator Bicoid and the pioneerlike transcription factor Zelda in living Drosophila embryos and showed that no thermodynamic or nonequilibrium MWC model can recapitulate hunchback transcription. Therefore, we explored a model where DNA accessibility is not the result of thermal fluctuations but is catalyzed by Bicoid and Zelda, possibly through histone acetylation, and found that this model can predict hunchback dynamics. Thus, our theoryexperiment dialogue uncovered potential molecular mechanisms of transcriptional regulatory dynamics, a key step toward reaching a predictive understanding of developmental decisionmaking.
eLife digest
Cells in the brain, liver and skin, as well as many other organs, all contain the same DNA, yet behave in very different ways. This is because before a gene can produce its corresponding protein, it must first be transcribed into messenger RNA. As an organism grows, the transcription of certain genes is switched on or off by regulatory molecules called transcription factors, which guide cells towards a specific ‘fate’.
These molecules bind to specific locations within the regulatory regions of DNA, and for decades biologist have tried to use the arrangement of these sites to predict which proteins a cell will make. Theoretical models known as thermodynamic models have been able to successfully predict transcription in bacteria. However, this has proved more challenging to do in eukaryotes, such as yeast, fruit flies and humans.
One of the key differences is that DNA in eukaryotes is typically tightly wound into bundles called nucleosomes, which must be disentangled in order for transcription factors to access the DNA. Previous thermodynamic models have suggested that DNA in eukaryotes randomly switches between being in a wound and unwound state. The models assume that once unwound, regulatory proteins stabilize the DNA in this form, making it easier for other transcription factors to bind to the DNA.
Now, Eck, Liu et al. have tested some of these models by studying the transcription of a gene involved in the development of fruit flies. The experiments showed that no thermodynamic model could accurately mimic how this gene is regulated in the embryos of fruit flies. This led Eck, Liu et al. to identify a model that is better at predicting the activation pattern of this developmental gene. In this model, instead of just ‘locking’ DNA into an unwound shape, transcription factors can also actively speed up the unwinding of DNA.
This improved understanding builds towards the goal of predicting gene regulation, where DNA sequences can be used to tell where and when cell decisions will be made. In the future, this could allow the development of new types of therapies that can regulate transcription in different diseases.
Introduction
Over the last decade, hopeful analogies between genetic and electronic circuits have posed the challenge of predicting the output gene expression of a DNA regulatory sequence in much the same way that the output current of an electronic circuit can be predicted from its wiring diagram (Endy, 2005). This challenge has been met with a plethora of theoretical works, including thermodynamic models, which use equilibrium statistical mechanics to calculate the probability of finding transcription factors bound to DNA and to relate this probability to the output rate of mRNA production (Ackers et al., 1982; Buchler et al., 2003; Vilar and Leibler, 2003; Bolouri and Davidson, 2003; Bintu et al., 2005a; Bintu et al., 2005b; Sherman and Cohen, 2012). Thermodynamic models of bacterial transcription launched a dialogue between theory and experiments that has largely confirmed their predictive power for several operons (Ackers et al., 1982; Bakk et al., 2004; Zeng et al., 2010; He et al., 2010; Garcia and Phillips, 2011; Brewster et al., 2012; Cui et al., 2013; Brewster et al., 2014; Sepúlveda et al., 2016; RazoMejia et al., 2018) with a few potential exceptions (Garcia et al., 2012; Hammar et al., 2014).
Following these successes, thermodynamic models have been widely applied to eukaryotes to describe transcriptional regulation in yeast (Segal et al., 2006; Gertz et al., 2009; Sharon et al., 2012; Zeigler and Cohen, 2014), human cells (Giorgetti et al., 2010), and the fruit fly Drosophila melanogaster (Jaeger et al., 2004a; Zinzen et al., 2006; Segal et al., 2008; Fakhouri et al., 2010; Parker et al., 2011; Kanodia et al., 2012; White et al., 2012; Samee et al., 2015; Sayal et al., 2016). However, two key differences between bacteria and eukaryotes cast doubt on the applicability of thermodynamic models to predict transcriptional regulation in the latter. First, in eukaryotes, DNA is tightly packed in nucleosomes and must become accessible in order for transcription factor binding and transcription to occur (Polach and Widom, 1995; Levine, 2010; Schulze and Wallrath, 2007; Lam et al., 2008; RavehSadka et al., 2009; Li et al., 2011; Fussner et al., 2011; Bai et al., 2011; Li et al., 2014b; Hansen and O'Shea, 2015). Second, recent reports have speculated that, unlike in bacteria, the equilibrium framework may be insufficient to account for the energyexpending steps involved in eukaryotic transcriptional regulation, such as histone modifications and nucleosome remodeling, calling for nonequilibrium models of transcriptional regulation (Kim and O'Shea, 2008; Estrada et al., 2016; Li et al., 2018; Park et al., 2019).
Recently, various theoretical models have incorporated chromatin accessibility and energy expenditure in theoretical descriptions of eukaryotic transcriptional regulation. First, models by Mirny, 2010, Narula and Igoshin, 2010, and Marzen et al., 2013 accounted for chromatin occluding transcriptionfactor binding by extending thermodynamic models to incorporate the MonodWymanChangeux (MWC) model of allostery (Figure 1A; Monod et al., 1965). This thermodynamic MWC model assumes that chromatin rapidly transitions between accessible and inaccessible states via thermal fluctuations, and that the binding of transcription factors to accessible DNA shifts this equilibrium toward the accessible state. Like all thermodynamic models, this model relies on the ‘occupancy hypothesis’ (Hammar et al., 2014; Garcia et al., 2012; Phillips et al., 2019): the probability ${p}_{bound}$ of finding RNA polymerase (RNAP) bound to the promoter, a quantity that can be easily computed, is linearly related to the rate of mRNA production $\frac{\mathrm{d}\mathrm{m}\mathrm{R}\mathrm{N}\mathrm{A}}{\mathrm{d}t}$, a quantity that can be experimentally measured, such that
Here, R is the rate of mRNA production when the system is in an RNAPbound state (see Appendix section 1.1 for a more detailed overview). Additionally, in all thermodynamic models, the transitions between states are assumed to be much faster than both the rate of transcriptional initiation and changes in transcription factor concentrations. This separation of time scales, combined with a lack of energy dissipation in the process of regulation, makes it possible to consider the states to be in equilibrium such that the probability of each state can be computed using its Boltzmann weight (Garcia et al., 2007).
Despite the predictive power of thermodynamic models, eukaryotic transcription may not adhere to the requirements imposed by the thermodynamic framework. Indeed, Narula and Igoshin, 2010, Hammar et al., 2014, Estrada et al., 2016, Scholes et al., 2017, and Li et al., 2018 have proposed theoretical treatments of transcriptional regulation that maintain the occupancy hypothesis, but make no assumptions about separation of time scales or energy expenditure in the process of regulation. When combined with the MWC mechanism of DNA allostery, these models result in a nonequilibrium MWC model (Figure 1B). Here, no constraints are imposed on the relative values of the transition rates between states and energy can be dissipated over time. To our knowledge, neither the thermodynamic MWC model nor the nonequilibrium MWC model have been tested experimentally in eukaryotic transcriptional regulation.
Here, we performed a systematic dissection of the predictive power of these MWC models of DNA allostery in the embryonic development of the fruit fly Drosophila melanogaster in the context of the steplike activation of the hunchback gene by the Bicoid activator and the pioneerlike transcription factor Zelda (Driever et al., 1989; Nien et al., 2011; Xu et al., 2014). Specifically, we compared the predictions from these MWC models against dynamical measurements of input Bicoid and Zelda concentrations and output hunchback transcriptional activity. Using this approach, we discovered that no thermodynamic or nonequilibrium MWC model featuring the regulation of hunchback by Bicoid and Zelda could describe the transcriptional dynamics of this gene. Following recent reports of the regulation of hunchback and snail (Desponds et al., 2016; Dufourt et al., 2018) and inspired by discussions of nonequilibrium schemes of transcriptional regulation (Coulon et al., 2013; Wong and Gunawardena, 2020), we proposed a model in which Bicoid and Zelda, rather than passively biasing thermal fluctuations of chromatin toward the accessible state, actively assist the overcoming of an energetic barrier to make chromatin accessible through the recruitment of energyconsuming histone modifiers or chromatin remodelers. This model (Figure 1C) recapitulated all of our experimental observations. This interplay between theory and experiment establishes a clear path to identify the molecular steps that make DNA accessible, to systematically test our model of transcriptionfactordriven chromatin accessibility, and to make progress toward a predictive understanding of transcriptional regulation in development.
Results
A thermodynamic MWC model of activation and chromatin accessibility by Bicoid and Zelda
During the first 2 hr of embryonic development, the hunchback P2 minimal enhancer (Margolis et al., 1995; Driever et al., 1989; Perry et al., 2012; Park et al., 2019) is believed to be devoid of significant input signals other than activation by Bicoid and regulation of chromatin accessibility by both Bicoid and Zelda (Perry et al., 2012; Xu et al., 2014; Hannon et al., 2017). As a result, the early regulation of hunchback provides an ideal scaffold for a stringent test of simple theoretical models of eukaryotic transcriptional regulation.
Our implementation of the thermodynamic MWC model (Figure 1A) in the context of hunchback states that in the inaccessible state, neither Bicoid nor Zelda can bind DNA. In the accessible state, DNA is unwrapped and the binding sites become accessible to these transcription factors. Due to the energetic cost of opening the chromatin ($\mathrm{\Delta}{\epsilon}_{\text{chrom}}$), the accessible state is less likely to occur than the inaccessible one. However, the binding of Bicoid or Zelda can shift the equilibrium toward the accessible state (Adams and Workman, 1995; Miller and Widom, 2003; Mirny, 2010; Narula and Igoshin, 2010; Marzen et al., 2013).
In our model, we assume that all binding sites for a given molecular species have the same binding affinity. Relaxing this assumption does not affect any of our conclusions (as we will see below in Sections 'The thermodynamic MWC model fails to predict activation of hunchback in the absence of Zelda' and 'No thermodynamic model can recapitulate the activation of hunchback by Bicoid alone'). Bicoid upregulates transcription by recruiting RNAP through a proteinprotein interaction characterized by the parameter ${\omega}_{bp}$. We allow cooperative proteinprotein interactions between Bicoid molecules, described by ${\omega}_{b}$. However, since to our knowledge there is no evidence of direct interaction between Zelda and any other proteins, we assume no interaction between Zelda and Bicoid, or between Zelda and RNAP.
In Figure 2A, we illustrate the simplified case of two Bicoid binding sites and one Zelda binding site, plus the corresponding statistical weights of each state given by their Boltzmann factors. Note that the actual model utilized throughout this work accounts for at least 6 Bicoidbinding sites and 10 Zeldabinding sites that have been identified within the hunchback P2 enhancer (Section 'Predicting Zelda binding sites'; Driever and NüssleinVolhard, 1988; Driever and NüssleinVolhard, 1989; Park et al., 2019). This general model is described in detail in Appendix section 1.2.
The probability of finding RNAP bound to the promoter is calculated by dividing the sum of all statistical weights featuring RNAP by the sum of the weights corresponding to all possible system states. This leads to
where $b=[Bicoid]/{K}_{b}$, $z=[Zelda]/{K}_{z}$, and $p=[RNAP]/{K}_{p}$, with $[Bicoid]$, $[Zelda]$, and $[RNAP]$ being the concentrations of Bicoid, Zelda, and RNAP, respectively, and ${K}_{b}$, ${K}_{z}$, and ${K}_{p}$ their dissociation constants (see Appendix sections 1.1 and 1.2 for a detailed derivation). Given a set of model parameters, plugging ${p}_{bound}$ into Equation 1 predicts the rate of RNAP loading as a function of Bicoid and Zelda concentrations as shown in Figure 2B. Note that in this work, we treat the rate of transcriptional initiation and the rate of RNAP loading interchangeably.
Dynamical prediction and measurement of inputoutput functions in development
In order to experimentally test the theoretical model in Figure 2, it is necessary to measure both the inputs – the concentrations of Bicoid and Zelda – as well as the output rate of RNAP loading. Typically, when testing models of transcriptional regulation in bacteria and eukaryotes, input transcriptionfactor concentrations are assumed to not be modulated in time: regulation is in steady state (Ackers et al., 1982; Bakk et al., 2004; Segal et al., 2008; Garcia and Phillips, 2011; Sherman and Cohen, 2012; Cui et al., 2013; Little et al., 2013; RavehSadka et al., 2009; Sharon et al., 2012; Zeigler and Cohen, 2014; Xu et al., 2015; Sepúlveda et al., 2016; Estrada et al., 2016; RazoMejia et al., 2018; Zoller et al., 2018; Park et al., 2019). However, embryonic development is a highly dynamic process in which the concentrations of transcription factors are constantly changing due to their nuclear import and export dynamics, and due to protein production, diffusion, and degradation (Edgar and Schubiger, 1986; Edgar et al., 1987; Jaeger et al., 2004b; Gregor et al., 2007b). As a result, it is necessary to go beyond steadystate assumptions and to predict and measure how the instantaneous, timevarying concentrations of Bicoid and Zelda at each point in space dictate hunchback output transcriptional dynamics.
In order to quantify the concentration dynamics of Bicoid, we utilized an established BicoideGFP line (Sections 'Fly Strains', 'Sample preparation and data collection' and 'Image analysis'; Figure 3A and Appendix 1—figure 3A; Video 1; Gregor et al., 2007b; Liu et al., 2013). As expected, this line displayed the exponential Bicoid gradient across the length of the embryo (Appendix section 2.1; Appendix 1—figure 3B).We measured mean Bicoid nuclear concentration dynamics along the anteriorposterior axis of the embryo, as exemplified for two positions in Figure 3A. As previously reported (Gregor et al., 2007b), after anaphase and nuclear envelope formation, the Bicoid nuclear concentration quickly increases as a result of nuclear import. These measurements were used as inputs into the theoretical model in Figure 2.
Zelda concentration dynamics were measured in a ZeldasfGFP line (Sections 'Fly Strains', 'Sample preparation and data collection', and 'Image analysis'; Figure 3B; Video 2; Hamm et al., 2017). Consistent with previous results (Staudt et al., 2006; Liang et al., 2008; Dufourt et al., 2018), the Zelda concentration was spatially uniform along the embryo (Appendix 1—figure 3). Contrasting Figure 3A and B reveals that the overall concentration dynamics of both Bicoid and Zelda are qualitatively comparable. As a result of Zelda’s spatial uniformity, we used mean Zelda nuclear concentration dynamics averaged across all nuclei within the field of view to test our model (Appendix section 2.1; Figure 3B).
Given the high reproducibility of the concentration dynamics of Bicoid and Zelda (Appendix 1—figure 3), we combined measurements from multiple embryos by synchronizing their anaphase in order to create an ‘averaged embryo’ (Appendix section 2.1), an approach that has been repeatedly used to describe protein and transcriptional dynamics in the early fly embryo (Garcia et al., 2013; Bothma et al., 2014; Bothma et al., 2015; Berrocal et al., 2018; Lammers et al., 2020).
Our model assumes that hunchback output depends on the instantaneous concentration of input transcription factors. As a result, at each position along the anteriorposterior axis of the embryo, the combined Bicoid and Zelda concentration dynamics define a trajectory over time along the predicted inputoutput function surface (Figure 3C). The resulting trajectory predicts the rate of RNAP loading as a function of time. However, instead of focusing on calculating RNAP loading rate, we used it to compute the number of RNAP molecules actively transcribing hunchback at each point in space and time, a more experimentally accessible quantity (Section 'The thermodynamic MWC model fails to predict activation of hunchback in the absence of Zelda'). This quantity can be obtained by accounting for the RNAP elongation rate and the cleavage of nascent RNA upon termination (Appendix section 2.2; Appendix 1—figure 4; Bothma et al., 2014; Lammers et al., 2020) yielding the predictions shown in Figure 3D.
Instead of examining the full timedependent nature of our data, we analyzed two main dynamical features stemming from our prediction of the number of RNAP molecules actively transcribing hunchback: the initial rate of RNAP loading and the transcriptional onset time, ${t}_{on}$, defined by the slope of the initial rise in the predicted number of RNAP molecules, and the time after anaphase at which transcription starts as determined by the xintercept of the linear fit to the initial rise, respectively (Figure 3D).
Examples of the predictions generated by our theoretical model are shown in Figure 3E and F, where we calculate the initial rate of RNAP loading and ${t}_{on}$ for different values of the Bicoid dissociation constant ${K}_{b}$. This framework for quantitatively investigating dynamic inputoutput functions in living embryos is a necessary step toward testing the predictions of theoretical models of transcriptional regulation in development.
The thermodynamic MWC model fails to predict activation of hunchback in the absence of Zelda
In order to test the predictions of the thermodynamic MWC model (Figure 3E and F), we used the MS2 system (Bertrand et al., 1998; Garcia et al., 2013; Lucas et al., 2013). Here, 24 repeats of the MS2 loop are inserted in the 5′ untranslated region of the hunchback P2 reporter (Garcia et al., 2013), resulting in the fluorescent labeling of sites of nascent transcript formation (Figure 4A; Video 3). This fluorescence is proportional to the number of RNAP molecules actively transcribing the gene (Garcia et al., 2013). The experimental mean fluorescence as a function of time measured in a narrow window (2.5% of the total embryo length, averaged across nuclei in the window) along the length of the embryo (Figure 4B) is in qualitative agreement with the theoretical prediction (Figure 3D).
To compare theory and experiment, we next obtained the initial RNAP loading rates (Figure 4C, blue points) and ${t}_{on}$ (Figure 4D, blue points) from the experimental data (Appendix section 2.3; Appendix 1—figure 5B). The steplike shape of the RNAP loading rate (Figure 4C, blue points) agrees with previous measurements performed on this same reporter construct (Garcia et al., 2013). The plateaus at the extreme anterior and posterior positions were used to constrain the maximum and minimum theoretically allowed values in the model (Appendix section 1.3). With these constraints in place, we attempted to simultaneously fit the thermodynamic MWC model to both the initial rate of RNAP loading and ${t}_{on}$. For a given set of model parameters, the measurements of Bicoid and Zelda concentration dynamics predicted a corresponding initial rate of RNAP loading and ${t}_{on}$ (Figure 3E and F). The model parameters were then iterated using standard curvefitting techniques (Section 'Data analysis') until the best fit to the experimental data was achieved (Figure 4C and D, blue lines).
Although the model accounted for the initial rate of RNAP loading (Figure 4C, blue line), it produced transcriptional onset times that were much lower than those that we experimentally observed (Appendix 1—figure 6B, purple line). We hypothesized that this disagreement was due to our model not accounting for mitotic repression, when the transcriptional machinery appears to be silent immediately after cell division (Shermoen and O'Farrell, 1991; Gottesfeld and Forbes, 1997; Parsons and Spencer, 1997; Garcia et al., 2013). Thus, we modified the thermodynamic MWC model to include a mitotic repression window term, implemented as a time window at the start of the nuclear cycle during which no transcription could occur; the rate of mRNA production is thus given by
where $R$ and ${p}_{bound}$ are as defined in Equations 1 and 2, respectively, and ${t}_{MitRep}$ is the mitotic repression time window over which no transcription can take place after anaphase (Appendix sections 1.2 and 3). After incorporating mitotic repression, the thermodynamic MWC model successfully fit both the rates of RNAP loading and ${t}_{on}$ (Figure 4C and D, blue lines, Appendix 1—figure 6A and B, blue lines).
Given this success, we next challenged the model to perform the simpler task of explaining Bicoidmediated regulation in the absence of Zelda. This scenario corresponds to setting the concentration of Zelda to zero in the models in Appendix section 1.2 and Figure 2. In order to test this seemingly simpler model, we repeated our measurements in embryos devoid of Zelda protein (Video 4). These $zeld{a}^{}$ embryos were created by inducing clones of nonfunctional zelda mutant (${\text{\mathit{z}\mathit{e}\mathit{l}\mathit{d}\mathit{a}}}^{294}$) germ cells in female adults (Sections 'Fly Strains', 'Zelda germline clones'; Liang et al., 2008). All embryos from these mothers lack maternally deposited Zelda; female embryos still have a functional copy of zelda from their father, but this copy is not transcribed until after the maternaltozygotic transition, during nuclear cycle 14 (Liang et al., 2008). We confirmed that the absence of Zelda did not have a substantial effect on the spatiotemporal pattern of Bicoid (Appendix section 4; Xu et al., 2014).
While close to 100% of nuclei in wildtype embryos exhibited transcription along the length of the embryo (Figure 4E, blue; Video 5), measurements in the zelda^{−} background revealed that some nuclei never displayed any transcription during the entire nuclear cycle (Video 6). Specifically, transcription occurred only in the anterior part of the embryo, with transcription disappearing completely in positions posterior to about 40% of the embryo length (Figure 4E, red). We confirmed that no visible transcription spots were present in zelda^{−} embryo posteriors by imaging in the posteriors of three zelda^{−} embryos. These embryos are not included in our total embryo counts.
From those positions in the mutant embryos that did exhibit transcription in at least 30% of observed nuclei, we extracted the initial rate of RNAP loading and ${t}_{on}$ as a function of position. Interestingly, these RNAP loading rates were comparable to the corresponding rates in wildtype embryos (Figure 4C, red points). However, unlike in the wildtype case (Figure 4D, blue points), ${t}_{on}$ was not constant in the $zeld{a}^{}$ background. Instead, ${t}_{on}$ became increasingly delayed in more posterior positions until transcription ceased posterior to 40% of the embryo length (Figure 4D, red points). Together, these observations indicated that removing Zelda primarily results in a delay of transcription with only negligible effects on the underlying rates of RNAP loading, consistent with previous fixedembryo experiments (Nien et al., 2011; Foo et al., 2014) and with recent liveimaging measurements in which Zelda binding was reduced at specific enhancers (Dufourt et al., 2018; Yamada et al., 2019). We speculate that the loss of transcriptionally active nuclei posterior to 40% of the embryo length is a direct result of this delay in ${t}_{on}$: by the time that onset would occur in those nuclei, the processes leading to the next mitosis have already started and repressed transcriptional activity.
Next, we attempted to simultaneously fit the model to the initial rates of RNAP loading and ${t}_{on}$ in the zelda^{−} mutant background. Although the model recapitulated the observed initial RNAP loading rates (Figure 4C, red line), we noticed a discrepancy between the observed and fitted transcriptional onset times of up to ∼5 min (Figure 4D, red). While the mutant data exhibited a substantial delay in more posterior nuclei, the model did not produce significant delays (Figure 4D, red line). Further, our model could not account for the lack of transcriptional activity posterior to 40% of the embryo length in the zelda^{−} mutant (Figure 4E, red).
These discrepancies suggest that the thermodynamic MWC model cannot fully describe the transcriptional regulation of the hunchback promoter by Bicoid and Zelda. However, the attempted fits in Figure 4C and D correspond to a particular set of model parameters and therefore do not completely rule out the possibility that there exists some parameter set of the thermodynamic MWC model capable of recapitulating the zelda^{−} data.
In order to determine whether this model is at all capable of accounting for the zelda^{−} transcriptional behavior, we systematically explored how its parameters dictate its predictions. To characterize and visualize the limits of our model, we examined two relevant quantitative features of our data. First, we defined the offset in the transcriptional onset time as the value of the onset time at the position 20% along the embryo length, the most anterior position studied here (Figure 5A), namely
where x is the position along the embryo. Second, we measured the average transcriptional onset delay along the anteriorposterior axis (Figure 5A). This quantity is defined as the area under the curve of ${t}_{on}$ versus embryo position, from 20% to 37.5% along the embryo (the positions where the zelda^{−} embryos display transcription in at least 30% of nuclei), divided by the corresponding distance along the embryo
where the offset in the onset time was used to define the zero of this integral (Appendix section 5.1). While the offset in ${t}_{on}$ is similar for both wildtype and zelda^{−} backgrounds (approximately 4 min), the average ${t}_{on}$ delay corresponding to the wildtype data is close to 0 min, and is different from the value of about 0.7 min obtained from measurements in the zelda^{−} background within experimental error (Figure 5C, ellipses).
Based on Estrada et al., 2016 and as detailed in Appendix section 5.1, we used an algorithm to efficiently sample the parameter space of the thermodynamic MWC model (dissociation constants ${K}_{b}$ and ${K}_{z}$, proteinprotein interaction terms ${\omega}_{b}$ and ${\omega}_{bp}$, energy to make the DNA accessible $\mathrm{\Delta}{\epsilon}_{\text{chrom}}$, and length of the mitotic repression window ${t}_{MitRep}$), and to calculate the corresponding ${t}_{on}$ offset and average ${t}_{on}$ delay for each parameter set. Figure 5B features three specific realizations of this parameter search; for each combination of parameters considered, the predicted ${t}_{on}$ is calculated and the corresponding ${t}_{on}$ offset and average ${t}_{on}$ delay computed. Although the wildtype data overlap with the thermodynamic MWC model region, the range of the ${t}_{on}$ offset and average ${t}_{on}$ delay predicted by the model (Figure 5C, green) did not overlap with that of the zelda^{−} data. We concluded that our thermodynamic MWC model is not sufficient to explain the regulation of hunchback by Bicoid and Zelda.
No thermodynamic model can recapitulate the activation of hunchback by Bicoid alone
Since the failure of the thermodynamic MWC model to predict the zelda^{−} data does not necessarily rule out the existence of another thermodynamic model that can account for our experimental measurements, we considered other possible thermodynamic models. Conveniently, an arbitrary thermodynamic model featuring n_{b} Bicoid binding sites can be generalized using the mathematical expression
where ${p}_{inacc}$ and ${P}_{r,i}$ are arbitrary weights describing the states in our generalized thermodynamic model, $R$ is a rate constant that relates promoter occupancy to transcription rate, and the r and i summations refer to the numbers of RNAP and Bicoid molecules bound to the enhancer, respectively (Appendix section 6.1; Bintu et al., 2005a; Estrada et al., 2016; Scholes et al., 2017). Note, that this generalized thermodynamic model also included the possibility of Bicoid binding to the inaccessible chromatin state (Appendix section 6.3).
Although this generalized thermodynamic model contains many more parameters than the thermodynamic MWC model previously considered, we could still systematically explore reasonable values of these parameters and the resulting ${t}_{on}$ offsets and average ${t}_{on}$ delays (Appendix section 6.2). For added generality, and to account for recent reports suggesting the presence of more than six Bicoidbinding sites in the hunchback minimal enhancer (Park et al., 2019), we expanded this model to include up to 12 Bicoidbinding sites.
The generalized thermodynamic model also failed to explain the zelda^{−} data (Appendix section 6.2; Figure 5C, yellow). Note that the region of parameter space occupied by the generalized thermodynamic model does not entirely include that of the thermodynamic MWC model due to differences in the constraints of parameter values used in the parameter exploration, as described in Appendix sections 1.3 and 6.2. Nevertheless, our results strongly suggest that no thermodynamic model of Bicoidactivated hunchback transcription can predict transcriptional onset in the absence of Zelda, casting doubt on the general applicability of these models to transcriptional regulation in development.
Qualitatively, the reason for the failure of thermodynamic models to predict hunchback transcriptional is revealed by comparing Bicoid and Zelda concentration dynamics to those of the MS2 output signal (Appendix 1—figure 10). The thermodynamic models investigated in this work have assumed that the system responds instantaneously to any changes in input transcription factor concentration. As a result, since Bicoid and Zelda are imported into the nucleus by around 3 min into the nuclear cycle (Figure 3A and B), these models always predict that transcription will ensue at approximately that time. Thus, thermodynamic models cannot accommodate delays in the ${t}_{on}$ such as those revealed by the zelda^{−} data (see Appendix section 6.4 for a more detailed explanation). Rather than further complicating our thermodynamic models with additional molecular players to attempt to describe the data, we instead decided to examine the broader purview of nonequilibrium models to attempt to reach an agreement between theory and experiment.
A nonequilibrium MWC model also fails to describe the zelda^{−} data
Thermodynamic models based on equilibrium statistical mechanics can be seen as limiting cases of more general kinetic models that lie out of equilibrium (Appendix section 6.5; Figure 1B). Following recent reports (Estrada et al., 2016; Li et al., 2018; Park et al., 2019) that the theoretical description of transcriptional regulation in eukaryotes may call for models rooted in nonequilibrium processes – where the assumptions of separation of time scales and no energy expenditure may break down – we extended our earlier models to produce a nonequilibrium MWC model (Appendix sections 6.5 and 7.1; Kim and O'Shea, 2008; Narula and Igoshin, 2010). This model, shown for the case of two Bicoid binding sites in Figure 6A, accounts for the dynamics of the MWC mechanism by positing transition rates between the inaccessible and accessible chromatin states, but makes no assumptions about the relative magnitudes of these rates, or about the rates of Bicoid and RNAP binding and unbinding.
Since this model can operate out of steady state, we calculate the probabilities of each state as a function of time by solving the system of coupled ordinary differential equations (ODEs) associated with the system shown in Figure 6A. Consistent with prior measurements (Blythe and Wieschaus, 2016), we assume that chromatin is inaccessible at the start of the nuclear cycle. Over time, the system evolves such that the probability of it occupying each state becomes nonzero, making it possible to calculate the fraction of time RNAP is bound to the promoter and, through the occupancy hypothesis, the rate of RNAP loading. Mitotic repression is still incorporated using the term ${t}_{MitRep}$. For times $t<{t}_{MitRep}$, the system can evolve in time but the ensuing transcription rate is fixed at zero.
We systematically varied the magnitudes of the transition rates and solved the system of ODEs in order to calculate the corresponding ${t}_{on}$ offset and average ${t}_{on}$ delay. Due to the combinatorial increase of free parameters as more Bicoidbinding sites are included in the model, we could only explore the parameter space for models containing up to five Bicoidbinding sites (Appendix section 7.2; Figure 6B and Appendix 1—figure 9). Regardless, none of the nonequilibrium MWC models with up to five Bicoidbinding sites came close to reaching the mutant ${t}_{on}$ offset and average ${t}_{on}$ delay (Figure 6B). Additionally, an alternative version of this nonequilibrium MWC model where the system could not evolve in time until after the mitotic repression window had elapsed yielded similar conclusions (see Appendix section 7.3 for details). We conjecture that the observed behavior extends to the biologically relevant case of six or more binding sites. Thus, we conclude that the more comprehensive nonequilibrium MWC model still cannot account for the experimental data, motivating an additional reexamination of our assumptions.
Transcriptionfactordriven chromatin accessibility can capture all aspects of the data
Since even nonequilibrium MWC models incorporating energy expenditure and nonsteady behavior could not explain the zelda^{−} data, we further revised the assumptions of our model in an effort to quantitatively predict the regulation of ${t}_{on}$ along the embryo. In accordance with the MWC model of allostery, all of our theoretical treatments so far have posited that the DNA is an allosteric molecule that transitions between open and closed states as a result of thermal fluctuations (Narula and Igoshin, 2010; Mirny, 2010; Marzen et al., 2013; Phillips et al., 2013).
In the MWC models considered here, the presence of Zelda and Bicoid does not affect the microscopic rates of DNA opening and closing; rather, their binding to open DNA shifts the equilibrium of the DNA conformation toward the accessible state. However, recent biochemical work has suggested that Zelda and Bicoid play a more direct role in making chromatin accessible. Specifically, Zelda has been implicated in the acetylation of chromatin, a histone modification that renders nucleosomes unstable and increases DNA accessibility (Li et al., 2014a; Li and Eisen, 2018). Further, Bicoid has been shown to interact with the coactivator dCBP, which possesses histone acetyltransferase activity (Fu et al., 2004). Additionally, recent studies by Desponds et al., 2016 in hunchback and by Dufourt et al., 2018 in snail have proposed the existence of multiple transcriptionally silent steps that the promoter needs to transition through before transcriptional onset. These steps could correspond to, for example, the recruitment of histone modifiers, nucleosome remodelers, and the transcriptional machinery (Li et al., 2014a; Park et al., 2019), or to the stepwise unraveling of discrete histoneDNA contacts (Culkin et al., 2017). Further, Dufourt et al., 2018 proposed that Zelda plays a role in modulating the number of these steps and their transition rates.
We therefore proposed a model of transcriptionfactordriven chromatin accessibility in which, in order for the DNA to become accessible and transcription to ensue, the system slowly and irreversibly transitions through m transcriptionally silent states (Appendix section 8.1; Figure 7A). We assume that the transitions between these states are all governed by the same rate constant π. Finally, in a stark deviation from the MWC framework, we posit that these transitions can be catalyzed by the presence of Bicoid and Zelda such that
Here, π describes the rate (in units of inverse time) of each irreversible step, expressed as a sum of rates that depend separately on the concentrations of Bicoid and Zelda, and c_{b} and c_{z} are rate constants that scale the relative contribution of each transcription factor to the overall rate (see Appendix section 8.2 for a more detailed discussion of this choice). We emphasize that this is only one potential model, and there may exist several other nonequilibrium models capable of describing our data.
In this model of transcriptionfactordriven chromatin accessibility, once the DNA becomes irreversibly accessible after transitioning through the m nonproductive states, we assume that, for the rest of the nuclear cycle, the system equilibrates rapidly such that the probability of it occupying any of its possible states is still described by equilibrium statistical mechanics. Like in our previous models, transcription only occurs in the RNAPbound states, obeying the occupancy hypothesis. Further, our model assumes that if the transcriptional onset time of a given nucleus exceeds that of the next mitosis, this nucleus will not engage in transcription. Thus, this transcriptionfactordriven model is an extension of the nonequilibrium MWC model with two crucial differences: (i) we allow for multiple inaccessible states preceding the transcriptionally active state, and (ii) the transitions between these states are actively driven by Bicoid or Zelda.
Unlike the thermodynamic and nonequilibrium MWC models, this model of transcriptionfactordriven chromatin accessibility quantitatively recapitulated the observation that posterior nuclei in zelda^{−} embryos do not engage in transcription as well as the initial rate of RNAP loading, and ${t}_{on}$ for both the wildtype and zelda^{−} mutant data (Figure 7B and C). Additionally, we found that a minimum of $m=3$ steps was required to sufficiently explain the data (Appendix section 8.3; Appendix 1—figure 14). Interestingly, unlike all previously considered models, the model of transcriptionfactordriven chromatin accessibility did not require mitotic repression to explain ${t}_{on}$ (Appendix sections 3 and 8.1). Instead, the timing of transcriptional output arose directly from the model’s initial irreversible transitions (Appendix 1—figure 14), obviating the need for an arbitrary suppression window in the beginning of the nuclear cycle. The only substantive disagreement between our theoretical model and the experimental data was that the model predicted that no nuclei should transcribe posterior to 60% of the embryo length, whereas no transcription posterior to 40% was experimentally observed in the embryo (Figure 7B and C). Finally, note that this model encompasses a much larger region of parameter space than the thermodynamic and nonequilibrium MWC models and, as expected from the agreement between model and experiment described above, contained both the wildtype and zelda^{−} data points within its domain (Figure 7D).
Discussion
For four decades, thermodynamic models rooted in equilibrium statistical mechanics have constituted the null theoretical model for calculating how the number, placement and affinity of transcription factor binding sites on regulatory DNA dictates gene expression (Bintu et al., 2005a; Bintu et al., 2005b). Further, the MWC mechanism of allostery has been proposed as an extra layer that allows thermodynamic and more general nonequilibrium models to account for the regulation of chromatin accessibility (Mirny, 2010; Narula and Igoshin, 2010; Marzen et al., 2013).
In this investigation, we tested thermodynamic and nonequilibrium MWC models of chromatin accessibility and transcriptional regulation in the context of hunchback activation in the early embryo of the fruit fly D. melanogaster (Driever et al., 1989; Nien et al., 2011; Xu et al., 2014). While chromatin state (accessibility, posttranslational modifications) is highly likely to influence transcriptional dynamics of associated promoters, specifically measuring the influence of chromatin state on transcriptional dynamics is challenging because of the sequential relationship between changes in chromatin state and transcriptional regulation. However, the hunchback P2 minimal enhancer provides a unique opportunity to dissect the relative contribution of chromatin regulation on transcriptional dynamics because, in the early embryo, chromatin accessibility at hunchback is granted by both Bicoid and Zelda (Hannon et al., 2017). The degree of hunchback transcriptional activity, however, is regulated directly by Bicoid (Driever and NüssleinVolhard, 1989; Driever et al., 1989; Struhl et al., 1989). Therefore, while genetic elimination of Zelda function interferes with acquisition of full chromatin accessibility, the hunchback locus retains a measurable degree of accessibility and transcriptional activity stemming from Bicoid function, allowing for a quantitative determination of the contribution of Zeldadependent chromatin accessibility on the transcriptional dynamics of the locus.
With these attributes in mind, we constructed a thermodynamic MWC model which, given a set of parameters, predicted an output rate of hunchback transcription as a function of the input Bicoid and Zelda concentrations (Figure 2B). In order to test this model, it was necessary to acknowledge that development is not in steadystate, and that both Bicoid and Zelda concentrations change dramatically in space and time (Figure 3A and B). As a result, we went beyond widespread steadystate descriptions of development and introduced a novel approach that incorporated transient dynamics of input transcriptionfactor concentrations in order to predict the instantaneous output transcriptional dynamics of hunchback (Figure 3C). Given input dynamics quantified with fluorescent protein fusions to Bicoid and Zelda, we both predicted output transcriptional activity and measured it with an MS2 reporter (Figures 3D and 4B).
This approach revealed that the thermodynamic MWC model sufficiently predicts the timing of the onset of transcription and the subsequent initial rate of RNAP loading as a function of Bicoid and Zelda concentration. However, when confronted with the much simpler case of Bicoidonly regulation in a zelda mutant, the thermodynamic MWC model failed to account for the observations that only a fraction of nuclei along the embryo engaged in transcription, and that the transcriptional onset time of those nuclei that do transcribe was significantly delayed with respect to the wildtype setting (Figure 4D and E). Our systematic exploration of all thermodynamic models (over a reasonable parameter range) showed that that no thermodynamic model featuring regulation by Bicoid alone could quantitatively recapitulate the measurements performed in the zelda mutant background (Figure 5C, yellow).
This disagreement could be resolved by invoking an unknown transcription factor that regulates the hunchback reporter in addition to Bicoid. However, at the early stages of development analyzed here, such a factor would need to be both maternally provided and patterned in a spatial gradient to produce the observed positiondependent transcriptional onset times. To our knowledge, none of the known maternal genes regulate the expression of this hunchback reporter in such a fashion (Chen et al., 2012; Perry et al., 2012; Xu et al., 2014). We conclude that the MWC thermodynamic model cannot accurately predict hunchback transcriptional dynamics.
To explore nonequilibrium models, we retained the MWC mechanism of chromatin accessibility, but did not demand that the accessible and inaccessible states be in thermal equilibrium. Further, we allowed for the process of Bicoid and RNAP binding, as well as their interactions, to consume energy. For up to five Bicoidbinding sites, no set of model parameters could quantitatively account for the transcriptional onset features in the zelda mutant background (Figure 6B). While we were unable to investigate models with more than five Bicoidbinding sites due to computational complexity (Estrada et al., 2016), the substantial distance in parameter space between the mutant data and the investigated models (Figure 6B) suggested that a successful model with more than five Bicoidbinding sites would probably operate near the limits of its explanatory power, similar to the conclusions from studies that explored hunchback regulation under the steadystate assumption (Park et al., 2019). Thus, despite the simplicity and success of the MWC model in predicting the effects of protein allostery in a wide range of biological contexts (Keymer et al., 2006; Swem et al., 2008; Martins and Swain, 2011; Marzen et al., 2013; Rapp and Yifrach, 2017; RazoMejia et al., 2018; Chure et al., 2019; Rapp and Yifrach, 2019), the observed transcriptional onset times could not be described by any previously proposed thermodynamic MWC mechanism of chromatin accessibility, or even by a more generic nonequilibrium MWC model in which energy is continuously dissipated (Tu, 2008; Kim and O'Shea, 2008; Narula and Igoshin, 2010; Estrada et al., 2016; Wang et al., 2017).
Since Zelda is associated with histone acetylation, which is correlated with increased chromatin accessibility (Li et al., 2014a; Li and Eisen, 2018), and Bicoid interacts with the coactivator dCBP, which has histone acetyltransferase activity (Fu et al., 2004; Fu and Ma, 2005; Park et al., 2019), we suspect that both Bicoid and Zelda actively drive DNA accessibility. A molecular pathway shared by Bicoid and Zelda to render chromatin accessible is consistent with our results, and with recent genomewide experiments showing that Bicoid can rescue the function of Zeldadependent enhancers at high enough concentrations (Hannon et al., 2017). Thus, the binding of Bicoid and Zelda, rather than just biasing the equilibrium toward the open chromatin state as in the MWC mechanism, may trigger a set of molecular events that locks DNA into an accessible state. In addition, the promoters of hunchback (Desponds et al., 2016) and snail (Dufourt et al., 2018) may transition through a set of intermediate, nonproductive states before transcription begins.
We therefore explored a model in which Bicoid and Zelda catalyze the transition of chromatin into the accessible state via a series of slow, effectively irreversible steps. These steps may be interpreted as energy barriers that are overcome through the action of Bicoid and Zelda, consistent with the coupling of these transcription factors to histone modifiers, nucleosome remodelers (Fu et al., 2004; Li et al., 2014a; Li and Eisen, 2018; Park et al., 2019), and with the stepwise breaking of discrete histoneDNA contacts to unwrap nucleosomal DNA (Culkin et al., 2017). In this model, once accessible, the chromatin remains in that state and the subsequent activation of hunchback by Bicoid is described by a thermodynamic model.
Crucially, this transcriptionfactordriven chromatin accessibility model successfully replicated all of our experimental observations. A minimum of three transcriptionally silent states were necessary to explain our data (Figure 7D and Appendix 1—figure 14C). Interestingly, recent work dissecting the transcriptional onset time distribution of snail also suggested the existence of three such intermediate steps in the context of that gene (Dufourt et al., 2018). Given that, as in hunchback, the removal and addition of Zelda modulates the timing of transcriptional onset of sog and snail (Dufourt et al., 2018; Yamada et al., 2019), we speculate that transcriptionfactordriven chromatin accessibility may also be at play in these pathways. Thus, taken in consideration with similar works examining the dynamics of transcription onset (Desponds et al., 2016; Dufourt et al., 2018; Fritzsch et al., 2018; Li et al., 2018), our results strongly suggest that chromatin state does not fluctuate thermodynamically, but rather progresses through a series of stepwise, transcriptionfactordriven transitions into a final RNAPaccessible configuration (Coulon et al., 2013).
Intriguingly, accounting for these intermediate states also obviated the need for the ad hoc imposition of a mitotic repression window (Appendix sections 3 and 8.1), which was required in the thermodynamic MWC model (Appendix 1—figure 6). Our results thus suggest a mechanistic interpretation of the phenomenon of mitotic repression after anaphase, where the promoter must traverse through intermediary transcriptionally silent states before transcriptional onset can occur.
These clues into the molecular mechanisms of action of Bicoid, Zelda, and their associated modifications to the chromatin landscape pertain to a time scale of a few minutes, a temporal scale that is inaccessible with widespread genomewide and fixedtissue approaches. Here, we revealed the regulatory action of Bicoid and Zelda by utilizing the dynamic information provided by live imaging to analyze the transient nature of the transcriptional onset time, highlighting the need for descriptions of development that go beyond steady state and acknowledge the highly dynamic changes in transcriptionfactor concentrations that drive developmental programs.
While we showed that one model incorporating transcriptionfactordriven chromatin accessibility could recapitulate hunchback transcriptional regulation by Bicoid and Zelda, and is consistent with molecular evidence on the modes of action of these transcription factors, other models may have comparable explanatory power. In the future, a systematic exploration of different classes of models and their unique predictions will identify measurements that determine which specific model is the most appropriate description of transcriptional regulation in development and how it is implemented at the molecular level. While all the analyses in this work relied on mean levels of input concentrations and output transcription levels, detailed studies of singlecell features of transcriptional dynamics such as the distribution of transcriptional onset times (Narula and Igoshin, 2010; Dufourt et al., 2018; Fritzsch et al., 2018) could shed light on these chromatinregulating mechanisms. Simultaneous measurement of local transcriptionfactor concentrations at sites of transcription and of transcriptional initiation with high spatiotemporal resolution, such as afforded by lattice lightsheet microscopy (Mir et al., 2018), could provide further information about chromatin accessibility dynamics. Finally, different theoretical models may make distinct predictions about the effect of modulating the number, placement, and affinity of Bicoid and Zelda sites (and even of nucleosomes) in the hunchback enhancer. These models could be tested with future experiments that implement these modulations in reporter constructs.
In sum, here we engaged in a theoryexperiment dialogue to respond to the theoretical challenges of proposing a passive MWC mechanism for chromatin accessibility in eukaryotes (Mirny, 2010; Narula and Igoshin, 2010; Marzen et al., 2013); we also questioned the suitability of thermodynamic models in the context of development (Estrada et al., 2016). At least regarding the activation of hunchback, and likely similar developmental genes such as snail and sog (Dufourt et al., 2018; Yamada et al., 2019), we speculate that Bicoid and Zelda actively drive chromatin accessibility, possibly through histone acetylation. Once chromatin becomes accessible, thermodynamic models can predict hunchback transcription without the need to invoke energy expenditure and nonequilibrium models. Regardless of whether we have identified the only possible model of chromatin accessibility and regulation, we have demonstrated that this dialogue between theoretical models and the experimental testing of their predictions at high spatiotemporal resolution is a powerful tool for biological discovery. The new insights afforded by this dialogue will undoubtedly refine theoretical descriptions of transcriptional regulation as a further step toward a predictive understanding of cellular decisionmaking in development.
Materials and methods
Predicting Zeldabinding sites
Request a detailed protocolZeldabinding sites in the hunchback promoter were identified as heptamers scoring three or higher using a Zelda alignment matrix (Harrison et al., 2011) and the Advanced PASTER entry form online (http://stormo.wustl.edu/consensus/cgibin/Server/Interface/patser.cgi) (Hertz et al., 1990; Hertz and Stormo, 1999). PATSER was run with setting ‘Seq. Alphabet and Normalization’ as ‘a:t 3 g:c 2’ to provide the approximate background frequencies as annotated in the Berkeley Drosophila Genome Project (BDGP)/Celera Release 1. Reverse complementary sequences were also scored.
Fly strains
Request a detailed protocolBicoid nuclear concentration was imaged in embryos from line yw;his2avmrfp1;bicoidE1,egfpbicoid (Gregor et al., 2007b). Similarly, Zelda nuclear concentration was determined by imaging embryos from line sfgfpzelda;+;hisirfp. The sfgfpzelda transgene was obtained from Hamm et al., 2017 and the hisiRFP transgene is courtesy of Kenneth Irvine and Yuanwang Pan.
Transcription from the hunchback promoter was measured by imaging embryos resulting from crossing female virgins yw;HistoneRFP;MCPNoNLS(2) with male yw;P2PMS2LacZ/cyo;+ (Garcia et al., 2013).
In order to image transcription in embryos lacking maternally deposited Zelda protein, we crossed mother flies whose germline was w,his2avmrfp1,zelda(294),FRT19A;+;MCPegfp(4F)/+ obtained through germline clones (see below) with fathers carrying the yw;P2PMS2LacZ/cyo;+ reporter. The ${\text{\mathit{z}\mathit{e}\mathit{l}\mathit{d}\mathit{a}}}^{294}$ transgene is courtesy of Christine Rushlow (Liang et al., 2008). The MCPegfp(4F) transgene expresses approximately double the amount of MCP than the MCPegfp(2) (Garcia et al., 2013), ensuring similar levels of MCP in the embryo in all experiments.
Imaging Bicoid nuclear concentration in embryos lacking maternally deposited Zelda protein was accomplished by replacing the MCPegfp(4F) transgene described in the previous paragraph with the bicoidE1,egfpbicoid transgene used for imaging nuclear Bicoid in a wildtype background. We crossed mother flies whose germline was w,his2avmrfp1,zelda(294),FRT19A;+;bicoidE1,egfpbicoid/+ obtained through germline clones (see below) with yw fathers.
Zelda germline clones
Request a detailed protocolIn order to generate mother flies containing a germline homozygous null for zelda, we first crossed virgin females of w,his2avmrfp1,zelda(294),FRT19A/FM7,y,B;+;MCPegfp(4F)/TM3,ser (or w,his2avmrfp1,zelda(294),FRT19A;+;bicoidE1,egfpbicoid/+ to image nuclear Bicoid) with males of ovoD,hsFLP,FRT19A;+;+ (Liang et al., 2008). The resulting heterozygotic offspring were heatshocked in order to create maternal germline clones as described in Liang et al., 2008. The resulting female virgins were crossed with male yw;P2PMS2LacZ/cyo;+ (Garcia et al., 2013) to image transcription or male yw to image nuclear Bicoid concentration.
Male offspring are null for zygotic zelda. Female offspring are heterozygotic for functional zelda, but zygotic zelda is not transcribed until nuclear cycle 14 (Liang et al., 2008), which occurs after the analysis in this work. All embryos lacking maternally deposited Zelda showed aberrant morphology in nuclear size and shape (data not shown), as previously reported (Liang et al., 2008; Staudt et al., 2006).
Sample preparation and data collection
Request a detailed protocolSample preparation followed procedures described in Bothma et al., 2014, Garcia and Gregor, 2018, and Lammers et al., 2020.
Embryos were collected and mounted in halocarbon oil 27 between a semipermeable membrane (Lumox film, Starstedt, Germany) and a coverslip. Data collection was performed using a Leica SP8 scanning confocal microscope (Leica Microsystems, Biberach, Germany). Imaging settings for the MS2 experiments were the same as in Lammers et al., 2020, except the Hybrid Detector (HyD) for the HisRFP signal used a spectral window of 556–715 nm. The settings for the BicoidGFP measurements were the same, except for the following. The power setting for the 488 nm line was 10 μW. The confocal stack was only 10 slices in this case, rather than 21, resulting in a spacing of 1.11 μm between planes. The images were acquired at a time resolution of 30 s, using an image resolution of 512 × 128 pixels.
The settings for the ZeldasfGFP measurements were the same as the BicoidGFP measurements, except different laser lines were used for the different fluorophores. The sfGFP excitation line was set at 485 nm, using a power setting of 10 μW. The HisiRFP excitation line was set at 670 nm. The HyD for the HisiRFP signal was set at a 680–800 nm spectral window. All specimens were imaged over the duration of nuclear cycle 13.
Image analysis
Request a detailed protocolImages were analyzed using customwritten software following the protocol in Garcia et al., 2013. Briefly, this procedure involved segmenting individual nuclei using the histone signal as a nulear mask, segmenting each transcription spot based on its fluorescence, and calculating the intensity of each MCPGFP transcriptional spot inside a nucleus as a function of time.
Additionally, the nuclear protein fluorescences of the BicoidGFP and ZeldasfGFP fly lines were calculated as follows. Using the histonelabeled nuclear mask for each individual nucleus, the fluorescence signal within the mask was extracted in xyz, as well as through time. For each timepoint, the xy signal was averaged to give an average nuclear fluorescence as a function of z and time. This signal was then maximum projected in z, resulting in an average nuclear concentration as a function of time, per single nucleus. These single nucleus concentrations were then averaged over anteriorposterior position to create the protein concentrations reported in the main text.
Data analysis
Request a detailed protocolAll fits in the main text were performed by minimizing the leastsquares error between the data and the model predictions. Unless stated otherwise, error bars reflect standard error of the mean over multiple embryo measurements. See Appendix section 2.1 for more details on how this was carried out for model predictions.
Appendix 1
1. Equilibrium models of transcription
1.1 An overview of equilibrium thermodynamics models of transcription
In this section, we give a brief overview of the theoretical concepts behind equilibrium thermodynamics models of transcription. For a more detailed overview, we refer the reader to Bintu et al., 2005b and Bintu et al., 2005a. These models invoke statistical mechanics in order to to calculate bulk properties of a system by enumerating the probability of each possible microstate of the system. The probability of a given microstate is proportional to its Boltzmann weight ${e}^{\beta \epsilon}$, where ε is the energy of the microstate and $\beta ={({k}_{B}T)}^{1}$ with ${k}_{B}$ being the Boltzmann constant and $T$ the absolute temperature of the system (Garcia et al., 2007).
Specific examples of these microstates in the context of simple activation are featured in Appendix 1—figure 1. As reviewed in Garcia et al., 2007, the Boltzmann weight of each of these microstates can also be written in a thermodynamic language that accounts for the concentration of the molecular species, their dissociation constant to DNA, and a cooperativity term ω that accounts for the proteinprotein interactions between the activator and RNAP. To calculate the probability of finding RNAP bound to the promoter ${p}_{bound}$, we divide the sum of the weights of the RNAPbound states by the sum of all possible states
Here, $[P]$ and $[A]$ are the concentrations of RNAP and activator, respectively. ${K}_{p}$ and ${K}_{a}$ are their corresponding dissociation constants, and ω indicates an interaction between activator and RNAP: $\omega >1$ corresponds to cooperativity, whereas $0<\omega <1$ corresponds to anticooperativity.
Using ${p}_{bound}$, we write the subsequent rate of mRNA production by assuming the occupancy hypothesis, which states that
where $R$ is an underlying rate of transcriptional initiation (usually interpreted as the rate of loading RNAP from the promoterbound state). In the case of simple activation illustrated in Appendix 1—figure 1, the overall transcriptional initiation rate is then given by
From Appendix Equation 1, one can derive the Hill equation that is frequently used to model biophysical binding. In the limit of high cooperativity, $\omega \frac{[P]}{{K}_{p}}\gg 1$ and $\omega \frac{[A]}{{K}_{a}}\gg 1$ such that
If we then define a new binding constant ${K}_{a}^{\prime}=\frac{{K}_{a}{K}_{p}}{\omega [P]}$, we get the familiar Hill equation of order 1 with a binding constant ${K}_{a}^{\prime}$
In general, any Hill equation of order n can be derived from a more fundamental equilibrium thermodynamic model of simple activation possessing n activatorbinding sites in the appropriate limits of high cooperativity. Thus, any time a Hill equation is invoked, equilibrium thermodynamics is implicitly used, bringing with it all of the underlying assumptions described in Appendix section 6.5. This highlights the importance of rigorously grounding the assumptions made in any model of transcription, to better discriminate between the effects of equilibrium and nonequilibrium processes.
1.2 Thermodynamic MWC model
In the thermodynamic MWC model, we consider a system with 6 Bicoidbinding sites and 10 Zeldabinding sites. In addition, we allow for RNAP binding to the promoter.
In our model, the DNA can be in either an accessible or an inaccessible state. The difference in free energy between the two states is given by $\mathrm{\Delta}{\epsilon}_{\text{chrom}}$, where $\mathrm{\Delta}{\epsilon}_{\text{chrom}}$ is defined as
Here, ${\epsilon}_{\text{accessible}}$ and $\epsilon}_{{\textstyle \text{inaccessible}}$ are the energies of the accessible and inaccessible states, respectively. A positive $\mathrm{\Delta}{\epsilon}_{\text{chrom}}$ signifies that the inaccessible state is at a lower energy level, and therefore more probable, than the accessible state. We assume that all binding sites for a given molecular species have the same binding affinity, and that all accessible states exist at the same energy level compared to the inaccessible state. Thus, the total number of states is determined by the combinations of occupancy states of the three types of binding sites as well as the presence of the inaccessible, unbound state. We choose to not allow any transcription factor or RNAP binding when the DNA is inaccessible.
In this equilibrium model, the statistical weight of each accessible microstate is given by the thermodynamic dissociation constants ${K}_{b}$, ${K}_{z}$, and ${K}_{p}$ of Bicoid, Zelda, and RNAP, respectively. The statistical weight for the inaccessible state is ${e}^{\mathrm{\Delta}{\epsilon}_{\text{chrom}}}$. We allow for a proteinprotein interaction term ${\omega}_{b}$ between nearestneighbor Bicoid molecules, as well as a pairwise cooperativity ${\omega}_{bp}$ between Bicoid and RNAP. However, we posit that Zelda does not interact directly with either Bicoid or RNAP. For notational convenience, we express the statistical weights in terms of the nondimensionalized concentrations of Bicoid, Zelda, and RNAP, given by b, z and p, respectively, such that, for example, $b\equiv \frac{[Bicoid]}{{K}_{b}}$. Appendix 1—figure 2 shows the states and statistical weights for this thermodynamic MWC model, with all the associated parameters.
Incorporating all the microstates, we can calculate a statistical mechanical partition function, the sum of all possible weights, which is given by
Using the binomial theorem
Appendix Equation 7 can be expressed more compactly as
From this partition function, we can calculate ${p}_{bound}$, the probability of being in an RNAPbound state. This term is given by the sum of the statistical weights of the RNAPbound states divided by the partition function
In this model, we once again assume that the transcription associated with each microstate is zero unless RNAP is bound, in which case the associated rate is $R$. Then, the overall transcriptional initiation rate is given by the product of ${p}_{bound}$ and $R$
Note that since the MS2 technology only measures nascent transcripts, we can ignore the effects of mRNA degradation and focus on transcriptional initiation.
1.3 Constraining model parameters
The transcription rate R of the RNAPbound states can be experimentally constrained by making use of the fact that the hunchback minimal reporter used in this work produces a steplike pattern of transcription across the length of the fly embryo (Figure 4C, blue points). Since in the anterior end of the embryo, the observed transcription appears to level out to a maximum value, we assume that Bicoid binding is saturated in this anterior end of the embryo such that
In this limit, Appendix Equation 10 can be written as
where ${R}_{max}$ is the maximum possible transcription rate. Importantly, ${R}_{max}$ is an experimentally observed quantity rather than a free parameter. As a result, the model parameter $R$ is determined by experimentally measurable quantity ${R}_{max}$.
The value of p can also be constrained by measuring the transcription rate in the embryo’s posterior, where we assume Bicoid concentration to be negligible. Here, the observed transcription bottoms out to a minimum level ${R}_{min}$ (Figure 4C, blue points), which we can connect with the model’s theoretical minimum rate. Specifically, in this limit, b approaches zero in Appendix Equation 10 such that all Bicoiddependent terms drop out, resulting in
where we have replaced $R$ with ${R}_{max}$ as described above. Next, we can express p in terms of the other parameters such that
Thus, p is no longer a free parameter, but is instead constrained by the experimentally observed maximum and minimum rates of transcription ${R}_{max}$ and ${R}_{min}$, as well as our choices of ${K}_{z}$ and $\mathrm{\Delta}{\epsilon}_{chrom}$. In our analysis, ${R}_{max}$ and ${R}_{min}$ are calculated by taking the mean RNAP loading rate across all embryos from the anterior and posterior of the embryo respectively, extrapolated using the trapezoidal fitting scheme described in Appendix section 2.3.
Finally, we expand this thermodynamic MWC model to also account for suppression of transcription in the beginning of the nuclear cycle via mechanisms such as mitotic repression (Appendix section 3). To make this possible, we include a trigger time term ${t}_{MitRep}$, before which we posit that no readout of Bicoid or Zelda by hunchback is possible and the rate of RNAP loading is fixed at 0. For times $t>{t}_{MitRep}$, the system behaves according to Appendix Equation 10. Thus, given the constraints stemming from direct measurements of ${R}_{max}$ and ${R}_{min}$, the model has six free parameters: $\mathrm{\Delta}{\epsilon}_{chrom}$, ${\omega}_{b}$, ${\omega}_{bp}$, ${K}_{b}$, ${K}_{z}$, and ${t}_{MitRep}$. The final calculated transcription rate is then integrated in time to produce a predicted MS2 fluorescence as a function of time (Appendix section 2.2).
For subsequent parameter exploration of this model (Appendix section 5.1), constraints were placed on the parameters to ensure sensible results. Each parameter was constrained to be strictly positive such that:
$\mathrm{\Delta}{\epsilon}_{chrom}>0$
${K}_{b}>0$
${K}_{z}>0$
${\omega}_{b}>0$
${\omega}_{bp}>0$
$0<{t}_{MitRep}<10$.
where an upper limit of 10 min was placed on the mitotic repression term to ensure efficient parameter exploration. This was justified because none of the observed transcriptional onset times in the data were larger than this value (Figure 4D).
2. Inputoutput measurements, predictions, and characterization
2.1 Input measurement methodology
Input transcriptionfactor measurements were carried out separately in individual embryos containing a eGFPBicoid transgene in a bicoid null mutant background (Gregor et al., 2007b) or a ZeldasfGFP CRISPRmediated homologous recombination at the endogenous zelda locus (Hamm et al., 2017). Over the course of nuclear cycle 13, the fluorescence inside each nucleus was extracted (details given in Section 'Image analysis'), resulting in a measurement of the nuclear concentration of each transcription factor over time. Six eGFPBicoid and three ZeldasfGFP embryos were imaged.
Representative fluorescence traces of eGFPBicoid for a single embryo indicate that the magnitude of eGFPBicoid fluorescence decreases for nuclei located toward the posterior of the embryo (Appendix 1—figure 3A). Further, the nuclear fluorescence of eGFPBicoid at 8 min into nuclear cycle 13 (Appendix 1—figure 3B) exhibited the known exponential decay of Bicoid, with a mean decay length of 23.5% ± 0.6% of the total embryo length, consistent with but slightly different than previous measurements that suggested a mean decay length of 19.1% ± 0.8% (Liu et al., 2013). This discrepancy could stem, for example, from minor differences in acquisition from the laserscanning twophoton microscope used in Liu et al., 2013 versus the laserscanning confocal microscope used here, such as differences in axial resolution (due both to different choices of objectives and the inherent differences in axial resolution of onephoton and twophoton fluorescence excitation processes). Nevertheless, the difference was minute enough that we felt confident in our eGFPBicoid measurements.
Intraembryo variability in eGFPBicoid nuclear fluorescence, defined by the standard deviation across nuclei within a single embryo divided by the mean, was in the range of 10–30%, as was the interembryo variability, defined by the standard deviation of the mean amongst nuclei, across different embryos (Appendix 1—figure 3C, blue and black, respectively). Six separate eGFPBicoid embryos were measured.
Similarly, representative fluorescence time traces of ZeldasfGFP for a single embryo are shown in Appendix 1—figure 3D. Unlike the eGFPBicoid profile, the ZeldasfGFP nuclear fluorescence was approximately uniform across embryo position (Appendix 1—figure 3E), consistent with previous fixedtissue measurements (Staudt et al., 2006; Liang et al., 2008). Intraembryo variability in ZeldasfGFP nuclear fluorescence was very low (less than 10%), whereas interembryo variability was relatively higher, up to 20% (Appendix 1—figure 3F, red and black, respectively). Three separate ZeldasfGFP embryos were measured.
Due to the consistency of ZeldasfGFP nuclear fluorescence, we assumed the Zelda profile to be spatially uniform in our analysis, and thus created a mean ZeldasfGFP measurement for each individual embryo by averaging all mean nuclear fluorescence traces in space across the anteriorposterior axis of the embryo (Appendix 1—figure 3D, inset). This mean measurement was used as an input in the theoretical models. However, we still retained interembryo variability in Zelda, as described below.
To combine multiple embryo datasets as inputs to the models explored throughout this work, the fluorescence traces corresponding to each dataset were aligned at the start of nuclear cycle 13, defined as the start of anaphase. Because each embryo may have possessed slightly different nuclear cycle lengths and/or experimental sampling rates (due to the manual realignment of the zstack to keep nuclei in focus), the individual datasets were not combined in order to create average Bicoid and Zelda profiles across embryos. Instead, a simulation and model prediction were performed for each combination of measured input Bicoid and Zelda datasets, essentially an in silico experiment covering a portion of the full embryo length. In all, outputs at each embryo position were predicted in at least three separate simulations. Subsequent analyses used the mean and standard error of the mean of these amalgamated simulations. With six GFPBicoid datasets and three ZeldaGFP datasets, there were 18 unique combinations of input embryo datasets; for a single set of parameters used in a particular model, each derived metric (e.g. ${t}_{on}$) was calculated using predicted outputs from each of the 18 possible input combinations. This procedure provided full embryo coverage and resulted in a distribution of the derived metric for that particular set of parameters. From this distribution, the mean and standard error of the mean were calculated, leading to the lines and shading in plots such as Appendix 1—figure 6.
2.2 MS2 fluorescence simulation protocol
To calculate a predicted MS2 fluorescence trace from measured Bicoid and Zelda inputs for a given theoretical model, we utilized a simple model of transcription initiation, elongation, and termination. First, the dynamic transcriptionfactor concentrations were used as inputs to each of the theoretical models outlined throughout the paper. These models generated a rate of RNAP loading as a function of time and space across the embryo over the course of nuclear cycle 13.
For each position along the anteriorposterior axis, the predicted rate of RNAP loading was integrated over time to generate a predicted MS2 fluorescence trace. Given the known reporter construct length $L$ of 5.2 kb (Garcia et al., 2013), we assume that RNAP molecules are loaded onto the start of the gene at a rate $R(t)$ predicted by the particular model under consideration (Appendix 1—figure 4; see Appendix sections 1.2, 6.1, 7.1, and 8.1 for model details). Each RNAP molecule traverses the gene at a constant velocity v of 1.54 kb/min, as measured experimentally by Garcia et al., 2013. With these numbers, we calculate an elongation time
Finally, we assume that upon reaching the end of the reporter gene, the RNAP molecules terminate and disappear instantly such that they no longer contribute to spot fluorescence.
The MS2 fluorescence signal reports on the number of RNAP molecules actively occupying the gene at any given time and, under the assumptions outlined above, is given by the integral
where $F(t)$ is the predicted fluorescence value, $R(t)$ is the RNAP loading rate predicted by each specific model, $R(t{t}_{elon})$ is the timeshifted loading rate that accounts for RNAP molecules finishing transcription at the end of the gene, and α is an arbitrary scaling factor to convert from absolute numbers of RNAP molecules to arbitrary fluorescence units. The predicted value $F(t)$ was scaled by α to match the experimental data.
The final predicted MS2 signal was modified in a few additional ways. First, any RNAP molecule that had not yet reached the position of the MS2 stem loops had its fluorescence value set to zero (Appendix 1—figure 4, i), since only RNAP molecules downstream of the MS2 stem loop sequence exhibit a fluorescent signal. Second, RNAP molecules that were only partially done elongating the MS2 stem loops contributed a partial fluorescence intensity, given by the ratio of the distance traversed through the stem loops to the total length of the stem loops
where ${F}_{partial}$ is the partial fluorescence contributed by an RNAP molecule within the stem loop sequence region, ${L}_{partial}$ is the distance within the stem loop sequence traversed, and ${L}_{loops}$ is the length of the stem loop sequence (Appendix 1—figure 4, ii). For this reporter construct, the length of the stem loops was approximately ${L}_{loops}=1.28\phantom{\rule{mediummathspace}{0ex}}kb$. RNAP molecules that had finished transcribing the MS2 stem loops contributed the full amount of fluorescence (Appendix 1—figure 4, iii). Finally, to make this simulation compatible with the trapezoidal fitting scheme in Appendix section 2.3, we included a falling signal at the end of the nuclear cycle, achieved by setting $R(t)=0$ after 17 min into the nuclear cycle and thus preventing new transcription initiation events.
Given the predicted MS2 fluorescence trace, the rate of RNAP loading and ${t}_{on}$ were extracted with the fitting procedure used on the experimental data (Appendix section 2.3).
2.3 Extracting initial RNAP loading rate and transcriptional onset time
To extract the initial rate of RNAP loading and the transcriptional onset time ${t}_{on}$ used in the data analysis, we fit both the experimental and calculated MS2 signals to a constant loading rate model, the trapezoidal model (Garcia et al., 2013).
The trapezoidal model provides a heuristic fit of the main features of the MS2 signal by assuming that the RNAP loading rate is either zero or some constant value r (Appendix 1—figure 5A). At time ${t}_{on}$, the loading rate switches from zero to this constant value r, producing a linear rise in the MS2 signal. After the elongation time ${t}_{elong}$, the loading of new RNAP molecules onto the gene is balanced by the loss of RNAP molecules at the end of the gene, producing a plateau in the MS2 signal. Finally, at the end of the nuclear cycle, transcription ceases at ${t}_{off}$ and the RNAP loading rate switches back to zero, producing the falling edge of the MS2 signal and completing the trapezoidal shape. Because we only consider the initial dynamics of transcription in the nuclear cycle in this investigation, we do not explore the behavior of ${t}_{off}$.
Appendix 1—figure 5B shows the results of fitting the mean MS2 fluorescence from a narrow window within a single embryo to the trapezoidal model. With this fit, we can extract the initial rate of RNAP loading (given by the initial slope) as well as ${t}_{on}$ (given by the intercept of the fit onto the xaxis).
As a consistency check, the ${t}_{on}$ values extrapolated from the trapezoidal fit of the data were compared with the experimental time points at which the first MS2 spots were observed for both the wildtype and zelda^{−} mutant experiments (Appendix 1—figure 5C). Due to the detection limit of the microscope, this latter method reports on the time at which a few RNAP molecules have already begun transcribing the reporter gene, rather than a ‘true’ transcriptional onset time. Using the first frame of spot detection yields similar trends to the trapezoidal fits, except that the measured first frame times are systematically larger, especially in the mutant data. Additionally, utilizing the first frame of detection to measure ${t}_{on}$ appears to be a noisier method, likely because the actual MS2 spots cannot be observed below a finite signaldetection limit, whereas the extrapolated ${t}_{on}$ from the trapezoidal fit corresponds to a ‘true’ onset time below the signaldetection limit. For this reason, we decided to rely on the trapezoidal fit to extract ${t}_{on}$, rather than using the first frame of spot detection.
3. Mitotic repression is necessary to recapitulate Bicoid and Zeldamediated regulation of hunchback using the thermodynamic MWC model
As described in Section 'The thermodynamic MWC model fails to predict activation of hunchback in the absence of Zelda' of the main text, a mitotic repression window was incorporated into the thermodynamic MWC model (Appendix section 1.2) in order to explain the observed transcriptional onset times of hunchback. Here, we justify and explain this theoretical modification in greater detail.
Appendix 1—figure 6A and B depicts the experimentally observed initial rates of RNAP loading and ${t}_{on}$ across the length of the embryo (blue points) for the wildtype background. After constraining the maximum and minimum theoretically allowed rates of RNAP loading (Appendix section 1.3), we attempted to simultaneously fit the thermodynamic MWC model to both the rate of RNAP loading and ${t}_{on}$.
The fit results demonstrate that while the thermodynamic MWC model can recapitulate the measured steplike rate of RNAP loading at hunchback (Appendix 1—figure 6A, purple line), it fails to predict the ${t}_{on}$ throughout the embryo (Appendix 1—figure 6B, purple line; see Appendix sections 2.2 and 2.3 for details about experimental and theoretical calculations). This model yields values of ${t}_{on}$ that are much smaller than those experimentally observed, a trend that holds throughout the length of the embryo. This disagreement becomes more evident when comparing the output transcriptional activity reported by the measured MS2 fluorescence with the input concentrations of Bicoid and Zelda. Specifically, the Bicoid and Zelda concentration measurements at 45% along the embryo, shown for a single embryo in Appendix 1—figure 6C, are used in conjunction with the previously mentioned bestfit model parameters to predict the output MS2 signal at the same position. This prediction can then be directly compared with experimental data (Appendix 1—figure 6D, purple line vs. black points, respectively). Although the model predicts that transcription will commence around 1 min after anaphase due to the concurrent increase in the Bicoid and Zelda concentrations, the observed MS2 signal begins to increase around 4 min after anaphase (Appendix 1—figure 6D). As a result, the predicted transcriptional dynamics in Appendix 1—figure 6D are systematically shifted in time with respect to the observed data.
The observed disagreement in ${t}_{on}$ suggests that in this model, transcription is prevented from starting at the time dictated solely by the increase of Bicoid and Zelda concentrations. While we speculate that this effect could stem from processes such as RNAP escape from the promoter, DNA replication at the start of the cell cycle, and postmitotic nucleosome clearance from the promoter, we choose not to commit to a detailed molecular picture and instead ascribe this transcriptional refractory period at the beginning of the nuclear cycle to mitotic repression, the observation that the transcriptional machinery cannot operate during mitosis (Shermoen and O'Farrell, 1991; Gottesfeld and Forbes, 1997; Parsons and Spencer, 1997; Garcia et al., 2013). To account for this phenomenon, we revised our thermodynamic MWC model by stating that hunchback can only read out the inputs and begin transcription after a specified mitotic repression time window following the previous anaphase (Appendix section 1.3).
Since we expect mitotic repression to operate independently of position along the length of the embryo (Shermoen and O'Farrell, 1991), we assumed that the duration of mitotic repression was uniform throughout the embryo. After incorporating a uniform 3 min mitotic repression window into the thermodynamic MWC model (Appendix 1—figure 6C and D, greyshaded region), the model successfully recapitulates ${t}_{on}$ throughout the embryo (Appendix 1—figure 6B and D, blue curves), while still explaining the observed rates of RNAP loading (Appendix 1—figure 6A, blue curve). Thus, once mitotic repression is accounted for, the thermodynamic MWC model based on statistical mechanics can quantitatively recapitulate the regulation of hunchback transcription by Bicoid and Zelda.
4. The effect of the zelda^{−} background on the Bicoid concentration spatiotemporal profile
Our models rest on the assumption that the Bicoid gradient remains unaltered regardless of whether these measurements are made in the wildtype or zelda^{−} backgrounds. To confirm this assumption, we measured eGFPBicoid concentrations in a zelda^{−} background. These flies were heterozygous for eGFPlabeled Bicoid and for wildtype Bicoid, resulting in roughly 50% of total Bicoid being labeled with eGFP. As shown in Appendix 1—figure 7A and B, the resultant eGFPBicoid nuclear fluorescence levels in nuclear cycle 13 in the zelda^{−} background (red) were roughly half the magnitude of the equivalent measurements in the wildtype background (blue), a trend that held both in time and along the embryo. After doubling the heterozygote eGFPBicoid nuclear fluorescence measurements to rescale them (Appendix 1—figure 7B, black), the two eGFPBicoid curves became similar, although the zelda^{−} eGFPBicoid values were systematically lower than in the wildtype background. The normalized difference, defined as the absolute value of the difference between the wildtype and zelda^{−} profiles at each position in the embryo divided by the value of the wildtype profile at the position, averaged across all measured positions, was $15\%\pm 2\%$. This value is within the range of the interembryo variability of eGFP Bicoid in wildtype background embryos (Appendix 1—figure 3C). Measuring the decay length of the eGFPBicoid profile in the zelda^{−} background also yielded a slightly different result: $21\%\pm 1\%$ of the total embryo length, as opposed to $23.5\%\pm 0.6\%$ in the wildtype background (dashed curves, see also Appendix 1—figure 3B).
Having compared the spatial profile of Bicoid in both backgrounds, we then contrasted the dynamics of nuclear Bicoid import. To quantify this analysis, we calculated the time to reach 50% and 90% of the maximum eGFPBicoid fluorescence signal for wildtype and zelda^{−} embryos, at each position along the anteriorposterior axis (Appendix 1—figure 7A, blue and red dashed lines). Because the raw fluorescence signals were noisy enough to confound this calculation, we first smoothed the signals using a moving average filter of ten datapoints (Appendix 1—figure 7A, lines).
Appendix 1—figure 7C and D show the times to reach 50% and 90% of maximum fluorescence for the anterior positions in both embryo backgrounds, where transcription was observed, respectively. In both backgrounds, the 50% and 90% times are similar to within approximately 1 min, indicating that the dynamics of nuclear eGFPBicoid at the start of nuclear cycle 13 are quantitatively comparable. Thus, we concluded that differences in transcription between the two embryo backgrounds do not stem from differences in Bicoid dynamics.
In summary, the dynamics of nuclear Bicoid concentration are quantitatively comparable in both wildtype and zelda^{− }backgrounds, whereas the overall Bicoid concentration is slightly lower in the zelda^{−} case. Nevertheless, these differences in concentration would have a negligible effect on our overall conclusions: in the context of our models, an overall rescaling in the magnitude of the Bicoid gradient between the wildtype and zelda^{−} backgrounds can be compensated by a corresponding rescaling in the dissociation constant of Bicoid, ${K}_{b}$. Because our systematic exploration of theoretical models considers many possible parameter values (Appendix section 5.1), this rescaling has no effect on our conclusion that the equilibrium models are insufficient to explain the zelda^{−} data. As a result, and given that our statistics for the wildtype eGFPBicoid data consisted of more embryos than the data for the zelda^{−} background, we used this wildtype data in our analyses as an input to both the wildtype and zelda^{−} model calculations.
5. Statespace exploration of theoretical models
5.1 General methodology of statespace exploration
To help visualize the limits of our models, we collapsed our observations onto a threedimensional state space, following a method similar to that described in Estrada et al., 2016. In this space, the xaxis was the average ${t}_{on}$ delay. This magnitude was computed by integrating the ${t}_{on}$ across 20% to 37.5% of the embryo length, corresponding to the range in which both wildtype and zelda^{−} experiments exhibited transcription in at least 30% of observed nuclei (Appendix 1—figures 8A and 5A, as defined in Equation 5). The offset in ${t}_{on}$ at 20% embryo length (Equation 4) was the yaxis in the state space. The zaxis was given by the average initial rate of RNAP loading between 20% and 37.5% of the embryo length (Appendix 1—figure 8B).
Combined, the average ${t}_{on}$ delay, ${t}_{on}$ offset, and average initial RNAP loading rate provide a simplified description of our data as well as of our theoretical predictions. Each theoretical model inhabits a finite region in this threedimensional state space, which we can calculate by systematically varying model parameters. Appendix 1—figure 8A and B show an example of how the three parameters are calculated using the zelda^{−} background data presented in Figure 4C and D (red points) in the main text.
Due to the large number of parameters in each model explored, the corresponding statespace boundaries were generated by efficient sampling of the underlying highdimensional parameter space. Although in actuality the state space contained three dimensions, we illustrate the sampling process here with a twodimensional example, using only the offset and average delay in transcriptional onset time, for ease of visualization (Appendix 1—figure 8C). The methodology is similar to the one described in Estrada et al., 2016. Briefly, a starting set of 50 points was generated, each with a randomized set of initial parameters, the specifics of which depended on the model being tested (Appendix 1—figure 8Ci). The state space was sectioned into 100 slices along each orthogonal axis (Appendix 1—figure 8Cii). The most extremal points in each slice were found, resulting in two extremal values each for the ${t}_{on}$ offset and average ${t}_{on}$ delay (Appendix 1—figure 8Ciii). For each of these points, a new set of five points was generated using random parameters within a small neighborhood of the seed points determined by the extremal points of the previous iteration (Appendix 1—figure 8Civ). These new points were plotted; some of these points may be more extreme than the previous set of points. Steps iiiv were iterated, resulting in a growing boundary over time (Appendix 1—figure 8Cv). This algorithm was run in the full threedimensional state space, where 100 threedimensional columns along the orthogonal xy, yz, and xzplanes were used instead of twodimensional slices.
Constraints imposed by the data were used to filter unrealistic results and ensure rapid convergence of the algorithm. First, if the simulated average ${t}_{on}$ delay was less than −0.5 min or greater than 2 min, the point was filtered out. This removal was justified experimentally, since none of the observed average ${t}_{on}$ delays were outside of this range (Figure 5A). Second, if the simulated average initial loading rate was smaller than 1 AU/min or greater than 4 AU/min, the point was also filtered out. This was also justified experimentally, since none of the observed initial RNAP loading rates between 20% and 37.5% embryo position lay outside this range (Figure 4C). Points that fulfilled these constraints were retained for the next iteration of the algorithm. This process was repeated until the resulting space of points no longer grew appreciably, resulting in an estimate of the size and shape of the state space for each of the models presented in Appendix sections 1.2, 6.1, 7.1, and 8.1.
To determine whether the algorithm had indeed converged, the total volume of each model’s region in state space was tracked with each iteration number. If the algorithm worked well, then this volume would approach some maximum value. Appendix 1—figure 8D shows the volume of the state space corresponding to each model presented in this work (normalized by the volume at the final iteration number) as a function of the iteration number. Each model converged to a finite value, indicating that the parameter space occupied by the models had been thoroughly explored.
5.2 State space exploration with the thermodynamic MWC model
Appendix 1—figure 9A and Appendix 1—video 1 show the resulting threedimensional state space for the thermodynamic MWC model (green), as well as all of the theoretical models considered here. We plotted the wildtype and zelda^{−} data on the same state space, represented as small ellipsoids of uncertainty. Any successful model must occupy a region that overlaps both the wildtype and zelda^{−} data.
As shown in Appendix 1—figure 9A and Appendix 1—video 1, the state space corresponding to the thermodynamic MWC model fails to overlap with the zelda^{−} data. To more clearly reveal this disagreement, this threedimensional state space was projected onto the xyplane, the space incorporating the average ${t}_{on}$ delay and ${t}_{on}$ offset information. To do this projection, we noticed that both the wildtype and zelda^{−} data only occupied average initial loading rate values between 2.5 AU/min and 3.6 AU/min (Appendix 1—figure 9A and Appendix 1—video 1). As a result, only points in that range of initial loading rates were retained for the projection. The resulting twodimensional representation of our exploration is shown in Appendix 1—figure 9B. Even in this simplified representation, the failure of the thermodynamic MWC model (Appendix 1—figure 9B, green) is evident. Therefore, we utilized this representation throughout the main text and Appendix (Figures 5C, 6B and 7D and Appendix 1—figures 13 and 14C).
6. Failures and assumptions of thermodynamic models of transcription
6.1 Generalized thermodynamic model
The generalized thermodynamic model is an extension of the thermodynamic MWC model presented in Appendix section 1.2. For extra generality, we assume the presence of twelve Bicoid binding sites and one RNAP binding site, but do not include the action of Zelda since the objective was to attempt to recapitulate the zelda^{−} mutant experimental data. We still allow for an inaccessible DNA state.
In this generalized model, the weight of each microstate can be arbitrary, rather than determined by underlying biophysical parameters. Since ${p}_{bound}$ only depends on whether RNAP is bound, there is no need to distinguish between different microstates that have the same number of Bicoid molecules bound: the arbitrary coefficients allow separate microstates to effectively be combined together into the same weight. Thus, each microstate corresponds only to the overall number of bound molecules, regardless of binding site ordering. With 12 Bicoid sites, in addition to the inaccessible state, there are 27 total microstates and 26 free parameters describing the weights of each state (with the accessible, unbound microstate normalized to unity). Like with the thermodynamic MWC model, we assume that transcription only occurs when RNAP is bound, with the same constrained maximum rate of RNAP loading ${R}_{max}$. However, since the weights of each microstate are arbitrary, we no longer have a variable p that can be constrained by ${R}_{min}$ like in Appendix Equation 14.
This generalized model is much more powerful than the thermodynamic MWC model due to a lack of coupling between individual microstate weights. Whereas in the previous model the underlying parameters ${K}_{b}$ and ${\omega}_{b}$ caused similar microstates to be related mathematically, now the statistical weights for each microstate are completely independent. Physically, this scenario can arise due to, for example, higherorder cooperativities or nonidentical binding energies between binding sites (Estrada et al., 2016).
The partition function in this generalized thermodynamic model is given by the polynomial
where ${p}_{inacc}$ is the weight of the inaccessible state and ${P}_{r,n}$ is the weight of the accessible state with r RNAP molecules bound and n Bicoid molecules bound. The overall transcriptional initiation rate is now
where ${P}_{1,n}$ is the statistical weight of each RNAPbound state and $R$ is the corresponding rate of transcriptional initiation. Note that, as described above, $R$ is still equal to ${R}_{max}$, the constraint described in Appendix section 1.3, but we no longer use the ${R}_{min}$ constraint.
The resulting rate of transcriptional initiation is integrated over time to produce a simulated MS2 fluorescence trace using the same procedure as for the models presented in Appendix sections 1.2, 7.1, and 8.1 (see Appendix section 2.2 for details). As with the thermodynamic MWC model, we allow for a mitotic repression time window to account for the lack of transcription early in the nuclear cycle.
6.2 Generalized thermodynamic model state space exploration
Due to the highdimensional parameter space of the generalized thermodynamic model, constraints were necessary to efficiently explore this parameter space (Appendix section 5.1). These constraints were placed on the values of the individual microstate weights ${P}_{r,n}$, based on dimensional analysis and heuristic arguments. Specifically, each weight ${P}_{r,n}$ is derived from a product of binding constants ${K}_{d}$ for either Bicoid or RNAP, pairwise cooperativity parameters ω, and higher order cooperativity terms. For the purposes of these parameter constraints, we only consider the $K}_{ds$ and ωs, and ignore constraints on higherorder cooperativities. In principle, each Bicoidbinding site possesses a unique ${K}_{d}$ and proteinprotein interaction terms ω with other Bicoid molecules and/or with RNAP. However, as described below, these biophysical parameters, once nondimensionalized, can be constrained to reasonable values by scaling relations through a simple bounding scheme.
For illustrative purposes, consider the microstate with RNAP and one Bicoid molecule bound. Its weight depends on two independent binding constants $p,b$ and a cooperativity term between RNAP and Bicoid ${\omega}_{bp}$. First, we assume that the $p,b$ terms are nondimensionalized, that is they take the form $p=[RNAP]/{K}_{p}$ and $b=[Bicoid]/{K}_{b}$. Although the two individual $p,b$ terms are in principle different since RNAP and Bicoid have can different binding energies, we can be generous about the constraints and assume that the nondimensionalized forms are both bounded below and above by 0 and 1000, respectively. This strategy is justified by assuming that neither RNAP nor Bicoid exist in concentrations three orders of magnitude above their dissociation constants, and do not exist at negative concentrations (Estrada et al., 2016). Similarly, we can be generous about any possible cooperativities and say that ${\omega}_{bp}$ and ${\omega}_{b}$ have a similar bound between 0 and 1000, thus accounting for both positive and negative cooperativities. For this state with RNAP and one Bicoid molecule bound, we can say that
which has bounds
and thus provide a bound for the possible values that the weight ${P}_{1,1}$ can take.
In general, this process can be applied to enforce bounds on any microstate weight ${P}_{r,n}$ through constraining of the possible values of p, b, ${\omega}_{bp}$, and ${\omega}_{b}$. As a result, the weight of a microstate with more Bicoid bound (i.e. higher values of n) will have a more generous dynamic range, due to the larger powers of b and ${\omega}_{b}$. In this way, exploration of parameter space can be made more constrained by restricting the possible values of the microstate weights ${P}_{r,n}$. In addition, the mitotic repression term was constrained like in the thermodynamic MWC model, where $0<{t}_{MitRep}<10$.
As a result of these constraints, the region occupied by the generalized thermodynamic model in the ${t}_{on}$ offset and average ${t}_{on}$ delay space does not entirely include that of the thermodynamic MWC model, whose parameters were only constrained to be positive values (Appendix section 1.3). Nevertheless, this model still fails to capture the delays observed in the zelda^{−} data (Appendix 1—figure 9B, yellow).
6.3 Extended generalized thermodynamic model with transcription factor binding in the inaccessible state
The generalized thermodynamic model (Appendix section 6.1) encompasses all possible thermodynamic models with up to twelve Bicoidbinding sites that can be bound in the accessible state. However, a potentially more general class of models involves those where Bicoid can also bind to the inaccessible state. For example, Bicoid action could conceivably result in some pioneering activity by directly binding to chromatin in the inaccessible state and facilitating RNAP binding and transcription. Here, we show that these models can be reformulated into the generalized thermodynamic model presented above.
If we allow for Bicoid to bind to any of the 12 binding sites in the inaccessible states, then we introduce l new microstates with individual Boltzmann weights ${P}_{l}$, one for each Bicoidbound inaccessible state, in addition to the unbound inaccessible state with weight ${P}_{inacc}$. Nevertheless, as long as the ensuing transcription rate of each Bicoidbound inaccessible state is zero, then the net effect of these additional inaccessible states could simply be described by a single effective inaccessible state with Boltzmann weight ${P}_{inacc}^{\prime}={P}_{inacc}+{\sum}_{l}{P}_{l}$. The resulting state space exploration (Section 'No thermodynamic model can recapitulate the activation of hunchback by Bicoid alone' and Figure 5C, yellow), which explores the whole parameter space of reasonable values of ${P}_{inacc}$, would thus also capture the behavior of this single effective inaccessible state. As a result, models that consider the binding of Bicoid to the inaccessible states are contained within our generalized thermodynamics model.
6.4 Investigation of the failure of thermodynamic models
Here, we provide an intuitive explanation for why thermodynamic models fail to recapitulate the delay in ${t}_{on}$ for $zeld{a}^{}$ embryos. The combination of the occupancy hypothesis and the assumption of separation of times scales described in Appendix section 6.5 imply that the rate of transcriptional initiation at any moment in time is an instantaneous readout of the Bicoid concentration at that time point. Thus, any thermodynamic model is memoryless. Intuitively, this means that a thermodynamic model requires transcription to begin as soon as the Bicoid concentration crosses a certain ‘threshold’ since time delays between input and output require some sense of memory. Examination of the dynamic measurements of MS2 output in $zeld{a}^{}$ embryos reveals that no matter what ‘threshold’ concentration of Bicoid is assigned for the start of transcription, the model cannot simultaneously describe two values of ${t}_{on}$ corresponding to different positions along the anteriorposterior axis (Appendix 1—figure 10A and B).
Another selfconsistency check of a thermodynamic model is to examine the concentration of Bicoid at ${t}_{on}$ for various positions along the embryo. Due to the memoryless nature of thermal equilibrium, a valid thermodynamic model predicts that, at different positions along the embryo, ${t}_{on}$ will occur when Bicoid reaches the same threshold value. For the $zeld{a}^{}$ data, however, the level of Bicoid at each anteriorposterior position’s ${t}_{on}$ value actually decreases with increasing ${t}_{on}$, suggesting the failure of the thermodynamic model (Appendix 1—figure 10C). Thus, the strong positiondependent delay in ${t}_{on}$ for the $zeld{a}^{}$ data cannot be explained by an instantaneous Bicoid readout mechanism.
More generally, the memoryless nature of thermodynamic models implies that, given any inputoutput function that increases monotonically with Bicoid and Zelda concentration, the ensuing onset time of transcription cannot be later than the time at which Bicoid or Zelda reach their maximal values. This is a reflection of a generic feature of thermodynamic models, namely that only instantaneous couplings in time can exist, and that time delays are impossible (Coulon et al., 2013; Wong and Gunawardena, 2020). By inspecting the nuclear concentrations of Bicoid and Zelda in Appendix 1—figure 3, we notice that times of maximal nuclear concentration for both transcription factors all occur around 4.5 min. This time is much earlier than the delayed transcriptional onsets exhibited in the zelda^{−} data (Figure 4D, red points), providing further evidence for the unsuitability of thermodynamic models in describing the observed delay in the transcriptional onset time along the anteriorposterior axis of the embryo.
6.5 Reexamining thermodynamic models of transcriptional regulation
Thermodynamic models based on equilibrium statistical mechanics can be seen as limiting cases of more general kinetic models. For example, consider simple activation, where an activator whose concentration is modulated in time regulates transcription by binding to a single site (Appendix 1—figure 11). In this generic model, the presence of activator can modulate the rates of activator and RNAP binding and unbinding through the parameters α, β, γ, and δ.
In order to reduce kinetic models to thermodynamic models where the probabilities of each state are dictated by Boltzmann weights such as those in Figure 2A, four conditions must be fulfilled. First, the rate of mRNA production must be linearly related to the probability of finding RNAP bound to the promoter (Appendix 1—figure 11i). This occupancy hypothesis is necessary for Appendix Equation 2 to hold. Second, the time scales of binding and unbinding of RNAP and transcription factors must be much faster than the time scales of the concentration dynamics of these proteins (Appendix 1—figure 11ii). Third, these time scales must also be much faster than the rate of transcriptional initiation and mRNA production (Appendix 1—figure 11iii). Under these conditions of separation of time scales, the binding and unbinding of proteins quickly reaches steady state while the overall concentrations of these molecular players are modulated (Segel and Slemrod, 1989). Fourth, there must be no energy input into the system (Appendix 1—figure 11iv). This condition demands ‘detailed balance’ (Vilar and Leibler, 2003; Ahsendorf et al., 2014; Hill, 1985): the product of state transition rates in the clockwise direction over a closed loop is equal to the product going in the counterclockwise direction, a constraint known as the cycle condition (Estrada et al., 2016). In the case of Appendix 1—figure 11, this requirement implies that
If these four conditions are met, then the system is effectively in equilibrium and the various binding states adopt probabilities that can be calculated using equilibrium statistical mechanics.
7. Nonequilibrium MWC model
7.1 Nonequilibrium MWC model
The nonequilibrium MWC model is an extension of the thermodynamic MWC model presented in Appendix section 1.2, where we now relax the assumption of separation of time scales (Appendix 1—figure 11ii and iii) and make it possible to assume, for example, that the system responds instantaneously to changes in activator concentration. Here, we explicitly simulate the full system of ordinary differential equations (ODEs) that describe the dynamics of the system out of steady state. Additionally, we allow for energy to be expended and thus do not enforce detailed balance through the cycle condition (Appendix 1—figure 11iv). We still employ a mitotic repression window term, before which no transcription is allowed.
We consider a generic model with n Bicoid binding sites, and again ignore Zelda since we are only interested in recapitulating the zelda^{−} mutant data. As a result, this new model has $n+1$ total binding sites which, together with the closed chromatin state, results in a total of ${2}^{n+1}+1=N$ microstates. In the case of six Bicoid binding sites, this results in $N=129$ total microstates. We assign each microstate x_{i} a label i and describe the transition rate from state j to state i using ${k}_{ij}$, where $i,j$ range from 0 to $N1$, inclusive.
In matrix notation, we write the system of ODEs as
where $\overrightarrow{X}$ is a vector containing the fractional occupancy of each microstate x_{i} and $K$ is a matrix containing all the transition rates ${k}_{ij}$. Normalizing such that the sum of all the components in the vector $\overrightarrow{X}$ is unity, we now have a vector representing the instantaneous probability of being in each microstate.
To relate the occupancies of the different states to the rate of transcriptional initiation, we retain the occupancy hypothesis presented earlier: that ${p}_{bound}$, the probability of being in a microstate with a bound RNAP molecule, is linearly related to the overall average transcriptional initiation rate that we determine from experimental measurements.
For this particular system, it is helpful to define an intuitive microstate labeling system. Because the relevant physical processes are the binding and unbinding of Bicoid and RNAP molecules, we can represent any microstate in binary form, where the total number of digits is the total number of binding sites $n+1$, and each digit represents an individual binding site. Our convention is to assign the first digit to the promoter, and the subsequent ones to the Bicoid sites. By assigning 0 to an unbound site and one to a bound site, we can rewrite each unique microstate’s label i in binary form. For example, for a model with six Bicoid sites, the label for the microstate with no RNAP bound and the first two Bicoid sites occupied is represented with
Here, bin() indicates taking the base 2 value of the binary label in the parentheses. The closed chromatin state is added manually and assigned to the last position in our binary label, ${x}_{N1}$. This convention allows us to intuitively define each unique label for the system’s microstates and provides a way to map the physical contents of a microstate with its associated label i.
In general, the overall transition matrix $K$ can be very complex. However, we benefit from the fact that the only nonzero transitions ${k}_{ij}$ are the ones that correspond to physical processes: modifying the open/closed chromatin state, and binding and unbinding of Bicoid or RNAP molecules. In this binary notation, these constraints imply that the only nonzero transitions are the ones that represent individual flips between 0 and 1, as well as between the open and closed states 0 and $N1$. The transition matrix $K$ is then easier to write, since it is clear from the binary representation which transitions must be nonzero. Finally, diagonal elements ${k}_{ii}$ are entirely constrained because they represent probability loss from a particular state i, and must be equal to the negative of the rest of the column i, such that the sum over each column in $K$ is zero.
Given that the Bicoid concentration changes as a function of time and that we assume firstorder binding kinetics, whichever rates ${k}_{ij}$ correspond to Bicoid binding rates must be multiplied by this timedependent nuclear concentration. In contrast, all offrates are independent of Bicoid concentration. To keep subsequent parameter exploration simple, we nondimensionalized the Bicoid concentration by rescaling it by its approximate scale. This was achieved by dividing all Bicoid concentrations by the average Bicoid concentration, calculated by averaging the mean Bicoid nuclear fluorescence across all datasets, anteriorposterior positions, and time points, yielding approximately 35 arbitrary fluorescence units. Thus, all the transition rates ${k}_{ij}$ in the model here are expressed in units of inverse minutes.
To model transcription specifically, we assumed that at the beginning of the nuclear cycle, the system is in the closed chromatin state: ${x}_{i}(t=0)=0$ except for the closed chromatin state ${x}_{N1}(t=0)=1$. We simulated the full trajectory of all the microstates x_{i} over time by solving the system of ODEs given in Equation 22. Finally, we calculated ${p}_{bound}$ by summing the x_{i}’s that correspond to RNAPbound states, and then computed the subsequent transcriptional initiation rate by multiplying ${p}_{bound}$ with the transcription rate $R$. Here, $R$ is the same ${R}_{max}$ as in Appendix sections 1.2 and 6.1 but again we do not constrain the model using ${R}_{min}$, just as in Appendix section 6.1.
Appendix 1—figure 12A shows an example of this model for a system with only one Bicoid binding site and no closed chromatin state, for simplicity, resulting in a fourstate network. The binary indexing labels (shown beneath each state in light pink) can be converted into the base10 labels (light teal) ranging from 0 to 3. The connection matrix for this system is
and the corresponding transition rate matrix $K$ is
where, in this example, ${k}_{02}$ represents the transition rate from state j to state i. The diagonal elements ${k}_{ii}$ are equal to the negative of the sum of the elements in the rest of the column in order to preserve conservation of probability. For example, ${k}_{00}=({k}_{10}+{k}_{20}+{k}_{30})$.
With all this information in hand, we solve for the occupancy of each of the four states using the matrix ODE
In this case, the occupancy hypothesis relates ${p}_{bound}$ to the overall transcription rate, resulting in
This model can produce timedependent behavior not found in the thermodynamic models. Appendix 1—figure 12B contains an example of a hypothetical input Bicoid activator concentration that switches instantaneously from zero to a finite value. In the thermodynamic models, the predicted transcriptional initiation rate also responds instantaneously (Appendix 1—figure 12B, top). In contrast, for a suitable set of parameters, the nonequilibrium MWC model predicts a slow response over time (Appendix 1—figure 12B, bottom).
To produce a simulated MS2 fluorescence trace, the resulting rate of mRNA production is integrated over time using the same procedure (Appendix section 2.2) as the models presented in Appendix sections 1.2, 6.1, and 8.1. As with the thermodynamic MWC model, we allow for a time window of mitotic repression to account for the lack of transcription early in the nuclear cycle. Specifically, this was implemented by allowing the system to evolve over time, but fixing transcription to zero ($R=0$) until after the mitotic repression time ${t}_{MitRep}$. An alternative formulation of the model, in which the whole system is frozen such that no transitions between states are allowed until after ${t}_{MitRep}$, is discussed below in Appendix section 7.3.
7.2 Nonequilibrium MWC model state space exploration
In the parameter exploration of this model (Appendix section 5.1), the transition rates ${k}_{ij}$ were constrained with minimum and maximum values of ${k}_{min}=1$ and ${k}_{max}={10}^{5}$ respectively, in units of inverse minutes. These bounds were conservatively chosen using the following estimates. First, we estimate the values of the possible unbinding rates ${k}_{off}$. We assume that RNAP and Bicoid obey the same unbinding kinetics. Estimates of in vivo singlemolecule binding kinetics inferred from Mir et al., 2018 indicate that the lifetime of Bicoid on DNA is on the order of $3\phantom{\rule{mediummathspace}{0ex}}{s}^{1}$. Second, we estimate the values of the possible onrates ${k}_{on}$ using the classic BergPurcell equation for the case of a diffusionlimited binding to a perfectly absorbing spherical receptor (Berg and Purcell, 1977). In this case, the onrate of molecule binding is given by
where $D$ is the diffusion coefficient of the molecule, a is the estimated size of the spherical receptor, and ${c}_{0}$ is the background concentration of the molecular species. Since here we are talking about transcription factor binding to a Bicoid binding site, we assume a to be on the order of 5 nm. We assume that RNAP and Bicoid obey the same diffusion characteristics, leading to a diffusion coefficient of approximately $0.3\phantom{\rule{mediummathspace}{0ex}}\mu {m}^{2}{s}^{1}$(Gregor et al., 2007b). Finally, Bicoid is is present at concentrations between 10 nM and 55 nM in the nucleus (Gregor et al., 2007a), and we assume that nuclear RNAP concentrations exist within the same range. Plugging these values into Appendix Equation 28 yields estimates for the maximum and minimum onrates:
and
Thus, our maximum and minimum transition rate bounds of $k}_{min}=1\phantom{\rule{mediummathspace}{0ex}}mi{n}^{1$ and $k}_{max}={10}^{5}\phantom{\rule{mediummathspace}{0ex}}mi{n}^{1$ lie outside these estimated binding and unbinding rates. The mitotic repression term was constrained like in the thermodynamic MWC model, where $0<{t}_{MitRep}<10$.
One caveat of the statespace exploration approach is that the high dimensionality of the nonequilibrium MWC model prevented us from calculating the full statespace boundary using six Bicoidbinding sites. Due to computational costs, we were only able to accurately produce a statespace boundary for this model (Appendix section 7.1) using five Bicoidbinding sites. Running the exploration for a model with six Bicoidbinding sites took over 2 weeks on our own server, and the algorithm had not noticeably converged in the end.
The results of the state space exploration for the nonequilibrium MWC model using five Bicoidbinding sites resulted in larger average ${t}_{on}$ delays than the thermodynamic models (Appendix sections 1.2 and 6.1). However, this model, like those, failed to reproduce the delays observed in the zelda^{−} data (Appendix 1—figure 9B, cyan).
Interestingly, the total areas covered by each nonequilibrium MWC model did not monotonically increase with Bicoidbinding site number (Figure 6B). This phenomenon where the state space of a model does not strictly increase with binding site number has been previously observed (Estrada et al., 2016) and the reason for this effect remains uncertain.
7.3 Alternative nonequilibrium MWC model with strong mitotic repression
In the main text, we entertained a nonequilibrium MWC model where mitotic repression blocks any productive transcription ($R=0$) until the mitotic repression window ${t}_{MitRep}$ has passed. Before this time, in this model, the system can nevertheless transition through its different states over time.
In an alternate formulation of this nonequilibrium MWC model, we consider a form of mitotic repression that we call strong mitotic repression. Here, the system itself is frozen in the initial inaccessible state and not allowed to evolve until after ${t}_{MitRep}$. After ${t}_{MitRep}$, the system evolves through time according to the same rules as the original nonequilibrium MWC model.
Repeating the state space exploration for this model, for up to five Bicoid binding sites, yielded similar conclusions. Namely, the model could not describe the average delay and offset in transcriptional onset time in the absence of Zelda (Appendix 1—figure 13). The intuition behind this is that, while this stronger form of mitotic repression could potentially achieve longer delays, the crucial feature of the zelda^{−} data is not merely a delayed transcription onset time, but a positiondependent delay that increases towards the posterior of the embryo. This stronger form of mitotic repression does not result in a mechanism capable of achieving such delay. In contrast, the final transcriptionfactordriven model (Appendix section 8.1) does provide such a mechanism by coupling the inaccessibletoaccessible transition to the positiondependent Bicoid gradient.
8. Transcriptionfactordriven model of chromatin accessibility
8.1 Transcriptionfactordriven model of chromatin accessibility
The transcriptionfactordriven model of chromatin accessibility is a slight modification of the thermodynamic MWC model (Appendix section 1.2) that replaces the MWC mechanism of chromatin transitions with a direct driving action due to Bicoid and Zelda. Here, we retain the idea of inaccessible vs. accessible states, but no longer demand that these states be in thermodynamic equilibrium. Instead, the system begins in the inaccessible state and undergoes a series of m identical, slow, and effectively irreversible transitions to the accessible state. Once these transitions into the accessible state occur, the system can rapidly and reversibly occupy all of its accessible microstates such that the probability of the system being in any of these microstates is described by thermodynamic equilibrium. The accessible states are governed by the same rules and parameters as the thermodynamic MWC model (Appendix section 1.2), albeit without the $\mathrm{\Delta}{\epsilon}_{chrom}$ parameter since now the transition from the inaccessible to accessible state is unidirectional.
We consider two possible contributions for these irreversible transitions: a Bicoiddependent pathway and a Zeldadependent pathway (Appendix 1—figure 15A, see Appendix section 8.2 for a discussion on this choice of parameterization). We assume the transition rates to be firstorder in Bicoid and Zelda, respectively, such that
and
Here, ${\pi}_{b}$ is the Bicoiddependent contribution to the transition rates and ${\pi}_{z}$ is the corresponding Zeldadependent contribution. There are two input parameters c_{b} and c_{z} that give the relative speed of each transition rate contribution. The overall rate π of each irreversible transition is given by the sum
Because the accessible states are in thermodynamic equilibrium with each other, we can effectively treat them as a single state and describe the entire system with $m+1$ states, corresponding to the inaccessible, intermediate, and accessible states. We label the inaccessible state with 0, the $m1$ intermediate states with one through $m1$, and the final accessible state with m. Thus, we describe the probability p_{i} of the system being in the state i with the probability vector $\overrightarrow{P}$
Calculating the overall RNAP loading rate then simply corresponds to rescaling ${p}_{bound}$ with the overall probability ${p}_{m}(t)$ of being in the accessible state:
where $R$ is the same maximum rate used in Appendix section 1.2. Note that ${p}_{m}(t)$ is a timedependent quantity that changes over time. To calculate ${p}_{m}(t)$, we solve the corresponding system of ODEs that describes the time evolution of $\overrightarrow{P}$.
where $\mathrm{\Pi}$ is the transition rate matrix describing the time evolution of the system. $\mathrm{\Pi}$, by definition, is a square matrix with dimension $m+1$. Given the initial condition that the system begins in the inaccessible state
the system of ODEs can be solved to find the probability of being in the accessible state ${p}_{m}(t)$. For example, for $m=3$ irreversible steps, $\mathrm{\Pi}$ takes the form
where π is given by Appendix Equation 31.
For simplicity, the time evolution of $\overrightarrow{P}$ was solved using MATLAB’s ode15s solver.
With the probability ${p}_{m}(t)$ of the system being in the accessible state calculated, we now calculate the probability ${p}_{bound}$ of RNAP bound to the promoter in the accessible states, which lie in thermodynamic equilibrium with each other. Because we now only have accessible states, the partition function is
where z, p, and b correspond to the nondimensionalized concentrations of Zelda, RNAP, and Bicoid, respectively, and ${\omega}_{b}$ and ${\omega}_{bp}$ are the cooperativities between Bicoid molecules and between Bicoid and RNAP, respectively. Thus, the overall transcriptional initiation rate is given by
Due to the lack of the inaccessible state in the partition function and because we assume that Zelda does not directly interact with Bicoid or RNAP, now the presence of Zelda mathematically separates out so that only Bicoid influences transcription. The calculation above is a standard equilibrium statistical mechanical calculation, except that we have weighted the final result with ${p}_{m}(t)$, the probability of being in the accessible states. The resulting rate is integrated to produce a simulated MS2 fluorescence trace using the same procedure (Appendix section 2.2) as the models presented in Appendix sections 1.2, 6.1, and 7.1.
Interestingly, we found that a mitotic repression term was not necessary to recapitulate the data, since the presence of intermediary states produced the necessary delay to explain the experimentally observed ${t}_{on}$ values in the data (Figure 4D, points).
In order to sufficiently explain the data, we found that a minimum of $m=3$ irreversible steps was necessary. Appendix 1—figure 14A and B show the results of fitting this model to the observed rates of RNAP loading and ${t}_{on}$ for the wildtype and zelda^{−} data, for increasing values of m (wildtype results not shown, since all values of m easily explained the wildtype data). We see that while lower values of m do a poor job of recapitulating the data, once we reach $m=3$ the model sufficiently predicts the experimental data within experimental error. For values of m higher than 3, explanatory power increases marginally. Considering the parameter exploration of this model (Appendix section 8.3) highlights the necessity of having at least $m=3$ steps.
8.2 Exploring alternatives to the additive transcriptionfactordriven transition rate
In Appendix section 8.1, we defined the transition rate between the transcriptionally silent states in our transcriptionfactordriven model of chromatin accessibility as
Here, we assumed that Zelda and Bicoid operate independently and in parallel to catalyze the transitions from the inaccessible to accessible state (Appendix 1—figure 15A). Our choice in using two independent Zelda and Bicoidmediated transitions was primarily motivated by the fact that, to our knowledge, no direct interactions between Bicoid and Zelda have been reported to date. However, this is not the only possible choice of model formulation. Here, we discuss and rule out two alternative mechanisms of Zelda and Bicoidmediated transitions from the inaccessible to accessible state.
As a first alternative, instead of an independent and additive mechanism, we could imagine a scenario where Bicoid and Zelda act simultaneously (Appendix 1—figure 15B). Here, each stochastic transition is given by
where c is some constant with units of ${[Bicoid]}^{1}{[Zelda]}^{1}mi{n}^{1}$.
In a second alternative, Bicoid and Zelda could act sequentially. Here, each stochastic transition contains an intermediary state (Appendix 1—figure 15C). In this case, the transition rate will be dependent on Bicoid and Zelda such that
where ${c}_{1}$ and ${c}_{2}$ are constants with units of ${[Bicoid]}^{1}mi{n}^{1}$ and ${[Zelda]}^{1}mi{n}^{1}$.
One critical experimental observation is that transcription occurs even in the absence of Zelda, albeit at a delayed capacity. Since removing Zelda would set π to zero in these alternative models, transcription would not occur at all, and so both of the proposed alternative mechanisms can be ruled out. More generally, the existence of transcription in the absence of Zelda requires that there must exist some independence between BicoidandZeldamediated transitions from the OFF to the ON state. Otherwise, no transition, and hence no transcription, could occur in the absence of Zelda.
8.3 Transcriptionfactordriven model of chromatin accessibility state space exploration
In the parameter exploration of this model (Appendix section 5.1), the parameters were constrained as
${c}_{b}>0$
${c}_{z}>0.$
The parameters shared with the thermodynamic MWC model retained the constraints described in Appendix section 1.3.
Appendix 1—figure 14C shows the state space explorations (see Appendix section 5.1) of this transcriptionfactordriven model for increasing numbers of intermediate steps m. Not until $m=3$ does the model explain the both the wildtype and zelda^{−} data, indicating that $m=3$ is the minimum number of irreversible steps necessary. In the state space exploration shown in Figure 7D and Appendix 1—figure 9, the number of irreversible steps was fixed at $m=3$.
Unlike the other models investigated (Appendix sections 1.2, 6.1, and 7.1), the transcriptionfactordriven model of chromatin accessibility occupied a region in state space that encompassed both the wildtype and zelda^{−} data (Appendix 1—figure 9, purple).
References
 1

2
Binding of disparate transcriptional activators to nucleosomal DNA is inherently cooperativeMolecular and Cellular Biology 15:1405–1421.https://doi.org/10.1128/MCB.15.3.1405
 3
 4

5
Sensitivity of OR in phage lambdaBiophysical Journal 86:58–66.https://doi.org/10.1016/S00063495(04)740837

6
Physics of chemoreceptionBiophysical Journal 20:193–219.https://doi.org/10.1016/S00063495(77)855446
 7

8
Localization of ASH1 mRNA particles in living yeastMolecular Cell 2:437–445.https://doi.org/10.1016/S10972765(00)801434

9
Transcriptional regulation by the numbers: applicationsCurrent Opinion in Genetics & Development 15:125–135.https://doi.org/10.1016/j.gde.2005.02.006

10
Transcriptional regulation by the numbers: modelsCurrent Opinion in Genetics & Development 15:116–124.https://doi.org/10.1016/j.gde.2005.02.007
 11
 12
 13
 14

15
Tuning promoter strength through RNA polymerase binding site design in Escherichia coliPLOS Computational Biology 8:e1002811.https://doi.org/10.1371/journal.pcbi.1002811
 16
 17
 18
 19

20
Eukaryotic transcriptional dynamics: from single molecules to cell populationsNature Reviews Genetics 14:572–584.https://doi.org/10.1038/nrg3484
 21

22
The role of DNA sequence in nucleosome breathingThe European Physical Journal E 40:106.https://doi.org/10.1140/epje/i2017115962

23
Precision of readout at the hunchback gene: analyzing short transcription time traces in living fly embryosPLOS Computational Biology 12:e1005256.https://doi.org/10.1371/journal.pcbi.1005256
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33

34
Estrogendependent control and celltocell variability of transcriptional burstingMolecular Systems Biology 14:e7678.https://doi.org/10.15252/msb.20177678

35
The coactivator CREBbinding protein participates in enhancerdependent activities of bicoidJournal of Biological Chemistry 279:48725–48733.https://doi.org/10.1074/jbc.M407066200

36
Interplay between positive and negative activities that influence the role of Bicoid in transcriptionNucleic Acids Research 33:3985–3993.https://doi.org/10.1093/nar/gki691

37
Living without 30nm chromatin fibersTrends in Biochemical Sciences 36:1–6.https://doi.org/10.1016/j.tibs.2010.09.002
 38
 39
 40

41
Live imaging of mrna synthesis in Drosophila. Book SectionIn: I Gaspar, editors. RNA Detection: Methods and Protocols. Springer. pp. 349–357.https://doi.org/10.1007/9781493972135
 42
 43
 44

45
Mitotic repression of the transcriptional machineryTrends in Biochemical Sciences 22:197–202.https://doi.org/10.1016/S09680004(97)010451
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58

59
Pattern formation by graded and uniform signals in the early Drosophila embryoBiophysical Journal 102:427–433.https://doi.org/10.1016/j.bpj.2011.12.042
 60

61
A quantitative model of transcription factoractivated gene expressionNature Structural & Molecular Biology 15:1192–1198.https://doi.org/10.1038/nsmb.1500
 62
 63

64
Transcriptional enhancers in animal development and evolutionCurrent Biology 20:R754–R763.https://doi.org/10.1016/j.cub.2010.06.070
 65
 66
 67
 68
 69
 70
 71
 72

73
Live imaging of bicoiddependent transcription in Drosophila embryosCurrent Biology 23:2135–2139.https://doi.org/10.1016/j.cub.2013.08.053

74
Posterior stripe expression of hunchback is driven from two promoters by a common enhancer elementDevelopment 121:3067–3077.

75
Tradeoffs and constraints in allosteric sensingPLOS Computational Biology 7:e1002261.https://doi.org/10.1371/journal.pcbi.1002261

76
Statistical mechanics of MonodWymanChangeux (MWC) modelsJournal of Molecular Biology 425:1433–1460.https://doi.org/10.1016/j.jmb.2013.03.013

77
Collaborative competition mechanism for gene activation in vivoMolecular and Cellular Biology 23:1623–1632.https://doi.org/10.1128/MCB.23.5.16231632.2003
 78
 79

80
On the nature of allosteric transitions: a plausible modelJournal of Molecular Biology 12:88–118.https://doi.org/10.1016/S00222836(65)802856

81
Thermodynamic models of combinatorial gene regulation by distant enhancersIET Systems Biology 4:393–408.https://doi.org/10.1049/ietsyb.2010.0010
 82
 83
 84

85
Mitotic repression of RNA polymerase II transcription is accompanied by release of transcription elongation complexesMolecular and Cellular Biology 17:5791–5802.https://doi.org/10.1128/MCB.17.10.5791

86
Precision of hunchback expression in the Drosophila embryoCurrent Biology 22:2247–2252.https://doi.org/10.1016/j.cub.2012.09.051
 87

88
Figure 1 theory meets figure 2 experiments in the study of gene expressionAnnual Review of Biophysics 48:121–163.https://doi.org/10.1146/annurevbiophys052118115525

89
Mechanism of protein access to specific DNA sequences in chromatin: a dynamic equilibrium model for gene regulationJournal of Molecular Biology 254:130–149.https://doi.org/10.1006/jmbi.1995.0606
 90
 91
 92
 93
 94
 95
 96

97
Gene regulation by chromatin structure: paradigms established in Drosophila melanogasterAnnual Review of Entomology 52:171–192.https://doi.org/10.1146/annurev.ento.51.110104.151007
 98
 99
 100
 101
 102

103
Thermodynamic state ensemble models of cisregulationPLOS Computational Biology 8:e1002407.https://doi.org/10.1371/journal.pcbi.1002407
 104
 105
 106
 107
 108

109
DNA looping and physical constraints on transcription regulationJournal of Molecular Biology 331:981–989.https://doi.org/10.1016/S00222836(03)007642
 110
 111

112
Gene regulation in and out of equilibriumAnnual Review of Biophysics 49:199–226.https://doi.org/10.1146/annurevbiophys121219081542
 113
 114
 115

116
Discrimination between thermodynamic models of cisregulation using transcription factor occupancy dataNucleic Acids Research 42:2224–2234.https://doi.org/10.1093/nar/gkt1230
 117
 118
 119
Decision letter

Pierre SensReviewing Editor; Institut Curie, PSL Research University, CNRS, France

Naama BarkaiSenior Editor; Weizmann Institute of Science, Israel

Anatoly KolomeiskyReviewer; Rice University, United States
In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.
Acceptance summary:
This is a combined experimentaltheoretical investigation of transcription in development using elegant experimental methods and a comprehensive theoretical analysis. The aim is to distinguish between computational models predicting the transcriptional output of a developmental gene from timevarying concentrations of transcription factors. Based on quantitative comparisons between livecell imaging measurements of transcription in the fly embryo and the predictions from three classes of computational models, the authors conclude that the data can only be accounted for by a nonequilibrium scheme describing the progression towards 'accessible' chromatin state as a multistep process driven by transcription factors Bicoid and pioneering factor Zelda.
Decision letter after peer review:
Thank you for submitting your article "Quantitative dissection of transcription in development suggests transcription factordriven chromatin accessibility" for consideration by eLife. Your article has been reviewed by two peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Naama Barkai as the Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Anatoly Kolomeisky (Reviewer #2).
The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.
Both reviewers and the Reviewing editor found the paper very interesting, thought provoking, and very well written. We are convinced that a nonequilibrium approach to transcription is very important. Your experimental approach is elegant and the data are of high quality. There is need for additional experiments. However, reviewers raised concerns about the novelty of your findings, and questioned certain modelling hypotheses. These important issues must be satisfactorily addressed before a decision regarding publication can be made. In our opinion, this can be done without additional wet lab experiments, and within a reasonable timeframe.
As the editors have judged that your manuscript is of interest, but as described below that major revisions are required, in particular regarding modelling and discussion, before it is published, we would like to draw your attention to changes in our revision policy that we have made in response to COVID19 (https://elifesciences.org/articles/57162). First, because many researchers have temporarily lost access to the labs, we will give authors as much time as they need to submit revised manuscripts. We are also offering, if you choose, to post the manuscript to bioRxiv (if it is not already there) along with this decision letter and a formal designation that the manuscript is "in revision at eLife". Please let us know if you would like to pursue this option. (If your work is more suitable for medRxiv, you will need to post the preprint yourself, as the mechanisms for us to do so are still in development.)
Summary:
This is a combined experimentaltheoretical investigation of transcription in development using elegant experimental methods and a comprehensive theoretical analysis.
The aim is to distinguish between computational models predicting the transcriptional output of a developmental gene from timevarying concentrations of transcription factors. Based on quantitative comparisons between livecell imaging measurements of transcription in the fly embryo and the predictions from three classes of computational models, the authors conclude that the data can only be accounted for by a nonequilibrium scheme describing the progression towards 'accessible' chromatin state as a multistep process driven by transcription factors Bicoid and pioneering factor Zelda.
Essential revisions:
Novelty and context
1) The novelty of the work in comparison with exiting studies reaching similar conclusions must be clarified. In particular in comparison [Dufourt et al., 2018] in which a different gene is studied, but within the same organism, in the same nuclear cycle and controlled by the same pioneering factor Zelda.
2) The Discussion must be extended to put the results in a broader context of existing literature. Nonequilibrium view on transcriptional regulation has been discussed before. is presented as new, although these ideas have been around and studied for a number of years already. See for instance Coulon et al., 2013, and discuss the parallels with the studies cited therein, which points to the same mechanisms of stepwise, irreversible, TFdriven progression of chromatin state towards a Pol IIaccessible configuration.
Dufourt et al., 2018; Fritzsch et al., 2018; Coulon et al., 2013.
Further data analysis
3) The data seem underexploited: The authors have access to singlecell information but only analyze ensembleaverage traces. This is surprising since the model they put forward is intrinsically stochastic (governed by a few ratelimiting steps). Signatures of such stochastic behavior should be clear in the data. For instance: Given that the model posits that transcription starts after m=5 equivalent first order reactions, then a key prediction is that the first appearance of the MS2 signal among all individual traces at a given position along the embryo should be Gammadistributed (with shape parameter m=5). Even if the first appearance of the MS2 spot is a noisy measurement, this should be measurable (as in [Fritzsch et al., 2018] Figure 4A) and the authors already have this data at hand.
Concerns regarding the model
3) In the equilibrium MWC model (Figure 2A), Zelda, although being a pioneering factor, is not represented as such. Here it acts by binding the 'accessible' state, hence simply by keeping chromatin from closing. Instead, a pioneering factor should be able to bind in the 'inaccessible' state and make this state unfavorable (e.g. weight = exp(∆_chrom / kT) * z * w_{z,chrom} with w_{z,chrom } > 0). How would this more realistic scenario affect the conclusions?
4) The model (Equation 6) essentially assumes that bicoid and zelda act independently, and in an additive manner. Is this supported by data? Does transcription happen without bicoid, ,as Equation 6 suggests. If the two factors act simultaneously, one would expect the rate $\pi =[Bicoid][Zelda]$. If they act sequentially, one would expect $\pi =\frac{[Zelda][Bicoid]}{{c}_{1}[Zelda]+{c}_{2}[Bicoid]}$. The use of Equation 6 must be justified better, and the difference between the different possible microscopic scenario should be checked numerically and discussed.
https://doi.org/10.7554/eLife.56429.sa1Author response
Essential revisions:
Novelty and context
1) The novelty of the work in comparison with exiting studies reaching similar conclusions must be clarified. In particular in comparison [Dufourt et al., 2018] in which a different gene is studied, but within the same organism, in the same nuclear cycle and controlled by the same pioneering factor Zelda.
We apologize for not having compared and contrasted our work with that of Dufourt et al. more explicitly. Indeed, they found that, just like with hunchback, the removal of Zelda resulted in a delay of transcriptional onset of the snail gene (and that the addition of Zelda binding sites led to an earlier onset of transcription). Dufourt et al. then proposed a model for explaining this delay by positing the existence of a series of identical, irreversible states that each promoter needs to traverse at the start of the nuclear cycle before transcription can ensue, and find the number of states and the transition rate most consistent with their data. Our transcription factordriven model of chromatin accessibility (Figure 7) is very much an extension of the model proposed by Dufourt et al.
Despite the similarities and inspiration from this previous work, there are key differences between Dufourt et al. and our manuscript. First, we correlated output hunchback transcriptional activity—both onset time and the rate of mRNA production upon transcriptional onset—with the concentration dynamics of the Bicoid and Zelda inputs, a first in the fruit fly to our knowledge. In contrast, the work by Dufourt et al. only examined output transcriptional dynamics and did not visualize or incorporate into their model the dynamics of the various regulatory inputs to their snail reporter gene.
Second, these direct quantitative investigations of the relationship between input concentrations and output dynamics made it possible to investigate progressively more complex models of transcriptional regulation (equilibrium model MWC, nonequilibrium MWC model, and transcription factordriven model of chromatin accessibility) that can predict both the transcriptional onset time and the rate of transcriptional initiation. Such systematic dissection ultimately brought us to the transcription factordriven model of chromatin accessibility inspired by the work of Dufourt et al. In contrast, their work focused on using the singlecell distribution of transcriptional onset times to investigate how Zelda modulates the number of hypothesized transcriptionally silent states and their transition rates. While our work stands on their shoulders and we do not want to diminish their contributions by any means, we believe that our work is broader and more systematic.
Finally, we test the transcription factordriven model of chromatin accessibility from a different perspective than that of Dufourt et al. They focus on examining the singlecell distribution of transcriptional onset times. A key assumption of their model is that the concentrations of input transcription factors and, hence, the number of states and their transition rates, do not change over time. However, as we show in our manuscript, these concentrations do change significantly. By considering concentration dynamics, we are able to examine the model under more realistic assumptions, and constrain it using the mean transcriptional onset time.
In summary, by performing dynamical measurements of both inputs and outputs, our work systematically explores a broad swath of models of transcriptional regulation, including that proposed by Dufourt et al. We then go beyond their work by dissecting the model from the perspective of transcription factor dynamics and complement their testing of the model by examining the mean predicted transcriptional onset time as well as the rate of transcriptional initiation. We apologize for our lack of scholarship in contrasting our work to that of Dufourt et al. and have now made this comparison more explicit in the Discussion, starting in the ninth paragraph.
2) The Discussion must be extended to put the results in a broader context of existing literature. Nonequilibrium view on transcriptional regulation has been discussed before. is presented as new, although these ideas have been around and studied for a number of years already. See for instance Coulon et al., 2013 and discuss the parallels with the studies cited therein, which points to the same mechanisms of stepwise, irreversible, TFdriven progression of chromatin state towards a Pol IIaccessible configuration.
Dufourt et al., 2018; Fritzsch et al., 2018; Coulon et al., 2013.
We thank the reviewer for pointing out our oversight in addressing similar studies of nonequilibrium transcriptional regulation. We now more clearly acknowledge these works in the Introduction (last paragraph). We have also added an additional paragraph in the Discussion (ninth paragraph) that contextualizes our findings in the broader setting of nonequilibrium transcriptional regulation. Specifically, we acknowledge that the idea of stepwise, transcription factordriven transitions of chromatin into a transcribing state is not new, but do highlight the novelty of our work in quantitatively investigating the relationship between dynamic transcription factor concentrations and the rates of these transitions.
Further data analysis
3) The data seem underexploited: The authors have access to singlecell information but only analyze ensembleaverage traces. This is surprising since the model they put forward is intrinsically stochastic (governed by a few ratelimiting steps). Signatures of such stochastic behavior should be clear in the data. For instance: Given that the model posits that transcription starts after m=5 equivalent first order reactions, then a key prediction is that the first appearance of the MS2 signal among all individual traces at a given position along the embryo should be Gammadistributed (with shape parameter m=5). Even if the first appearance of the MS2 spot is a noisy measurement, this should be measurable (as in [Fritzsch et al., 2018] Figure 4A) and the authors already have this data at hand.
We would have loved to perform the analysis suggested by the reviewers in this manuscript. The analysis of transcription onset times using Gammadistributed stochastic models, as the reviewer points out, is not uncommon in the field, and has been performed in numerous studies, for example in Dufourt et al., 2018, and Fritzsch et al., 2018. We acknowledge that by only examining the means of the transcription onset time, we are underutilizing the data, which are taken at singlecell resolution. However, after carefully considering the possibility of extending our work to account for full distributions and capture stochastic effects of the model, we concluded that the results of such an analysis would be inconclusive given the current data and will require a future followup study that lies outside the scope of this paper. Below, we outline our reasoning in full detail. In essence, the explicit coupling of our model’s transition rates to dynamic transcription factor concentrations complicates the resultant theoretical analysis, as well as drastically reduces the statistical size of our datasets to a level where meaningful conclusions about higher moments of the distribution are difficult to draw.
One key difference between our study and comparable works is that our model explicitly accounts for how transcription factor concentrations modulate the transition rates between the silent states that precede transcriptional onset. Specifically, we posit that Bicoid and Zelda concentrations dictate these transition rates. Thus, in order for our analysis to be meaningful, it must take into account transcription factor concentrations. For example, in the context of explaining the zelda^{} mutant data, our analysis must take the spatiotemporal modulation of the Bicoid concentration gradient into account. As a result, for each position along the length of the embryo, a different dynamic Bicoid profile will cause the transition rates in the stochastic model to be timevarying. Thus, the resultant distribution of transcription onset times will no longer be a simple Gammadistribution that reflects the correct shape parameter.
Author response image 1 demonstrates how acknowledging Bicoid concentration dynamics significantly changes the shape of the singlecell transcriptional onset times distribution and its interpretation. Here, we took the transcription factordriven model used in the main text and numerically simulated the zelda^{} transcription onset times for two cases, (1) using the full dynamic Bicoid input at each embryo position, and (2) using a static Bicoid input obtained by averaging the dynamic Bicoid concentration in the first 8 minutes of the nuclear cycle (author response image 1A). For model parameters, we chose the same bestfit parameters used in Figure 7 of the main text. author response image 1B shows the resulting predicted distributions of transcription onset times for the dynamic and static cases (red and blue, respectively) for a model with k=5 steps. The figure reveals that Bicoid dynamics cause the overall distribution to shift to the right. Nevertheless, both distributions still fit well to Gammadistributions (purple and yellow curves).
Repeating these simulations for models ranging from k=1 to k=6 steps at a particular embryo position and fitting Gammadistributions to both static and dynamic cases shows that, as expected, the static input model yields Gammadistribution fits that match the underlying truth (Author response image 1C, blue), where the effective shape parameter (i.e. number of steps) matches to the correct value. However, we see that the dynamic input model systematically yields effective shape parameters that are higher than the actual number of states assumed in our simulation (Author response image 1C, red). Thus, the dynamic nature of the Bicoid concentration leads to a higher effective shape parameter. This trend holds for various embryo positions as well. Author response image 1D shows the k=5 steps model for various positions along the embryo, demonstrating that the dynamic input systematically produces distributions with higher effective shape parameters than the static input.
While the need to consider transcription factor dynamics complicates the analysis from the standpoint of analytical expressions for the probability distribution, we can still carry forward stochastic numerical simulations of transcription onset times taking into account Bicoid dynamics. However, after doing this, we encountered a severe experimental barrier, namely that the statistical size of each sample (i.e. each position along the embryo with unique Bicoid concentration dynamics) was very low.
In an attempt to analyze the singlecell results, we followed the analysis carried out by Dufourt et al., 2018. Specifically, we calculated the effective number of intermediary steps at each embryo position, defined as the square of the mean transcription onset time divided by the variance. For a true Gammadistribution, this yields the shape parameter (the number of intermediary steps). Here, due to the dynamic nature of Bicoid, we interpreted this as an effective number of intermediary steps to be compared with the stochastic simulation of the model.
To calculate the transcription onset time for a single cell, we invoked the technique used in the main text consisting of fitting a straight line to the initial rise in the MS2 signal, thus defining the xintercept as the onset time (Author response image 1E). After grouping the single cell fits into bins along embryo position, we calculated the effective number of intermediary steps (Author response image 1F, datapoints). We then simulated N=500 cells for the transcription factordriven model ranging from k=1 to k=6 steps between the inaccessible and accessible state, using the dynamic Bicoid concentration at each embryo position and the parameters used in Figure 7 of the main text. From these singlecell simulations, we calculated the effective number of intermediary steps (Author response image 1F, black dashed line). While the k=1 and k=2 models appear to clearly fall outside the data, it is difficult to distinguish between models with higher step numbers, simply because the data were sparse enough that the bootstrapped error in the effective step number was too high. For example, at 20% along the embryo, the effective step number in the data possessed a mean of approximately 7, with a bootstrapped standard error of almost 5. This high error prevents us from drawing precise conclusions about the numbers of intermediary steps that best fit the data.
The reason for these high uncertainties is the low statistical size of the dataset (Author response image 1F, bars). At most, a single position bin possessed ~25 cells’ worth of data. For this reason, we did not place much faith into the analysis of higher moments of the transcription onset times. This low statistical size, in our opinion, makes such a singlecell analysis inconclusive in the context of this work and calls for a new experiment where the number of data sets is dramatically increased. However, germline clone zelda^{} mutant experiments are incredibly timeconsuming to perform. Simply doubling the amount of data would take >2 months of fulltime imaging and, we believe, would still be insufficient for rigorous statistical analysis. Thus, we hope that the reviewer will agree that, while exciting and certainly a next step, a thorough analysis of singlecell distributions of transcription onset times will require a more efficient experimental scheme that lies outside of this work’s scope. A more detailed discussion of the potential for future insights to be gained from examining singlecell distributions can be found in the twelfth paragraph of the Discussion. Regardless, though we believe this falls outside the scope of our current work, we would be happy to incorporate the analysis described here in the Appendix if the reviewers deem it necessary.
Concerns regarding the model
3) In the equilibrium MWC model (Figure 2A), Zelda, although being a pioneering factor, is not represented as such. Here it acts by binding the 'accessible' state, hence simply by keeping chromatin from closing. Instead, a pioneering factor should be able to bind in the 'inaccessible' state and make this state unfavorable (e.g. weight = exp(∆_chrom / kT) * z * w_{z,chrom} with w_{z,chrom } > 0). How would this more realistic scenario affect the conclusions?
We agree with the reviewer that our equilibrium MWC model does not encompass all the possible mechanisms of Zelda as a pioneering factor. However, it is important to note that the initial equilibrium MWC model was sufficient to explain our wildtype data, and it was only the zelda^{} mutant data that could not be described by any of our thermodynamic models. Thus, it was actually the removal of Zelda that seemed to demonstrate the failure of the equilibrium paradigm. As a result, the specific theoretical action of Zelda will not affect the conclusion that the thermodynamic framework fails to explain the key result that removing Zelda drastically delays onset times of transcription in our system (Figure 4D, red points).
Nevertheless, one could extend this reasoning to wonder if a potential pioneering activity of Bicoid would change the conclusions. After pursuing this line of thought, we concluded that this would also not change our conclusions, since such a mechanism is largely captured in our generalized thermodynamic model, which only included Bicoid and RNAP binding to attempt to explain the zelda^{} mutant data (subsection “No thermodynamic model can recapitulate the activation of hunchback by Bicoid alone” and Appendix subsection “Generalized thermodynamic model”). Here, the Boltzmann weights for each microstate were left as an arbitrary coefficient, with the weight of the inaccessible state denoted by ${P}_{\mathrm{\text{inacc}}}$. Extending the model to allow Bicoid to bind to the inaccessible state would introduce $l$ new microstates with individual Boltzmann weights ${P}_{l}$, one for each Bicoidbound inaccessible state. Nevertheless, as long as the ensuing transcription rate of each inaccessible state with bound Bicoid molecules is zero, an assumption that we believe is reasonable, then the net effect of these additional inaccessible states could simply be described by a single effective inaccessible state with Boltzmann weight ${P}_{\mathrm{\text{inacc}}}\prime ={P}_{\mathrm{\text{inacc}}}+{\sum}_{l}{P}_{l}$. The resulting state space exploration (subsection “No thermodynamic model can recapitulate the activation of hunchback by Bicoid alone” and Figure 5C), which explores the whole parameter space of reasonable values of ${P}_{\mathrm{\text{inacc}}}$, would thus also capture the behavior of this single effective inaccessible state.
Finally, as pointed out in the reviewer comment #9, the constrained dynamics of both Bicoid and Zelda, which reach a peak value around 4.5 min, imply that any thermodynamic model will not be able to produce onset times significantly later than that, due to the separation of time scales assumption. Thus, our main conclusion that the thermodynamic framework is insufficient to explain the zelda^{} data is a rather general point that is independent of the specific implementation of the model. We have incorporated these ideas as a new section in the Appendix (Appendix subsection “Extended generalized thermodynamic model with transcription factor binding in the inaccessible state”) and referenced it in the main text (subsection “No thermodynamic model can recapitulate the activation of hunchback by Bicoid alone”).
4) The model (Equation 6) essentially assumes that bicoid and zelda act independently, and in an additive manner. Is this supported by data? Does transcription happen without bicoid, ,as Equation 6 suggests. If the two factors act simultaneously, one would expect the rate $\pi =[Bicoid][Zelda]$. If they act sequentially, one would expect $\pi =\frac{[Zelda][Bicoid]}{{c}_{1}[Zelda]+{c}_{2}[Bicoid]}$. The use of Equation 6 must be justified better, and the difference between the different possible microscopic scenario should be checked numerically and discussed.
First, we would like to clarify a possible misunderstanding. Equation 6 only describes the rate of transitioning between the silent states preceding transcription and the transcriptionally active state, and holds no information about the ensuing rate of transcription, which is then given by a standard thermodynamic model involving Bicoid (Figure 7A). Equation 6 alone does not suggest that transcription happens without Bicoid, only that the system can transition to the ON state with Zelda alone. Once in the ON state, the absence of Bicoid will thus result in merely whatever basal transcriptional activity is supported by the presence of RNAP.
Second, the reviewer brings up salient points about the specific form of Equation 6. We would first like to stress that the main goal of this section of the paper was to propose a simple nonequilibrium extension to the thermodynamic framework that could explain all aspects of the experimental data. To our knowledge, there have been no works suggesting any direct interactions between Bicoid and Zelda. In the absence of evidence, we thus chose a model where Bicoid and Zelda acted independently and additively. A more comprehensive comparison of various nonequilibrium models is a project that we believe to be beyond the scope of this paper. Indeed, the exploration of these models still remains challenging even in the simpler case of bacterial transcriptional regulation as exemplified by the work by Hammar et al., 2014, and the recent bioRxiv preprint of Morrison et al. (2020). Nevertheless, we can at least comment on the two alternative mechanisms proposed in this comment.
If Bicoid and Zelda act simultaneously, then one would indeed expect Equation 6 to have the form $\pi =[Bicoid][Zelda]$. However, one critical experimental fact is that transcription occurs even in the absence of Zelda, albeit at a delayed capacity. Thus, this purely simultaneous action mechanism can be ruled out, since removing Zelda would set all transition rates $\pi $ to zero. Similarly, this rules out a sequential mechanism where $\pi =\frac{[Bicoid][Zelda]}{{c}_{1}[Bicoid]+{c}_{2}[Zelda]}$, since removing Zelda would similarly forbid any transitions from the OFF to ON state.
More generally, the existence of transcription in the absence of Zelda implies that there must exist some independence between BicoidandZeldamediated transitions from the OFF to ON state. If Bicoid and Zelda did not work in parallel to catalyze the transition into the ON state, transcription could not occur in the absence of Zelda. To discuss this point explicitly, we have added this argument as an appendix section in Appendix subsection “Exploring alternatives to the additive transcription factordriven transition rate” and Appendix—figure 15, and have referenced it in the subsection “Transcription factordriven chromatin accessibility can capture all aspects of the data”.
https://doi.org/10.7554/eLife.56429.sa2Article and author information
Author details
Funding
National Science Foundation (DGE1752814)
 Elizabeth Eck
University of California Berkeley (Chancellor's Fellowship)
 Elizabeth Eck
Department of Defense (Graduate Student Fellowship)
 Jonathan Liu
Burroughs Wellcome Fund (Career Award)
 Hernan G Garcia
Sloan Research Foundation
 Hernan G Garcia
Human Frontier Science Program
 Hernan G Garcia
Searle Scholars Program
 Hernan G Garcia
Shurl and Kay Curci Foundation
 Hernan G Garcia
Hellman Foundation
 Hernan G Garcia
National Institutes of Health (DP2 OD02454101)
 Hernan G Garcia
National Science Foundation (1652236)
 Hernan G Garcia
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
We are grateful to Jack Bateman, Jacques Bothma, Mike Eisen, Jeremy Gunawardena, Jane Kondev, Oleg Igoshin, Rob Phillips, Christine Rushlow and Peter Whitney for their guidance and comments on our manuscript. We thank Kenneth Irvine and Yuanwang Pan for providing the hisirfp fly line. This work was supported by the Burroughs Wellcome Fund Career Award at the Scientific Interface, the Sloan Research Foundation, the Human Frontiers Science Program, the Searle Scholars Program, the Shurl and Kay Curci Foundation, the Hellman Foundation, the NIH Director's New Innovator Award (DP2 OD02454101), and an NSF CAREER Award (1652236) (HGG), an NSF GRFP (DGE 1752814) and UC Berkeley Chancellor’s Fellowship (EE), and the DoD NDSEG graduate fellowship (JL).
Senior Editor
 Naama Barkai, Weizmann Institute of Science, Israel
Reviewing Editor
 Pierre Sens, Institut Curie, PSL Research University, CNRS, France
Reviewer
 Anatoly Kolomeisky, Rice University, United States
Publication history
 Received: February 27, 2020
 Accepted: October 16, 2020
 Accepted Manuscript published: October 19, 2020 (version 1)
 Version of Record published: December 15, 2020 (version 2)
Copyright
© 2020, Eck et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 1,924
 Page views

 351
 Downloads

 0
 Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.