1. Chromosomes and Gene Expression
  2. Computational and Systems Biology
Download icon

How subtle changes in 3D structure can create large changes in transcription

  1. Jordan Yupeng Xiao
  2. Antonina Hafner
  3. Alistair N Boettiger  Is a corresponding author
  1. Program in Biophysics, Stanford University, United States
  2. Department of Developmental Biology, Stanford University, United States
Research Article
  • Cited 2
  • Views 2,175
  • Annotations
Cite this article as: eLife 2021;10:e64320 doi: 10.7554/eLife.64320

Abstract

Animal genomes are organized into topologically associated domains (TADs). TADs are thought to contribute to gene regulation by facilitating enhancer-promoter (E-P) contacts within a TAD and preventing these contacts across TAD borders. However, the absolute difference in contact frequency across TAD boundaries is usually less than 2-fold, even though disruptions of TAD borders can change gene expression by 10-fold. Existing models fail to explain this hypersensitive response. Here, we propose a futile cycle model of enhancer-mediated regulation that can exhibit hypersensitivity through bistability and hysteresis. Consistent with recent experiments, this regulation does not exhibit strong correlation between E-P contact and promoter activity, even though regulation occurs through contact. Through mathematical analysis and stochastic simulation, we show that this system can create an illusion of E-P biochemical specificity and explain the importance of weak TAD boundaries. It also offers a mechanism to reconcile apparently contradictory results from recent global TAD disruption with local TAD boundary deletion experiments. Together, these analyses advance our understanding of cis-regulatory contacts in controlling gene expression and suggest new experimental directions.

Introduction

The genomes of many organisms have been shown to adopt a domain-like structure commonly referred to as topologically associated domains (TADs) (Ibrahim and Mundlos, 2020; Jerković et al., 2020; McCord et al., 2020; Rowley and Corces, 2018), defined as contiguous regions of the genome where intra-region 3D proximity is greater than inter-region (Dixon et al., 2012; Nora et al., 2012). When plotted as a heat map of contact frequency as a function of two genomic coordinates, TADs appear as boxes on the diagonal at specific genomic coordinates. They can be detected by a range of distinct techniques, including methods relying on proximity ligation like 3C/Hi-C (Jerković et al., 2020; Kempfer and Pombo, 2020; McCord et al., 2020; Rowley and Corces, 2018), ligation-free sequencing methods like GAM (Beagrie et al., 2017) and SPRITE (Quinodoz et al., 2018), microscopy methods such as ORCA (Bintu et al., 2018; Mateo et al., 2019), and even live-cell measurements like DAMC (Redolfi et al., 2019). Recent microscopy experiments have shown that TADs are a statistical property that emerges from a population of cells dynamically exploring multiple conformational states, rather than a static structure found in all cells (Bintu et al., 2018).

Despite broad agreement on the existence of TADs, their role in transcriptional regulation has recently become a subject of controversy as different recent results have been interpreted to support or refute a functional link (Finn and Misteli, 2019a; Ghavi-Helm et al., 2019; Mir et al., 2019). For example, many TAD boundaries demarcate regions of co-expressed genes and separate differentially expressed genes, suggesting that they play a role in cis-regulatory specificity (Long et al., 2016; McCord et al., 2020; Spielmann et al., 2018). Opposing this interpretation, it has been noted that the quantitative difference in interaction frequency is generally small: inter-TAD contacts often reach only half the frequency of intra-TAD contacts, raising questions of how such small differences could explain such clear separation of expression (McCord et al., 2020). The disruption of TAD boundaries by cis-mutation has been proposed as the causal driver of the substantial alterations in gene expression for a variety of disease conditions, a conclusion supported by animal models (Dowen et al., 2014; Franke et al., 2016; Hnisz et al., 2016; Lupiáñez et al., 2015) reviewed in Long et al., 2016; McCord et al., 2020; Spielmann et al., 2018. However, global disruption of TAD boundaries by depletion of boundary-associated factors led to few substantial changes in gene expression in cell culture models (Cuartero et al., 2018; Nora et al., 2017; Rao et al., 2017; Schwarzer et al., 2017; Stik et al., 2020; Zuin et al., 2014), though such depletion does obstruct the ability of macrophages to respond to inflammatory stimuli (Cuartero et al., 2018; Stik et al., 2020). Genetic experiments have made it clear that enhancers act on promoters only in cis, suggesting a requirement for physical proximity that would explain the functional role of TADs (Furlong and Levine, 2018). However, recent microscopy experiments uncovered little correlation between enhancer-promoter (E-P) proximity and nascent transcription activity at well-studied loci (Alexander et al., 2019; Mateo et al., 2019) reviewed in Finn and Misteli, 2019b. This surprising disagreement has led to various speculative explanations, such as action at a distance through condensates with diameters of tens of molecules (Alexander et al., 2019; Benabdallah et al., 2019; Cavalheiro et al., 2021; Heist et al., 2019; Lim and Levine, 2021).

We wondered whether these apparent contradictions can be reasonably reconciled with known biochemical mechanisms. To answer these questions, we examined simple biophysical models of proximity-dependent E-P communication grounded in known physical laws (the chemical master equation [CME]). Here, we identified biochemical mechanisms whose relevance for the spatial organization of the genome has not previously been considered and showed how the said mechanisms provide a simple reconciliation of all three apparent contradictions. We discuss the implications for this revised view of contact-mediated cis-regulation of gene expression for interpreting experimental results.

Results

A transcriptional response hypersensitive to changes in contact frequency

To better understand the connection between E-P contact frequency and transcriptional behavior, we started with a quantitative analysis of recently published super-resolution microscopy data (Mateo et al., 2019). These experiments used Optical Reconstruction of Chromatin Architecture to observe the 3D path of chromatin through two neighboring regulatory domains of the Drosophila Hox genes. The upstream domain contained the gene Ubx and its enhancers, and the downstream region contained the gene abd-A and its enhancers (Figure 1A). In wildtype Drosophila embryos, interactions between the two domains were less frequent than within the domains, producing two clearly distinct TADs in population average data (Figure 1A). Deletion of only a few kilobases of sequence at the border of these TADs resulted in a clear increase in interaction between the domains, including between Ubx enhancers and the abd-A promoter. However, interaction frequency increased only by a factor of 1–2.5-fold across the population (Figure 1A). Moreover, while the structures were easily distinguished at the population level, the structures of single cells were more variable (Figure 1—figure supplement 1A, B), with 25% of individual wildtype cells showing some cross-TAD border mixing between the Ubx enhancers and the abd-A promoter compared to 50% of mutant cells (Figure 1—figure supplement 1C). The change in gene expression, as assayed by single-molecule fluorescent in situ hybridization (smFISH), was hypersensitive to this moderate change in contact frequency, with an average fivefold increase in mRNA counts of abd-A (Figure 1B). Similar changes were also seen for Ubx (Figure 1—figure supplement 2). From these single-cell analyses, we concluded that the quantitative disconnect between weak TAD borders (see also Figure 1—figure supplement 3) and the large transcriptional effects of border disruption (see also Figure 1—figure supplement 4) is not an artifact of Hi-C or of population averaging. Instead, small differences in the frequency of contact appear sufficient to drive large changes in expression, requiring us to re-examine the textbook picture of E-P regulation (Alberts et al., 2018).

Figure 1 with 4 supplements see all
A futile cycle promoter explains super-linear transcriptional responses following subtle changes in enhancer-promoter contact.

(A) Quantification of the minor difference in cross-border contact frequency between the Ubx and abd-A regulatory domains in wildtype and domain-border-deleted embryos, quantified by Optical Reconstruction of Chromatin Architecture. (B) Quantification of the major change in transcription of abd-A between wildtype and domain-border-deleted embryos. Raw data taken from Mateo et al., 2019.

© 2019, Mateo et al. The inset image in Figure 1A is reproduced with permission from Figure 5B of Mateo et al., 2019. Further reproduction of this panel would need permission from the copyright holder.

Modeling nonlinear transcriptional response

We set out to explore theoretical models that can explain how changes in E-P interaction frequency (3D genome folding) can lead to nonlinear effects on transcription. To do this, we used the chemical master equation (CME) approach, represented as a continuous-time Markov process (Durrett, 2012; Strogatz, 2019Figure 2A), to model the discrete/stochastic nature of enhancer-promoter interactions.

Figure 2 with 1 supplement see all
The emergence of hypersensitivity in simple stochastic transcription models.

(A) A stochastic model for the evolution of the chemical system, which specifies how the probability of observing the system in state i at time s given it was in state j at time t evolves in time (the forward Kolmogorov equation). M is a transition matrix for the continuous-time Markov system. (B) Cartoon illustration of chemical states for the 2-state enhancer model and the corresponding transition matrix M, as a function of the concentration of transcription factor (TF) activator, A. (C) The probability of TF binding as a function of molecule count, for the2-state and 3-state systems. The Hill equation with a coefficient n = 2 is shown for comparison. (D) As in (B), but for the 3-state enhancer model with two potentially cooperative binding sites. (E) Cartoon depiction of state space for the enhancer cooperativity model and corresponding matrix M for the Markov system/chemical master equation model. The DNA is denoted by a blue polymer, the positions of the enhancers by green segments, and the promoter region by an orange segment. Contacts form with probability c1 and c2 for enhancers 1 and 2, respectively, and separate from loop configuration with probability u1 and u2. (F) Probability of being in the two-loop state as a function of the fold change in loop frequency, for a selection of rate constants chosen to demonstrate the existence of a hypersensitive response. (G) Cartoon depiction of the expected change in contact frequency according to the cooperativity model presented in (E). Enhancer cooperativity leads to the formation of specific loop interactions.

Nonlinear transcriptional responses have been a subject of intense research, both experimental and theoretical (Bergmann et al., 2007; Bintu et al., 2005a; Bintu et al., 2005b; Eldar et al., 2002; Gregor et al., 2007; Haskel-Ittah et al., 2012; He et al., 2010; Manu et al., 2009; Ozbudak et al., 2004; Sanchez et al., 2011; Ben-Tabou de-Leon and Davidson, 2009; Zinzen and Papatsenko, 2007; Zinzen et al., 2006; Bentovim et al., 2017). While this previous modeling work has focused on connecting transcriptional output to changes in transcription factor (TF) concentrations and not to E-P contact, we used it as a starting point to build intuition for how to obtain a hypersensitive response. Additionally, as CMEs and continuous-time Markov processes may be less familiar to some readers, we first applied it to a familiar problem: a model of the molecular mechanisms that render transcription hypersensitive to changes in nucleoplasmic TF concentration. All models in this work start with the common framework of transcription (more details in Materials and Methods): TFs bind an enhancer, the enhancer interacts with a promoter, and the promoter state determines transcriptional output. For simplicity, we assume that E-P interaction and binding of General Transcription Factors (GTFs) occur at a much faster rate than TF-enhancer binding, in which case transcription rate is simply proportional to the fraction of time the enhancer is bound (see Materials and Methods for a discussion on relaxing this assumption).

Consider first an enhancer with only one TF-binding site for factor A. The enhancer has two states: unbound and TF-bound (Figure 2B). It transitions from unbound to bound with a probability per unit time proportional to the number of molecules of A: konA, and transitions from bound to unbound with fixed probability per unit time koff (Figure 2B). This stochastic process can be compactly expressed by writing down the corresponding stochastic matrix, M, associated with the continuous-time Markov jump process (Figure 2B). Here, Pij(s,t) is the probability the system is in state i at time s given that it was in state j at time t. The time evolution of P is governed by the forward Kolmogorov equation (Figure 2A). Standard probability theorems for Markov processes enable us to compute the complete stochastic dynamic behavior of this system, frequently through straightforward matrix operations (Durrett, 2012). While trivial cases like this example may be solved without this matrix formalism, the formalism will become helpful for more complex models and the matrix operations will remain identical to those introduced here. Of immediate interest, we can see a stationary state exists because M is irreducible and recurrent. And by solving for the normalized kernel of M, the stationary probability that the system is in the TF-bound state (state 2) is π2 = A/(koff/kon + A) (Figure 2C). From this result, we see, for any value kon or koff, the probability that the enhancer is bound (and therefore whether it can activate transcription) exhibits only sublinear responses to A (i.e., is never hypersensitive) (Figure 2C).

Now consider a minor variation: an enhancer that has two binding sites for factor A (Figure 2D). For simplicity, let the two sites be identical, so the system can be described with only three states: (1) no sites bound, (2) either bound, or (3) both bound. If only the state with both sites bound is transcriptionally active, we are interested in the probability the system is in state 3, which at stationary state is π3 = A2/((koff1koff2)/(kon1kon2) + koff2/kon2A + A2) (the normalized kernel of the matrix M in Figure 2D). From this equation, we observe that for the special case where koff2/kon2 < koff1/kon1, this system is hypersensitive to A (Figure 2C). The stronger this difference, the more the system approaches the behavior of f(A) = An/(cn+An), which is the Hill equation: a sigmoidal curve, starting at zero and reaching a maximum value of 1 as A increases, with an inflection at A = c (Figure 2C, Figure 2—figure supplement 1). As n increases, the gentle curve sharpens into a step function, where A = 0 for A < c and = 1 for A > c. Thus, the Hill equation provides a convenient way of parameterizing a sigmoidal response in terms of its critical threshold c and its sensitivity, n (Figure 2—figure supplement 1).

We next wondered if a simple adaptation of this model, replacing TF interactions with enhancer-enhancer interactions, could explain the hypersensitive transcriptional response to changes in genome structure (Figure 2E–F). In this scenario, a small change in the initial interaction frequency increases slightly the probability that two enhancers interact, much like how a small change in TF concentration improves the probability two TFs bind the same enhancer (Figure 2G). Mutual affinity between these enhancers stabilizes their interaction with the promoter, much like mutual affinity between TFs stabilizes their interaction with an enhancer. In the hypersensitive regime, a minor initial change in contact frequency (x-fold), due to disruption of the TAD border, leads to a major change in E-P contact frequency (y-fold > x-fold), due to cooperativity in forming the hub, which could in turn produce a major change in transcription (Figure 2F, G).

Unfortunately, this model is immediately rejected by available data. Upon perturbation of TAD boundaries, we (Figure 1A) and others do not observe the appearance or loss of cooperative enhancer contacts of great enough magnitude to explain the change in transcription (Figure 1B). These and other recent multi-contact data have revealed the existence of cooperative three-way interactions (Allahyar et al., 2018; Bintu et al., 2018; Oudelaar et al., 2019, Oudelaar et al., 2018). Yet even with these cooperative effects, the fold change in measured contact frequency is not as large as the fold change in transcription before and after TAD border deletion.

Promoter futile cycles and hypersensitive response

We thus sought to identify other mechanisms that could give rise to hypersensitivity to E-P contact frequency. Inspired by the hypersensitive cell-signaling pathways dependent on futile cycle competition between phosphatases and kinases (Ferrell, 2012; Huang and Ferrell, 1996; Qiao et al., 2007), we considered a mechanism in which the promoter is not waiting passively for an enhancer signal, but rather is engaged in a futile cycle of accumulation and removal of factors that favor transcription. We propose two chemically explicit realizations of this model to establish concrete examples.

The first is the ‘condensate version,’ in which a ubiquitously expressed GTF, such as PolII itself, is allowed to accumulate at individual promoters. Additional molecules join the condensate and dissolve back into the nucleoplasm at rates r and g, respectively. Condensates with n molecules already at the promoter provide n chances per interaction for a newly arrived molecule to be captured, such that capture is more likely at larger condensates (Figure 3A, B). For simplicity, we assumed that the transcription rate of the gene is proportional to the size of the condensate (see Materials and methods for a discussion on relaxing this assumption). As in other futile cycle signal transduction systems, these competing processes of addition and removal happen in all cells at all times—they are not dependent on interactions with an enhancer or the presence of cell-type-specific TFs. Promoters where condensates dissolve faster than they form will generally be silent, but retain a non-zero probability of transitioning to the active state. Cell-type-specific/enhancer-dependent gene activation arises as follows: each time an active enhancer contacts the promoter, a single additional molecule is transferred to the condensate. We denoted the probability of contact per unit time as e, which is a property of the 3D structural dynamics of the domain. While enhancer contact is not explicitly required for promoter activation, we will show that this simple model still exhibits rapid and relatively homogenous responses to enhancer activation/deactivation. For the moment, we assumed the enhancer is already activated by the necessary TF binding and requires only looping to influence transcription, allowing us to focus on the effects of 3D structure. The relevant states in the chemical master equation are shown in cartoon form in Figure 3B, along with the stochastic matrix of the corresponding Markov jump process (Figure 3C). In the interest of simplicity, we have introduced a maximum cluster size as a fixed parameter. The effects of relaxing this assumption are discussed in Materials and methods.

Figure 3 with 1 supplement see all
Systems with hypersensitive transcriptional responses to structural perturbation.

(A) Generalized cartoon depiction of probabilistic events that affect promoter activity. (B) Schematic of the general transcription factor (GTF) condensate version of the futile cycle model. (C) Markov matrix corresponding to the GTF model version depicted in (B). (D) Simulated condensate size distribution as a function of enhancer-promoter looping rate e. Each violin plot represents a single simulation with fixed e. The superlinear regime reflects where fold change in transcription (linearly proportional to condensate size) is greater than corresponding fold change in e. (E–G) Sensitivity, critical threshold, and maximum value of the GTF system as a function of condensate dissociation rate and max cluster size. (H–J) Same as in (B), (C), and (D), but for the post-translational modification (PTM) model version. Numerical differences in model outputs between D and J, reflect primarily differences in parameter choices rather than intrinsic differences between the model versions, as seen also in the parameter sweeps (E-G, Figure 3—figure supplement 1).

This simple model version includes a regime in which condensate size (a proxy for transcription) is hypersensitive to changes in 3D structure, as illustrated by simulations of the Markov jump process (Figure 3D). (Specific numerical parameters for the simulations shown in Figure 3D and all following figures may be found in Materials and methods.) In this regime, at low E-P contact frequencies, condensates dissolve faster than they form at the promoter, and most of the population has no PolII bound. As E-P contact frequency approaches the hypersensitive regime, a higher percentage of the population binds multiple PolII molecules, but most are still unbound (Figure 3D). At this point, the fold change in contact is still greater than the fold change in transcription, and the system is in a transcriptional regime largely unaffected by structural change. However, a further increase in the E-P contact frequency tips the balance, such that most of the population shifts from an unbound to a mature condensate state. This is the hypersensitive regime—a small (2-fold) increase in e resulted in a large (10-fold) increase in the average amount of PolII per promoter, and thus on transcription. Further increases in contact frequency have sublinear effects on transcription output.

To investigate the generality of this hypersensitive regime, we conducted a sweep of model parameters (Figure 3E–G). Normalizing to association probability per unit time, we characterized the response as a function of maximum cluster size, cmax, and the dissolution constant, g. We parameterized the response by fitting the average condensate size vs. enhancer loop rate to a Hill function and plotting the sensitivity (n), critical threshold (c), and max value of that function (v) (see Figure 2—figure supplement 1). We saw that sensitivity increased with cluster size, and hypersensitive behavior required cluster size greater than 2 (Figure 3E). In addition, the PolII off-rate had to be not too small, or the stability of the off-state was too low to permit hypersensitivity, and not too large, or the mature condensates were never stable (Figure 3F). A higher dissolution constant resulted in greater sensitivity up until the point where g was so high that the cluster state was no longer reached, as can be seen in the max-value plot (Figure 3G). As expected, increasing the max cluster size linearly increased the cluster size obtained at the maximal E-P looping rate (Figure 3G). The critical threshold increases as a function of g as well, also to the point where high g prevented the system from ever reaching the max cluster size (Figure 3F). For larger cmax, larger values of g can be used without preventing access to this state, allowing higher critical thresholds and higher sensitivity (Figure 3E, F).

While we have described the condensate model version using PolII as an example, any variety of GTF (or even combination thereof) could play this role in the cell. Recent experimental work supports the formation of condensates of GTFs such as PolII, Mediator, and BRD4 at the promoters, (Chong et al., 2018; Cho et al., 2018; Sabari et al., 2018; Lu et al., 2018), and provides experimental evidence for enhanced capture/reduced release (Chong et al., 2018) as expected for phase separating condensates. The biological significance for transcription of such condensates remains disputed.

The second realization of the futile cycle model is the ‘post-translational modification (PTM) version’ (Figure 3H). It is based on accumulation of PTMs, ‘tags,’ deposited by constitutively expressed enzymes in the vicinity of the promoter, and does not require any condensate of molecules to form (Figure 3H). We assumed that the enzyme can physically interact with deposited tags, which allosterically stimulate its effective addition rate from r0 to r1, r2, …, rn, where n is the total number of stimulated states. For simplicity, we focused on the minimal case of n = 2 (i.e., the enzyme has only two stimulated states). The probabilities of enzyme-tag association and dissociation are proportional to the amount of tag existing at the promoter, with respective rate constants kn and τn. The promoter also has an intrinsic rate of tag removal g. Finally, as in the condensate version, cell-type-specific regulation occurs through physical interaction with an enhancer that happens at rate e, determined by the 3D chromatin structure. Each E-P interaction adds one tag. The corresponding stochastic transition matrix is shown in Figure 3I. Note that in enumerating the states of the system we tracked all possible combinations of enzyme stimulation states and promoter tag levels.

The PTM futile cycle model version can also achieve a hypersensitive regime in which small fold changes in E-P contact lead to large fold changes in transcription (Figure 3J), around the experimentally observed range (Figure 1A, B). This hypersensitive behavior suggests that the relatively weak effects of most TAD boundaries (Figure 1—figure supplement 3) may still have important consequences for the control of gene expression. While the higher-dimensional parameter space of the PTM version is more cumbersome to scan comprehensively and less practical to plot, the minimal conditions for hypersensitivity can be derived by considering the deterministic limits of the stochastic system (see Figure 3—figure supplement 1 and Materials and methods).

Thus, the requirements for hypersensitivity to structural change in the PTM version are conceptually similar to those of the condensate version. The tag-removal rate must be neither too small, or the system will activate too easily, nor too large, or the system will never be able to fully activate. This is analogous to the dissolution rate restrictions in the condensate version. The rate of accumulation of tags must be larger for systems with a larger number of tags to start with, which is achieved when nmax >= 2 and rrr2 in the PTM version. In the condensate version, a conceptually similar bias is achieved through the larger capture probability of the larger condensates. As long as these conditions are met, the system admits hypersensitivity E-P loop rate e.

Promoter-specific properties can create an illusion of E-P specificity

We next investigated how differences in promoter-specific rate constants in these model versions affect E-P responsiveness to TAD boundary removal. We considered two promoters that differ by a factor of 2 in g (the condensate dissolution or tag removal rate), but are otherwise identical. Both start the simulation in a low-transcription state and experience the same twofold increase in E-P interaction at the start of the simulation, simulating the effect of TAD border disruption (Figure 4A).

Differential sensitivity to enhancer-promoter (E-P) contact can lead to an illusion of specificity.

(A) Cartoon depiction of two simulated promoters undergoing the same change in enhancer contact frequency after a topologically associated domain (TAD)-border deletion. (B) Violin plots of the changes in transcription rates (which are linearly proportional to promoter tag levels) after a twofold increase in the E-P contact rate for each promoter, simulating the border deletion depicted in (A). (C) Cartoon representation of the ‘lock and key’ explanation of E-P specificity, contrasted with the ‘tipping point’ explanation suggested by results in (B).

For both versions of the model, Promoter 1 exhibited a large (greater than twofold) increase in transcription, yet Promoter 2 remained almost unaffected in response to the twofold perturbation of the E-P interaction frequency (Figure 4B). Because of its higher intrinsic affinity for the condensate dissolution or tag removal machinery, Promoter 2 was still in the sublinear, low-transcription-response regime (Figure 4B). Further increases in E-P interaction would be sufficient to drive it into a high-transcribing state like Promoter 1 (due to the sigmoidal behavior explored above, Figure 3E–G). Quantitative differences in the simulation results largely reflect the particular parameter regime of the model (Figure 4B). Note that ‘low-transcribing state’ and ‘high-transcribing state’ refer to the qualitatively distinct behavioral modes of the system shown in Figure 4. It is this qualitative switch and the relative magnitude of the change that is of interest to us, not the numerical values. The total number of tags or molecules in the condensate between the low and high state may vary between different simulations without affecting the existence of a hypersensitive regime or the ability for a single parameter like tag removal to shift the system from the hypersensitive to the sublinear-low or sublinear-high regimes. These simulations illustrate that the apparent specificity (‘lock and key’) may be an illusion resulting from different tipping points among promoters, which arise from regulating accumulation/depletion of TF condensates or promoter-associated PTMs (Figure 4C). The promoter-specific response observed in these simulations follows from sigmoidal sensitivity to E-P contact frequency described above, which exist across a broad range of parameters.

Hysteresis in promoter activation leads to a unique response at different time scales

While experiments that perturb chromatin structure by deleting TAD borders have reported large (over 8-fold) changes in gene expression (Cavalheiro et al., 2021; McCord et al., 2020Figure 1—figure supplement 4), experiments that globally abrogated TADs by depleting cohesin observed few genes changing expression by more than 2-fold (Rao et al., 2014; Zuin et al., 2014Figure 5A, B). As these experiments differ by orders of magnitude in the time scale at which transcription changes are assessed (due to technical limitations), we wanted to explore the temporal behavior of the futile cycle promoters by assessing the effect of a TAD boundary deletion/fusion at an early time point relative to a late time point.

Figure 5 with 1 supplement see all
Transcriptional effects of contact perturbation depend on experiment time scale.

(A) Schematic of global cohesin loop disruption experiments. (B) Schematic of individual topologically associated domain (TAD) boundary deletion experiments. (C) Violin plots of transcription as a function of enhancer contact, as in Figure 4. Blue and red violin plots represent cell populations that start in the OFF (blue, zero-molecule condensate) or ON state (red, condensate of six molecules). Blue and red lines connect the median behavior from the corresponding simulations. (D) Same as in (C) but for the post-translational modification (PTM) model version (OFF = 0 tags, ON = 30 tags). (E) Median transcription rate for populations starting in ‘OFF’ or ‘ON’ states, observed at different times after initialization of the simulation as indicated by the color legend. (F) Simulation of futile cycle promoter transcription rate as a function of time under different enhancer perturbations for the general transcription factor (GTF) condensate model version. As described in the text, ‘early’ is the amount of time it takes for the promoter to switch off following enhancer deactivation, and late is an order of magnitude or longer later. (G) Same as in (F), but for the PTM model version.

We found that, in addition to exhibiting hypersensitivity, both the condensate and PTM model versions readily exhibited hysteresis. Hysteresis means that the system has memory such that its behavior depends not only on its current state but also the preceding states (Figure 5C, D). For example, the transition from the high-transcribing state to the low-transcribing state occurs at different E-P contact frequency depending on whether the system started in the high or low regime initially (Figure 5C, D). In the continuum limit, as shown with the PTM model version, even at t = infinity promoters that started in the high-tag (‘ON’) state transitioned to the low-tag (‘OFF’) state at a different value of e than promoters starting in the low-tag state transition to the high-tag state (Figure 5—figure supplement 1), making it clear this hysteresis is an emergent property of the dynamic feedback and not a trivial delay. In the discrete stochastic case, the memory of the starting state is not permanent. As there is always a finite probability of tag removal or condensate shrinkage, there is always a finite probability that n such loss events occur consecutively before a gain event, and in infinite time, such an event is guaranteed to happen, so no promoter will stay active forever. The length of the memory is determined by the difference in the average rate of gain vs. the average rate of loss in the respective high and low states. However, because of the hysteresis in the system, we observed qualitatively distinct sensitivities of the average transcription rate to small changes in E-P interaction at short time scales than we saw at long time scales (Figure 5E). At short time scales, the effect of a twofold change in contact frequency was minor across the whole range of E-P loop frequency compared to the change observed at long time scales in the hypersensitive regime (Figure 5E).

To explore this further, we examined the distribution of transcription rates in simulation for both the condensate and PTM model versions for promoters experiencing a twofold change in the E-P rate (simulating the effect of TAD loss by border deletion or cohesin depletion) (Figure 5F, G). We chose model parameters that positioned these promoters in the hypersensitive regime at long time scales (as in Figures 4 and 5; see Materials and methods). As the absolute kinetic rate constants for these model versions are not known empirically, the units of time in the simulation are arbitrary. We defined ‘early’ as the amount of simulation time required for a population of active promoters to transition into the non-transcribing state when the enhancer is deactivated (e.g., bound by repressors). Empirical measures of such time vary substantially among genes, but generally lie on the minutes-to-hours time scale. This scale corresponds with the time used in the cohesin-degron experiments and is distinct from the many-cell-generations/days required for testing genetic boundary deletions. We defined ‘late’ time points as one or more orders of magnitude longer than ‘early.’ With these definitions in mind, we examined the difference in simulated transcriptional response at different times. Both the condensate and the PTM simulations exhibited a weak (less than twofold) change in transcription in most of the simulated cells of the population by the early time point, even though this was sufficient for enhancer deactivation to lead most of the population to drop into the low expressing state (Figure 5F, G). As before, the parameter of interest is the relative change in transcription rate (tag concentration or condensate size), not the numerical value.

Transcription and E-P contact may be largely uncorrelated in single cells

Having examined the temporal dynamics at the population scale, we turned to the dynamic behaviors in single cells. We asked whether a promoter driven by contact-dependent enhancer activity and futile cycles, as described above, would be expected to show correlation between E-P looping and transcription at the single-cell level (Figure 6A). From our simulated cells, we computed the odds ratios for observing transcription in a window of time given observation of contact in that window of time, in either the condensate (Figure 6B) or PTM (Figure 6C) version of the model. Across the different regimes of E-P contact frequency, the odds ratios were statistically greater than 1, but only by a small margin of 0–30%, paralleling our recent empirical observation (Mateo et al., 2019Figure 6D). These simulations demonstrate that contact-dependent mechanisms of E-P regulation can be still consistent with empirical data in which many transcribing cells lack contacts and vice versa. Since the frequency of E-P interaction only tips the scales in the existing futile cycles of promoter modification, rather than acting as a triggering event for transcription, a strong correlation is not expected. The odds ratio still remains different from 1, however, as promoters that are experiencing a high frequency of E-P contact are more likely to transcribe, increasing the probability the two events co-occur.

Figure 6 with 1 supplement see all
Enhancer-promoter (E-P) contact events may exhibit minimal correlation to transcription events in single cells.

(A) Schematic of the textbook expectation of correlation between E-P contact and transcription. (B, C) Odds ratios for observing nascent transcription given E-P proximity, from simulations of the condensate and post-translational modification (PTM) model versions. (D) As in (B) and (C), but for actual data from recent super-resolution imaging experiments (Mateo et al., 2019). Error bars show standard error from bootstrapping. (E) Average measured distance between enhancer and promoter probes relative to the time of transcriptional bursts, reproduced from Figure 6E of Alexander et al., 2019. (F, G) Average E-P distance from simulations of both versions of the futile cycle model, where 0 represents contact and 1 represents no contact, relative to time transcription burst, modeled as a tag concentration > 20. Shaded region denotes ± standard error of the mean. (H, I) As in F and G, with sample size increased to 40-fold.

To further study the relation between E-P contact and transcription, we examined contact frequency in the stochastic models relative to the timing of transcription events. Such temporally ordered analysis requires live imaging data. Pioneering work by Alexander and colleagues observed no significant change in E-P distance across any time scale relative to detected transcription bursts, which was interpreted to refute a contact-dependent model of gene enhancer-mediated regulation (Alexander et al., 2019Figure 6E). For a moderate number of simulated bursts (~500), we also found no detectable increase in proximity at any time scales relative to that of the burst (Figure 6F, G). This is surprising at first since by construction transcription is dependent on contact. Because the transition also depends on promoter-intrinsic addition and removal events of tags or GTFs, which also occur stochastically with variable timing, the dependence was obscured by stochastic noise. Thus, simulations involving a similar number of total cells as experiments also lacked detectable correlation between the fraction of time spent transcribing and average E-P distance or contact rate for individual cells, same as observed in the experimental data (Figure 6—figure supplement 1A, B, D, E). Substantially increasing the number of simulated cells (~20,000) uncovered weak but statistically significant correlations in each of these cases (Figure 6H, I and Figure 6—figure supplement 1C, F), similar to the observations from the Drosophila data (Figure 6D). Similar results were observed with either the condensate or PTM version of the futile cycle. Thus, these models show that enhancers could regulate genes like Sox2 or the Bithorax genes in a purely contact-dependent manner, even without strong correlation between contact events and transcription at the single-cell level.

Discussion

Paradoxes and apparent contradictions arising from recent experimental results are reconciled by a minimal model of promoter activity

Here, we set out to understand one paradox: what molecular mechanisms, if any, could enable subtle changes in chromatin structure to lead to large changes in transcription? We developed a futile cycle model and described two possible biochemical realizations/versions that both exhibit hypersensitivity and hysteresis: TF condensates and PTM accumulation. These two distinct formulations of promoter behavior highlight the common features required for a promoter to become hypersensitive to minor changes in E-P contact frequency. Both versions propose the existence of a promoter that accumulates or loses a molecular species, the abundance of which correlates with transcriptional activity of the promoter. In both model versions, accumulation and loss proceed in futile cycles, whether or not the activating signals at the enhancer are present. For a broad range of parameter values, this futile competition renders the promoter sensitive to the activation state of its associated enhancer, and in some regimes hypersensitive to small changes in contact frequency with the active enhancer. While both TF condensates and PTM accumulation have been extensively studied empirically, their role in 3D transcriptional regulation by enhancers has received less attention in modeling, hithertofore. Using the two concrete examples, we showed that the general futile cycle model provides an unexpected explanation of three recent, controversial results. Here, we discuss these three controversies, identify which features of the model offer explanation of the controversy, and identify experiments which may test these features in the future.

Some promoters may be insensitive to contact perturbations, but still activate in a contact-dependent manner

The observation that many TADs harbor genes whose spatial-temporal patterns of expression differ substantially across development poses an obstacle to the hypothesis that 3D chromatin structures like TADs contribute to regulatory specificity of gene expression (Kvon et al., 2014; Schwarzer and Spitz, 2014; Symmons et al., 2014). For these patterns to differ, enhancers within the TAD must be able to regulate some promoters and not others even though all experience a similar frequency of E-P interaction. A simple and common explanation of these differences is to conclude that contact frequency differences do not generally play a role in E-P specificity (since two promoters seeing similar contacts respond the same). Similarly, many enhancers are known to ‘hop over’ intervening promoters (van Arensbergen et al., 2014)—activating expression of more distal genes without notably affecting the transcription of more proximal ones, such as Shh limb enhancer (Lettice et al., 2003) and several FGF8 limb enhancers (Marinić et al., 2013) or the snail shadow enhancer in flies (Perry et al., 2010). In most of these cases, the more distal gene experiences less frequent contact in available measurements. Such cases have been used to argue that chemical mechanisms determine E-P specificity, and as a corollary, that structural differences that affect E-P contact frequency are of little biological relevance (Ghavi-Helm et al., 2019; Kumar et al., 2020; Mir et al., 2019).

Our simulations indicate that such conclusions are premature. A futile cycle promoter exhibits not just hypersensitive response to contact interactions, but also substantial regimes of sublinear or insensitive response. Promoter-specific differences in the intrinsic addition or removal of tags (or the recruitment of GTFs to a promoter condensate) can render one promoter unresponsive to an enhancer that activates a second promoter, even though the two experience similar absolute interaction frequency. Indeed, the unresponsive one may even experience a higher interaction frequency (e.g., because it is physically closer). It is thus premature to interpret such results as evidence of biochemical specificity or to dismiss a role for chromatin structure. This is notable since transgene assays indicate most regulatory elements will promiscuously activate most promoters if placed close enough (Arnold et al., 2013; Furlong and Levine, 2018; Kvon, 2015; Kvon et al., 2014). We do not argue either that one should dismiss a role for chemical specificity—several previous experiments suggest that an enhancer that strongly activates one promoter may only weakly activate a different one, and that a second enhancer may have reversed preferences (Haberle et al., 2019; Hong and Cohen, 2021). However, we do propose that existing data still supports a model in which 3D structure is of critical importance for E-P interactions genome wide. If certain perturbations show no effect, it may be because the promoter is in the insensitive rather than hypersensitive regime.

The model also suggests further experiments to test this mechanism. In particular, promoters that exhibit hypersensitive responses to certain perturbations, such as increased contact upon insulator removal, should be relatively insensitive in other regimes, which could be tested by secondary mutations. For example, increasing the contact frequency between Ubx enhancers and the abd-A promoter following deletion of the border element, by removing some of the 30–60 kb that separates them, is predicted to have sublinear effects on promoter activity since the hypersensitive regime was already crossed following the initial deletion.

A more comprehensive test of this prediction of the models would be an experiment that measured gene expression from a reporter construct, while systematically varying the separation between an enhancer and promoter across a wide range of distances, from most proximal, until too far away to have any effect. Here, E-P contact frequency would be proportional to E-P genomic distance. The models predict that there is a threshold distance, below which the promoter is active and above which it is inactive. The response around this threshold is thus hypersensitive, and further changes at either side of the threshold are sublinear. This experiment is predicted to exhibit the sigmoidal response seen in the models: two sublinear regimes separated by a hypersensitive regime.

Such an experiment has recently been published with Drosophila enhancers, and results are consistent with the model prediction—decreasing E-P contact frequency by increasing E-P distance within a TAD produced a sharp drop in expression (Yokoshi et al., 2020). Similarly, manipulations of the Sox9 locus, which progressively increased the contact frequency between Sox9 enhancers and the adjacent Kcnj2 gene by deleting first the TAD border between them and then further shortening the distance, also exhibited just such a nonlinear response (Despang et al., 2019). Most notably, systematic variation of the distance between the Sox2 promoter and one of its enhancers, both inserted into an ectopic site, reveals a strikingly hypersensitive dependence of transcription on E-P contact frequency (Zuin et al., 2021).

Understanding global disruption of TADs

Examination of the time sensitivity of the model provided a possible explanation of a second set of controversial experiments. It was reported in 2017 that rapid removal of cohesin from the genome substantially altered global contact patterns, removing TADs and so-called ‘loops’ or corner points from contact maps genome wide (Rao et al., 2017). Surprisingly, this structural perturbation resulted in few transcriptional changes, with hardly any gene changing nascent transcription levels by more than a factor of 2. This apparently contradicted results in earlier and concurrent reports, in which disruption of single TAD boundaries resulted in substantial change in expression to the local genes. While it is important to note that the experiments were conducted in different cellular backgrounds, our modeling work demonstrates that a simple explanation may exist. Twofold or smaller differences in contact frequency in the model show significant hysteresis and will retain their initial transcription rate for many hours after such perturbation, but this memory is lost on longer time scales. Transcription was assayed only hours after the contact changes in the degron experiment, which may explain why little change was detected. In contrast, experiments that genetically deleted border elements assayed transcription days and many cell generations later, at which point some promoters saw major changes in expression.

Notably, TAD boundaries have been globally weakened or removed through a variety of alternative methods, including degradation of CTCF, deletion of the cohesin-associated factor NIPBL, protease cleavage of cohesin, and siRNA (Cuartero et al., 2018; Nora et al., 2017; Schwarzer et al., 2017; Stik et al., 2020; Zuin et al., 2014). A direct comparison between the studies is complicated by the variations in cell type and genetic background used. Yet, by and large, transcriptional assays performed at longer intervals after first detection of E-P contact disruption exhibit greater transcriptional changes, consistent with the explanation suggested by these simulations.

How can we connect modeling time scales with the experimental measurements? While too little is known about the detailed chemical kinetics in vivo to confidently extrapolate these results to absolute time scales, it is clear there is a fast regime and a slow regime, with moderate change looping rates associated with a slower time scale. Available in vivo data on recruiting epigenetic modifiers to promoters predominantly shows significant response on the time scale of hours (Braun et al., 2017; Hathaway et al., 2012; Ho et al., 1996), though subtle responses can be detected with stronger recruitment in as little as 15 min in some studies (Braun et al., 2017). If the promoter tags of our PTM model version are realized by such epigenetic mechanisms (discussed below), our results are consistent with the expectation that a decreased or increased contact frequency over only 6 hr will be insufficient to change the promoter state, while a substantially longer period such as the multiple days/generations between deletion and measurement can give the promoter enough time to change states. Less is known about condensate kinetics, whose very existence and reliable detection in vivo remain controversial. Nonetheless, studies reporting condensates of GTFs such as Med1, PolII, and Brd4 indicate a stability that generally exceeds the imaging time (Chong et al., 2018; Cho et al., 2018; Sabari et al., 2018).

Our model suggests experiments to test the hypothesis that promoters are hysteretic. Deletion of essential enhancers, such as the Sox2 control region, results in substantial decreases in gene expression, even though alterations to the interaction frequency do not. If expression were to be measured within 6 hr of enhancer deletion or inactivation, rather than many cell generations later, the model predicts the change would be much less dramatic. Though not a trivial experiment to execute, this could be attempted using CRISPR to induce the deletion in a population (or CRISPRi to silence the enhancer), followed by fixation within a few hours and combined single-molecule RNA FISH and DNA FISH. RNA FISH would quantify RNA expression per cell, and the DNA FISH would sort out which cells actually contained the deletion.

Transcription may be dependent on E-P contact even when the two are not correlated in single cells

It has been widely expected that active promoters should colocalize with their active enhancers, based largely on bulk population assays such 3C and H3K27ac Hi-ChIP (Bartman et al., 2016; Deng et al., 2012; Kagey et al., 2010; Mumbach et al., 2017, Mumbach et al., 2016; Rao et al., 2014; Williamson et al., 2016); for review see Furlong and Levine, 2018. Surprisingly, recent single-cell measurements found that cells with close E-P proximity were only slightly more likely to exhibit nascent transcription than those lacking contact (Finn and Misteli, 2019b; Mateo et al., 2019). It is possible the correlation between looping and expression is missed in these fixed cell assays because the loops are transient relative to the rate of transcription, such that loops dissociate by the time the nascent RNA transcript is detectable. However, recent live-cell imaging experiments suggest otherwise: live-cell tagging of chromatin proximal to the enhancer and promoter of Sox2 coupled with live-cell measurements of nascent RNA transcription uncovered no significant correlation between E-P proximity and transcription activity (Alexander et al., 2019). These data have supported speculation that enhancers must be able to influence promoter activity from a distance, for example, through bridging by large protein condensates (Alexander et al., 2019; Benabdallah et al., 2019; Heist et al., 2019).

The futile cycle model demonstrates that it is possible for transcription to be regulated by E-P contact and yet show little temporal correlation, even in live, single-cell time-lapse data. This is because individual E-P contact events have little impact on promoter state, contributing to only marginally more promoter tags or larger condensates. However, repeated interaction events, coupled with promoter-intrinsic feedback to accumulate tags, eventually lead to activation. This explanation can be directly tested with larger data sets—with sufficient observations, a weak correlation should be detectable (as seen in the fixed cell data), since cells experiencing a higher frequency of contact are more likely to have a high level of tag and thus more likely to transcribe, even though it is not a one-to-one dependency.

A simple test of this could be performed by collecting deeper single-cell live imaging data sets—a feasible though exceedingly laborious task given the available throughput of live imaging. An order of magnitude more measurements may allow detection of the expected weak temporal correlation. A yet more direct experiment would be to take single-cell measurements of the protein state of the promoter, with simultaneous measurement of transcriptional state—for example, monitoring the level of H3K4me3 or PolII. Of course, this experiment is exceedingly difficult with current technology, which lacks reliable detection of protein state at individual promoter loci, and is further complicated by the substantial number of candidates to test for promoter accumulation signals. While dynamic measurements would lead to stronger conclusions, even in fixed cell populations, correlating promoter state in terms of protein occupancy and abundance to the RNA expression state would be informative. While out of reach of current technologies, given the rapid improvements in experimental methods for enhanced sensitivity and versatility, it is imaginable that such direct measures will one day be possible.

Repression also supports hypersensitivity and hysteresis

For simplicity, the futile cycle model presented above assumed that the condensates or the tags that accumulated at the promoter are activating, such that transcription is proportional to tag level. However, it is equally plausible that futile cycles of repressive condensates or tags at the promoter account for hypersensitivity of some promoters. In this case, transcription can be modeled as inversely proportional to the concentration of the promoter tag; everything else about the analysis and its conclusions remains unchanged. The distal regulatory element that adds more of the tag should be termed a silencer rather than an enhancer. In response to increased contact with the silencer, the promoter would transition through a weakly sensitive ON regime, a hypersensitive bistable regime, to a weakly sensitive OFF regime. Alternatively, a model in which the enhancer removes the repressive tag from the promoter would support all three regimes.

Conclusion

We have described and analyzed a simple futile cycle model of enhancer-contact-dependent promoter activity and presented two concrete realizations for analysis. In this view, the promoter is not merely an intermediary in the regulation of transcription. It does not only relay the activity state of the enhancer into transcriptional activity of its target gene, nor function only as an amplifier of enhancer activity to set the levels of expression. Instead, through accumulation of local tags or GTFs in a condensate, the promoter is able to integrate its history of interaction with one or more regulatory elements and respond in striking nonlinear ways to changes in these signals. In addition to its core sequence, the accumulation of other factors at the promoter, be they histone modifications or transcription factors, provides the necessary mechanism for the dynamical system to exhibit memory.

This view of the promoter as an integrator of signaling by accumulation of molecules is not new and does not posit the existence of any unknown molecular mechanisms. However, we found that a minimal model that captures the key features of such promoters can exhibit a much more complex dependency on chromatin structure than previously acknowledged. Our modeling results offer a simple explanation to controversies of major importance to the chromatin structure and transcription community. These include (1) why TAD boundaries can be weak and why insulators only mildly affect contact frequency and yet be major regulators of transcription; (2) why many genes thought to be responsive to distal enhancers are insensitive to some structural changes; (3) why global disruption of structure and acute disruption of structure have been reported to have distinct effects on transcription; and (4) why physical distance between known distal enhancers and their cognate gene does not show obvious correlation with transcriptional state in single cells.

Materials and methods

Transcriptional framework and terminology

Request a detailed protocol

The terms enhancer and promoter have fluid definitions in the literature, with overlapping empirical criteria. For clarity, we will use the following simplified definitions. Regulated genes are genes activated by cell-type-specific combinations of transcription factors (TFs), which bind regulatory DNA sequence elements called enhancers. The enhancer loops to the promoter at some frequency determined by the genome structure. The promoter is a different regulatory DNA element that contains the transcription start site (TSS) and binding sites for general transcription factors (GTFs). As they bind the promoter, GTFs can influence transcription immediately, whereas TFs, which bind enhancers, only affect transcription when an E-P loop occurs. There is no loss of generality with these definitions—a TF-binding site that is close to the TSS, rather than being described as part of the ‘promoter,’ is, in this view, part of an enhancer with a high interaction frequency (one which is likely never rate-limiting). Many informative prior models of transcriptional regulation (Ben-Tabou de-Leon and Davidson, 2009; Gregor et al., 2007; Manu et al., 2009; Sanchez et al., 2011; Zinzen et al., 2006; Zinzen and Papatsenko, 2007) assume that the rate of transcription is simply proportional to TF occupancy. These models are consistent with the general definition we just described, with the added assumptions that E-P contact and GTF-promoter binding are fast relative to TF binding, and that transcription itself is proportional to GTF binding. This broader definition will help us see how genome structure would be expected to impact the behavior of these models.

Nonlinear mathematical frameworks

Request a detailed protocol

Two distinct mathematical frameworks have been used to model nonlinear transcriptional systems. Mass action kinetics models chemical entities using continuous variables representing concentrations. Interaction rates are quantified by on/off rate constants. The mathematical tools of ordinary differential equations (ODEs) can be used to study the behavior of systems under these assumptions (Strogatz, 2019). Alternatively, the CME models the chemical species as discrete molecules, which interact stochastically. The mathematical tools of probability theory, particularly continuous-time Markov chains, allow analysis of the behavior of such discrete systems (Durrett, 2012). Each chemical state of the system is enumerated, and transitions between states occur with a finite probability that depends only on the current state. The CME’s assumptions are less restrictive than the ODE approach, and it allows the study of variability in stochastic systems. In the limit where the number of discrete molecules is large, the master equation is accurately described by the corresponding continuous ODEs. Stochastic models of transcription have been instructive in understanding promoter bursting and how different chemical signaling pathways buffer or amplify stochastic variations, which in turn improves robustness or increases heterogeneity (Ycart and Peccoud 1995, Raj 2006, Sanchez et al., 2011, Boettiger 2013).

Solutions to the chemical master equation/Markov jump process

Request a detailed protocol

Consider a system with n states. Note n is not required to be finite. Let P(t) be a matrix where element pi,j(t) indicates the probability the system is in state i at time t given that it started in state j at time 0. The evolution of P(t) is given by the forward Kolmogorov equation, d/dt P(t= P(t)M, where M is the transition probability matrix, shown in each of the model figures (see Figures 24).

The stationary distribution of Pπ is given by the normalized kernel of M. In the two- and three-state models shown in Figure 2, we were interested only in the probability the system was in the fully occupied state, and thus inspect only the last element of π. In other models, like the condensate model, transcription was assumed to be proportional to the size of the condensate, and we must sum all the elements of π scaled by their corresponding transcription rate for the corresponding condensate size in order to determine the expected distribution of transcription.

It is also instructive to look at solutions not at steady state. This solution is straightforward for small matrices (see Durrett, 2012), but for large matrices is more efficiently explored by stochastic simulation of the underlying jump process as discussed below.

Stochastic simulation of the Markov jump process

Request a detailed protocol

Stochastic simulations were written in MATLAB 2020a. Executable source code to run these simulations is available here: https://github.com/BoettigerLab/model-transcription-looping (copy archived at swh:1:rev:8af1e7fcf1fa126908465558c4ef6d127000a9c8), Boettiger, 2021. A detailed table of parameters values and parameter descriptions is provided at the start of each script. The associated figure is indicated in the filename and in the header of the corresponding script for each simulation.

Why use stochastic simulations when analytic methods are available? While the matrix formalism can be used to solve analytically the time-dependent and stationary probability distributions for all the states of interest in both the PTM mechanism and condensate mechanism flavors of the model, we find the complexity of the expressions in full variable form does not easily facilitate interpretation. The analytic solutions in variable form require many pages just to write down, and as such do not facilitate the mental evaluation of limit behavior as demonstrated with simpler models. By contrast, the parametric approximation of the numerical results we find more intuitive.

In brief, these simulations model the chemical master equation in explicit time and discrete time steps, with discrete transition probabilities for the addition and removal of promoter tags. For simplicity, we equate promoter modification and transcription, reducing the total parameter space without loss of generality.

The number of time steps used in the simulation was explored systematically to benchmark their effects on model behavior (e.g., see Figure 5E). As can be seen from these simulations and inferred from the structure of the model, the system asymptotically approaches the stationary distribution, such that the difference in the distribution of states observed after 104 or after 105 units of simulation time is small. The simulations in Figure 5 explored also the more transient behaviors at shorter time scales, which were benchmarked against the time it took the simulation to respond to enhancer deactivation. The initial conditions were selected to have all promoters start in the OFF state with no tags or condensate molecules, unless indicated as a ‘start on’ simulation. In the latter case, all promoters were initialized with a condensate or tag size equal to that observed on average at stationarity. The specific number of time steps used in each simulation and the specific initialization conditions are recorded along with the corresponding rate constants used in the supplemental code provided.

Model assumptions and justification

The free concentration of high abundance molecular species, like nucleotides, soluble GTFs, or tags, is unchanged by their accumulation at the promoter

Request a detailed protocol

RNA production consumes ATP to produce multiple mRNA molecules from a DNA template and so these models do not explicitly conserve energy or mass. This does not violate any laws of physics; other processes in the cell, not explicitly described in the model, produce (and consume) species such as ATP in such a way that mass energy in the universe is still conserved. When these components are in sufficient excess and produced at a sufficient rate, they are reasonably approximated as constants.

This assumption can be removed with no observable change to the results, save added complexity in their computation. This follows as the relative effect of a minor depletion of the bulk population, from N to N-1, has an effect size of 1/N on the addition rate.

Transcription is proportional to condensate size/tag amount

Request a detailed protocol

To keep the parameter list minimal and emphasize the chemical features chiefly responsible for the hypersensitivity to E-P contact frequency, we assumed transcription is proportional to condensate size and tag amount. This assumption can be replaced with a variety of more complex, more stochastic models of transcription. As long as transcription is correlated with the accumulated amount of these factors, the hypersensitivity will remain. The degree of hypersensitivity will be lessened (or require averaging over many more promoters in order to emerge) if the correlation is weak.

Enhancers are already bound by TFs

Request a detailed protocol

It is straightforward to add a more complex model of the enhancer, one, for example, which cooperatively binds multiple TFs and thus whose occupancy depends on the concentration of the respective TF activators and repressors as in Zinzen 2006 and related works. One must decide which TF occupancy configurations are capable of influencing transcription via PTM or condensate recruitment. The addition then is a function not just of the loop frequency e, but rescaled by the probability the enhancer is in an active state(s), which is a function of the TF concentrations.

Condensates have a fixed maximum size while promoter PTMs do not

Request a detailed protocol

To keep the parameter space small in the condensate model version and facilitate a systematic scan over parameters, we modeled the system with a fixed maximum condensate size. A more realistic model would postulate that this value is a stochastic number, determined by the concentration at which the GTF forming the condensate passes below the solubility critical point and stops condensing. If the concentration is always below this threshold, no condensation will be observed. For systems with concentration above the critical point, condensates form, and compete to associate out the extra molecules. Should distinct liquid condensates meet in 3D space, they will fuse, though the frequency of such events will be dramatically reduced if the condensates are tethered on chromatin as postulated in our model, as the dense packaging of these long polymer fibers limits their diffusive mobility. In the tethered regime, condensate size is limited by the competition with other condensates for the limited pool of GTF molecules and the limited lifetime of the processes permissive to formation at a given site (such as the activity of the enhancer). Importantly, the soluble pool of GTFs remains large even once condensates have formed, and thus individual GTFs will still leave and fuse with existing condensates. The tethered condensate model is analogous to beads of dew condensing on a glass window as the air cools, forcing a fraction of water molecules out of the gas phase and into the liquid phase—the surface tension keeps the beads generally in place, so they do not all condense into a single droplet. The size of the beads is limited by the amount of water available above the critical dew point (which depends on the size of the temperature drop and how saturated the air was to begin with). Individual water molecules in the dew drops continually escape back into the gas phase, and individual molecules in the air continually join the dew drops, and are preferentially captured by the larger drops. Local variations in temperature of the substrate, much like local variations in the enhancer activity of different promoters in the model, will cause variations in the growth and dissolution rates of the different droplets.

No fixed saturation for the total amount of tag is assumed in the PTM model version. Since tag removal is proportional to current abundance, for all non-zero removal rates the amount of tag will always have a finite average. Since an individual promoter can always add one more tag though, the transition matrix is infinite. As the substrates for common PTMs (phosphates, methyl groups, acetyl groups) are in high abundance and under rapid turnover in the cell, the addition of a tag to the promoter is not expected to deplete the available pool and so impact the next modification.

Analytic inference of the parameter space of the PTM model

Request a detailed protocol

Standard dynamical systems theory (Strogatz, 2019) reveals the continuum limit version of the PTM mechanism has at most three real, critical points, one unstable and two stable, at high and low tag levels, respectively, arising from the intersection of a sigmoidal tag production curve, p(a) with a linear decay d(a) (Figure 3—figure supplement 1). Hypersensitivity in this model arises for parameter regimes in which a minor increase or decrease in e leads to the loss of either the upper or lower stable point in a saddle node bifurcation, causing the system to tip into the remaining stable point. The conditions for the existence and annihilation of these stable points can all be seen from a phase-portrait analysis of the shape of the curves p(a) and d(a). The effects of the individual rate constants on the shape of p(a) and d(a) can be seen from the analytical forms of these equations (Figure 3—figure supplement 1A, B). The amplitude of p(a) sets the maximal transcriptional difference between enhancer-induced and -uninduced states. The sensitivity and stability of the response to enhancer promoter contact is determined by the sensitivity of p(a). When p(a) is a gentle sigmoid, it is possible for a very small change in e to transition the system from a mono-stable off-state, through the bistable regime, into the monostable on-state (Figure 3—figure supplement 1C). In the corresponding stochastic regime, the difference in p(a) and d(a) for values of a left of the upper stable point determines the ultimate stability of the upper stable point. In this stochastic regime, when the difference between p(a) and d(a) is small, there is a non-zero probability of losing more tags than gained in a certain time window, even when p(a) is on average greater than d(a). Thus, hypersensitivity to e is reduced/lost as the Hill coefficient of p(a) approaches 1 or approaches infinity: moderate Hill coefficients in p(a) produce the most hypersensitivity to e (Figure 3—figure supplement 1C–E). Similarly, if d(a) becomes too large or too small, it also limits the sensitivity, as can be seen schematically by its effect on the position of stable points (Figure 3—figure supplement 1F).

Derivation of the continuum limit equations

The ODE model (Figure 3—figure supplement 1A) was derived from the discrete chemical master equation using mass-action kinetics as described below. The stimulated enzyme example presented below is only one implementation of the PTM version, but used here because it is the simplest and most intuitive system that recapitulates the desired quantitative behavior.

Model parameter definitions

Request a detailed protocol

Consider a promoter-bound enzyme that attaches molecular tags to a promoter at some basal rate r0. This enzyme can also associate with n promoter-bound tags up to some number nmax, which changes the tagging rate to rn. The enzyme-tag association rates kn are dependent on the current association state (e.g., cooperativity), while dissociation is assumed to be constant (for now, but can be generalized later). 'Free' tags not associated with the enzyme but still promoter-bound are removed from the promoter with a rate constant g.

Reaction and rate equations

Request a detailed protocol

For illustrative purposes, we will start with the nmax=2 case (later, we will show this is sufficient to capture nonlinear behavior of interest). Then, the enzyme has three 'association states' (n = 0, 1, or 2):

Stimulation of enzyme activity by tag association:

  • E+aEa with rate k1 forward, τ1 backward

  • Ea+aEa2 with rate k2 forward, τ2 backward

where E is enzyme and a is tag. In general, there may be up to n different stimulation states:

  • Ean+aEan+1 with rate kn rightward, τn leftward

Tagging/detagging:

  • EE+a with rate r0

  • EaEa+a with rate r1

  • Ea2Ea2+a with rate r2

  • a0 with rate g

In general:

  • EanEan+a with rate rn

Note: The enzyme adds a tag to the vicinity of the promoter by reacting with abundant ambient substrate or coenzyme (e.g., Acetyl-CoA). As described above, the abundance of this substrate/coenzyme is assumed to be sufficiently large that its addition to the promoter does not appreciably change the total abundance, enabling us to reduce the number of necessary parameters by accounting for this constant concentration in the production rates rn. In other words, a appears to be created ‘from nothing,’ but the requisite material pool actually exists on both sides of each reaction equation and is simply omitted for readability.

Now we can write out the rate equations:

(1) d[E]dt=k1[E][a]+τ[Ea]
(2) d[Ea]dt=k1[E][a]-τ[Ea]-k2[Ea][a]+τ[Ea2]
(3) d[Ea2]dt=k2[Ea][a]-τ[Ea2]
(4) d[a]dt=r0[E]+r1[Ea]+r2[Ea2]-g[a]+e

where e is the tagging contribution from E-P contact. Assuming the total enzyme number is constant,

(5) [Etot]=[E]+[Ea]+[Ea2]

Our goal is to find d[a]dt as a function of [a] and constants, and to study the steady-state behavior of the system, that is, finding the solutions for [a] when Equations (1)–(4) are all set to zero. Solving (1) and (3), then substituting [Ea] and [Ea2] into (4) and (5):

(6) d[a]dt=r0[E]+r1k1τ[E][a]+r2k1k2τ2[E][a]2-g[a]+e
[Etot]=[E]+k1τ[E][a]+k1k2τ2[E][a]2
(7) [E]=[Etot]1+k1τ[a]+k1k2τ2[a]2

Combining (6) and (7):

d[a]dt=[Etot](r0+r1k1τ[a]+r2k1k2τ2[a]21+k1τ[a]+k1k2τ2[a]2)g[a]+e
(8) =[Etot](r0τ2k1k2+r1τk2[a]+r2[a]2τ2k1k2+τk2[a]+[a]2)g[a]+Ξ

The terms with rn represent enzyme tagging, and the term with g is detagging. This nonlinear system with n = 2 can be solved explicitly for the steady-state values of a, though the solutions are less compact than the final equation, and are most easily interpreted in the phase portraits of Equation (6). Analytic steady-state solutions were calculated in MATLAB using the symbolic algebra package. Scripts to reproduce these calculations are available on our GitHub site at: https://github.com/BoettigerLab/model-transcription-looping.

Insulation score calculations

Request a detailed protocol

Several variations of this metric exist, but we define it as follows: first, the contact frequency map is normalized relative to linear genomic distance by dividing each off-diagonal row by the sum of the row. This improves the interpretability of the measure as the inter-domain triangle contains more interactions further from the diagonal than the intra-domain ones (Figure 1—figure supplement 3). We then scan the whole genome, computing the average, normalized interaction frequency within each of the three triangular regions shown in Figure 1—figure supplement 3: upstream (intra-domain U), downstream, intra-domain D, and between inter-window (domain I) of query point. We report the ratio of intra-domain to inter-domain interactions for this sliding window as the insulation score for that genomic position, as indicated in Figure 1—figure supplement 3. This metric can be computed for a range of sliding window sizes. As long as the window size is smaller than the typical TAD and large enough not to be sensitive to measurement-error fluctuations associated with over-binning the data, the distribution of insulation scores does not depend strongly on bin size. A comparison of several different computational algorithms for identifying TADs showed greatest agreement across replicates and robustness to data normalization using this metric (Gong et al., 2018; Lazaris et al., 2017).

Balanced, Hi-C matrices from Rao et al., 2014 were downloaded and exported using Juicebox (Durand et al., 2016) for manipulation in MATLAB for computing insulation scores.

Data availability

This is a theoretical and computational paper using published experimental data.

The following previously published data sets were used
    1. Rao SSP
    2. Huntley MH
    3. Durand NC
    4. Stamenova EK
    5. Bochkov ID
    6. Robinson JT
    7. Sanborn AL
    8. Machol I
    9. Omer AD
    10. Lander ES
    11. Aiden EL
    (2014) NCBI Gene Expression Omnibus
    ID GSE63525. A three-dimensional map of the human genome at kilobase resolution reveals prinicples of chromatin looping.

References

  1. Book
    1. Alberts B
    2. Johnson A
    3. Lewis J
    4. Morgan D
    5. Raff M
    6. Keith Roberts PW O
    (2018)
    Molecular Biology of the Cell
    W. W. Norton & Company.

Decision letter

  1. Naama Barkai
    Senior Editor; Weizmann Institute of Science, Israel
  2. Job Dekker
    Reviewing Editor; University of Massachusetts Medical School, United States
  3. Katie S Pollard
    Reviewer; Gladstone Institutes, United States
  4. Job Dekker
    Reviewer; University of Massachusetts Medical School, United States

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

The work describes a simple theoretical model for enhancer action that explains several major controversies in the field of long-range gene regulation and the role of topologically associating domains and insulating boundaries in modulating enhancer-promoter interactions. Further, the model makes predictions that be experimentally tested. This is valuable for the field of gene regulation.

Decision letter after peer review:

Thank you for submitting your article "How subtle changes in 3D structure can create large changes in transcription" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, including Job Dekker as the Reviewing Editor and Reviewer #3, and the evaluation has been overseen by Naama Barkai as the Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Katie S. Pollard (Reviewer #2).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

As the editors have judged that your manuscript is of interest, but as described below that additional experiments are required before it is published, we would like to draw your attention to changes in our revision policy that we have made in response to COVID-19 (https://elifesciences.org/articles/57162). First, because many researchers have temporarily lost access to the labs, we will give authors as much time as they need to submit revised manuscripts. We are also offering, if you choose, to post the manuscript to bioRxiv (if it is not already there) along with this decision letter and a formal designation that the manuscript is "in revision at eLife". Please let us know if you would like to pursue this option. (If your work is more suitable for medRxiv, you will need to post the preprint yourself, as the mechanisms for us to do so are still in development.)

Summary:

The work describes a simple theoretical model for enhancer action that explains several major controversies in the field of long-range gene regulation and the role of topologically associating domains and insulating boundaries in modulating enhancer-promoter interactions. Further, the model makes predictions that can be experimentally tested. This is valuable for the field of gene regulation.

Essential revisions:

All three reviewers agreed the work addresses a major question in the field but also raised important issues that should be addressed. These primarily revolve around how the model is described, and the extent to which the full parameter space is explored. Further, the authors have not explored alternative models which would help clarify how surprising and robust their results are. There is also an opportunity to emphasize that the proposed model is not necessarily absolutely correct, but one of many plausible models that can produce a non-linear relationship between genome structure (enhancer-promoter contact) and transcription. Any thoughts on other models that could generate similar dynamics would be a useful discussion point.

1. The presentation of the model is unclear. It is currently present in the text, lines 110-122, in pure qualitative description. Authors define only rates in the text; definitions of other model parameters are not present. For example, E and a are not specifically defined in the text or Methods section. Since both terms "enzyme" and "enhancer" are being used and in fact "enzyme tagging" and "enhancer tagging" occur simultaneously in the model, it is not possible to say for sure when do authors call which one in the model and thus the methods section can be interpreted in different ways. Moreover, the cartoon is missing a legend confirming, which molecular player is which. The figure caption mentions only green triangles being the tags, but no other parts of the cartoon are being explained. Taken together, this makes it very difficult to verify the mechanics of the model.

– Given its centrality to the manuscript, we recommend describing the overall strategy in more detail in Results. For example, at line 124 (Pg. 4) the authors could talk about how the simulations are done, including where the variability comes from (e.g., random starting conditions vs. probabilistic events vs. different parameters).

– – The authors should provide a detailed technical description of their model directly in the text, including description of their parameters, list their constitutive equations and identify all parameters in their cartoon Figure 1C.

– Axes labels in all figures should be expressed in the parameters/variables of the model (as in Figure 6C-D) directly connecting to inputs/outputs of the model.

2. In the Methods section, it appears that in lines 577-580 of the model description, the mass is not conserved. This issue is critical to address, because when mass is not conserved the model can not be physical.

3. Xiao et al. make several key assumptions to dramatically simplify their model. Namely, it is assumed that promoter modification and transcription are equivalent and that enhancer-promoter contact influences transcription instead of transcription influencing structure. Steady-state equilibrium must also be assumed. It would be helpful if the authors explicitly stated these assumptions and provided references to support their being reasonable.

4. For clarity, it would be helpful to discuss model parameters in greater detail. First, we suggest noting which parameters shift the location of the curve and which increase the steepness of the curve. Second, we recommend including a phase diagram exploring when sigmoidal behavior and any other key model predictions arise across parameter space. In what circumstances does hypersensitivity or time lag emerge? The authors demonstrate that a narrow set of parameters is sufficient to produce a super-linear relationship between enhancer-promoter contact and transcription in Figure 6. One potential dilemma is this model's ability to explain many experimental observations by indicating that minimal changes all occur in the sub-linear regime while observable changes occur in the super-linear regime. Given that one needs specific parameters to replicate an example of the hyper-linear regime (including at least three degrees of stimulation and increasing stimulation of the successive states), it could be valuable to demonstrate how large the plausible parameter space is. Without an exhaustive search across the space of minimal parameters, it is not clear when this property emerges or how common it is within the full parameter space. The authors could vary model parameters and plot a grid visualizing behavior (e.g., steepness of the curve or Hill coefficient).

5. The authors observe hysteresis in median transcription rate as a function of enhancer contact frequency. However, the presented violin plots suggest a presence of two states, one with low and one with high transcription rates. In the intermediate regime of enhancer contact frequency, where authors report hysteresis, the violin plots show bimodal distributions suggesting coexistence of these two states. This would suggest that the system exists in and switches between two distinct states with a discontinuous transition, instead of a continuous hysteretic behavior as suggested by the median behavior.

6. There is also an opportunity to emphasize that the proposed model is not necessarily absolutely correct, but one of many plausible models that can produce a non-linear relationship between genome structure (enhancer-promoter contact) and transcription. Any thoughts on other models that could generate similar dynamics would be a useful discussion point. There are parallels to both sigmoidal dose-response curves, where drug concentration is plotted against response, and transcription factor binding curves, where free ligand concentration is plotted against the fraction bound. We recommend providing background context on these types of models or the Hill equation to illustrate why non-linear behavior is or is not surprising given the proposed model.

7. It is not totally clear why the authors decide to call their proposed approach the futile cycle model. There are similarities to other well-known models in biochemistry and biophysics that should be noted. It might make sense to simply call this a mechanistic model of cooperative promoter activation. If the authors stick with "futile cycle", the relationship between promoter activation through tags and metabolic signaling should be described in more detail.

8. At the beginning of the Discussion section authors state they will propose future experiments in each section. However, in some of the sections it is not clear what specifically authors are proposing. These suggestions should be made clearer.

9. If I understand the model correctly, the non-linearity arises because of the increase rate of tag addition when tag is already present. The authors then speculate histone modifications can be one such tag. However, there is only so many sites of modification at a promoter. Can the authors analyze how the possible range of tag densities affects performance of the model? Is the range required biologically plausible?

10. Can the authors do more analysis to explore how rapid changes in gene expression may occur (e.g. upon signaling a gene may go up within minutes)? How much more frequent does the E-P interaction need to be for rapid switch to the active promoter state? Can the authors do an analysis where they change the rates of the futile cycle upon some signal: at what time scale does transcription then change (keeping E-P frequency the same)?

11. Due to the lack of description, in many sections it is not clear what are the specific inputs and outputs of the model (e.g. Figure 2).

12. The Methods section describes the chemical kinetics of the suggested reactions and the insulation score calculations. But it is not clear how do these inform each other, how are contact-frequency maps chosen/computed and cross-referenced with the local E-P kinetics?

13. In 587-588, the index of k is 2(n+1), which equals to 2n+2, but then in the next line the following assumption is made 2n+1 → n+1.

14. The authors make assumptions that their kinetic considerations hold for n>2. What is the evidence?

15. The language of the paper is often not technically precise with qualifiers missing, which could lead to ambiguities and misinterpretations. Here are some examples:

p. 1, line 10, "difference in contact across TAD borders is usually less than twofold";

p. 1, line 17, "results from recent cohesion disruption";

p. 2, line 71, "A simple model of hypersensitivity to changes in contact frequency".

16. On p. 13, line 483, authors define Ostwald ripening as given by weak multivalent interactions; however, Ostwald ripening is a thermodynamic process. In addition, they propose that liquid condensates become larger due to Ostwald ripening, but there are also other processes that may occur, such as coalescence of condensates, which would also lead to larger condensates.

17. An alternative explanation for TAD-specific enhancer action is that an E-P interaction within a TAD (between two convergent CTCF sites), one that is brought about by extruding cohesin, is not equivalent to an interaction that occurs between two loci on either side of a CTCF site and that can be a random collision that is not mediated by extruding cohesin. In other words, two interactions can be of the same frequency but can be of a very different molecular nature. I agree that this model would not explain the results of the experiment where cohesin is acutely removed.

18. In the beginning of the introduction the authors introduce TADS. I recommend that the authors present this in a more nuanced way: compartment domains also appear as boxes along the diagonal, an issue that has led some in the chromosome folding field to be confused. This reviewer believes TADS are those domains that strictly depend on cohesin mediated loop extrusion, whereas compartment domains are not. If the authors agree, perhaps they can rewrite this section?

https://doi.org/10.7554/eLife.64320.sa1

Author response

Essential revisions:

All three reviewers agreed the work addresses a major question in the field but also raised important issues that should be addressed. These primarily revolve around how the model is described, and the extent to which the full parameter space is explored. Further, the authors have not explored alternative models which would help clarify how surprising and robust their results are. There is also an opportunity to emphasize that the proposed model is not necessarily absolutely correct, but one of many plausible models that can produce a non-linear relationship between genome structure (enhancer-promoter contact) and transcription. Any thoughts on other models that could generate similar dynamics would be a useful discussion point.

We thank the reviewers for these very helpful comments. We provide line-by-line responses below.

We have substantially rewritten the manuscript to address these concerns. We added a new section, “Modeling nonlinear transcriptional response”, (pages 6-11), that now provides a detailed background on the modeling approach used, including a discussion situating this work relative to prior work in the field. This section includes 1 new figure (Figure 2) and 1 new supplemental Figure (Figure 2 supplement 1). The section on the futile cycle model, “Promoter Futile Cycles and Hypersensitive Response”, (pages 11-15) provides a detailed description of the futile cycle model, also with detailed new figures (Figure 3 and Figure 3 supplement 1). To help the reader appreciate which features of the model are general while still providing a picture of what specific molecules and biochemical mechanisms might execute these general behaviours, we now present two distinct versions of the Futile Cycle model. The “PTM” version is based on the accumulation of Post-Translational Modifications at the promoter. This is the model which appeared in the first submission, though it is now presented in much more detail. The second “Condensate” version is based on the accumulation of GTF condensates at the promoter. The stochastic transition matrices for Kolomogorov’s Forward Equation are written explicitly in the figure to remove any ambiguity about the mathematical approach used, remove uncertainty about which transition probabilities link which states, and clarify what the states are in the model. This equation and some basic concepts of Markov chains and their application in the stochastic Chemical Master Equation are all introduced in the revised background. To facilitate interpretation for the general reader, we introduced cartoon schematics in addition to these stochastic matrices, which summarize the same states and transition probabilities in pictorial form instead of equations.

By describing two biochemically distinct versions (condensate formation/dissolution and PTM accumulation/turnover) and highlighting their common features which give rise to hypersensitivity, hysteresis, and ability to reproduce recent seemingly-contradictory findings, we hope to address the question of the Futile Cycle model’s generality. We hope you find this revised presentation to strike a balance between abstract generality and molecular reality (e.g. in place of abstract promoter states 1..n, we consider condensates of TFs and PTMs).

1. The presentation of the model is unclear. It is currently present in the text, lines 110-122, in pure qualitative description. Authors define only rates in the text; definitions of other model parameters are not present. For example, E and a are not specifically defined in the text or Methods section. Since both terms "enzyme" and "enhancer" are being used and in fact "enzyme tagging" and "enhancer tagging" occur simultaneously in the model, it is not possible to say for sure when do authors call which one in the model and thus the methods section can be interpreted in different ways. Moreover, the cartoon is missing a legend confirming, which molecular player is which. The figure caption mentions only green triangles being the tags, but no other parts of the cartoon are being explained. Taken together, this makes it very difficult to verify the mechanics of the model.

We agree the presentation of the model was unclear and has been completely rewritten, see pages 11-15 for the new text, and new Figure 3.

We now describe essential qualitative features of the model, and propose two different chemical mechanisms which exhibit those features. One chemical mechanism is based on the accumulation of post-translational modifications, which we renamed “PTM model version” from “Futile cycle” model in the first submission. The other model version requires no enzymes but assumes instead that TFs can form condensates at promoters (the “Condensate model version”). The Chemical Master Equation (CME) for each model is now shown explicitly in the new Figure 3. For simplicity we solve the CME by numerical simulation.

To aid in general readers unfamiliar with stochastic systems, Markov processes, or the Chemical Master Equation, we have provided a brief introduction to these systems as well (see Figure 2). We use this stochastic approach to briefly re-derive classic results on transcriptional hypersensitivity to cellular signals, and contrast that case to the focus of our work, hypersensitivity to changes in 3D structure. See section “Modeling nonlinear transcriptional response” on pages 5-11.

In addition to these substantial additions to the main text as suggested, we have substantially expanded the methods section to provide further technical details about the analytic methods, simulations, and parameter values. We also highlight two reference texts that provide the curious reader an excellent introduction to mathematical frameworks used in this study, continuous time Markov processes (Rick Durrett: Essentials of Stochastic Processes (Durrett 2012)) and Nonlinear Dynamical Systems (Strogatz 2018). The simulation parameters and model code are provided on our Github page, and we have fixed the broken link in the text (which was previously broken—a sorry oversight on our part). This includes new scripts with symbolic mathematical analyses of several of the Markov processes described in the text, along with simulations of the new condensate version and updated simulations of the PTM version of the model.

We hope this revised presentation is much clearer about the chemical processes postulated, and the mathematical approaches used in the analysis.

– Given its centrality to the manuscript, we recommend describing the overall strategy in more detail in Results. For example, at line 124 (Pg. 4) the authors could talk about how the simulations are done, including where the variability comes from (e.g., random starting conditions vs. probabilistic events vs. different parameters).

We thank the reviewers for this suggestion and have rewritten the first half of the manuscript to introduce the overall strategy in detail. Please see pages 5-15. It will now be clear that the stochasticity arises from our use of a Chemical Master Equation approach—modeling the molecular interactions as a continuous-time Markov process. As described in our introduction to the CME for non-specialists (see pages 7-8), stochasticity arises from a probabilistic treatment of molecular interactions. To aid in building an intuitive understanding of the model’s general behavior, we also make use of the continuum limit, in which the CME is reasonably approximated by a system of ODEs, allowing us to use tools from the analysis of deterministic dynamical systems. While the primary conclusions of the paper rest on the stochastic simulations, we found these limiting cases to be useful in building intuition, and understanding and predicting the model behavior. In the hopes that it will be of interest to the mathematically inclined reader, we include the detailed analytical work in the supplement. The revised text clearly flags where continuum theory is being discussed and where dynamic CMEs are being used.

We have also substantially expanded the methods, adding an explicit section on the assumptions, including the selection of starting conditions and their impact on model behavior.

– The authors should provide a detailed technical description of their model directly in the text, including description of their parameters, list their constitutive equations and identify all parameters in their cartoon Figure 1C.

This has been added (see above), see also equations added in main figures.

– Axes labels in all figures should be expressed in the parameters/variables of the model (as in Figure 6C-D) directly connecting to inputs/outputs of the model.

This has been done, and the relation between the model parameters (tag count or condensate size) have been explicitly added to the figures as well.

2. In the Methods section, it appears that in lines 577-580 of the model description, the mass is not conserved. This issue is critical to address, because when mass is not conserved the model cannot be physical.

See Equations 4 and 5. This ODE version of the model is used only for building intuition, as should now be clear.

The revised presentation of the model should now make it clear where simplifying assumptions arise (see next reply). Regarding the reservoir assumption (when some molecular species are in such excess relative to their levels at the promoter that gaining a molecule may be safely assumed to have a negligible change on the available pool), we assumed that the substrate material for the tag (acetyl groups, methyl groups and/or phosphate groups) is in considerable excess in the cell, and under continual production and destruction by a myriad of other cellular processes, such that the addition of one acetyl group to one promoter does not significantly deplete the pool in such a way as to make a measurable impact on the probability that a second acetyl be added.

3. Xiao et al. make several key assumptions to dramatically simplify their model. Namely, it is assumed that promoter modification and transcription are equivalent and that enhancer-promoter contact influences transcription instead of transcription influencing structure. Steady-state equilibrium must also be assumed. It would be helpful if the authors explicitly stated these assumptions and provided references to support their being reasonable.

The necessary assumptions have been more clearly stated when introduced in the main text.

Additionally, we have added a section to the Methods to recap all the assumptions in one place.

In particular we specify:

a) Tags are in excess such that addition of one tag does not affect the probability of the addition of the next in the PTM model. As acetyl/methyl/phosphate groups are in rapid turnover in 100s of cellular processes that occur on the timescales of transcriptional regulation and are available in uM concentrations in cells, this assumption appears reasonable. See

[https://bionumbers.hms.harvard.edu/bionumber.aspx?id=101259&ver=3&trm=acetyl+CoA&org=].

b) In the ‘condensate’ version of the model we assume that TFs such as PolII and Mediator, which make up the condensate model, are in sufficient excess such that a single molecule joining a promoter condensate (of up to a max of <30 total molecules) does not affect the probability of binding of the next. As the typical cells contain thousands to tens of thousands of such molecules, this assumption seems reasonable.

c) We assume transcription rates are linearly proportional to condensate size or tag number with a proportionality of unity. This assumption simplifies the model in such a way it is easier to understand, and can be trivially relaxed. As we are only interested in relative transcription changes, any extension of the model to assume a deterministic number of transcripts per tag per unit time immediately cancels out in the relative transcription change (and would trivially rescale the y-axis in our graphs without changing the qualitative behavior). A more realistic model may assume some stochastic probability of transcription initiation (or burst of initiation) as a function of tag number / condensate size. If the stochasticity in this process is very large, such that transcription is all-but-random and uncorrelated to tag or condensate size, then regimes of the model which predicted hypersensitive transcriptional responses to structure change will no longer show that the condensate size is still hypersensitive, and the transcriptional response is all-but random (by construction). But as long as the two are correlated (or anti-correlated) the key qualitative behavior will persist. (This result is mathematically trivial—a sigmoid with random noise added is still sigmoidal, even if averaging is required to make the form visible).

d) The revised presentation should make it clear that steady state is not assumed. We in fact explore the qualitative behavior of the model across different time scales, including the very instructive stationary states of the Markov process and the stable states of the corresponding ODEs in the continuum limit. Relevant time scales become an essential feature of this analysis, which we have elaborated on in the manuscript to describe how the time points are selected (see page 19).

e) Initial conditions are now specified in the Methods as follows (see also the source code):

Unless otherwise indicated, the promoters were all started in the OFF state, defined as 0 tags (PTM version) or 0 proteins in a condensate (condensate version). In several cases, clearly indicated in the text, we explored the memory effects at moderate time scales by starting the system in the high/ON state. In this case we used an initial condition in which all promoters started at the average tag-level at the maximum enhancer-promoter frequency examined (typically 40 tags) or the maximum condensate size (typically 6 to 20). These details have been added to the Methods, in addition to residing in the simulation code itself (for which the links should now be fixed!).

4. For clarity, it would be helpful to discuss model parameters in greater detail. First, we suggest noting which parameters shift the location of the curve and which increase the steepness of the curve. Second, we recommend including a phase diagram exploring when sigmoidal behavior and any other key model predictions arise across parameter space. In what circumstances does hypersensitivity or time lag emerge? The authors demonstrate that a narrow set of parameters is sufficient to produce a super-linear relationship between enhancer-promoter contact and transcription in Figure 6. One potential dilemma is this model's ability to explain many experimental observations by indicating that minimal changes all occur in the sub-linear regime while observable changes occur in the super-linear regime. Given that one needs specific parameters to replicate an example of the hyper-linear regime (including at least three degrees of stimulation and increasing stimulation of the successive states), it could be valuable to demonstrate how large the plausible parameter space is. Without an exhaustive search across the space of minimal parameters, it is not clear when this property emerges or how common it is within the full parameter space. The authors could vary model parameters and plot a grid visualizing behavior (e.g., steepness of the curve or Hill coefficient).

We have added the parameter sweeps as requested (see Figure 3E-G, and new Figure 3 supplement 1 and 7). We have characterized these with respect to the parameters of the Hill function (see Figure 2 supplement 1).

Our explorations identify a set of minimal conditions (parameters) that support hypersensitivity.

These are now described individually in the text:

a) Something must accumulate at the promoter.

b) Accumulation must be opposed by a removal process.

c) Accumulation must have positive feedback in such a way that promoters with more of the accumulating entity are more likely to get more than those that have little or none, up to some maximum value.

These high-level properties may be realized by a variety of molecular processes. Depending on the explicit model, they may be affected by a combination of kinetic parameters or by single parameters (see Figure 3, Figure 3 supplement 1, and Figure 5 supplement 1). We find these 3 conditions to be much more intuitive and accessible to the reader than writing k1 < tau1 and k2 > tau2 and k2 > k1, especially given that the model can be recast as a GTF condensate in which these three general principles still hold, even though realized through a different set of chemical states with distinct state-transition probabilities that are unrelated to enzyme kinetics (see Figure 3).

5. The authors observe hysteresis in median transcription rate as a function of enhancer contact frequency. However, the presented violin plots suggest a presence of two states, one with low and one with high transcription rates. In the intermediate regime of enhancer contact frequency, where authors report hysteresis, the violin plots show bimodal distributions suggesting coexistence of these two states. This would suggest that the system exists in and switches between two distinct states with a discontinuous transition, instead of a continuous hysteretic behavior as suggested by the median behavior.

The reviewer’s understanding of the model is largely correct—the system isbistable and the violin plots reveal the bistability. This bistability produces hysteresis, as can be seen not only in the medians (Figure 5C-E) but also in the newly added plot overlaying the violins (Figure 3D and J, Figure 5C-D). In addition to showing the violin plots, we have overlaid graphs showing the averages and/or median transcription rate across the experiment in many of our figures, as these population level quantities are the ones measured in many bulk experiments.

As should be clearer in our revised text, and from the responses above, the model is a discrete stochastic system (a CME or equivalently, a continuous-time Markov process). It proceeds in discrete jumps (e.g. from 0 tags to 1 tag, even though the mean tag count can be 0.1, 0.3 individual promoters only have integer tag values). There are no discontinuous jumps: the model, by construction, only gains one tag at a time (or one protein at a time) and similarly only loses one protein at a time. It is an emergent property of the model that a promoter that has lost beyond a critical number of tags is rather more likely to lose the rest (one-by-one) than to gain any back, and a promoter that has over the critical number is more likely to gain another. This dynamic behavior is most easily understood by considering the continuum limit of the model, which allows one to use the stability analysis for ODE models common in deterministic systems (see Strogatz: Dynamical Systems, Chapter 1-2).

6. There is also an opportunity to emphasize that the proposed model is not necessarily absolutely correct, but one of many plausible models that can produce a non-linear relationship between genome structure (enhancer-promoter contact) and transcription. Any thoughts on other models that could generate similar dynamics would be a useful discussion point. There are parallels to both sigmoidal dose-response curves, where drug concentration is plotted against response, and transcription factor binding curves, where free ligand concentration is plotted against the fraction bound. We recommend providing background context on these types of models or the Hill equation to illustrate why non-linear behavior is or is not surprising given the proposed model.

We have now added a substantial discussion of prior work on hypersensitive transcriptional responses, including the classic model of transcription-factor cooperativity which is one mechanism to create a hypersensitive response to chemical signals (like morphogens or drug dose-response), pages 5-11. This helpful addition provides an opportunity to highlight both the parallels and differences between these models and those which we present subsequently, improving the clarity of the presentation. Please see the new Figure 2 and the associated text on pages 8-11.

As the reviewer points out, the ‘Futile Cycle model’ which we renamed ‘PTM model version’ is not the only plausible model. We clarified this in text and provide another example with the ‘Condensate model version’ of how hypersensitivity to E-P contact can be achieved.

See also response 1 and 4.

7. It is not totally clear why the authors decide to call their proposed approach the futile cycle model. There are similarities to other well-known models in biochemistry and biophysics that should be noted. It might make sense to simply call this a mechanistic model of cooperative promoter activation. If the authors stick with "futile cycle", the relationship between promoter activation through tags and metabolic signaling should be described in more detail.

The new section, “Promoter Futile Cycles and Hypersensitive Response”, introduces the model and concept of Futile Cycle. Moreover, the presentation of two completely different versions of the model, one which involves addition/removal of PTMs and the other formation/dissolution of GTF condensates we hope will also improve the clarity for why we chose this name.

We fear that “cooperative promoter activation” would be confused with the cooperative binding of transcription factors, which explains hypersensitivity to changes in transcription factor concentration but not hypersensitivity to structural change affecting E-P contact frequency. Based on these comments, and related feedback from colleagues, we have added an additional section to explain why cooperativity among enhancer-contacts is indeed an alternative mechanism to achieve sensitivity to structural change but not consistent with published data. In contrast the models we describe are consistent with published data. Please see Figure 2E-G and pages 10-11.

8. At the beginning of the Discussion section authors state they will propose future experiments in each section. However, in some of the sections it is not clear what specifically authors are proposing. These suggestions should be made clearer.

We have added several additional experimental suggestions to the discussion, and elaborated on several of our previous ones. We have also updated the formatting of this presentation with clear paragraph separation of the proposed experiments. We thank the reviewer for this suggestion which we hope will substantially improve the interest of the manuscript.

9. If I understand the model correctly, the non-linearity arises because of the increase rate of tag addition when tag is already present. The authors then speculate histone modifications can be one such tag. However, there is only so many sites of modification at a promoter. Can the authors analyze how the possible range of tag densities affects performance of the model? Is the range required biologically plausible?

Yes, the reviewer understands the model correctly in terms of the origin of nonlinear response. The number of promoter tags or number of condensate molecules needed to achieve hypersensitive responses as a function of other model parameters is now discussed in Figure 3 and Figure 3 supplement 1 and Figure 5 supplement 1. The plausibility of the assumptions of tag number is now discussed in the Methods in our new section on Model Assumptions (see page 30-31).

In what we now call the “PTM version” of the model, in order to keep the number of parameters to a minimum, we did not set an explicit saturation point for deposition of tags, and instead let the removal rate set the effective maximum. In the condensate model, depletion of the pool of proteins below the saturation threshold for condensation sets the maximum number of molecules at the promoter. For simplicity, we model this effect with a fixed constant we call “max cluster size”.

The numbers used in these simulations, 10-40, likely do approach saturation of available substrate sites (though perhaps multiple tags, such as H3K4me and histone acetylation together accumulate to produce the stable active state). There are multiple histone tails available in a single nucleosome, which contain multiple potential modification sites for some of the tags which are found at promoters. The abundant lysine residues in histone tails can each accumulate 3 methyl tags (though the position of the lysine in the tail can have distinct effects on transcription probability). Many of the histone tagging enzymes will also add their modifications (like acetylation) to other non-histone promoter-associated proteins. Alternatively, a greater number of enzyme states, for example, achieved by an enzyme whose stability of association with the promoter-complex increases steadily with additional PTMs, would allow a very hypersensitive response to emerge even with few total tags in the ON state.

10. Can the authors do more analysis to explore how rapid changes in gene expression may occur (e.g. upon signaling a gene may go up within minutes)? How much more frequent does the E-P interaction need to be for rapid switch to the active promoter state? Can the authors do an analysis where they change the rates of the futile cycle upon some signal: at what time scale does transcription then change (keeping E-P frequency the same)?

The temporal dynamics for a promoter that starts in an off-state, to switch to an on state, given a particular rate or E-P contact, is now shown in Figure 5e. The relation in the amount of time expected for transcription to change when E-P contact frequency is perturbed moderately, compared to the amount of time required for enhancer mediated gene activation is discussed on page 19. The relationship of these model properties to previous biochemical measurements is described in the discussion on page 26.

Briefly:

The absolute kinetic properties for the molecular species in our models are not known (how many acetyl or methyl groups per minute are added or removed from a promoter? How many GTFs condense or redissolve per minute?). We have also not taken the molecular specification of the model to such as an extreme as insisting that the ‘tag’ be histone-acetylation rather than histone methylation or PolII tail-phosphorylation, or insisting the condensate be made of PolII rather than Mediator—and the rate constants may be different for the different molecular species. Thus the model is of more use in understanding relative differences in response time between different mechanisms (e.g. enhancer activation vs. loop disruption), rather than providing precise predictions in absolute time.

However, we can and do compare relative temporal responses to different perturbations. In particular, for the discussion related to the timescales of cohesin depletion vs. TAD border deletion, we determine the amount of time it takes the promoter to switch off upon enhancer deactivation (in arbitrary units of model time) and compare this to the amount of time it would take the promoter to switch off if the E-P contact frequency was modulated by a factor of two (see Figure 5F,G). For a more comprehensive parameter sweep, we also plot the promoter state as a function of time for a whole parameter scan of E-P contact frequencies and different lag times (see Figure 5E).

From these analyses we concluded, if typical promoters take minutes to an hour to respond to enhancer activation/deactivation, then a few hours is insufficient time for those same promoters to respond to a small structural change (a 2 fold reduction in enhancer activity), and that much greater transcriptional changes would be observed at time scales orders of magnitude longer than the typical time for promoter activation/deactivation via enhancer activation/deactivation.

11. Due to the lack of description, in many sections it is not clear what are the specific inputs and outputs of the model (e.g. Figure 2).

This should now be clear. In short, the inputs are generally E-P contact frequency and the output is transcription rate, which we simplify by assuming it is equivalent to “condensate size” or “tag count”—as all the interesting behaviors of our model are already established by these parameters, there is was no need to add gratuitous complexity. See also responses to Major points 1 and 4 above.

12. The Methods section describes the chemical kinetics of the suggested reactions and the insulation score calculations. But it is not clear how do these inform each other, how are contact-frequency maps chosen/computed and cross-referenced with the local E-P kinetics?

The Methods section has been extended to describe each simulation presented.

Insulation score is a measure of relative contact frequency within a TAD vs. between TADs. Since much of the experimental data involves perturbations to TAD boundaries resulting in partial or complete merger of the TADs, the insulation score provides an upper bound on the magnitude of the change in contact frequency. It is because insulation scores are rarely larger than 2 that we repeatedly focus on 2 fold changes in contact frequency in our simulations. The calculation of insulation score is presented in the methods so the reader knows precisely where the claim that almost all TAD boundaries correspond to factor-of-2 or less differences in contact frequency.

13. In 587-588, the index of k is 2(n+1), which equals to 2n+2, but then in the next line the following assumption is made 2n+1 → n+1.

We apologize for the original version of this section, which included a confusing and unnecessary renumbering of the variables indexed by k. It has been completely rewritten and we hope the new version will be more readable.

14. The authors make assumptions that their kinetic considerations hold for n>2. What is the evidence?

For simplicity we performed the PTM model simulations with n=2. All of the ‘paradoxes’ discussed can be resolved with a PTM model in which only two stimulated states are needed.

An analysis of the continuum limit version of the PTM model shows that the effective Hill coefficient of the tag-addition process is set by the number of states. We presented the derivation for n=2, a straightforward reproduction of this derivation with n=3 or n=4 leads to the appearance of a tag-addition curve where the highest order polynomials are now n=3 and n=4 respectively. Choosing rate constants such that these terms dominate leads to a case where tag-addition converges to a Hill Function with Hill Coefficient n=3 or 4 (the less these terms dominate the flatter the effective Hill Coefficient). The effects of changing the steepness (i.e. effective Hill Coefficient) of the tag-addition curve is shown in Figure 3 supplement 1 and Figure 5 supplement 1. A detailed analysis of the effect for all options between n=1 and the expected behavior in the limit n-> infinity is now discussed in more detail in the section “Analytic inference of the parameter space of the PTM model” in the methods, see page 31-32.

Additionally, we presented a condensate model in which no enzymes are involved at all.

15. The language of the paper is often not technically precise with qualifiers missing, which could lead to ambiguities and misinterpretations. Here are some examples:

We have carefully revised the text for clarity of language.

p. 1, line 10, "difference in contact across TAD borders is usually less than twofold";

In this sentence, we highlight a qualitative difference in how structure changes (less than two fold), to how transcription changes (as much as ten fold, sometimes more). We find this qualitative sentence informative and readable, and supported by substantial evidence in the literature. Later in the text we provide explicit and precise quantitative justification for the statement. For example, Figure 1 supplement 4 plots the difference in contact across ALL TAD borders in the human genome from IMR90 cells using the Hi-C data from Rao et al. 2014, and shows that these insulations scores are rarely greater than 2 (less than 10% of TAD borders in that data have a score >2, the median score is ~1.32).

We feel it would unnecessarily interrupt the readability of the manuscript to say that “in IMR90 cells, 95% of TAD borders demarcate differences of less than 2 fold change in contact frequency”. It would also give the naive reader the impression this property is specific to IMR90 cells rather than generally true of available Hi-C data across the cell types and species which have been analyzed. We appreciate that this line in the abstract would feel better justified if it carried a citation, however eLife does not use citations in the abstract.

p. 1, line 17, "results from recent cohesion disruption";

We have added the citation to the ‘recent result’.

p. 2, line 71, "A simple model of hypersensitivity to changes in contact frequency".

We have reworded this line for clarity.

16. On p. 13, line 483, authors define Ostwald ripening as given by weak multivalent interactions; however, Ostwald ripening is a thermodynamic process. In addition, they propose that liquid condensates become larger due to Ostwald ripening, but there are also other processes that may occur, such as coalescence of condensates, which would also lead to larger condensates.

This part of the text has been removed.

17. An alternative explanation for TAD-specific enhancer action is that an E-P interaction within a TAD (between two convergent CTCF sites), one that is brought about by extruding cohesin, is not equivalent to an interaction that occurs between two loci on either side of a CTCF site and that can be a random collision that is not mediated by extruding cohesin. In other words, two interactions can be of the same frequency but can be of a very different molecular nature. I agree that this model would not explain the results of the experiment where cohesin is acutely removed.

We agree. Indeed a number of additional mechanisms could be proposed to reconcile different subsets of the paradoxes we discuss.

18. In the beginning of the introduction the authors introduce TADS. I recommend that the authors present this in a more nuanced way: compartment domains also appear as boxes along the diagonal, an issue that has led some in the chromosome folding field to be confused. This reviewer believes TADS are those domains that strictly depend on cohesin mediated loop extrusion, whereas compartment domains are not. If the authors agree, perhaps they can rewrite this section?

Because the term “TAD” is frequently used in discussing the paradoxes we seek to address with our model, we find it necessary to introduce it to the text.

We appreciate the term has become controversial in the field. Here, we use the term TAD as it was originally defined when it was originally coined in the literature (in work one of our reviewers co-authored, and before any cohesin depletion experiments were done): linear stretches of the genome in which intra-domain contact is greater than interdomain contact—features which appear as boxes or triangles in the popularly used heatmap representation of these data.

We disagree with (re)definingTADs to be features created by loop-extrusion for the following reasons: The term was coined before cohesin depletion experiments, and has been extensively applied to describe results in many cell types and many experiments in which cohesin has not yet been perturbed. If we redefine TADs as the product of loop extrusion, the TADs of Dixon 2012 in IMR90 cells, the TADs of Lupianez 2015 in mouse limbs, can be called at best “prospective TADs”, since it has not been shown that in these cells or limbs that they arise from loop extrusion and are not also compartment boundaries. Indeed, several of the examples of TAD boundaries highlighted in Dixon 2012 and Nora 2012 and Lupianez 2015 are also compartment boundaries, and are at genomic positions which retain a detectable border in HCT116 and or mESC cells following cohesin depletion.

While it is useful to talk about ‘structural features created by cohesin’ we propose it would be much less confusing to just call those ‘structural features created by cohesin (SFCCs ?), rather than co-opting the word TAD to this new purpose. Nearly a decade of discussing high resolution Hi-C maps, since Dixon 2012/Nora 2012, has demonstrated it is also useful to have a word to refer to boxes or triangles that we see on these maps, regardless of what we know of the molecular mechanisms that gave rise. The term TAD was introduced to do this in 2012 and seems to serve the purpose well, and we continue to use it as such.

https://doi.org/10.7554/eLife.64320.sa2

Article and author information

Author details

  1. Jordan Yupeng Xiao

    Program in Biophysics, Stanford University, Stanford, United States
    Contribution
    Conceptualization, Formal analysis, Methodology, Writing - original draft
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-8072-9341
  2. Antonina Hafner

    Department of Developmental Biology, Stanford University, Stanford, United States
    Contribution
    Supervision, Validation, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-4927-5227
  3. Alistair N Boettiger

    1. Program in Biophysics, Stanford University, Stanford, United States
    2. Department of Developmental Biology, Stanford University, Stanford, United States
    Contribution
    Conceptualization, Software, Formal analysis, Funding acquisition, Methodology, Writing - original draft, Project administration
    For correspondence
    boettiger@stanford.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-3554-5196

Funding

National Institutes of Health (U01 DK127419)

  • Alistair N Boettiger

NIH (DGM132935A)

  • Alistair N Boettiger

Stanford University (GM008294)

  • Jordan Yupeng Xiao

Stanford University (The Walter V. and Idun Berry Postdoctoral Fellowship Program)

  • Antonina Hafner

Burroughs Wellcome Fund (CASI)

  • Alistair N Boettiger

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Luca Giorgetti and Gregory Roth for informative discussions and members of the Boettiger lab for a critical reading of the manuscript. We thank Daniel Ibrahim for insightful and inspiring discussions about the behavior of structural perturbations near the Sox9 locus. This work was supported by a CASI award from the Burroughs Wellcome Foundation a NIH New Innovator Award (DGM132935A) to ANB and support from the 4DN consortia (U01 DK127419). JX was supported by the Molecular Biophysics Training Program at Stanford (GM008294). AH was supported by a Walter V. and Idun Berry Postdoctoral Fellowship.

Senior Editor

  1. Naama Barkai, Weizmann Institute of Science, Israel

Reviewing Editor

  1. Job Dekker, University of Massachusetts Medical School, United States

Reviewers

  1. Katie S Pollard, Gladstone Institutes, United States
  2. Job Dekker, University of Massachusetts Medical School, United States

Publication history

  1. Received: October 25, 2020
  2. Accepted: June 25, 2021
  3. Accepted Manuscript published: July 9, 2021 (version 1)
  4. Version of Record published: August 9, 2021 (version 2)

Copyright

© 2021, Xiao et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 2,175
    Page views
  • 346
    Downloads
  • 2
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Chromosomes and Gene Expression
    2. Developmental Biology
    Benoit Roch et al.
    Research Article Updated

    We developed an Xrcc4M61R separation of function mouse line to overcome the embryonic lethality of Xrcc4-deficient mice. XRCC4M61R protein does not interact with Xlf, thus obliterating XRCC4-Xlf filament formation while preserving the ability to stabilize DNA ligase IV. X4M61R mice, which are DNA repair deficient, phenocopy the Nhej1-/- (known as Xlf -/-) setting with a minor impact on the development of the adaptive immune system. The core non-homologous end-joining (NHEJ) DNA repair factor XRCC4 is therefore not mandatory for V(D)J recombination aside from its role in stabilizing DNA ligase IV. In contrast, Xrcc4M61R mice crossed on Paxx-/-, Nhej1-/-, or Atm-/- backgrounds are severely immunocompromised, owing to aborted V(D)J recombination as in Xlf-Paxx and Xlf-Atm double Knock Out (DKO) settings. Furthermore, massive apoptosis of post-mitotic neurons causes embryonic lethality of Xrcc4M61R -Nhej1-/- double mutants. These in vivo results reveal new functional interplays between XRCC4 and PAXX, ATM and Xlf in mouse development and provide new insights into the understanding of the clinical manifestations of human XRCC4-deficient condition, in particular its absence of immune deficiency.

    1. Chromosomes and Gene Expression
    2. Computational and Systems Biology
    Zelin Liu et al.
    Tools and Resources

    Circular RNAs (circRNAs) act through multiple mechanisms via their sequence features to fine-tune gene expression networks. Due to overlapping sequences with linear cognates, identifying internal sequences of circRNAs remains a challenge, which hinders a comprehensive understanding of circRNA functions and mechanisms. Here, based on rolling circular reverse transcription (RCRT) and nanopore sequencing, we developed circFL-seq, a full-length circRNA sequencing method, to profile circRNA at the isoform level. With a customized computational pipeline to directly identify full-length sequences from rolling circular reads, we reconstructed 77,606 high-quality circRNAs from seven human cell lines and two human tissues. circFL-seq benefits from rolling circles and long-read sequencing, and the results showed more than tenfold enrichment of circRNA reads and advantages for both detection and quantification at the isoform level compared to those for short-read RNA sequencing. The concordance of the RT-qPCR and circFL-seq results for the identification of differential alternative splicing suggested wide application prospects for functional studies of internal variants in circRNAs. Moreover, the detection of fusion circRNAs at the omics scale may further expand the application of circFL-seq. Together, the accurate identification and quantification of full-length circRNAs make circFL-seq a potential tool for large-scale screening of functional circRNAs.