Regulation of chromatin architecture by transcription factor binding

  1. Stephanie Portillo-Ledesma
  2. Suckwoo Chung
  3. Jill Hoffman
  4. Tamar Schlick  Is a corresponding author
  1. Department of Chemistry, 100 Washington Square East, Silver Building, New York University, United States
  2. Simons Center for Computational Physical Chemistry, 24 Waverly Place, Silver Building, New York University, United States
  3. Courant Institute of Mathematical Sciences, New York University, United States
  4. New York University-East China Normal University Center for Computational Chemistry, New York University Shanghai, China

Abstract

Transcription factors (TF) bind to chromatin and regulate the expression of genes. The pair Myc:Max binds to E-box regulatory DNA elements throughout the genome to control the transcription of a large group of specific genes. We introduce an implicit modeling protocol for Myc:Max binding to mesoscale chromatin fibers at nucleosome resolution to determine TF effect on chromatin architecture and shed light into its mechanism of gene regulation. We first bind Myc:Max to different chromatin locations and show how it can direct fiber folding and formation of microdomains, and how this depends on the linker DNA length. Second, by simulating increasing concentrations of Myc:Max binding to fibers that differ in the DNA linker length, linker histone density, and acetylation levels, we assess the interplay between Myc:Max and other chromatin internal parameters. Third, we study the mechanism of gene silencing by Myc:Max binding to the Eed gene loci. Overall, our results show how chromatin architecture can be regulated by TF binding. The position of TF binding dictates the formation of microdomains that appear visible only at the ensemble level. At the same time, the level of linker histone and tail acetylation, or different linker DNA lengths, regulates the concentration-dependent effect of TF binding. Furthermore, we show how TF binding can repress gene expression by increasing fiber folding motifs that help compact and occlude the promoter region. Importantly, this effect can be reversed by increasing linker histone density. Overall, these results shed light on the epigenetic control of the genome dictated by TF binding.

eLife assessment

In this important study, chromatin is simulated as a polymer at the scale of genes, and the 3D organization of chromatin is analyzed at nucleosome resolution. There is convincing evidence for the emergence of chromatin microdomains due to the action of transcription factors, based on the simulation incorporating well-known biophysical properties of DNA, of nucleosomes, of linker histones, and of the transcription factor pair Myc:Max, as well as considering how the 3D organization of chromatin results from bending and looping of DNA. The work greatly improves our understanding of how the joint action of transcription factors and chromatin features affects chromatin structure and accessibility, which is of interest to anyone studying gene regulation.

https://doi.org/10.7554/eLife.91320.3.sa0

Introduction

Eukaryotic genomes are compactly packaged inside the cell nucleus by wrapping ∼147 bp of DNA around an octamer of histone proteins. This first level of organization establishes the nucleosome, the chromatin basic repeating unit (Luger et al., 1997; Kornberg, 1977; Onufriev and Schiessel, 2019). Chains of nucleosomes form chromatin fibers that undergo additional folding that increases compaction. This folding can be aided by the binding of proteins. For example, CTCF and cohesin execute loop extrusion to form Topologically Associated Domains or TADs (Rowley and Corces, 2018). Linker histones (LH) bind to the nucleosome at the entry/exit DNA sites to compact chromatin fibers and regulate their architecture (Portillo-Ledesma et al., 2022; Perišić et al., 2019). Finally, TFs bind to regulatory DNA elements, modulating chromatin architecture and accessibility and, thereby gene expression (Weidemüller et al., 2021).

Myc is a TF with a C-terminal basic helix-loop-helix leucine zipper (bHLHZip) motif that has DNA-binding activity and can establish protein-protein interactions (Meyer and Penn, 2008). Myc regulates a large number of cellular processes, such as proliferation, growth, differentiation and pluripotency, metabolism, and apoptosis (Eilers and Eisenman, 2008).

Myc heterodimerizes with another bHLHZip protein, Max, to form the Myc:Max complex that binds to E-box (5’-CACGTG-3’) regulatory DNA elements throughout the genome to control transcription (Blackwood and Eisenman, 1991). Myc:Max heterodimers can further form bivalent heterotetramers, allowing them to bring together sequence-distant regions of the genome (Nair and Burley, 2003). The heterotetramer assembles in a head-to-tail way through the individual leucine zippers of each heterodimer, resulting in the formation of an antiparallel four-helix bundle. This interaction between Myc and Max is essential for gene transcriptional repression (Mao et al., 2003). Several Myc-regulated genes contain multiple E-boxes within promoters that are usually separated by 100 bp (Coller et al., 2000). The binding of bivalent Myc:Max to such regions can create loops in the chromatin and allow the cooperative regulation at promoters and enhancers containing multiple E-boxes.

Efforts to model TF binding to chromatin have been made at various spatial levels. At the nucleosome level, all-atom and coarse-grained molecular dynamics simulations have been applied to study the binding of the pioneer TF Oct4, alone (Huertas et al., 2020) or with Sox2 (Tan and Takada, 2020), to single nucleosomes. At the kb or chromosome level, TF binding has been introduced in polymer models as beads that can reversibly interact with other proteins or with the DNA (Brackley et al., 2013; Buckle et al., 2018). Although the implicit binding of high-mobility group (HMG) proteins to chromatin fibers has been studied at nucleosome resolution (Bajpai et al., 2017), to the best of our knowledge, the effect of TF binding on chromatin architecture has not been studied at nucleosome resolution level.

Here, we perform mesoscale chromatin simulations at nucleosome resolution under the implicit binding of the Myc:Max complex to study this complex’s effect on microdomain formation (TAD-like structures at the kb level), its interplay with other chromatin elements (e.g. LH, histone tail acetylation, and linker DNA length), and its effect on gene folding. Overall, we find that depending on the linker DNA length, the TF binding position dictates the folding of fibers and the formation of microdomains. The extent of the effect of TF binding on chromatin compaction and overall shape depends on the presence of LH and histone acetylation, and on the length of linker DNA. While long linker DNAs and high acetylation levels allow higher TF concentrations to further compact the chromatin fibers, short linkers, and LH impose structural restraints that limit the effect of TF binding. Finally, we show that the binding of Myc:Max can repress the expression of the gene Eed by increasing fiber folding and compaction in a way that occludes the promoter region; interestingly, this repression can be reversed with high LH density.

Overall, our study provides new evidence on the regulation of chromatin architecture by TF binding and how such effects are regulated by chromatin composition and epigenetic marks.

Results

TF binding mediates the formation of microdomains

Protein binding to chromatin fibers has been shown to produce regions of enriched contacts in Micro-C or Hi-C maps. For example, loops associated with chromosome domain boundaries overlap with CTCF binding regions (Rao et al., 2014). Similarly, the enrichment of Hi-C contacts can be used as a reporter of the strength of interaction between a pair of TF binding sites (Ma et al., 2018). Finally, the binding of TFs, cofactors, or chromatin modifiers produces fine-scale domains, smaller than TADs, that can be identified as stripes and dots in Micro C maps (Hsieh et al., 2020).

To better understand how TF binding position affects chromatin architecture, we simulate 50-nucleosome uniform fibers of 26, 44, and 62 bp linker DNAs, as well as 50-nucleosome life-like fibers with TF binding in four different scenarios (Figure 1). Interestingly, contact maps of fibers with both medium (44 bp) (Figure 1A) and long (62 bp) (Figure 1B) linker DNAs show clear regions of high-frequency contact, or microdomains, as a result of the TF binding, that are dependent on the TF binding positions. In fibers with short linkers (Figure 1C), like 26 bp, the formation of microdomains is less clearly identified in the contact maps. Namely, high-intensity regions close to the diagonal are diffuse and not as well defined as in the fibers with 44 and 62 bp linkers. See for example, topologies 2 and 3 in panels A, B, and C of Figure 1. On the other hand, other topologies, like topologies 1 and 4, allow the identification of microdomains. The 44 and 62 bp linker fibers are less compact and more globular than the 26 bp linker fiber (Perišić et al., 2010). Additionally, these fibers are more sensitive to changes in salt concentration and the presence of LH (Perišić et al., 2010; Portillo-Ledesma et al., 2022). The presence of more diffuse microdomains in 26 bp fibers can be explained by the rigidity of short-linker fibers in which the linker DNA dictates fiber architecture and produces a highly bent 10 nm ladder-like form (Collepardo-Guevara and Schlick, 2014) with low sensitivity to external and internal parameters, such as LH binding (Portillo-Ledesma et al., 2022; Perišić et al., 2010; Routh et al., 2008). Thus, longer linker DNAs that give chromatin fibers more flexibility, facilitate their folding modulation by TF binding.

Figure 1 with 7 supplements see all
Transcription factor (TF) binding location drives the formation of microdomains.

50-nucleosome chromatin fibers with: (A) 44 bp linkers, (B) 62 bp linkers, (C) 26 bp linkers, and (D) Nonuniform linkers simulated with four different TF topologies. At the top, for the 44 bp system, we show an ideal zigzag fiber coloring in red the DNA with TF binding regions. Arcs show the possible binding geometries. The binding positions and geometries that define each topology are the same in all systems. For each system, we show the cumulative contact map calculated from 10 independent trajectories and a representative fiber structure also showing in red the TF binding regions. Additional representative structures are shown in Figure 1—figure supplements 36.

Similar to the 44 and 62 bp uniform systems, microdomains emerge in ensemble-based contact maps of life-like fibers (Figure 1D). Due to the fiber polymorphism triggered by variations in the linker DNA (Collepardo-Guevara and Schlick, 2014), the microdomains are slightly more diffuse than in the 44 and 62 bp uniform fibers (Figure 1A and B). Thus, TF binding positions define the organization of life-like chromatin fibers, but the irregularity of the fibers smooths out this effect compared to uniform systems; fibers that combine short and long linkers form fluid heterogeneous bent ladders that accommodate TF binding in a more relaxed way than compact 30 nm regular forms.

As shown in Figure 1—figure supplements 1 and 2, these microdomains can also be identified by a clustering analysis of the contact map in nucleosome resolution. Fibers with 44 and 62 bp linkers (Figure 1—figure supplement 1) show clear TF-topology-dependent nucleosome clusters. In contrast, for the 26 bp and life-like fibers (Figure 1—figure supplement 2), the TF dependency of the nucleosome clusters is not as clear. These results support the notion that microdomains are more diffuse in these two systems due to fiber rigidity or polymorphism, respectively. Similarly, internucleosome contact plots show some TF binding position variations in medium and long-range interactions for the 44 and 62 bp fibers (Figure 1—figure supplement 1), but less variations for the 26 bp and life-like systems (Figure 1—figure supplement 2). See for example the plots corresponding to Topology 1 of each system where the 44 and 62 bp fibers reveal four clear high-frequency regions at medium and long-range interactions that are not as clear in the 26 bp and life-like systems.

Importantly, the contact maps that reveal microdomains are ensemble-based maps in which the contacts from 10 independent trajectories are summed. These maps are analogous to experimental Hi-C maps obtained from a population of cells. On the other hand, the contact maps corresponding to single trajectories, equivalent to single-cell Hi-C maps, do not show a clear formation of microdomains. As we see from Figure 2 for the 62 bp system with five TF binding regions (Topology 1), checker-board patterns in the ensemble-based map are a product of TF binding not discernible in individual contact maps obtained from single trajectories. These results agree with experimental data showing that cohesin dictates the preferential positioning of boundaries at CTCF sites to produce ensemble TAD structures (Bintu et al., 2018). Here, such microdomain structures emerge naturally from our tagging of TF binding regions.

Microdomains emerge only from ensemble-based contact maps.

The 10 single-trajectory contact maps for the 62 bp system Topology 1 (five Transcription factor (TF) binding regions) at left reveals various microdomains. The large ensemble-based contact map at right obtained by summing the 10 individual contact maps reveals all possible microdomains contacts.

TF binding effect is regulated by linker DNA length, histone acetylation, and LH density

As we have shown previously, histone acetylation affects chromatin compaction (Collepardo-Guevara et al., 2015), size, and compaction of nucleosome clutches (Portillo-Ledesma et al., 2021), and can produce the segregation of domains (Rao et al., 2017). Similarly, LH density controls chromatin higher-order folding and compaction (Portillo-Ledesma et al., 2022; Perišić et al., 2019; Grigoryev et al., 2016), and the size and compaction of nucleosome clutches (Portillo-Ledesma et al., 2021). In addition, LH binding and tail acetylation act cooperatively to direct fiber folding (Bascom and Schlick, 2017).

To investigate how TF binding and other chromatin regulators, such as LH and tail acetylation act together to modulate chromatin architecture, we study TF binding in the presence of LH and tail acetylation for fibers with different linker DNAs. In particular, we determine TF saturation curves for fibers with uniform linkers (26, 35, 44, 53, 62, 70, and 80 bp), and for nonuniform “life-like” fibers (see Methods) by measuring both packing ratios and sedimentation coefficients at increasing TF concentration (percentage of linker DNA beads that can bind TF), from 0 to 100%.

As we see from Figure 3A and Figure 3—figure supplement 1, when the linker DNA is short, such as 26 and 35 bp, TF binding does not increase the packing ratio of the fiber. However, the sedimentation coefficient increases significantly from 0 to 25% [TF], and then remains flat. This indicates that TF binding saturation occurs at a concentration of 25% and that higher TF concentrations do not affect fiber folding. Namely, fibers cannot fold into a more compact form due to excluded volume interactions. As clearly shown by the fiber configurations of the 26 bp system (Figure 4), while in the absence of TF the fiber has a ladder-like extended structure (short linkers), it folds over itself when TF is bound, explaining the higher sedimentation coefficients. On the other hand, for medium and long linker fibers, both packing ratios and sedimentation coefficients increase upon TF binding (Figure 3A). The longer linkers allow nucleosomes to approach closer to one another upon TF binding, increasing the packing ratio. Additionally, the longer the linker DNA, the larger the change in packing ratio upon TF binding at a concentration of 25%. Similar to the short linker fibers, TF binding also produces a more globular fiber (Figure 4) with a higher sedimentation coefficient (Figure 3A).

Figure 3 with 1 supplement see all
Transcription factor (TF) saturation curves are affected by histone acetylation and linker histone (LH).

Graphs (A–C) show packing ratios and sedimentation coefficients as a function of TF concentration for the 26 bp, 44 bp, 62 bp, and life-like fibers in different conditions: (A) systems without LH and acetylation; (B) systems with four acetylation islands; and (C) systems with LH density ρ = 1. Average and standard deviation values are obtained from ensembles of 1000 structures. At the top left, we show the fiber axis (red trace) and position of nucleosomes (blue dots) for a 70 bp chromatin fiber to illustrate the increase of packing ratio (number of nucleosomes per 11 nm of fiber length) upon TF binding. At the top right, we show a 70 bp linker chromatin fiber to illustrate the decrease of chromatin global size upon TF binding. Results including uniform systems with 35, 53, 70, and 80 bp are shown in Figure 3—figure supplement 1.

Transcription factor (TF) binding affects chromatin architecture.

Chromatin uniform fibers of 44, 62, and 26 bp linkers, as well as nonuniform life-like fibers at increasing TF concentration. Wildtype (WT) shows structures of the wildtype systems (no acetylation and no LH). +Ac shows structures of the acetylated systems with acetylated tails drawn in red and wildtype tails in blue. +Linker histone (LH) shows structures of systems with an LH density of 1 LH/nucleosome, with LHs drawn in cyan.

For life-like fibers, the effect on packing ratio and sedimentation coefficient is similar to that observed in the medium and long linker uniform fibers; fibers become more compact and globular upon TF binding (Figure 3A and Figure 4). However, in agreement with the trends observed for the TF binding regions (Figure 1), fluid life-like fibers appear to be less sensitive to the increase of TF concentration than the 44 and 62 bp systems.

When acetylation islands are incorporated into the chromatin systems (Figure 3B), the trends in packing ratios and sedimentation coefficients are similar to wildtype systems, but the extent of the TF binding effect is diminished. There is a smaller change in packing ratio when we increase TF binding from 0 to 25%. This is because histone acetylation disrupts internucleosome contacts and opens the chromatin fiber structure (Collepardo-Guevara et al., 2015), which is opposite to the repressive effect observed by TF binding. In agreement, compared to wildtype systems, fiber structures (Figure 4) show a smaller change in global shape and compaction upon TF binding.

Similarly, when LH is bound to medium and long linker DNA fibers (Figure 3C), the effect of TF on chromatin compaction is diminished compared to the wildtype systems. In this case, LH is a chromatin compactor, like TF binding. Thus, LH and TF compete for compacting the chromatin fiber. At 0% TF, the packing ratio of the systems with LH is much higher than the wildtype systems due to LH binding. Thus, upon TF binding, the fiber packing ratio cannot increase as when no LHs are bound. Moreover, LH binding produces straighter and ordered structures (Portillo-Ledesma et al., 2022), whereas TF binding produces more globular fibers. Thus, the effects of TF and LH on fiber global shape are in opposition. Upon TF binding, fibers with LH are less globular and more straight than the analogous wildtype fibers (Figure 4).

For short linker fibers like 26 and 35 bp, as we saw in the wildtype system, no increase can be observed in the packing ratio upon TF binding. However, very different trends are observed for the sedimentation coefficients of the 26 bp system. In contrast to the wildtype system (Figure 3A), TF binding does not increase the sedimentation coefficient (Figure 3C), and larger standard deviations are obtained, indicating more variability in the fiber structures that create the 1000-structure ensemble. This is because with LH, some TF dimer/dimer interactions cannot form in some of the trajectories. For example, for the 50% [TF] system, from the 10 independent trajectories, four reach the 50% [TF], four have 0% [TF], and two have less than 20% [TF], producing an effective [TF] in the ensemble of ∼22%. Thus, LH bound to short linker DNAs produces a fiber rigidity that impairs TF function. A similar effect has been seen for condensin and topoisomerase II binding (Choppakatla et al., 2021), where LH depletion causes excess loading of these proteins to chromosomes, producing aberrant chromosomes.

Finally, for life-like fibers (Figure 3C), the trends in packing ratio and sedimentation appear similar to those of the wildtype systems (Figure 3A). Interestingly, the increase in packing ratio upon TF binding at a 25% concentration is slightly higher than for the wildtype system (0.85 vs. 0.45 increase). This indicates that the simultaneous binding of LH and TF to life-like fibers is better accommodated than in uniform fibers. Thus, fiber fluidity triggered by variations of linker DNA (Collepardo-Guevara and Schlick, 2014) modulates the effect of protein binding on chromatin architecture.

TF binding-mediated compaction as possible mechanism of gene locus repression

Myc helps maintain cell pluripotency and inhibits differentiation by activating genes needed for pluripotency and repressing genes that trigger differentiation (Smith et al., 2010; Krepelova et al., 2014). For example, the embryonic ectoderm development gene, Eed, expresses a protein member of the Polycomb-group (PcG) family required for silencing pluripotency genes upon embryonic stem cell (ESC) differentiation (Obier et al., 2015). Thus, in ESCs, Eed is repressed so that cell differentiation is suppressed. To determine the mechanism of gene repression in Eed by Myc:Max, we fold de novo the Eed gene in the presence and absence of TF binding. To study Eed gene activation upon cell differentiation, we additionally simulate the folding of Eed with and without TF binding, and with increasing LH density (ρ) to 0.8 LH/nucleosome, the level observed in mouse somatic cells (Woodcock et al., 2006).

Figure 5A shows the ensemble-based contact map for the Eed gene with and without TF binding. We see that TF binding increases the number and frequency of internucleosome contacts. The density of the matrix increases by ∼9% upon TF binding. On the right upper corner of the contact maps, there is a clear difference: microdomains emerge as product of the TF binding. Contact maps obtained from single trajectories (Figure 5B) clearly show that TF binding increases hairpin motifs (medium-range contacts) and hierarchical loops (long-range contacts); these motifs are evident from diagonal and perpendicular regions to the main diagonal of the map (Grigoryev et al., 2016). Such an increase in local loop interactions is also observed in the plot of contact frequency versus genomic distance (Figure 5—figure supplement 1) showing that while short local interactions slightly decrease upon TF binding, local loop interactions increase markedly. These folding motifs increase fiber compaction (Figure 5C). Fiber configurations show that upon TF binding, the TF binding regions are in close proximity, producing a more compact fiber.

Figure 5 with 1 supplement see all
Transcription factor (TF) binding represses the Eed gene loci.

(A) Ensemble-based nucleosome contact maps obtained from 20 independent trajectories of the Eed gene in the absence and presence of TF binding. (B) Nucleosome contact maps obtained from a single trajectory of the Eed gene in the absence and presence of TF binding. (C) Representative chromatin fibers of the Eed gene in the absence and presence of TF binding. In magenta are shown the TF binding regions. Linker histones (LHs) are shown in cyan.

Furthermore, the fiber sedimentation coefficient increases, while the radius of gyration decreases, upon TF binding (Table 1). Such radii of gyration measurements agree with experimentally and theoretically determined radii of gyrations for similar size fibers (Kadam et al., 2023; Boettiger et al., 2016) (e.g. our 62 ± 14 nm Rg is similar to the 43 ± 9 nm Rg predicted for 26 kb regions in Drosophila cells Boettiger et al., 2016). Significantly, we find that the TF binding region 1, which overlaps with the promoter region, becomes occluded, thus less accessible. As reported in Table 1, the area and volume of the promoter region are reduced ∼1.5 times upon TF binding. Thus, Eed mechanism of repression by Myc:Max involves the compaction and occlusion of the promoter region due to increased fiber folding. A less accessible promoter will impair gene transcription.

Table 1
Compaction parameters: Sedimentation coefficient and radius of gyration for the entire Eed system, and area and volume for the promoter region of Eed.
Fiber Rg (nm)Fiber Sc (S20,w)Promoter area (nm2)Promoter volume (nm3)
Eed62 ± 14162 ± 19252 ± 451753 ± 239
Eed + TF60 ± 13171 ± 18171 ± 421262 ± 202
Eed + 0.8 LH61 ± 9173 ± 15196 ± 471447 ±123
Eed + 0.8 LH + TF57 ± 8175 ±16210 ± 461420 ± 104

Upon cell differentiation, LH density increases to ∼0.8 LH/nucleosome (Woodcock et al., 2006), and Eed needs to be activated so the polycomb protein is expressed and the pluripotency genes are repressed. Thus, to determine how such an LH density increase upon cell differentiation might affect Eed repression and help its activation, we study the Eed gene loci with an LH density ρ = 0.8. When ρ increases, the fiber becomes more compact and less flexible (Figure 6 and Table 1) due to the repressive effect of LH binding (Portillo-Ledesma et al., 2022; Luque et al., 2014). This change in fiber compaction and flexibility modulates the impact of TF. Similar to the effect observed for the 26 bp system in Figure 3, the Myc:Max dimers cannot interact when the fiber becomes more rigid and straight due to LH binding. Indeed, from the 20 trajectories ran for this system, only one trajectory has dimers engaged, and thus, the systems with ρ = 0.8 with and without TF exhibit similar properties (Table 1), structures (Figure 6), and genomic contacts (Figure 6—figure supplement 1). These results implicate LH in the activation of genes by impairing the effect of repressive proteins.

Figure 6 with 1 supplement see all
Activation of the Eed gene loci depends on linker histone (LH) density.

(A) Ensemble-based nucleosome contact maps obtained from 20 independent trajectories of the Eed gene with an LH density of 0.8 LH/nucleosome in the absence and presence of transcription factor (TF) binding. (B) Representative chromatin fibers of the Eed gene in the absence and presence of TF binding showing that the two TF binding regions remain apart upon TF binding. In magenta are shown the TF binding regions. LHs are shown in cyan.

Discussion

Our Monte Carlo simulations of chromatin fibers at nucleosome resolution with implicitly bound TFs demonstrate how these proteins shape chromatin architecture and thus regulate gene expression. Our results on different TF binding regions (Figure 1) show that TF binding creates microdomains that are visible in the ensemble-based contact maps; these microdomains are more clearly evident in, and enhanced by longer-linker DNAs. These results agree with experimental Hi-C maps (Rao et al., 2014; Ma et al., 2018; Hsieh et al., 2020) and polymer modeling studies (Brackley et al., 2013; Brackley et al., 2016; Brackley et al., 2017) showing that the clustering of proteins that bridge genomic segments results in TAD-like structures. Our results showing that fibers with short linkers have more blurred microdomains to geometrical restrictions agree with observations that silent chromatin has longer linker DNA than active chromatin (Valouev et al., 2011). Longer linkers provide the chromatin fiber with the flexibility needed to fold and create loops that compact the structure and repress transcription. We showed this for metaphase chromatin where longer linkers and the absence of LH favor the formation of hierarchical loops that compact genes and chromosomes (Grigoryev et al., 2016). Importantly, life-like fibers, representative of real-life chromatin, are sensitive to TF binding positions and form microdomains, although these are less defined than in uniform fibers due to the high plasticity and fluidity of the heterogeneous chromatin (Collepardo-Guevara and Schlick, 2014).

Similar to the effect on microdomain formation, we also suggest that, upon TF binding, fibers with short linkers (26 and 35 bp) do not change their packing ratio, whereas fibers with medium and long (44–80 bp) linkers, as well as life-like fibers, become more compact (Figure 3A). Thus, the role of linker DNA appears to be coupled to the effect that repressive proteins have on the chromatin structure. This agrees with recent cryo-EM results showing that long linker DNAs produce relaxed DNA trajectories that enable the binding of LH and produce transcriptionally silent heterochromatin regions, whereas short linker DNAs preclude the binding of LH and produce transcriptionally active chromatin (Dombrowski et al., 2022).

Histone tails extend from the nucleosome core and interact with other nucleosomes and with the linker DNA (Arya and Schlick, 2006). When tails are acetylated, they become globular, impairing internucleosome interactions and opening up the chromatin fiber (Collepardo-Guevara et al., 2015). When we introduced histone acetylation, we saw that, for fibers with medium, long, and life-like linker DNAs, the effect of TF binding on chromatin compaction is reduced (Figure 3B). This is because the effects of acetylation and TF binding (Myc:Max in particular) on chromatin compaction and architecture are opposite. In line with these results, it has been reported that histone acetylation can prevent the binding of some pioneer TFs (Fuglerud et al., 2018). Our results thus highlight how histone acetylation can counter and modulate the repressive effect of proteins that bind to chromatin.

When studying the effect of LH binding, we found that LH can impair the effect of TF binding on chromatin architecture (Figure 3C). Particularly affected are fibers with short linkers. The C-terminal domain of LH interacts with the linker DNAs to compact chromatin fibers (Luque et al., 2014). Thus, LH competes for the linker DNA with TF binding. Indeed, recent experiments have shown that although LH remains bound to the nucleosome dyad upon TF binding, the C-terminal domain unbinds from the linker DNA (Burge et al., 2022). These results highlight the influence of the LH C-terminal domain on regulating TF binding. LH binding produces a rigid and straight fiber (Portillo-Ledesma et al., 2022) that is not flexible enough to be maneuvered by TF binding. Thus, although TF and LH binding might not be mutually exclusive as they do not share the same genomic binding sites, LH can indirectly impair TF function by affecting the chromatin fiber architecture. Interestingly, this effect seems to be less important in life-like fibers; these fibers better accommodate simultaneous LH and TF binding. Thus, fiber fluidity appears essential for the regulation of chromatin fibers by multiple protein binding.

Our folding of the Eed gene loci upon repression and activation revealed that TF binding produces the formation of microdomains that can occlude the promoter of Eed, repressing its transcription (Figure 5). However, this effect is eliminated by increasing the LH density (Figure 6), as the fiber flexibility does not allow for the DNA regions with TF bound to come together. Thus, TF and LH binding work together to regulate gene expression. This cooperation further underscores the idea that LH does not always act as a repressor of global transcription. Furthermore, our emphasis of the LH and TF synergy agrees with results found in mESC showing that LH depletion can lead to both an increase and a decrease in gene expression (Fan et al., 2005), and that LH depletion in Drosophila can down-regulate euchromatin genes (Vujatovic et al., 2012).

As our nucleosome-resolution modeling study at the gene level reveals, the epigenetic regulation of chromatin by transcription factors is a complex dance that depends on the delicate composition of other internal and external factors present in the cell. Thus, TFs collaborate with epigenetic marks and intrinsic chromatin features like linker lengths and nucleosome-free regions to regulate the folding of chromatin at multiple scales and hence gene expression.

Because the microdomains and loops affect access to and topologies of transcription start sites, these structural effects translate into effective regulation of cell differentiation and development, as well as human disease progression.

Of course, genome regulation in the live cell involves much more than transcription factors. Other chromatin activators and repressors, architectural proteins, and RNAs are also involved.

Although our implicit modeling of TF binding is a simple strategy, it provides clear trends and guiding insights into the regulation of chromatin architecture by protein binding. Further modeling with explicit proteins or consideration of various biomolecules will undoubtedly increase our understanding of genome regulation on the many structural and temporal levels involved in complex cellular contexts.

Methods

Chromatin mesoscale model

Briefly, our chromatin mesoscale model contains coarse-grained elements that define the nucleosome cores, linker DNA, histone core tails, and LH (Bascom and Schlick, 2017; Arya and Schlick, 2006; Collepardo-Guevara and Schlick, 2014; Perišić et al., 2010; Figure 7A). In particular, cores composed of eight histone proteins and ∼147 bp of DNA are described as rigid cylinders with 300 Debye-Hückel charges distributed on their irregular surface calculated by the DiSCO algorithm (Zhang et al., 2003). Linker DNA connecting nucleosomes is treated with a combined worm-like chain and bead model (Jian et al., 1997) based on the charged colloidal cylinder approach derived by Stigter, 1977 where each bead has a salt-dependent charge and a resolution of ∼9 bp. Histone tails (Arya and Schlick, 2006) and LHs (Luque et al., 2014; Perišić et al., 2019) are coarse-grained as five residues per bead with the Levitt-Warshel united-atom protein model. The charges on histone tails beads are calculated as the sum of the five amino acids that compose the bead, whereas charges on LH beads are determined by the DiSCO algorithm. Acetylation of histone tails is modeled by alternatively folded rigid tails with altered force constants for stretching and bending with energies 100 times larger than wildtype tails (Collepardo-Guevara et al., 2015). Tail coordinates are determined separately for wildtype and acetylated tails, by swapping coordinates when a change occurs. The new tail configuration after the swapping is subject to a standard Metropolis acceptance/rejection criterion based on the changes in local electrostatic energy (Bascom and Schlick, 2017).

Our chromatin mesoscale model and modeling of Myc:Max binding to chromatin and the Eed gene.

(A) Coarse-grained chromatin elements (nucleosome core, linker DNA, histone tails, and linker histone, LH) form chromatin fibers at the kb level. (B) Crystal structure of Myc:Max forming a heterotetramer that binds to sequence-distant DNA regions. (C) Schematic representation of Myc:Max heterodimers and heterotetramers. (D) Illustration of our implicit modeling of Myc:Max binding to chromatin fibers: distance constraints following Hooke’s spring law are enforced when two linker DNAs with Myc:Max bound are in close spatial proximity (20 nm) and at at least 30 linker DNAs apart. (E) Design of the Eed gene loci using Mnase-seq data to position nucleosome free regions (NFRs) (orange track) and Chip-seq data to position LHs (blue track) and Myc:Max binding regions (green track).

The potential energy function of the model includes stretching, bending, and twisting terms for the linker DNA (ES, EB, ET), stretching and bending terms for histone tails (EtS, EtB) and LHs (ElhS, ElhB), and excluded volume (EV) and electrostatic (EC) terms for all beads as:

(1) E(r)=ES+EB+ET+EtS+EtB+ElhS+ElhB+EV+EC,

where r is the collective position vector.

Further details on the method parameters can be found in Bascom and Schlick, 2017; Arya and Schlick, 2006; Collepardo-Guevara and Schlick, 2014; Perišić et al., 2010.

TF Binding modeling

Based on the crystal structure (PDBID 1NKP) of the Myc:Max complex (Figure 7B) showing that heterotetramers can bind to sequence-distant regions, we simulate the binding of Myc:Max implicitly (Figure 7C) by adding restraints between two genome loci bi and bj (Figure 7D). In particular, a harmonic energy penalty of the form:

(2) Ebibj=k(lbibjl0)2

is applied to two target DNA linker beads. Here l0 is selected as 13 nm based on the distance between the two DNA chains in the crystal structure, and k is set to 20 kcal/molnm2 as this value ensures that the energy penalty is not strong enough to produce overlapping of DNA beads or cores, but small compared to the total energy of the system.

During each trajectory, two DNA beads that have TF bound can engage in a constraint if the distance between them is less than 20 nm and if they are separated by at least 30 beads. If during the simulation, two beads that were engaged in a constraint separate more than 20 nm, the constraint is eliminated.

Systems

TF binding to different regions

To determine the effect of TF binding location on chromatin architecture, we simulate 50-nucleosome fibers without LH and histone acetylation, and with short, medium, or long linker DNA lengths, such as 26, 44, and 62 bp. Additionally, because real-life chromatin fibers are non uniform, we also simulate a life-like fiber with the linker DNAs that follow the distribution of linkers found in mouse embryonic stem cells (mESCs) (Voong et al., 2016): 30% for 26 bp, 17% for 35 bp, 15% for 44 bp, 13% for 53 bp, 9% for 62 bp, 7% for 70 bp, and 9% for 80 bp, as determined in our previous work (Portillo-Ledesma et al., 2022).

For each system, we study four different binding patterns (see Figure 1A):

  1. Topology 1: five equally distributed binding regions that occupy each the linker DNA of 5 consecutive nucleosomes

  2. Topology 2: two binding regions, one at the beginning and one at the end of the fiber, that occupy each the linker DNA of 15 consecutive nucleosomes

  3. Topology 3: one binding region located in the middle of the fiber that occupies the linker DNA of 15 consecutive nucleosomes

  4. Topology 4: two binding regions, one located close to the beginning and another one located close to the end of the fiber, that occupy each the linker DNA of five consecutive nucleosomes

Note that by binding region we mean a region whose linker DNA beads are marked by one Myc:Max dimer.

TF binding at increasing concentrations

To determine the interplay between TF binding and chromatin internal parameters like LH, histone acetylation, and linker DNA length, we study TF binding at increasing concentrations such as 0, 25, 50, 75, and 100%. In particular, we study 50-nucleosome fibers of 26, 35, 44, 53, 62, 70, and 80 bp linker DNAs, as well as life-like fibers in three conditions:

  1. Wildtype, systems have no LH and no acetylation

  2. +LH, systems have a density of 1 LH/nucleosome

  3. +Acetylation, systems have four histone acetylation islands located at nucleosomes 6–11, 17–22, 28–33, and 39–44

In these three conditions, TFs are allowed to bind to any linker DNA region. TF concentration is calculated as the percentage of linker DNA beads that can bind TF.

Gene repression by Myc:Max binding

In the mm9 mouse genome assembly, Eed is located on chr7:97,103,164–97,129,486, occupying ∼26 kbp. To build Eed, Mnase-seq data (GSM2083107) (Mieczkowski et al., 2016) of mESCs were used to position nucleosome-free regions (NFRs). In particular, the data were downloaded in bedGraph format from the Genome Omnibus Expression (GOE) repository and loaded into the UCSC Genome Browser without further processing. NFRs were visually inspected and identified as genomic regions with the absence of signal. Details of MNase-seq data processing can be found at the GEO site where the data are deposited. As shown in Figure 7E, 7 NFRs can be identified. NFRs 1 and 2 are long and we model them with ∼350 bp, whereas NFRs 3–7 are short, and we model them with ∼200 bp. Similar to the life-like fibers, the linker length distribution of mESCs (Voong et al., 2016) was used to determine the distribution of nonuniform linker lengths, and the nucleosome repeat length (NRL) typical of mESCs, 189 bp (Woodcock et al., 2006), was used to calculate the number of nucleosomes in the 24 kbp fiber (length of the fiber without the NFRs), obtaining 129 nucleosomes. In the Eed gene, two binding regions for Myc:Max have been detected with Chip-seq experiments (GSE48175) (Krepelova et al., 2014; Figure 7E). One between 97,128,221 and 97,129,793 bp, and a second one between 97,124,070 and 97,124,896 bp. Thus, in the Eed model, these regions are defined as TF binding regions 1 and 2 (Figure 7E). Similarly, LH positions were determined based on Chip-seq data (GSE46134) (Cao et al., 2013) where peaks with frequencies at least 10% of the highest frequency peak were selected (Figure 7E), producing a density of 0.37 LH/nucleosome, close to the average LH density of 0.5 found in mESCs (Woodcock et al., 2006). Histone acetylation was not introduced in the model, as analyzed Chip-seq data (Zhang et al., 2022b; Shen et al., 2012; Zhang et al., 2022a) show no acetylation in the Eed region. To determine the role of LH in the Eed activation mechanism, we additionally simulate the Eed system (with and without TF) with LH density ρ = 0.8 (LHs randomly distributed), as found in mouse somatic cells (Woodcock et al., 2006). The list of linker DNAs, LH positions, and Myc:Max positions are detailed in Supporting Table 1.

Simulation and analysis

Each system is sampled with Monte Carlo (MC) simulations. For the 50-nucleosome systems, we use 50 million MC steps and 10 independent replicas. For the Eed system of 129 nucleosomes, we use 70 million MC steps and 20 independent replicas. All replicas are started from different random seeds and a DNA twisting angle of 0, +12, or –12 degrees to mimic natural variations (Drew and Travers, 1985). Frames are saved every 100,000 MC steps.

The MC moves include a global pivot move that selects the shorter fiber end passing through a random axis and rotates it, local translation and rotation moves of DNA beads and cores, and local translation of LH beads (Bascom and Schlick, 2017). Histone tails are sampled with a bead-by-bead regrowth move by the Rosenbluth scheme (Arya and Schlick, 2006).

Trajectory convergence is monitored by the system energy and global (end-to-end distance and sedimentation coefficient) and local (nucleosome triplet angle) parameters (Figure 1—figure supplement 7) over the entire ensemble (10 trajectories for each 50-nucleosome system and 20 trajectories for the Eed gene). For the 50-nucleosome systems, the last 10 million steps, or 100 structures, of each of the 10 trajectories are used to create ensembles of 1000 structures per system. For the Eed gene systems, we create ensembles of 2000 structures with the last 10 million steps of 20 trajectories. These ensembles are analyzed to calculate fiber packing ratio, sedimentation coefficients, radius of gyration, and volume and area of the promoter region.

The fiber packing ratio is calculated as:

(3) Pr=11NCFl,

where NC is the total number of nucleosomes and Fl is the fiber length calculated by defining the fiber axis with a cubic smoothing spline interpolation to the nucleosomes x, y, and z coordinates; see details in the supporting information of Portillo-Ledesma et al., 2022.

Sedimentation coefficients are calculated as:

(4) S20,w=((S1S0)ρ+S0)(1+R1NCij1Rij),

where S0 and S1 are the sedimentation coefficients of a mononucleosome without LH (S0=11.1S) (Garcia-Ramirez et al., 1992) and with LH (S1=12S) (Butler and Thomas, 1998), respectively, ρ is the LH density on the fiber, R1 is the radius of a nucleosome (R1= 5.5 nm), NC is the number of nucleosomes in the chromatin fiber, and Rij is the distance between the nucleosomes i and j.

The radius of gyration, which describes the overall dimension of the chromatin fiber, is measured as the root mean squared distance of each nucleosome from the center of mass according to:

(5) Rg2=1NCj=1NC(rjrmean)2,

where NC is the number of nucleosomes, rj is the center position of the nucleosome core j, and rmean is the average of all core positions.

The volume and area of the Eed promoter region that overlaps with the TF binding region 1 are calculated using the alpha shape function of Matlab. With alphaShape, a bounding area or volume is created to envelop the nucleosome coordinates. We use an alpha value of 100 to create a loose shape. In particular, we use the x, y (for area) or x, y, z (for volume) coordinates of nucleosomes 123–129 as vertices, as they are located in the promoter region.

Interactions among nucleosomes are calculated every 100,000 MC steps for each trajectory, normalized across all trajectory frames, and summed to create a contact map that we plot at a logarithmic scale. Two nucleosomes i and j are considered to be in contact if any element of nucleosome i, such as core, tails, or linker DNA, is less than 2 nm from any element of nucleosome j. These matrices are calculated in both bp and nucleosome resolution.

Internucleosome interaction matrices at nucleosome resolution are decomposed into one-dimensional plots that depict the magnitude of i, i±k interactions, or contact patterns, as follows:

(6) I(k)=i=1NCI(i,i±k)j=1NCI(j),

where NC is the number of nucleosomes, I is the internucleosome interaction matrix, and k is the number of nucleosomes between cores i and j.

To estimate the number of microdomains in each system, we convert the interaction matrices in nucleosome resolution to distance matrices by calculating the inverse of each element in the matrix (d=1/frequency) and perform a clustering analysis with the DBSCAN algorithm (Ester et al., 1996). To define the clusters, we set the parameter minpoints as five for every system, and the radius of search, ε, as 3 for the 62 bp system, 2 for the 44 bp and life-like systems, and 1.4 for the 26 bp system.

Data availability

Matlab scripts for data analysis, as well as representative structures (in pdb format) for the 26 bp, 44 bp, 62 bp, and life-like fibers, and the Eed system are deposited on Zenodo.

The following data sets were generated
    1. Portillo-Ledesma S
    2. Chung S
    3. Hoffman J
    4. Schlick T
    (2023) Zenodo
    Transcription Factor Binding Regulates Chromatin Architecture.
    https://doi.org/10.5281/zenodo.8199993
The following previously published data sets were used
    1. Cao K
    2. Lailler N
    3. Dong X
    4. Bouhassira EE
    5. Fan Y
    (2013) NCBI Gene Expression Omnibus
    ID GSE46134. High Resolution Mapping of H1 Linker Histone Variants in Embryonic Stem Cells.

References

  1. Book
    1. Bascom GD
    2. Schlick T
    (2017) Mesoscale modeling of Chromatin fibers
    In: Lavelle C, Victor JM, editors. Nuclear Architecture and Dynamics. Academic Press. pp. 123–147.
    https://doi.org/10.1016/B978-0-12-803480-4.00005-3
  2. Conference
    1. Ester M
    2. Kriegel HP
    3. Sander J
    (1996)
    A density-based algorithm for discovering clusters in large spatial databases with noise
    Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining.
    1. Garcia-Ramirez M
    2. Dong F
    3. Ausio J
    (1992)
    Role of the histone “tails” in the folding of oligonucleosomes depleted of histone H1
    The Journal of Biological Chemistry 267:19587–19595.

Peer review

Reviewer #1 (Public Review):

In this study, authors performed multiple sets of mesoscale chromatin simulations at nucleosome resolution to study the effects of TF binding on chromatin structures. Through simulations at various conditions, authors performed systemically analysis to investigate how linker histone, tail acetylation, and linker DNA length can operate together with TFs to regulate chromatin architecture. Using gene Eed as one example, authors found that binding of Myc:Max could repress the gene expression by increasing fiber folding and compaction and this repression can be reversed by the linker histone. Understanding how transcription factors bind to regulatory DNA elements and modulate chromatin structure and accessibility is an essential question in epigenetics. Through modelling of TF binding to chromatin structures at nucleosome levels, authors demonstrated that TF binding could create microdomains that are visible in the ensemble-based contact maps and short DNA linkers prevent the formation microdomains. It has also been shown that tail acetylation and TF binding have opposite effects on chromatin compaction and linker histone can compete for the linker DNA with TF binding to impair the effect of TF binding. This study improves our knowledge on how TFs collaborate with different epigenetic marks and chromatin features to regulate chromatin structure and accessibility, which will be of broad interest to the community.

https://doi.org/10.7554/eLife.91320.3.sa1

Reviewer #2 (Public Review):

In this paper, Portillo-Ledesma et al. study chromatin organization in the length scale of a gene, simulating the polymer at nucleosome resolution. The authors have presented an extensive simulation study with an excellent model of chromatin. The model has linker DNA and nucleosomes with all relevant interactions (electrostatics, tails, etc). Authors simulate 10 to 26 kb chromatin with varying linker lengths, linker histones (LH), and acetylated tails. The authors then study the effect of a transcription factor (TF) Myc: Max binding. The critical physical feature of the TF in the model is that it binds to the linker region and bends the DNA to make loops/intra-chromatin contacts. Authors systematically investigate the interplay between different variables such as linker DNA length, LH density, and the TF concentration in determining chromatin compaction and 3D organization.

The manuscript is well-written and is a relevant study with many useful results. The biggest strength of the work is the fact that the authors start with a relevant model that incorporates well-known biophysical properties of DNA, nucleosomes, linker histones, and the transcription factor Myc:Max. One of the novel results is the demonstration of how linker lengths play an important role in chromatin compaction (measured by computing packing ratio) in the presence of DNA-bending TFs. As the TF concentration increases, chromatin with short linker lengths does not compact much (only a small change in packing ratio). If the linker lengths are long, a higher percentage of TFs leads to an increase in packing ratio (higher compaction). Authors further show that TFs are able to compact life-like chromatin fiber with linker length taken from a realistic distribution. The authors compute inter-nucleosomal contact maps from their simulated configurations and show that the map has features similar to what is observed in Hi-C/Micro-C experiments. Authors study the compaction of the Eed gene locus and show that TF binding leads to the formation of small domains known as micro-domains. Authors have predicted many relevant and testable quantities. Many of the results agree with known experiments like the formation of the micro-domains. Hence, the conclusions made in this study are justified - they follow from the simulation results.

https://doi.org/10.7554/eLife.91320.3.sa2

Author response

The following is the authors’ response to the original reviews.

Reviewer #1:

1. The results that TF binding produces microdomains at medium and long linker DNA but not short linker is very interesting. Although the differences can be observed from the figure, it still lacks of quantitative comparison. It is not clear the exact definition of the microdomain observed from simulations and what numbers of microdomains can be identified under different conditions. A quantitative comparison of different conditions could also be provided.

We thank the reviewer for this suggestion. Our intent was to show qualitatively how TF binding locations that we design can direct fiber folding and create microdomains, which we define in the paper as high frequency contact regions in the contact maps, similar to the TADs observed in HiC maps. Together with the fiber configurations, contact maps allow us to identify formation of such microdomains, and to observe how these microdomains change depending on the conditions we build into the model, such as TF binding region or linker DNA length.

To address your point, we have added a clustering analysis of the contact matrices with nucleosome resolution and assign each contact along the genome position (nucleosome index) to a cluster. In Figure 1- figure supplement 1 and 2, we show how DBSCAN clustering provides a clustering distribution that quantitatively describes the microdomains observed in the matrices and estimates the number of microdomains. For example, in the 44 and 62 bp systems, the contacts along the genomic distance separate into 5, 2, and 1 nucleosome groups for topologies 1 to 3, and into 2 and 1 group for topology 4, respectively. In the 26 bp and life-like systems, where microdomains are more diffuse due to fiber rigidity or polymorphism, we see that the clustering results are not as TF-topology-dependent as in the 44 and 62 bp systems. We also decomposed the contact matrices into one dimensional plots that depict the magnitude of i, i ± k internucleosome interactions. We see that internucleosome patterns change with the TF binding topology, and that the 26 bp and life-like systems show the least changes.

1. When increasing TF concentration, from 0 to 100%, it seems that both packing ratio and sedimentation coefficients are not sensitive to the TF concentrations after 25%. Is it due to the saturation of TF binding? How many TF binding sites are considered at each concentration?

Yes, in most cases, at TF concentrations higher than 25%, the fiber compaction does not change due to saturation of TF binding. Although the TF concentrations are reached, such as 50%, 70%, or 100%, these do not influence the fiber architecture. A higher order folding and compaction cannot be reached due to excluded volume interactions that impede overlapping of beads in the model.

We have clarified this in the manuscript.

As stated in the Methods section, the TF concentration refers to the number of linker DNA beads that can engage in a constraint compared to the total number of linker DNA beads. Thus, at 25% TF, 25% of linker DNA beads are engaged in TF constraints. We have added a comment on this in the Results section.

1. It is shown that the contact maps that reveal microdomains are ensemble-based maps and single trajectories do not show clear formation of microdomains. Does the formation of microdomains increase with the number of combined trajectories?

The formation of microdomains occurs in each single trajectory. However, the microdomains formed in each trajectory can be different. That is why ensemble-based maps show clearer trends of microdomains that might not be as visible in single-trajectory maps. If we increase the number of trajectories, the macrodomains will be more visible and there will be more macrodomains in the contact map, but the formation of microdomains will not increase in each single trajectory.

1. "As we see from Figure 3A, when the linker DNA is short, such as 26 and 35 bp, TF binding does not increase the packing ratio of the fiber." The results of 35bp cannot be found in Figure 3A. In addition, the color of 44 and 62 bp should be changed since they are very similar in the figure.

Thank you for catching this. The results corresponding to the 35 bp system are presented in the Figure 3-figure supplement 1. We have changed the text to read “As we see from Figure 3A and Figure3- figure supplement 1..”.

We have changed the color of the 62 bp trace to blue in the plots of Figure 3. Consistently, we have also changed the color of the 62 bp fiber in Figure 1 and Figure 4.

1. For modelling of TF binding at increasing concentrations, it is mentioned that in these three conditions, TFs are allowed to bind to any region. Do you mean TF can also bind to nucleosomal DNA? Nucleosome structure prevents the binding of many TFs.

In our model, only linker DNA beads can engage in the constraints (bind TF).

We have changed the text to read “TFs are allowed to bind to any linker DNA region”.

1. The details of the Mnase-seq dataset and how NFRs are identified should be provided, such as the coverage of the data and what read fragments are selected for NFR mapping.

MNase data in bedgraph format were downloaded from the Genome Expression Omnibus (GSM2083107) repository and loaded without further processing into the Genome Browser. NFRs were visually inspected and detected as genomic regions without peaks. As detailed in the GEO repository, the sequenced paired-end reads were mapped to the mm9 genome. Only uniquely mapped reads with no more than two mismatches were retained and reads with insert sizes less than 50 or larger than 500 bp were discarded.

We have clarified this in the manuscript.

1. The calculations of volume and area of the Eed promoter region should be further elucidated.

Thank you. We now elaborate upon these calculations. In particular, the Eed promoter region is defined between cores 123 and 129. The x,y or x,y,z coordinates of those cores are used to create the bounding area or volume by defining the shape’s vertices.

1. In Figure 2, it is not clear how different topology are identified.

In Figure 2 the topology, or TF binding regions, is the same for each of the 10 contact maps as these emerge from trajectory replicas of the same system which we named Topology 1. Different microdomains are formed in each individual trajectory as the high-frequency regions appear in different locations on each contact map. However, when these 10 maps are summed, the ensemble contact map clearly shows consensus microdomains in each region where TF binds.

Reviewer #2:

To further improve the manuscript, I have the following suggestions/comments.

1. While most of the conclusions in this paper follow from the evidence provided by the ximulations, the result in section 3.3 title "Gene locus repression is medicated by TF finding," may not follow from the results. In my opinion, repression is a more complex process, and many more factors (such as nucleosome positioning, nucleosome sliding, histone methylation, and other proteins such as PRC or HP1, etc) may be involved in repression. While compaction is often associated with repressed chromatin (heterochromatin), recent studies have shown that heterochromatin fibers are highly diverse, and compaction alone may not be the criteria for repression (eg. see Spracklin et al. Nat. Struct. Mol. Biol. 30, 38-51 (2023).). In this light, I would recommend slightly modifying the title to say, "TF binding-mediated compaction can help in gene locus repression" or something similar.

Yes! We completely agree that gene repression is a very complex phenomenon that involves many factors that we are approaching by modeling starting from the simplest strategy. Thus, we have changed the subtitle to read “TF binding-mediated compaction as possible mechanism of gene locus repression”.

1. Authors could also present the contact probability versus genomic distance. This may provide some generic features at nucleosome resolution, given the variability in linker length and LH density.

We thank the reviewer for this suggestion. We have now calculated the contact probability for the EED gene with and without TF binding (Figure 5- figure supplement 1). We see that the contact probability corresponding to short range interactions (i ± 2, 3, 4, 5, and 6) is slightly lower for the EED gene upon TF binding. However, a striking increase in the contact probability upon TF binding is seen in the genomic region between 3 and 5 kb, which corresponds to local loop interactions. Thus, TF binding slightly decreases local interactions but increases chromatin loops. Such changes are not observed for the EED system with LH density 0.8 (Figure 6- figure supplement 1), further supporting the idea that an increase in LH density hampers the effect of TF binding for the EED gene architecture.

We have now added these results to the manuscript.

1. Write a short paragraph about the limitations of the model/study. For example, one of the limitations could be that, as of now, it has only the effect of a few proteins, but to predict repression, one may need to incorporate the effect of several proteins.

We agree with the reviewer that our model is a simple, first-step approach. Nonetheless, even the simplest mathematical model can be enlightening in helping dissect essential factors. Here, our model clearly shows how TF binding location modulates fiber architecture and the interplay between TF binding and other chromatin elements, like linker DNA length, LH density, and histone acetylation. We have now stated in the Discussion section that although limited due to being implicit and not considering other protein partners, our model can provide insights on the regulation of chromatin architecture by protein binding. Future modeling with explicit protein binding or combination of several proteins will further help us understand genome folding regulation.

1. The radius of gyration of 26 kb chromatin is around ~60nm in this paper. Is there any experimental measurement to compare (approximate order of magnitude)? While I do not know any measurement for Eed gene locus, I am aware of the results in the Boettiger et al. paper from Xiaowei Zhuang lab (Nature 2016). There, they find that the Rg of a 26 kb region is above 100nm. But that is for a different organism, a different set of genes. Also, see Sangram Kadam et al. Nature Communications 14 (1), 4108, 2023.

Thank you for this suggestion. To the best of our knowledge, there are no radius of gyration measurements for the EED gene. Regarding the two papers you cite, in the paper from Boettiger et al. (1) they determine by microscopy experiments that Rg ∝ L! where L is the genomic length and c is 0.37 ± 0.02 for active chromatin (Figure 1d of the paper). In such case, the Rg for a 26 kb region would be 43 ± 9 nm. Considering that these are Drosophila cells, our value of 62 nm is in good agreement with that estimate. Regarding the Kadam et al. paper (2), by coarse grained modeling they find an Rg of around 100 nm for different genes. Considering that the radius of gyration depends on cell type and fiber configuration (see for example (3) for the dependency of Rg on loop number and persistence length), we believe that our measurements in the same ball park as experimental results and other theoretical modeling studies are good indicators of our model’s reasonableness.

We have added this comparison to the manuscript.

1. The reason why it is useful to compare some distance measurements (physical dimension) with experiments is the following: The contact map in Hi-C only gives relative contact probabilities. It does not give absolute contact probabilities. To convert a Hi-C map into a physical distance, one requires comparison with some experimentally measured 3D distance. The radius of gyration is an ideal quantity to compare. From my experience, the contact probability is often much smaller than 1, suggesting that the chromatin is more expanded. But this could be due to the effect of many other proteins in vivo and the crowding, etc. I do not expect this work to incorporate all those effects. However, it may be useful to make a comment about it in the manuscript.

Thank you. We have added to the discussion a comment on our first-generation model of TF binding to chromatin and the neglect of many associated protein and RNA cofactors that certainly influence chromosome folding and domain formation on higher scales. Some distance measures are also added to the Results as mentioned above.

References

1. Boettiger,A.N., Bintu,B., Moffitt,J.R., Wang,S., Beliveau,B.J., Fudenberg,G., Imakaev,M., Mirny,L.A., Wu,C. and Zhuang,X. (2016) Super-resolution imaging reveals distinct chromatin folding for different epigenetic states. Nature, 529, 418–422.

2. Kadam,S., Kumari,K., Manivannan,V., Dutta,S., Mitra,M.K. and Padinhateeri,R. (2023) Predicting scale-dependent chromatin polymer properties from systematic coarsegraining. Nat. Commun., 14, 4108.

3. Wachsmuth,M., Knoch,T.A. and Rippe,K. (2016) Dynamic properties of independent chromatin domains measured by correlation spectroscopy in living cells. Epigenetics Chromatin, 9, 57.

https://doi.org/10.7554/eLife.91320.3.sa3

Article and author information

Author details

  1. Stephanie Portillo-Ledesma

    1. Department of Chemistry, 100 Washington Square East, Silver Building, New York University, New York, United States
    2. Simons Center for Computational Physical Chemistry, 24 Waverly Place, Silver Building, New York University, New York, United States
    Contribution
    Conceptualization, Data curation, Software, Formal analysis, Investigation, Visualization, Methodology, Writing - original draft, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-8211-5447
  2. Suckwoo Chung

    Department of Chemistry, 100 Washington Square East, Silver Building, New York University, New York, United States
    Contribution
    Data curation, Formal analysis, Investigation, Visualization
    Competing interests
    No competing interests declared
  3. Jill Hoffman

    Department of Chemistry, 100 Washington Square East, Silver Building, New York University, New York, United States
    Contribution
    Software, Formal analysis, Investigation
    Competing interests
    No competing interests declared
  4. Tamar Schlick

    1. Department of Chemistry, 100 Washington Square East, Silver Building, New York University, New York, United States
    2. Simons Center for Computational Physical Chemistry, 24 Waverly Place, Silver Building, New York University, New York, United States
    3. Courant Institute of Mathematical Sciences, New York University, New York, United States
    4. New York University-East China Normal University Center for Computational Chemistry, New York University Shanghai, Shanghai, China
    Contribution
    Conceptualization, Resources, Supervision, Funding acquisition, Investigation, Project administration, Writing – review and editing
    For correspondence
    schlick@nyu.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-2392-2062

Funding

National Institute of General Medical Sciences (R35-GM122562)

  • Tamar Schlick

National Science Foundation (2151777)

  • Tamar Schlick

New York University

  • Stephanie Portillo-Ledesma

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

This work builds upon initial explorations of protein binding to chromatin fibers in our chromatin code by Dr. Gavin Bascom. Support from the National Institutes of Health, National Institute of General Medical Sciences Award R35-GM122562, National Science Foundation Award (2151777) from the Division of Mathematical Sciences, and Philip-Morris USA Inc to TS is gratefully acknowledged. The authors thank the HPC team at NYU for providing computational resources on the Greene NYU Supercomputer. SPL is grateful for financial support from the Simons Center for Computational Physical Chemistry at NYU.

Senior and Reviewing Editor

  1. Detlef Weigel, Max Planck Institute for Biology Tübingen, Germany

Version history

  1. Sent for peer review: August 30, 2023
  2. Preprint posted: September 27, 2023 (view preprint)
  3. Preprint posted: November 30, 2023 (view preprint)
  4. Preprint posted: December 18, 2023 (view preprint)
  5. Version of Record published: January 19, 2024 (version 1)

Cite all versions

You can cite all versions using the DOI https://doi.org/10.7554/eLife.91320. This DOI represents all versions, and will always resolve to the latest one.

Copyright

© 2023, Portillo-Ledesma et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 397
    Page views
  • 57
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Stephanie Portillo-Ledesma
  2. Suckwoo Chung
  3. Jill Hoffman
  4. Tamar Schlick
(2024)
Regulation of chromatin architecture by transcription factor binding
eLife 12:RP91320.
https://doi.org/10.7554/eLife.91320.3

Share this article

https://doi.org/10.7554/eLife.91320

Further reading

    1. Cell Biology
    2. Chromosomes and Gene Expression
    Carolline Ascenção, Jennie R Sims ... Marcus B Smolka
    Research Article

    Meiotic sex chromosome inactivation (MSCI) is a critical feature of meiotic prophase I progression in males. While the ATR kinase and its activator TOPBP1 are key drivers of MSCI within the specialized sex body (SB) domain of the nucleus, how they promote silencing remains unclear given their multifaceted meiotic functions that also include DNA repair, chromosome synapsis, and SB formation. Here we report a novel mutant mouse harboring mutations in the TOPBP1-BRCT5 domain. Topbp1B5/B5 males are infertile, with impaired MSCI despite displaying grossly normal events of early prophase I, including synapsis and SB formation. Specific ATR-dependent events are disrupted, including phosphorylation and localization of the RNA:DNA helicase Senataxin. Topbp1B5/B5 spermatocytes initiate, but cannot maintain ongoing, MSCI. These findings reveal a non-canonical role for the ATR-TOPBP1 signaling axis in MSCI dynamics at advanced stages in pachynema and establish the first mouse mutant that separates ATR signaling and MSCI from SB formation.

    1. Chromosomes and Gene Expression
    Masaaki Sokabe, Christopher S Fraser
    Insight

    A new in vitro system called Rec-Seq sheds light on how mRNA molecules compete for the machinery that translates their genetic sequence into proteins.