OpenNucleome for highresolution nuclear structural and dynamical modeling
Abstract
The intricate structural organization of the human nucleus is fundamental to cellular function and gene regulation. Recent advancements in experimental techniques, including highthroughput sequencing and microscopy, have provided valuable insights into nuclear organization. Computational modeling has played significant roles in interpreting experimental observations by reconstructing highresolution structural ensembles and uncovering organization principles. However, the absence of standardized modeling tools poses challenges for furthering nuclear investigations. We present OpenNucleome—an opensource software designed for conducting GPUaccelerated molecular dynamics simulations of the human nucleus. OpenNucleome offers particlebased representations of chromosomes at a resolution of 100 KB, encompassing nuclear lamina, nucleoli, and speckles. This software furnishes highly accurate structural models of nuclear architecture, affording the means for dynamic simulations of condensate formation, fusion, and exploration of nonequilibrium effects. We applied OpenNucleome to uncover the mechanisms driving the emergence of ‘fixed points’ within the nucleus—signifying genomic loci robustly anchored in proximity to specific nuclear bodies for functional purposes. This anchoring remains resilient even amidst significant fluctuations in chromosome radial positions and nuclear shapes within individual cells. Our findings lend support to a nuclear zoning model that elucidates genome functionality. We anticipate OpenNucleome to serve as a valuable tool for nuclear investigations, streamlining mechanistic explorations and enhancing the interpretation of experimental observations.
eLife assessment
This important work significantly advances the field of computational modeling of genome organization through the development of OpenNucleome. The evidence supporting the tool's effectiveness is compelling as the authors compare their predictions with experimental data. It is anticipated that OpenNucleome will attract significant interest from the biophysics and genomics communities.
https://doi.org/10.7554/eLife.93223.3.sa0Introduction
The highly complex structural organization of the human nucleus plays a crucial role in the functioning and regulation of our cells (Dekker et al., 2017; Hübner et al., 2013; Bickmore, 2013; Gorkin et al., 2014; Dekker and Mirny, 2016; Furlong and Levine, 2018; Finn and Misteli, 2019; Chen and Belmont, 2019; Lin et al., 2021; Liu et al., 2024). The complexity arises from the diverse range of nuclear landmarks, such as nucleoli (Lafontaine et al., 2021), nuclear speckles (Chen and Belmont, 2019; Lamond and Spector, 2003), and the nuclear lamina (van Steensel and Belmont, 2017), each serving distinct functions. These landmarks provide specialized environments for various nuclear processes, allowing for efficient coordination and regulation of gene expression. Moreover, the spatial arrangement of chromosomes within the nucleus, intertwined with the nuclear landmarks, is critical for proper gene regulation and communication between different genome regions. Disruptions or abnormalities in the nuclear organization can have profound consequences on cellular function and can contribute to the development of diseases, including cancer and genetic disorders (Seruga et al., 2008; SchusterBöckler and Lehner, 2012).
Recent advancements in experimental techniques have significantly enhanced our understanding of nuclear organization (Bickmore, 2013; Schmitt et al., 2016; McCord et al., 2020; Parmar et al., 2019; Jerkovic and Cavalli, 2021; Chen et al., 2016). The advent of highthroughput sequencingbased methods, such as genomewide chromosomeconformation capture (HiC), has unveiled crucial structural elements of the genome (Dekker et al., 2002; LiebermanAiden et al., 2009), including chromatin loops (Rao et al., 2014), topologically associating domains (Dixon et al., 2016; Dekker and Heard, 2015), and compartments (LiebermanAiden et al., 2009). Additionally, sequencingbased techniques such as DamID (Greil et al., 2006), ChipSeq (Park, 2009), and TSASeq (Chen et al., 2018) have revealed valuable information regarding interactions between chromosomes and nuclear landmarks. However, it is worth noting that these sequencing methods often offer averaged contacts, which can mask the heterogeneity present across populations, although singlecell techniques are also emerging (Wen et al., 2020; Ramani et al., 2017; Nagano et al., 2013). Moreover, translating contact data into spatial positions can be challenging, adding complexity to interpreting experimental findings.
To complement these sequencing approaches, microscopic imaging techniques directly probe the spatial positions within individual nuclei (Bickmore, 2013; van Steensel and Belmont, 2017; Chen et al., 2015; Boettiger et al., 2016; Shachar et al., 2015). Recent advancements in DNA FISH (fluorescence in situ hybridization) have enabled highthroughput imaging of thousands of loci simultaneously (Su et al., 2020; Takei et al., 2021). These imaging studies have not only confirmed the structural features observed through sequencing techniques but have also provided valuable insights into the heterogeneity present at the singlecell level.
The abundance of available experimental data in the field of nuclear organization provides a fertile ground for structural modeling (Qi et al., 2020; Qi and Zhang, 2019; Boninsegna et al., 2022; Fujishiro and Sasai, 2022; Shi and Thirumalai, 2021; Dekker et al., 2013; Jost et al., 2014; Giorgetti et al., 2014; Di Pierro et al., 2017; Buckle et al., 2018; Nuebler et al., 2018; Bianco et al., 2018; Shi et al., 2018; MacPherson et al., 2018; Shin et al., 2023; AmiadPavlov et al., 2021; Brahmachari et al., 2022; Jiang et al., 2022; Ganai et al., 2014; Liu et al., 2018; Laghmach et al., 2020; Chu and Wang, 2021; Lappala et al., 2021; Chu and Wang, 2022; Goychuk et al., 2023; Sun et al., 2021; Kadam et al., 2023). To make sense of this wealth of information, various computational approaches have been introduced, with polymer simulation approaches being extensively utilized. These simulation techniques aid in reconstructing structural ensembles that closely replicate experimental data, offering valuable insights into the mechanisms underlying chromosome folding. In recent studies, these approaches have also been employed to investigate the interplay between the genome and the nuclear lamina (Bajpai et al., 2021; Kamat et al., 2023; Laghmach et al., 2021; Stephens et al., 2018), as well as nucleoli (Qi and Zhang, 2021), shedding light on their dynamic relationships.
Despite the progress made in computational modeling, the absence of welldocumented software with easytofollow tutorials pose a challenge. Many research groups develop their own independent software, which complicates crossvalidation and hinders the establishment of best practices for genome modeling (Fujishiro and Sasai, 2022; Yildirim et al., 2023; Oliveira Junior et al., 2021). Moreover, comprehensive models of the entire nucleus, especially at high resolution, remain scarce. Addressing these limitations and fostering collaboration in the scientific community can be achieved through the development of opensource tools. By promoting transparency and accessibility, such tools have the potential to greatly facilitate nuclear modeling and contribute to a more unified and collaborative research environment.
We present OpenNucleome, an opensource software designed for conducting molecular dynamics (MD) simulations of the human nucleus. This software streamlines the process of setting up whole nucleus simulations through just a few lines of Python scripting. OpenNucleome can unveil intricate, highresolution structural and dynamic chromosome arrangements at a 100KB resolution. It empowers researchers to track the kinetics of condensate formation and fusion while also exploring the influence of chemical modifications on condensate stability. Furthermore, it facilitates the examination of nuclear envelope deformation’s impact on genome organization. The software’s modular architecture enhances its adaptability and extensibility. Leveraging the power of OpenMM (Eastman et al., 2017), a GPUaccelerated MD engine, OpenNucleome ensures efficient simulations.
Our work demonstrates the fidelity of the simulated nuclear organizations by faithfully reproducing HiC, Lamin B DamID, TSASeq, and DNAMERFISH data. The dynamic insights extracted from this model are pivotal in advancing our understanding of nuclear organization mechanisms. Our findings reveal that inherent heterogeneity in chromosome contacts naturally emerges within single cells. Interestingly, robust contacts between chromosomes and nuclear bodies can also be established due to a coupled selfassembly mechanism. Notably, the resilience of contacts involving nuclear bodies supports a nuclear zoning model for genome function. In the realm of nuclear investigations, we anticipate OpenNucleome to serve as an invaluable tool, seamlessly complementing experimental techniques.
Results
Nonequilibrium nucleus model at 100 KB resolution
We present an opensource implementation of a computational framework that facilitates the structural and dynamical characterization of the human nucleus. This framework builds upon a previous investigation but incorporates several significant modifications. Firstly, we enhance the model resolution by a factor of 10, enabling the precise determination of the spatial positioning of each chromatin segment measuring 100KB in length. Secondly, we present a kinetic scheme for speckles that accounts for the phosphorylation of protein molecules. This inclusion captures the influence of chemical reactions on the stability and dynamics of nuclear bodies. Thirdly, we incorporate explicit nuclear envelope dynamics to explore the impact of largescale deformations on genome organization. Finally, our implementation into OpenMM offers the advantages of Python Scripting and GPU acceleration, facilitating easy extension and customization. These features will facilitate the broad applicability and adoption of the proposed model.
The nucleus model provides particlebased representations for chromosomes, nucleoli, speckles, and the nuclear envelope. As shown in Figure 1A and B, each of the 46 chromosomes is represented as a beadsonastring polymer, where each bead represents a 100KBlong genomic segment. Based on HiC data, we further assign each bead as compartment A, B, or C to signify euchromatin, heterochromatin, or pericentromeric regions. The lamina was modeled as a spherical enclosure with 10 µm diameter, using discrete particles arranged to represent a mesh grid with covalent bonds linking together nearest neighbors (Strom et al., 2021). We modeled nucleoli and speckles as liquid droplets that emerge through the spontaneous phase separation of coarsegrained particles, representing protein and RNA molecule aggregates (Chen and Belmont, 2019; Lafontaine et al., 2021). These particles exhibited attractive interactions within the same type to promote condensation. More details about the various components of the system can be found in the Appendix 1, section ‘Components of the whole nucleus model’.
The energy function of the nucleus model includes three components that account for the selfassembly of chromosomes, the assembly of nuclear bodies, and the coupling between chromosomes and nuclear landmarks. Therefore, the model approximates nuclear organization as a coupled selfassembly process. The chromosome energy function (see Equation 7 in Appendix 1, section ‘HiC inspired interactions for the diploid human genome’) includes terms that account for the polymer connectivity and excluded volume effect, an ideal potential, compartmentspecific interactions, and specific interchromosomal interactions. As shown in Figure 1C, the ideal potential is only applied for beads from the same chromosome to approximate the effect of loop extrusion by Cohesin molecules (Sanborn et al., 2015; Fudenberg et al., 2016) for chromosome compaction and territory formation (Di Pierro et al., 2016; Zhang and Wolynes, 2017). Compartmentspecific interactions, on the other hand, promote microphase separation and compartmentalization of euchromatin and heterochromatin. Finally, interchromosomal interactions account for sequencespecific effects that compartmentdependent potentials cannot capture.
Interactions among coarsegrained particles that form nuclear bodies were designed to promote and stabilize the formation of liquid droplets, as has been revealed by many experiments (Handwerger et al., 2005; Caragine et al., 2018; Caragine et al., 2019). We adopted the Lennard–Jones potential for nucleolar particles to mimic the weak, multivalent interactions that arise from protein and RNA molecules that make up the nucleoli. As a first attempt to approximate their complex dynamics, we considered two types of particles that form speckles: phosphorylated (P) and dephosphorylated (dP). The two types can interconvert via chemical reactions (Brackley et al., 2017; Söding et al., 2020; Carrero et al., 2006) and dP particles share attractive interactions modeled with the Lennard–Jones potential.
As shown in Figure 1C, to recognize specific interactions between chromosomes and nuclear landmarks, we introduced contact potentials between them. These potentials are inspired by the experimental techniques that probe the corresponding contacts. Appendix 1, sections ‘Chromosome–nuclear landmark interactions’ and ‘Nuclear landmark–nuclear landmark interactions’ contain more details about all the nuclear landmarkrelated energy functions.
Optimization of model parameters with experimental data
The nucleus model was designed to be interpretable such that energy terms represent physical processes. Furthermore, the expressions of the interaction potentials were also designed such that their parameters can be determined from experimental data via the maximum entropy optimization algorithm (Lin et al., 2021; Xie and Zhang, 2019; Schuette et al., 2023). Below, we briefly outline the procedure used for parameter optimization and further details can be found in Appendix 1, section ‘Optimization of the whole nucleus model parameters.
As illustrated in Figure 2, starting from a given set of parameters, we first perform MD simulations to produce a collection of 3D structures for the diploid genome and various nuclear bodies. These structures are then transformed into a contact map or contact probabilities between chromatin beads and nuclear landmarks by averaging over homologous chromosomes. Constraints corresponding to different energy terms could be obtained from the simulated results and compared with those estimated from HiC, SON TSASeq, and Lamin B DamID profiles. Finally, the model parameters were updated based on the difference between simulated and experimental constraints using the adaptive moment estimation (Adam) optimization algorithm (Kingma and Ba, 2014). The three steps can be repeated with updated parameters to improve the simulationexperiment agreement further.
No quantitative experimental data exists for interactions among nuclear body particles to serve as constraints. We varied the strength of the interaction potential to produce 2–3 nucleoli and ∼30 speckle clusters during the simulations (Figure 2—figure supplement 1) while ensuring the fluidity of the resulting droplets.
Molecular dynamics simulations with GPU acceleration
We implemented the nucleus model into the MD engine OpenMM (Eastman et al., 2017). OpenMM offers an excellent interface with Python scripting, significantly improving the readability and customizability of the model. The code was designed into functional modules, with different components, such as chromosomes and nuclear landmarks, written as separate classes. This design further facilitates the introduction of additional nuclear components, if desired, with minimal changes to existing code. We provide examples of simulation set up, trajectory analysis, parameter optimization, and introducing new features in the GitHub repository.
Figure 3A illustrates the workflow for setting up and executing whole nucleus simulations. A configuration file that provides the position of individual particles in the PDB file format is needed to initialize the simulations. This file also contains topological information regarding whether a particle represents chromosomes or nuclear landmarks and the identity of specific chromosomes. The input file can be generated with provided Python scripts by randomly distributing the positions of chromosomes, speckles, and nucleoli, though optimized configurations are also included in the GitHub repository. By default, the lamina particles will be uniformly placed on a sphere of 10 μm in diameter. Upon parsing the configuration file, interactions among various components can be set up with optimized parameters. This step will produce an object that can be used for MD simulations. As shown in Figure 3B, the workflow only requires a few lines of code. The package also includes analysis scripts to compute contact maps, monitor conformational dynamics, and track nuclear bodies.
A significant benefit of OpenMM is its native support of GPU acceleration. As shown in Figure 3C, the simulation speed with one Nvidia Volta V100 GPU is 150 times faster than that of the four Intel Xeon Platinum 8260 CPU cores. Notably, this performance enhancement cannot be achieved by simply increasing the CPU core numbers. For example, the simulation speed with 32 CPU cores is less than twice that of 4 CPU cores, potentially due to the system’s heterogeneous distribution of particles.
Simulations reproduce and predict diverse experimental data
We extensively validated the parameterized nucleus model to examine its biological relevance. MD simulations initialized from 50 different initial configurations were performed to build an ensemble of structures. As mentioned in the following section, a diverse set of initial configurations is essential for reproducing interchromosomal contacts probed in HiC. From the simulated structures, we computed various quantities for direct comparison with experimental measurements. Given that the majority of experimental data were analyzed for the haploid genome, we adopted a similar approach by averaging over paternal and maternal chromosomes to facilitate direct comparison. More details on data analysis can be found in Appendix 1, section ‘Details of simulation data analysis’.
We compared the simulated contact probabilities among chromosomes with HiC data. As shown in Figure 4A and Figure 4—figure supplement 1, the simulated and experimental contact maps are highly correlated. The squares along the diagonal support the formation of chromosome territories that promote intrachromosomal contacts, and the apparent checkboard patterns follow the compartmentalization of various chromatin types. We further examined the decay of intrachromosomal contacts as a function of the sequence separation, which is known to deviate from that of an equilibrium globule (LiebermanAiden et al., 2009). As shown in Figure 4B, the simulated results overlap well with the HiC data (orange curve). In addition, the simulated average contact probabilities between various compartment types match values estimated from HiC data. Moreover, the simulated and experimental average contact probabilities between pairs of chromosomes agree well, and the Pearson correlation coefficient between the two datasets reaches 0.89.
We further examined the contacts between chromosomes and nuclear landmarks. As illustrated in Figure 4C, the simulated LaminB DamID signals for chromosome 7 match well with the experimental results, capturing the complex contact pattern that weaves chromatin toward and away from the nuclear envelope. Similarly, SON TSASeq data that quantify the contact between chromosomes and speckles are well captured by simulated structures. The anticorrelation between DamID and TSASeq is clearly visible. The observed agreement between simulation and experimental results is not limited to any particular chromosome. Good agreements are achieved for all chromosomes.
The simulations also provide 3D representations of the nucleus that can be compared with DNAMERFISH data (Su et al., 2020). We found that the simulated radius of gyration of individual chromosomes matches well with experimental values (Figure 4A). The simulated and experimental average normalized chromosome radial positions also correlate strongly, as shown in Figure 5B. We note that while the sequencing results presented in Figure 4 were used for model parameterization, the MERFISH data were not. Therefore, the simulation results here are de novo predictions, and their agreement with experimental data strongly supports the coupled assembly mechanism used for designing the energy function.
A significant advantage of MD simulationbased models is the dynamical information they naturally produce. We measured the dynamics of telomeres by tracking the meansquare displacements (MSDs), $\u27e8{\mathbf{r}}^{2}(\mathrm{\Delta}t)\u27e9$, as a function of time. In Figure 5C, we plot representative MSD trajectories over a 1hr timescale. In line with previous research (Di Pierro et al., 2018; Bronstein et al., 2009; Lee et al., 2021), telomeres display anomalous subdiffusive motion. When fitted with the equation $\u27e8{\mathbf{r}}^{2}\left(\mathrm{\Delta}t\right)\u27e9={D}_{\alpha}\mathrm{\Delta}{t}^{\alpha}$, these trajectories yield a spectrum of α values, with a peak around 0.59. The exponent and the diffusion coefficient $D}_{\alpha}=\left(27\pm 11\right)\times {10}^{4}\mu {m}^{2}\cdot {s}^{\alpha$ both match well with the experimental values (Bronshtein et al., 2015; Jack et al., 2022), upon setting the nucleoplasmic viscosity as $1Pa\cdot s$ (see Appendix 1, section ‘Mapping the reduced time unit to real time’ for more details).
The good agreement in the dynamics of individual loci further inspired us to examine the diffusion of whole chromosomes. In particular, we plotted the normalized chromosome radial positions as a function of time in Figure 6A. Remarkably, we found that chromosomes appear arrested and no significant changes in their positions are observed over timescales comparable to the cell cycle (see also Figure 6—figure supplement 1). Therefore, our simulations predict that largescale movements of chromosomes are unlikely during the G1 phase.
Heterogeneity and robustness of the simulated conformational ensemble
The lack of relaxation of chromosome radial positions suggests the importance of starting configurations used to initialize the simulations. Statistical averages of the resulting ensemble of nuclear structures depend crucially on these starting configurations. Using an optimization procedure, we selected them from 1000 configurations to maximize the agreement with experimental laminB DamID and interchromosomal contact probabilities. Appendix 1, section ‘Initial configurations for simulations’ provides more details on preparing the 1000 initial configurations.
We selected a total of 50 starting configurations to initiate independent simulations. Smaller sets of starting configurations are not sufficient to reproduce the interchromosomal contact probabilities, as shown in Figure 6—figure supplement 2B. Notably, different sets of 50 configurations selected from independent trials show significant overlap (Figure 6—figure supplement 2D), supporting the robustness of the selection protocol in detecting conserved features of genome organization.
While the ensemble as a whole is relatively robust, individual configurations with the ensemble exhibit significant differences. For example, the Lamin B DamID profiles produced from different trajectories are only weakly correlated (Figure 6C), with an average correlation coefficient of 0.53. These weak correlations result from significant differences in the normalized radial positions of chromosomes, as can be seen in representative configurations from two simulation trajectories (Figure 6B). The fluctuations of normalized radial positions cause changes in contacts between chromosomes as well, resulting in little correlation between interchromosomal contact matrices (Figure 6D).
We examined genome organizations reported by Su et al. and found a similar variation of interchromosomal contact probabilities across individual cells (Figure 6—figure supplement 2A and D). Notably, the simulated configurations capture the fluctuations of interchromosomal contacts observed in DNAMERFISH data, further supporting the biological relevance of the reported in silico structures.
Despite the differences in interchromosomal contacts across trajectories, high conservation of connections between chromosomes and speckles can be observed in individual simulations. For example, the average correlation coefficient between in silico SON TSASeq profiles produced from different trajectories is 0.72, much higher than the corresponding value for Lamin B DamID profiles. Conservation of contacts between chromosomes and nuclear bodies (zones) across individual cells has indeed been reported in a previous study that simultaneously images chromatin and various subnuclear structures (Takei et al., 2021).
Nuclear deformation preserves chromosome–nuclear body contacts
Numerous studies have highlighted the remarkable influence of nuclear shape on the positioning of chromosomes and the regulation of gene expression (Brahmachari et al., 2022; Contessoto et al., 2023). The nucleus, once regarded as a mere compartment for DNA storage, is increasingly recognized as a dynamic and intricately structured organelle. To better understand the interplay between nuclear shape and genome organization as a fundamental mechanism that shapes the transcriptional landscape, we performed additional simulations in which the nuclear lamina was altered from a sphere into more ellipsoidal shapes by applying a force along the zaxis (Figure 7A). More details about these simulations can be found in Appendix 1, section ‘Nuclear envelope deformation simulations’.
As illustrated in Figure 7B, the presence of external forces resulted in significant alterations in nuclear shape. We conducted two independent simulations with different force strengths, leading to varying degrees of deformation in the nuclear lamina. This deformation, in turn, caused a reorganization of chromosomes, affecting their normalized radial positions and pairwise contacts (see Figure 7—figure supplement 1 and Figure 7C). We observed that more deformed nuclei exhibited lower correlation coefficients for interchromosomal contacts compared to results obtained from simulations in a spherical nucleus. Similarly, the DamID profiles exhibited significant variations upon nucleus deformation, whereas TSASeq signals were much less affected and remained highly correlated with the results from the spherical nucleus simulations.
Therefore, it appears that speckles, and potentially other nuclear condensates, can dynamically reorganize in response to changes in chromosome conformations to maintain contacts with genomic loci. This robustness in nuclear body contacts may be essential for ensuring the robust functioning of the genome in a population of cells with significant variability in nuclear shape.
Discussion
We introduced a computational model, OpenNucleome, to facilitate simulations for the human nucleus at high structural and temporal resolution. We conducted extensive crossvalidation with experimental data to support the biological relevance of simulated 3D structures. Implementing the model into the MD package, OpenMM enables GPU acceleration for longtimescale simulations. Tutorials in the format of Python Scripts with extensive documentation are provided to facilitate the adoption of the model by the community.
Our software enhances the capabilities of existing genome simulation tools Fujishiro and Sasai, 2022; Yildirim et al., 2023; Oliveira Junior et al., 2021. Specifically, OpenNucleome aligns with the design principles of OpenMiChroM (Oliveira Junior et al., 2021), prioritizing opensource accessibility while expanding simulation capabilities to the entire nucleus. Similar to software from the Alber lab (Yildirim et al., 2023), OpenNucleome offers highresolution genome organization that faithfully reproduces a diverse range of experimental data. Furthermore, beyond static structures, OpenNucleome facilitates dynamic simulations with explicit representations of various nuclear condensates, akin to the model developed by Fujishiro and Sasai, 2022.
A significant advantage of OpenNucleome lies in its predictive power for dynamical information. For example, the model succeeded in reproducing the subdiffusive behavior of telomeres. We further showed that the dynamics of individual chromosomes are slow and their radial positions do not relax over the time course of a cell cycle. This is consistent with previous theoretical estimations on chromosome dynamics (Rosa and Everaers, 2008) and recent observations of solid behavior of chromatin in vivo (Strickfaden et al., 2020). Live cell experiments that directly track the positions of multiple chromosomes could further validate/falsify this prediction. We anticipate the model will greatly facilitate the investigation of the dynamics of genomic loci and nuclear bodies and the interpreting of live cell imaging results.
Slow chromosome dynamics and a lack of conformational relaxation naturally result in the heterogeneity of chromosome radial positions across individual cells. This heterogeneity raises doubts about the notion that chromosome radial positions provide robust and reliable mechanisms for gene regulation (Hübner et al., 2013; Maeshima et al., 2010; Fraser and Bickmore, 2007; Takizawa et al., 2008). Instead, our results support the nuclear zoning model for gene regulation (Takei et al., 2021), where specific loci function as ‘fixed points’ anchored to certain nuclear bodies in all cells. This anchoring mechanism robustly creates the desired molecular environment surrounding these genomic segments. Unlike chromosome radial positions, contacts between genomic loci and speckles can be robustly established in individual cells, as shown in our simulations. It was achieved through a nucleation process that attracts speckle particles toward specific loci due to specific interactions. Nucleation occurs much more rapidly than chromosome rearrangement due to the smaller size of speckle particles. The coupled selfassembly mechanism for chromosomes and nuclear bodies can similarly facilitate the formation of other nuclear zones for different kinds of fixed points.
Despite the heterogeneity in chromosome positions and interchromosomal contacts, the ensemble of nuclear structures as a whole is not random and exhibits conserved features. For example, on average, certain chromosomes remain closer to the nuclear envelope than others (see Figure 5B). Similarly, the average contact frequency between certain chromosome pairs is higher than others, though this trend can be frequently violated in individual cells. How such conserved features arise as cells exit from the mitotic phase remains unclear and would be interesting for further explorations.
Methods
Molecular dynamics simulation details
We used the software package OpenMM Eastman et al., 2017 to perform MD simulations in reduced units at constant temperature (T = 1.0). Unless otherwise specified, we froze the lamina particles and only propagated the dynamics of chromatin, nucleoli, and speckles.
Two integration schemes were used with a time step of dt = 0.005to efficiently generate structural ensembles and produce realistic dynamical information, respectively. For simulations used in parameter optimization and building structural ensembles, we employed the Langevin integrator with a damping coefficient of ${\gamma}^{1}=10.0$. In the case of MSD calculations shown in Figure 6, we utilized Brownian dynamics with a damping coefficient of ${\gamma}^{1}=0.01$. The higher damping coefficient provides a better approximation to the viscous nucleus environment, while the smaller value in the Langevin integrator facilitates conformational sampling with faster diffusion rates.
We employed the semigrand Monte Carlo technique (Sadigh et al., 2012) to simulate chemical transitions between two types of speckle particles. At every 4000 simulation steps, we attempt a total of ${N}_{\text{Sp}}$ chemical reactions that converts one type of speckle particles to the other type with a probability of 0.2. ${N}_{\text{Sp}}$ corresponds to the total number of speckle particles, and the switching probability was chosen to be comparable to the experimental phosphorylation rate. More details on the speckle dynamics are provided in Appendix 1, section ‘Speckles as phaseseparated droplets undergoing chemical modifications’.
When deforming the nuclear envelope, we unfroze the lamina particles and evolved them dynamically as the rest of the nucleus. Bonded interactions among lamina particles held the nuclear envelope together as a particle mesh. A harmonic force along the zaxis was introduced to compress the particle mesh. More details are provided in Appendix 1, section ‘Nuclear envelope deformation simulations’.
For simulations used to optimize parameters, a total of 50 independent 3millionsteplong trajectories were performed. Configurations were recorded at every 2000 simulation steps for analysis. The first 500,000 steps of each trajectory were discarded as equilibration. For production simulations, we performed 50 independent 10millionstep long trajectories starting from different initial configurations. Nuclear structures were again recorded at every 2000 steps to determine statistical averages presented in the article. An additional eight simulations of 30million steps in length were performed to compute telomere MSDs.
We mapped the reduced units to real units with the conversion of length scale σ = 385nm and the timescale in Brownian dynamics simulations $\tau =0.65s$. These conversions were determined as detailed in Appendix 1, section ‘Unit conversion’.
Experimental data processing and analysis
We obtained the in situ HiC data, SON TSAseq data, and LaminB DamID data of HFF cell lines from the 4DN data portal. The intra and interchromosomal interactions were calculated at 100KB resolution with VC_SQRT normalization applied to the interaction matrices. HiC data extraction and normalization were performed using Juicer tools (Durand et al., 2016). We followed the same processing and normalization method described in Zhang et al., 2021 to analyze TSAseq data. Two biological replicates of LaminB DamID data were merged and the normalized counts over Damonly control were used for analysis. The SON TSASeq and LaminB DamID data were processed at the 25KB resolution and the average values at the 100KB resolution were used in Figure 4 for model validation.
Appendix 1
Components of the whole nucleus model
As outlined in the main text, the whole nucleus model consists of chromosomes, nucleoli, speckles, and the nuclear lamina. Below, we provide details on the particlebased representations of the various components, totaling 70542 coarsegrained beads. Abbreviations are frequently used for clarity in notation, with N for nucleus, La for lamina, No for nucleoli, and Sp for speckles.
Chromosomes as beads on the string polymers
We explicitly modeled the 46 human chromosomes as beadsonastring polymers. Each coarsegrained bead represents a 100 KB genomic segment, totaling 60642 beads for the genome. We assigned each bead as either compartment type A, B, C, or N. The compartment assignments for types A and B were extracted from the HiC contact matrix for HFF cells (Krietenstein et al., 2020) using the cooltools software (Venev et al., 2020), and compartment C were identified as centromeric regions based on the DNA sequence. Compartment N denotes genomic regions that cannot be assigned as A, B, or C due to a lack of HiC data.
The nuclear lamina as a particlebased mesh
The nuclear envelope provides an enclosure to confine DNA and a repressive environment to organize chromatin with specific interactions (Hetzer, 2010). To account for the role of the nuclear lamina while keeping our model simple, we approximate it with discrete particles uniformly placed on a sphere.
Following our previous work (Kamat et al., 2023), we used the Fibonacci grid to initialize the lamina particles, which form a uniform and almost equidistant network of lamina particles on the surface of the nucleus (Swinbank and James Purser, 2006; Li et al., 2007). The Cartesian coordinates associated with the ith lamina particles are defined as
where ${N}_{\text{La}}=8000$ represents the number of lamina particles, $i\in \{0,1,\dots ,{N}_{\text{La}}2,{N}_{\text{La}}1\}$, and $\Phi =\pi \times (3\sqrt{5})$ is the golden angle. We set $R}_{\text{N}}=5\mu \mathrm{m$ as the radius of the human foreskin fibroblasts (HFF) cell nucleus.
Nucleoli as phaseseparated droplets
Nucleoli have been shown to behave as liquid droplets that form through phase separation (Lafontaine et al., 2021; Pederson, 2011; Shin and Brangwynne, 2017). We modeled the droplets with coarsegrained beads. While the composition of nucleoli is rather complex, we only used one type of particle for simplicity. In our simulations, we fixed the number of nucleolus particles, ${N}_{\text{No}}$, based on the experimental concentration of nuclear protein NPM1, $c=1\mu \text{M}$ (Qi and Zhang, 2021; Kamat et al., 2023; Zhu et al., 2019). For example,
where ${N}_{\text{A}}$ is the Avogadro constant and $R}_{\text{No}}=0.5\mu \mathrm{m$ is the average nucleolous size (Caragine et al., 2018; Caragine et al., 2019).
Speckles as phaseseparated droplets undergoing chemical modifications
Similar to nucleoli, speckles have also been shown to behave as liquid droplets (Chen and Belmont, 2019). However, one crucial unique feature of speckles is the constant chemical modifications of protein molecules comprising them, such as splicing factors (Spector and Lamond, 2011). The phosphorylation of these molecules has been argued to be essential for the dynamics and the number of speckles. Therefore, we implemented a kinetic scheme introduced by de Vries and coworkers to account for the chemical reactions. In this scheme, we consider two types of speckle molecules: phosphorylated (SpP) and dephosphorylated (SpdP). Only SpdP particles share attractive interactions.
The two protein types can interconvert via chemical reactions with a transition probability matrix T defined as
For simplicity, we assume the forward transition rate from SpP to SpdP particles is identical to the reverse rate. Because of the symmetry in transition rates, the average number of dP particles $\langle {N}_{\text{SpdP}}\rangle =0.5{N}_{\text{Sp}}$, where ${N}_{\text{Sp}}$ is the total number of speckle particles.
We chose the transition probability as 0.2 to be consistent with the phosphorylation rate. In particular, we estimate the rate as
where τ is the time interval between consecutive attempts of chemical reactions. As detailed in section ‘Molecular dynamics simulation details’, the reactions were attempted every 4000 simulation steps, with a time step of 0.005. The time unit in our simulations is 0.65 s (see section ‘Mapping the reduced time unit to real time’). The estimated value for k_{12} is in the same order as the experimental phosphorylation rate (VelazquezDones et al., 2005).
We estimated the total number of speckle particles as follows. Assuming that there is a total of 30 speckles (Galganski et al., 2017), we have $N}_{\text{SpdP}}=30\times {N}_{\text{s}$, where $N}_{\text{s}$ is the number of SpdP particles in each cluster. This estimation assumes that only SpdP particles share attractive interactions and contribute to cluster formation. From the experimentally estimated relative mass densities of the protein concentrations in the speckle and nucleolus droplet as $\frac{170}{203}$ (Handwerger et al., 2005), we have
We assumed that speckle and nucleolus particles have identical mass and each nucleolus has 100 particles. The radius for speckle and nucleolus was approximated as 0.3 and $0.5\mu \mathrm{m}$, yielding ${N}_{s}\approx 20$ and ${N}_{\text{SpdP}}\approx 600$. Because of the kinetic scheme defined in Equation 3, only parts of SpdP particles will participate in droplet formation during the simulations. Therefore, we increase the particle number and set $\langle {N}_{\text{SpdP}}\rangle =800$, which yields ${N}_{\text{Sp}}=1600$.
Energy function of the whole nucleus model
As detailed below, the energy function of the whole nucleus, ${U}_{\text{Nucleus}}$, consists of interactions among chromosomes, among nuclear landmarks, and cross interactions between the two. Therefore,
HiC inspired interactions for the diploid human genome
The energy function of the genome model is defined as
${U}_{\text{homo}}(r)$ determines a generic polymeric topology of chromosomes with excluded volume effect:
where the subscripts $i,i+1$, and i+2 represent the index of ${i}^{th},{(i+1)}^{th}$, and ${(i+2)}^{th}$ beads, respectively, and ${u}_{\text{bond}}({r}_{i,i+1})$ and ${u}_{\text{angle}}({r}_{i,i+1},{r}_{i+1,i+2})$ denote the bonding and angular potential applied for neighboring beads to ensure the connectivity of the chromatin chain and follow:
where, as discussed in Equation 32, ${r}_{0}=0.5\sigma $ represents the size of the chromatin bead. The softcore potential provides excluded volume effects for pairs of beads from the same or different chromosomes and follows:
where ${u}_{\text{sc}}({r}_{ij})$ denotes a softcore potential added to each pair formed by beads index i and j to account for the excluded volume effect while allowing the finite probability of crossover of polymer chains.
which corresponds to the Lennard–Jones potential capped off at a finite volume within a repulsive core to allow for chain crossing at a finite energy cost. ${E}_{\text{cut}}=4\u03f5$ and ${r}_{\text{cut}}$ is chosen as the distance at which ${U}_{\text{LJ}}(r)=0.5{E}_{\text{cut}}$.
${U}_{\text{ideal}}(r)$ is the intrachromosomal potential applied to genomic loci within the same chromosome, while ${U}_{\text{compt}}(r)$ is the compartmentspecific interaction potential. The ideal potential, which can be rigorously derived following the maximum entropy principle (Roux and Weare, 2013; Zhang and Wolynes, 2015), adopts the following form:
where I indexes over each chromosome and i and j index over pair of beads on that chromosome. ${\alpha}_{\text{ideal}}(ij)$ depends only on the sequence separation between two beads i and j. $f({r}_{ij})$ measures the probability of contact formation for two loci separated by a distance of r_{ij}, and its ensemble average corresponds to the contact probability measured in HiC experiments. $f({r}_{ij})$ adopts the form
The numerical value of r_{c} was determined from the HiC contact map, as detailed in the next section. This contact probability function depicts that when $r<{r}_{c},f\approx 1$ but when $r>{r}_{c},f\approx {({r}_{c}/r)}^{4}$. The powerlaw decay with an exponent of 4 is consistent with the relationship between contact probability and spatial distances revealed in imaging studies (Qi and Zhang, 2019; Wang et al., 2017). The tanh function ensures the continuity of the function and its derivative around r_{c} (Appendix 1—figure 1). Additionally, we truncated the ideal potential to be applicable for a sequence separation less than or equal to 100 MB and set the parameters for larger sequence separations to be zero. As shown in Figure 4 of the main text, our parameterized ideal potential produced chromosomes with sizes comparable to imaging results. Incorporating longerrange interactions to improve the model further is straightforward but would also significantly increase the number of parameters.
Similar to the ideal potential discussed above, we have
where T_{i} and T_{j} denote the compartment types of beads i and j which can be A, B, or C. Therefore, CG beads of the same compartment types will share the same interaction parameter ${\alpha}_{\text{compt}}\left({T}_{i},{T}_{j}\right)$, which will be derived from average HiC contact frequencies as detailed in the following sections.
To account for specific interactions between chromosomes, we introduced the interchromosomal potential as
$I,J\in \{1,2,\dots ,23\}$ index the haploid chromosomes, and parental and maternal chromosomes share identical parameters. This potential allows the model to capture interactions beyond those arising purely from compartmentalization as defined in Equation 14.
All parameters in the energy function are summarized in Appendix 1—table 1. The procedure used for parameter optimization is detailed in the following sections.
Nuclear landmark–nuclear landmark interactions
The general energy function for interactions among nuclear landmark particles is defined as
The nuclear lamina was modeled as a particle mesh, and bonded potentials were introduced for nearest neighbor particles defined as
with ${r}_{o}=0.5\sigma $. i indices all the lamina particles, and j represents the nearest four neighbors around i determined from the initial configuration for which the particles were placed on a Fibonacci grid. To avoid pairs (i, j) being counted twice or more, we set j always larger than i.
Shortranged, nonbonded interactions were introduced among nuclear landmark particles to account for attractions that promote phase separation and the excluded volume effect. These interactions were modeled with a cut and shifted Lennard–Jones (LJ) potential defined as
with ${E}_{\text{cut}}=4\u03f5({(\frac{\sigma}{{r}_{\text{cut}}})}^{12}{(\frac{\sigma}{{r}_{\text{cut}}})}^{6})$. We note that when ${r}_{\text{cut}}$ was set as $\sigma}_{LJ}\times {2}^{1/6$, the potential has no attractive regime and only serves to prevent the overlap among particles, that is, the excluded volume effect.
For attractive interactions between nucleolus particles, and between type dP speckle particles, we set the parameters as ${\u03f5}_{\text{LJ}}=3.0,{\sigma}_{\text{LJ}}=0.5$, and ${r}_{\text{cut}}=1.5$. Therefore,
where the sums iterate over pairs of nucleolus particles and speckle dP particles.
For the excluded volume effect between nucleolus and speckle particles, between dP and P particles, and between P particles, we set the parameters as ${\u03f5}_{\text{LJ}}=1.0,{\sigma}_{\text{LJ}}=0.5$, and $r}_{\text{cut}}=0.5\times {2}^{1/6$. These potentials are consistent with the estimated size of 0.5 σ for speckle and nucleolus particles.
The excluded volume effect was also introduced between lamina and nucleolus particles and between the lamina and speckle particles to confine the nuclear bodies inside the nuclear envelope. We set the parameters as ${\u03f5}_{\text{LJ}}=1.0,{\sigma}_{\text{LJ}}=0.75$, and ${r}_{\text{cut}}=0.75\times {2}^{1/6}$. The value for ${\sigma}_{\text{LJ}}$ was chosen based on a linear combination of the lamina particle size (1.0 σ) and the speckle/nucleolus particle size (0.5 σ).
Therefore, the excluded volume potential can be written as
We used abbreviations to denote various nuclear landmarks, with La for the nuclear lamina, SpP for Ptype speckle particles, SpdP for dPtype speckle particles, and No for nucleolus particles. All the interaction parameters for the nuclear landmarks are listed in Appendix 1—table 3 for convenient reference.
Chromosome–nuclear landmark interactions
The energy function for interactions between chromosome and nuclear landmark particles is defined as
The functional form of the potential used to describe interactions between chromosomes and nuclear landmarks is inspired by experimental techniques that probe their contacts, such as Lamin B DamID and SON TSASeq. For example, the average contact probability between a chromatin bead i and the nuclear lamina can be estimated as
where j indexes over the lamina particles. $c({r}_{ij})$ is defined as
It is a switching function that approaches one for ${r}_{ij}<{r}_{c}$, a threshold distance at which we set chromatin and the lamina as in contact. We chose $\eta =4.0$ to obtain a reasonable decay of contact probability between chromosomes and nuclear landmarks. ${r}_{c}=0.75$ was selected as the average size of the lamina (1.0 σ) and chromatin (0.5 σ) particles.
For the computational model to reproduce the experimental contact probability, following the maximum entropy argument (Roux and Weare, 2013; Zhang and Wolynes, 2015), the interaction potential between chromosomes and the nuclear lamina adopts the following form:
A similar argument to the one outlined above was used to derive the interactions among chromosomes from HiC data, that is, Equations 12, 14, and 15 (Hetzer, 2010). The individual parameters ${\alpha}_{i}^{\text{CLa}}$ were optimized to ensure a match between simulated and experimental Lamin B DamID data. The second term was included to account for the excluded volume effect and prevent chromatin from moving outside the envelope.
The interaction potential between chromosomes and the speckles adopts a similar form defined as
The second sum for j only includes dPtype speckle particles. The individual parameters ${\alpha}_{i}^{\text{CSp}}$ were optimized to ensure a match between simulated and experimental SON TSAseq data.
Finally, the interaction potential between chromosomes and nucleoli is defined as
Because of the low data quality for the ChIPSeq experiments for detecting chromatinnucleoli contacts, we did not perform systematic optimizations for ${\alpha}_{i}^{\text{CNo}}$. Instead, we simply set them as ${\alpha}_{i}^{\text{CNo}}={P}_{i}^{\text{N}}\u03f5$, with $\u03f5=1.0$. ${P}_{i}^{\text{N}}$ is the probability for the chromatin bead i to contact nucleoli as quantified by the software SPIN (Wang et al., 2021).
We list all the interaction parameters between chromosomes and the nuclear landmarks in Appendix 1—table 4.
Optimization of the whole nucleus model parameters
Below, we describe the procedures used to derive model parameters.
Connecting imaging and HiC data with the contact function
The function f(r) defined in Equation 13 was used to determine the chromatin contact probabilities. The availability of spatial positions and HiC data makes possible the definition of a contact function, f(r), that converts distances into contact probabilities. In particular, we determined r_{c} as the value at which the simulated average interchromosomal contact probability ${\langle f({r}_{c})\rangle}_{\text{inter}}^{\text{sim}}$ matches the experimental value, that is,
The angular brackets represent ensemble averaging, performed using the structures at 100 KB resolution reported in our previous work (Kamat et al., 2023). Matching simulation and experimental values produced ${r}_{c}=0.54\sigma \approx 208$ nm. We note that this estimation for r_{c} is comparable to the average bond length (0.5 σ), thus ensuring that nearest neighbor genomic regions with contact probability close to 1, that is, $\u27e8f\left({r}_{i,i+1}\right)\u27e9\approx 1$.
Adam optimizer for chromosome interaction parameters
Mathematical expressions for the various energy terms in ${U}_{\text{Genome}}$ were designed such that their ensemble averages can be mapped onto combinations of contact frequencies measured in HiC. The correspondence between the energy functions and HiC measurements allows model parameterization with an efficient adaptive moment (Adam) algorithm (Kingma and Ba, 2014). Specifically, ${\alpha}_{\text{ideal}}\left(ij\right),{\alpha}_{\text{compt}}\left({T}_{i},{T}_{j}\right)$, and ${\alpha}_{\text{inter}}(I,J)$ were tuned to satisfy the following constraints:
where ${\delta}_{{T}_{i},{T}_{1}}$ is the Kronecker delta function with the following definition:
The angular bracket represents the ensemble average, and ${f}_{ij}^{\text{exp}}$ is the corresponding experimental contact frequency.
During the optimization process, our aim was to minimize the disparity between experimental findings and simulated data. To achieve this, we defined the cost function as follows:
where the index i iterates over all the constraints defined in Equation 28.
The details of the algorithm for parameter optimization are as follows:
Starting with a set of values for ${\alpha}_{\text{ideal}}\left(ij\right),\text{}{\alpha}_{\text{compt}}\left({T}_{i},{T}_{j}\right)$, and ${\alpha}_{\text{inter}}\left(I,J\right)$, we performed 50 independent 3millionstep long MD simulations to obtain an ensemble of nuclear configurations. The 500K steps of each trajectory are discarded as equilibration. We collected the configurations at every 2000 simulation steps from the rest of the simulation trajectories to compute the ensemble averages defined on the lefthand side of Equationi 13.
Check the convergence of the optimization by calculating the percentage of error defined as $\sum _{i}\left(\u27e8{f}_{i}\u27e9{f}_{i}^{\mathrm{exp}}\right)/\sum _{i}{f}_{i}^{\mathrm{exp}}$. The summation over i includes all the average contact probabilities defined in Equation 28.
If the error is less than a tolerance value ${e}_{\text{tol}}$, the optimization has converged, and we stop the simulations. Otherwise, we update the parameters, α, using the Adam optimizer (Kingma and Ba, 2014). With the new parameter values, we return to step one and restart the iteration.
Adam optimizer for chromosome–nuclear body interaction parameters
Similar to those among chromatin particles, the interaction parameters between chromatin and nuclear landmarks were optimized with Adam’s algorithm to reproduce experimental constraints.
The constraints that we aimed to reproduce were defined as follows:
where ${C}_{i}^{\text{La}}$ and ${C}_{i}^{\text{Sp}}$ measure the contacts between chromatin bead i and nuclear lamina and speckles, respectively, as defined in Equations 42 and 45. ${\text{LAF}}_{i}$ and ${\text{LAF}}_{i}$ denote the lamina and speckle association frequency for chromatin bead i as measured in Lamin B DamID and SON TSASeq experiments. N denotes the number of chromatin beads. We combined the constraints defined in Equation 31 with those in Equation 28 to simultaneously optimize the parameters using the iterative algorithm outlined in the previous section. We note that the interaction potential between chromatin and speckles defined in Equation 25 did not use precisely the same function as in ${C}_{i}^{\text{Sp}}$. We chose to sum over all speckle dP particles, rather than identifying the droplets, which is difficult to do during the simulations.
Parameter optimization for nuclear body–nuclear body interactions
As much remains to be known about the organization of nuclear bodies, we designed the interaction potentials and parameters based on qualitative observations without extensive finetuning. For example, we used the standard Lennard–Jones potential (Equation 18) to mimic shortrange interactions. The lengthscales, ${\sigma}_{\text{LJ}}$, in these potentials, were chosen based on a linear combination of the size of interacting particles, as discussed in section ‘Unit conversion’.
The interaction strength, ${\u03f5}_{\text{LJ}}$, was set as 1.0 to be on the same order as thermal energy (${k}_{\text{B}}T$), when the potential was used to account for the excluded volume effect.
For attractive interactions that promote phase separation and nuclear body formation, we set ${\u03f5}_{\text{LJ}}=3.0$. Smaller values failed to produce clustered nucleoli, while much larger values significantly decreased the fluidity of the resulting droplets. The same value was used for speckle dP particles and produced droplet numbers comparable to experimental observations (Figure 2—figure supplement 1).
Unit conversion
The reduced unit for length scale is noted as σ. We set the nucleus radii as 13σ. Assuming a nucleus with an average size of 5 μm, we have σ = 385 nm.
Mapping chromatin bead size to real unit
We estimated the size of the chromosome bead as 192.5 nm based on superresolution imaging data as follows. The median radius of gyration has been shown to follow a powerlaw scaling as a function of domain length with an exponent of 0.3 (Boettiger et al., 2016). Assuming that the radius of a domain is proportional to the radius of gyration, we have
We previously estimated the size of 1 MB bead as ${R}_{\text{1MB}}=\sigma =385$ nm, and Equation 32 yields the size of 100 KB as ${R}_{\text{100KB}}=0.5\sigma $.
Mapping lamina bead size to real unit
We chose the number and the diameter of lamina beads ${N}_{\text{La}},{\sigma}_{\text{La}}$ by estimating the distance between nearest neighbor lamina beads. We found that at ${N}_{\text{La}}=8000$, when the lamina particles were placed on the Fibonacci grid over the spherical surface, the average nearest neighbor distance was 0.52. Therefore, we set ${\sigma}_{\text{La}}=0.5\sigma $ when considering the excluded volume effect between lamina particles. However, when modeling the excluded volume effect between lamina and chromatin, nucleolus, or speckle particles, we used ${\sigma}_{\text{La}}=1.0$ (see Equation 20). A larger value provides a stronger excluded volume effect that prevents these particles from crossing the nucleus boundary or getting stuck in the space of the lamina particle mesh grid.
Mapping nucleoli bead size to real unit
The size of nucleolus particles (${\sigma}_{\text{No}}$) was estimated as follows. Since the average number of nucleoli inside a cell nucleus ranges from 2 to 5, we approximate the number of particles comprising individual droplets as ${N}_{\text{No}}/3$, assuming a total of three nucleoli. ${N}_{\text{No}}$ corresponds to the total number of nucleolus particles. With a spacefilling model, the ratio of the volume between one nucleolus and the cell nucleus can be estimated as
where ${2}^{1/6}{\sigma}_{n}/2$ denotes the effective radius of a nucleolus particle, and ${R}_{\text{N}}$ is the nucleus size. Using experimental values for the nucleolus and nucleus size (Caragine et al., 2018; Caragine et al., 2019) as ${R}_{\text{No}}=0.5\mu m$ and ${R}_{\text{N}}=5\mu m$, we have ${\sigma}_{\text{No}}=0.5$.
Mapping speckle bead size to real unit
A similar procedure as in the previous section was used to estimate the size of speckle particles ${\sigma}_{\text{Sp}}$. Since approximately 600 dPtype speckle particles form speckle clusters, each speckle cluster consists of around 20 particles. This estimation assumes a total of 30 speckle droplets in the system, consistent with the experimentally reported range of 20–50 speckles.
With a spacefilling model, the ratio of the volume between one speckle and the cell nucleus can be estimated as
where ${N}_{\text{Sp}}=20$. Using experimental values for the speckle and nucleus size
(Handwerger et al., 2005) as ${R}_{\text{Sp}}=0.3\mu m$ and ${R}_{\text{N}}=5\mu m$, we have ${\sigma}_{\text{Sp}}=0.5$.
Mapping the reduced time unit to real time
We determined the timescale mapping by matching the simulated diffusion coefficient of chromatin particles with experimental values. The diffusion coefficient in our simulations can be estimated from the fluctuationdissipation theorem (Kubo, 1966) as $D=\frac{{k}_{B}T}{\zeta}$, where the friction coefficient $\zeta =m\gamma $. Using the conversion from $\frac{{k}_{\text{B}}T}{m}=\frac{{\sigma}^{2}}{{\tau}_{\text{B}}^{2}}$, we have
We used the simulation setup ${\gamma}^{1}={10}^{2}{\tau}_{B}$ when deriving the last equation.
In the meantime, from the Stokes–Einstein (SE) equation, we have $D=\frac{{k}_{B}T}{6\pi \eta r}$, where η is the viscosity and $r=0.25\sigma $ is the radius of chromatin beads. Therefore,
and
Setting the nucleoplasmic viscosity as $1Pa\cdot s$ produces ${\tau}_{B}\approx 0.65s$. This mapping produced diffusion coefficients and MSD curves that match well with experimental measurements presented in Bronshtein et al., 2015, as discussed in the main text. We note that the chosen value for the nucleoplasmic viscosity indeed falls into the range of reported experimental values from ${10}^{1}Pa\cdot s$ to ${10}^{2}Pa\cdot s$ (Platani et al., 2002; Tseng et al., 2004).
Molecular dynamics simulation details
Initial configurations for simulations
Due to the slow relaxation dynamics of whole chromosomes relative to the simulation timescale, the reported results are sensitive to the configurations used to initialize the simulations. Therefore, we designed the following protocol to prepare the initial configurations and ensure the biological relevance of simulation results.
We first created a total of 1000 configurations for the genome by sequentially generating the conformation of each one of the 46 chromosomes as follows. For a given chromosome, we start by placing the first bead at the center (origin) of the nucleus. The positions of the following beads, i, were determined from the $\left(i1\right)$th bead as ${r}_{i}={r}_{i1}+0.5v$. v is a normalized random vector, and 0.5 was selected as the bond length between neighboring beads. To produce globular chromosome conformations, we rejected vectors, v, that led to bead positions with distance from the center larger than $4\sigma $. Upon creating the conformation of a chromosome i, we shift its center of mass to a value $r}_{\text{com}}^{i$ determined as follows. We first compute a mean radial distance, $r}_{\text{o}}^{i$ with the following equation:
where D_{i} is the average value of Lamin B DamID profile for chromosome i. ${D}_{\text{hi}}$ and ${D}_{\text{lo}}$ represent the highest and lowest average DamID values of all chromosomes, and $6\sigma $ and $2\sigma $ represent the upper and lower bound in radial positions for chromosomes. As shown in Appendix 1—figure 2, the average Lamin B DamID profiles are highly correlated with normalized chromosome radial positions as reported by DNA MERFISH (Su et al., 2020), supporting their use as a proxy for estimating normalized chromosome radial positions. We then select $r}_{\text{com}}^{i$ as a uniformly distributed random variable within the range $\left[{r}_{\text{o}}^{i}2\sigma ,{r}_{\text{o}}^{i}+2\sigma \right]$. Without loss of generality, we randomly chose the directions for shifting all 46 chromosomes.
We further relaxed the 1000 configurations to build more realistic genome structures. Following an energy minimization process, 1millionstep MD simulations were performed starting from each configuration. Simulations were performed with the following energy function:
where ${U}_{\text{Genome}}$ is defined as in Equation 7. ${U}_{\text{GLa}}$ is the excluded volume potential between chromosomes and lamina, that is, only the second term in Equation 24. Parameters in ${U}_{\text{Genome}}$ were from a preliminary optimization. The end configurations of the MD simulations were collected to build the final configuration ensemble (FCE).
We further computed the Pearson correlation coefficient of pairwise interchromosomal contacts between different structures in FCE (see section ‘Computing pairwise interchromosomal contact probabilities’). As shown in Figure 6—figure supplement 2A, the probability distribution of these correlation coefficients is comparable with that determined from DNAMERFISH structures, supporting the biological relevance of the structural diversity in the constructed ensemble.
From 1000 relaxed configurations, we selected a subset of structures to initialize simulations presented in the main text. An optimization procedure was introduced for structure selection. We start this procedure by randomly select N structures to build the initial configuration ensemble (ICE). We then iteratively go through every configuration in ICE and replace with a structure from FCE that’s not already included in ICE. We then compute the Pearson correlation coefficient between new average ICE interchromosomal contact probabilities and experimental values. If the Pearson correlation coefficient is higher than the value determined from the original ICE, the new structure is accepted and the ICE is updated. Otherwise, the new structure is rejected. We stop the selection process for when the Pearson correlation coefficient stops improving.
We found that as N increases, the agreement between ICE interchromosomal contact probabilities and experimental values continue to increase (Figure 6—figure supplement 2B). We set N = 50, which produces a Pearson correlation coefficient between ICE and experimental interchromosomal contact probabilities of 0.9. Further increasing N does not significantly improve the agreement but incurs more computational cost.
It is worth noting that the outcomes of the selection procedure depend on the initial set of configurations included in ICE at the beginning. However, we found that the ICEs produced from 20 independent trials are highly correlated (Figure 6—figure supplement 2C) and all reproduce the heterogeneity in interchromosomal contacts seen in DNA MERFISH data (Figure 6—figure supplement 2D). Therefore, the selection procedure is robust and can produce biologically meaningful configurations to initialize simulations.
With the chromosome positions prepared, we randomly placed 300 nucleoli and 1600 speckle particles inside the nucleus to complete the set up of initial configurations.
Langevin dynamics simulations
We used the Langevin integrator with the damping coefficient ${\gamma}^{1}=10$ to control the temperature at T = 1.0 for simulations used for parameter optimization and for producing an ensemble of nucleus structures. Langevin dynamics simulations allow faster chromosome movements, compared to Brownian dynamics simulations, facilitating the conformational sampling. In these simulations, the lamina particles were frozen and no explicit dynamics were considered for the nuclear envelope.
Brownian dynamics simulations
We also performed Brownian dynamics simulations with damping coefficient ${\gamma}^{1}={10}^{2}$ to control the temperature at T = 1.0. These simulations provide better approximations of the overdamped dynamics of chromatin for direct comparison with live cell imaging studies. As detailed in section ‘Unit conversion’, upon mapping the coarsegrained timescale to the physical unit, Brownian dynamics simulations produce diffusion coefficients for telomeres comparable to experimental values (see Figure 5).
Nuclear envelope deformation simulations
We performed Langevin dynamics simulations to investigate the impact of nuclear envelope deformation on genome organization. To induce a compressing force along the zaxis, we introduced a harmonic potential in the form of
where z_{i} is the z coordinate of the ith lamina bead, and ${N}_{\text{La}}$ represents the total number of lamina beads. The particles in the system evolve under the combined effect of ${U}_{\text{compress}}$ and ${U}_{\text{Nucleus}}$ defined in Equation 6.
Details of simulation data analysis
The computer simulations yield 3D coordinates of the diploid genome. However, when comparing directly with experimental data processed for the haploid genome, unless stated otherwise, we computed averages across paternal and maternal chromosomes to ascertain various genomewide properties as listed below.
Computing simulated contact probabilities
Simulated contact probability maps were computed by averaging over chromosome configurations collected from all trajectories. For a given configuration, the contact probability between two chromatin segments (i and j) was evaluated using the contact function defined in Equation 13.
Computing the Pearson correlation coefficients between experimental and simulated contact maps
We computed the Pearson correlation coefficients (PCCs) between experimental and simulated contact maps in Figure 4A and Figure 4—figure supplement 1 as
where x_{i} and y_{i} represent the experimental and simulated contact probabilities, and n is the total number of data points. Only nonredundant data points, that is, half of the pairwise contacts, are used in the PCC calculation.
Computing pairwise interchromosomal contact probabilities
For a given genome structure, we computed the pairwise interchromosomal contacts as follows. For every pair of chromosomes, we determined their contact probability by averaging all genomic pairs from two chromosomes using Equation 13. We then averaged over all four pairs of diploid chromosomes to compute the haploid average contacts. In total, there are ${C}_{22}^{2}=231$ contact pairs between haploid chromosomes excluding the sex chromosomes.
Distances from nuclear bodies and association frequencies
The contacts of a chromatin bead i with the nuclear lamina were evaluated as
with ${r}_{c}=0.75\sigma $. We average over the ensemble of nuclear configurations and homologs to compute the in silico Lamin B DamID signal as
where the angular brackets indicate ensemble averaging. ${\overline{C}}^{\text{La}}$ is defined as the genome wide average of $\langle {C}_{i}^{\text{La}}\rangle $.
For chromatinspeckle contacts, we first identified the speckles formed at any given structure using the densitybased spatial clustering algorithm DBSCAN (Ester et al., 1996) as implemented in the scikit library for Python (Pedregosa et al., 2011). For the identified droplets, we computed their center of mass coordinates, $\overrightarrow{r}}^{com$ and the radius of gyration, R. With the identified clusters, we then determined the distance from the ith chromatin bead to the sth speckle as
where $\cdot $ represents the L2 norm. We subtract the radius of the speckle cluster in the above equation to determine the distance to the droplet surface. From the list of distances to different speckles, the contact between chromatin bead i and speckles is computed as
where we sum over all the N_{s} speckle clusters. A similar expression was used for determining the contacts between chromatin and nucleoli.
Finally, we average over the ensemble of nuclear configurations and homologs to compute the in silico SON TSASeq signal as
where the angular brackets indicate ensemble averaging. ${\overline{C}}^{\text{Sp}}$ is defined as the genome wide average of $\langle {C}_{i}^{\text{Sp}}\rangle $.
Computing simulated normalized chromosome radial positions
For a given chromosome i, we first determined its center of mass position denoted as C_{i}. Starting from the center of the nucleus, O, we extend the vector ${v}_{OC}$ to identify the intersection point with the nuclear lamina as P_{i}. The normalized radial position of chromosome i is then defined as $\frac{{v}_{O{C}_{i}}}{{v}_{O{P}_{i}}}$, where $.$ represents the L2 norm.
Computing simulated chromosome radii of gyration
The radius of gyration for a chromosome is computed as
where ${r}_{\text{com}}$ and n are the center of mass and the number of beads of the chromosome. i indices over all the chromosome beads and ${r}_{i}$ correspond to the Cartesian coordinates of bead i. $\left\right.\left\right$ represents the L2 norm.
Computing simulated meansquare displacement
MSD for telomeres were computed as
where $\Delta t,\delta t$, and ${N}_{\text{step}}$ represent the time interval, the time step, and the total number of steps, respectively. The summation over t corresponds to averaging over eight independent trajectories. MSDs telomeres from paternal and maternal chromosomes are separately computed and analyzed.
Details of experimental data analysis
Interchromosomal contacts from DNA MERFISH data
We collected the DNA MERFISH data reported in Su et al., 2020 to construct the experimental ensemble of 5455 genome structures. For each structure, we computed the pairwise interchromosomal contacts following the procedure outlined in section ‘Computing pairwise interchromosomal contact probabilities’.
To better visualize and analyze interchromosomal contacts, we applied the Uniform Manifold Approximation and Projection (UMAP) technique as implemented in software package umaplearn (McInnes et al., 2018; Moshtagh, 2005), with default parameters to reduce the 231 haploid contacts into two dimensions. All 5455 DNA MERFISH structures were included in this analysis.
The same transformations produced from the UMAP analysis of experimental structures were applied to in silico configurations to produce results shown in Figure 6—figure supplement 2C and D .
Computing experimental normalized chromosome radial positions
We followed the same procedure outlined in section ‘Computing simulated normalized chromosome radial positions’ to compute the experimental values. To determine the center of the nucleus using DNA MERFISH data, we used the algorithm, minimum volume enclosing ellipsoid (MVEE) (Moshtagh, 2005), to fit an ellipsoid for each genome structure. The optimal ellipsoid defined as ${\left(xc\right)}^{T}A\left(xc\right)\equiv 1$ is obtained by optimizing $min\left(\mathrm{log}\left(det\left[\mathbf{A}\right]\right)\right)$ subjecting to the constraint that ${({x}_{i}c)}^{T}A({x}_{i}c)\le 1$. ${x}_{i}$ correspond to the list of chromatin positions determined experimentally.
Computing experimental radii of gyration
We computed the experimental radii of gyration with using the same expression as that for analyzing simulated structures (Equation 47).
Data availability
HiC data (https://data.4dnucleome.org, accession number: 4DNFIB59T7NN). SON TSAseq data (https://data.4dnucleome.org, accession number: pulldown data 4DNEX6U8TS3Y, control data 4DNEXI7XUWFK). LaminB DamID data (https://data.4dnucleome.org, accession number 4DNESXZ4FW4T). The software is available at https://github.com/ZhangGroupMITChemistry/OpenNucleome (copy archived at ZhangGroupMITChemistry, 2024).

4DN Data PortalID 4DNESXZ4FW4T. LaminB1 DamID of HFFc6 Tier 1 cells – cells were transduced with virus expressing DamLaminB1, gDNA was harvested after 4 days and processed for DamIDseq.

4DN Data PortalID 4DNEXI7XUWFK. Set of Input for SON Ab2 TSAseq version 2 Reaction Condition 2 (PBS 50% Sucrose) Enhancement Condition E (1:300 tyramidebiotin, 30 minute reaction) on HFFc6 cells.

4DN Data PortalID 4DNEX6U8TS3Y. TSAseq against SON protein on HFFc6 (Tier 1).

4DN Data PortalID 4DNFIB59T7NN. Ultrastructural Details of Mammalian Chromosome Architecture.

4DN Data PortalID 4DNEXFUGLVQA. DamIDseq with DAMLMNB1 on HFFc6 (Tier 1).
References

The spatial organization of the human genomeAnnual Review of Genomics and Human Genetics 14:67–84.https://doi.org/10.1146/annurevgenom091212153515

Ephemeral protein binding to DNA shapes stable nuclear bodies and chromatin domainsBiophysical Journal 112:1085–1093.https://doi.org/10.1016/j.bpj.2017.01.025

Shaping the genome via lengthwise compaction, phase separation, and lamina adhesionNucleic Acids Research 50:4258–4271.https://doi.org/10.1093/nar/gkac231

Loss of lamin A function increases chromatin dynamics in the nuclear interiorNature Communications 6:8044.https://doi.org/10.1038/ncomms9044

Transient anomalous diffusion of telomeres in the nucleus of mammalian cellsPhysical Review Letters 103:018102.https://doi.org/10.1103/PhysRevLett.103.018102

Surface fluctuations and coalescence of nucleolar droplets in the human cell nucleusPhysical Review Letters 121:148101.https://doi.org/10.1103/PhysRevLett.121.148101

Modelling the compartmentalization of splicing factorsJournal of Theoretical Biology 239:298–312.https://doi.org/10.1016/j.jtbi.2005.07.019

Imaging specific genomic DNA in living cellsAnnual Review of Biophysics 45:1–23.https://doi.org/10.1146/annurevbiophys062215010830

Mapping 3D genome organization relative to nuclear compartments using TSASeq as a cytological rulerThe Journal of Cell Biology 217:4025–4048.https://doi.org/10.1083/jcb.201807108

Genome organization around nuclear specklesCurrent Opinion in Genetics & Development 55:91–99.https://doi.org/10.1016/j.gde.2019.06.008

Deciphering the molecular mechanism of the cancer formation by chromosome structural dynamicsPLOS Computational Biology 17:e1009596.https://doi.org/10.1371/journal.pcbi.1009596

Exploring the threedimensional organization of genomes: interpreting chromatin interaction dataNature Reviews. Genetics 14:390–403.https://doi.org/10.1038/nrg3454

Chromatin domains: The unit of chromosome organizationMolecular Cell 62:668–680.https://doi.org/10.1016/j.molcel.2016.05.018

OpenMM 7: Rapid development of high performance algorithms for molecular dynamicsPLOS Computational Biology 13:e1005659.https://doi.org/10.1371/journal.pcbi.1005659

ConferenceA densitybased algorithm for discovering clusters in large spatial databases with noiseKDD96 Proceedings. pp. 226–231.

Formation of chromosomal domains by loop extrusionCell Reports 15:2038–2049.https://doi.org/10.1016/j.celrep.2016.04.085

Nuclear speckles: molecular organization, biological function and role in diseaseNucleic Acids Research 45:10350–10368.https://doi.org/10.1093/nar/gkx759

Chromosome positioning from activitybased segregationNucleic Acids Research 42:4145–4159.https://doi.org/10.1093/nar/gkt1417

Cajal bodies, nucleoli, and speckles in the Xenopus oocyte nucleus have a lowdensity, spongelike structureMolecular Biology of the Cell 16:202–211.https://doi.org/10.1091/mbc.e04080742

The nuclear envelopeCold Spring Harbor Perspectives in Biology 2:a000539.https://doi.org/10.1101/cshperspect.a000539

Chromatin organization and transcriptional regulationCurrent Opinion in Genetics & Development 23:89–95.https://doi.org/10.1016/j.gde.2012.11.006

Understanding 3D genome organization by multidisciplinary methodsNature Reviews. Molecular Cell Biology 22:511–528.https://doi.org/10.1038/s4158002100362w

Phase separation and correlated motions in motorized genomeThe Journal of Physical Chemistry. B 126:5619–5628.https://doi.org/10.1021/acs.jpcb.2c03238

Modeling epigenome folding: formation and dynamics of topologically associated chromatin domainsNucleic Acids Research 42:9553–9561.https://doi.org/10.1093/nar/gku698

Compartmentalization with nuclear landmarks yields random, yet precise, genome organizationBiophysical Journal 122:1376–1389.https://doi.org/10.1016/j.bpj.2023.03.003

The fluctuationdissipation theoremReports on Progress in Physics 29:255–284.https://doi.org/10.1088/00344885/29/1/306

The nucleolus as a multiphase liquid condensateNature Reviews. Molecular Cell Biology 22:165–182.https://doi.org/10.1038/s4158002002726

Mesoscale liquid model of chromatin recapitulates nuclear order of eukaryotesBiophysical Journal 118:2130–2140.https://doi.org/10.1016/j.bpj.2019.09.013

The interplay of chromatin phase separation and lamina interactions in nuclear organizationBiophysical Journal 120:5005–5017.https://doi.org/10.1016/j.bpj.2021.10.012

Nuclear speckles: a model for nuclear organellesNature Reviews. Molecular Cell Biology 4:605–612.https://doi.org/10.1038/nrm1172

Stressed Fibonacci spiral patterns of definite chiralityApplied Physics Letters 90:164102.https://doi.org/10.1063/1.2728578

Multiscale modeling of genome organization with maximum entropy optimizationThe Journal of Chemical Physics 155:010901.https://doi.org/10.1063/5.0044150

From 1D sequence to 3D chromatin dynamics and cellular functions: a phase separation perspectiveNucleic Acids Research 46:9367–9383.https://doi.org/10.1093/nar/gky633

Chromatin structure: does the 30nm fibre exist in vivo?Current Opinion in Cell Biology 22:291–297.https://doi.org/10.1016/j.ceb.2010.03.001

UMAP: uniform manifold approximation and projectionJournal of Open Source Software 3:861.https://doi.org/10.21105/joss.00861

A scalable computational approach for simulating complexes of multiple chromosomesJournal of Molecular Biology 433:166700.https://doi.org/10.1016/j.jmb.2020.10.034

ChIPseq: advantages and challenges of a maturing technologyNature Reviews. Genetics 10:669–680.https://doi.org/10.1038/nrg2641

How the genome folds: the biophysics of fourdimensional chromatin organizationAnnual Review of Biophysics 48:231–253.https://doi.org/10.1146/annurevbiophys052118115638

The nucleolusCold Spring Harbor Perspectives in Biology 3:a000638.https://doi.org/10.1101/cshperspect.a000638

Scikitlearn: machine learning in pythonThe Journal of Machine Learning Research 12:2825–2830.

Cajal Body dynamics and association with chromatin are ATPdependentNature Cell Biology 4:502–508.https://doi.org/10.1038/ncb809

Predicting threedimensional genome organization with chromatin statesPLOS Computational Biology 15:e1007024.https://doi.org/10.1371/journal.pcbi.1007024

Datadriven polymer model for mechanistic exploration of diploid genome organizationBiophysical Journal 119:1905–1916.https://doi.org/10.1016/j.bpj.2020.09.009

Chromatin network retards nucleoli coalescenceNature Communications 12:6824.https://doi.org/10.1038/s41467021271239

Structure and dynamics of interphase chromosomesPLOS Computational Biology 4:e1000153.https://doi.org/10.1371/journal.pcbi.1000153

On the statistical equivalence of restrainedensemble simulations with the maximum entropy methodThe Journal of Chemical Physics 138:084107.https://doi.org/10.1063/1.4792208

Genomewide mapping and analysis of chromosome architectureNature Reviews. Molecular Cell Biology 17:743–755.https://doi.org/10.1038/nrm.2016.104

Cytokines and their relationship to the symptoms and outcome of cancerNature Reviews. Cancer 8:887–899.https://doi.org/10.1038/nrc2507

Mechanisms for active regulation of biomolecular condensatesTrends in Cell Biology 30:4–14.https://doi.org/10.1016/j.tcb.2019.10.006

Nuclear specklesCold Spring Harbor Perspectives in Biology 3:a000646.https://doi.org/10.1101/cshperspect.a000646

Fibonacci grids: a novel approach to global modellingQuarterly Journal of the Royal Meteorological Society 132:1769–1793.https://doi.org/10.1256/qj.05.227

Microorganization and viscoelasticity of the interphase nucleus revealed by particle nanotrackingJournal of Cell Science 117:2159–2167.https://doi.org/10.1242/jcs.01073

Mass spectrometric and kinetic analysis of ASF/SF2 phosphorylation by SRPK1 and Clk/StyThe Journal of Biological Chemistry 280:41761–41768.https://doi.org/10.1074/jbc.M504156200

Learning the formation mechanism of domainlevel chromatin states with epigenomics dataBiophysical Journal 116:2047–2056.https://doi.org/10.1016/j.bpj.2019.04.006

Evaluating the role of the nuclear microenvironment in gene function by populationbased modelingNature Structural & Molecular Biology 30:1193–1206.https://doi.org/10.1038/s41594023010361

Genomic energy landscapesBiophysical Journal 112:427–433.https://doi.org/10.1016/j.bpj.2016.08.046

SoftwareOpenNucleome, version swh:1:rev:380e3b5a65446081d6d4007362e121da18d8b1e9Software Heritage.
Article and author information
Author details
Funding
National Institute of General Medical Sciences (R35GM133580)
 Bin Zhang
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
This work was supported by the National Institutes of Health (grant R35GM133580).
Version history
 Sent for peer review:
 Preprint posted:
 Reviewed Preprint version 1:
 Reviewed Preprint version 2:
 Version of Record published:
Cite all versions
You can cite all versions using the DOI https://doi.org/10.7554/eLife.93223. This DOI represents all versions, and will always resolve to the latest one.
Copyright
© 2024, Lao et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 642
 views

 35
 downloads

 1
 citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading

 Chromosomes and Gene Expression
The hox operon in Synechocystis sp. PCC 6803, encoding bidirectional hydrogenase responsible for H_{2} production, is transcriptionally upregulated under microoxic conditions. Although several regulators for hox transcription have been identified, their dynamics and higherorder DNA structure of hox region in microoxic conditions remain elusive. We focused on key regulators for the hox operon: cyAbrB2, a conserved regulator in cyanobacteria, and SigE, an alternative sigma factor. Chromatin immunoprecipitation sequencing revealed that cyAbrB2 binds to the hox promoter region under aerobic conditions, with its binding being flattened in microoxic conditions. Concurrently, SigE exhibited increased localization to the hox promoter under microoxic conditions. Genomewide analysis revealed that cyAbrB2 binds broadly to ATrich genome regions and represses gene expression. Moreover, we demonstrated the physical interactions of the hox promoter region with its distal genomic loci. Both the transition to microoxic conditions and the absence of cyAbrB2 influenced the chromosomal interaction. From these results, we propose that cyAbrB2 is a cyanobacterial nucleoidassociated protein (NAP), modulating chromosomal conformation, which blocks RNA polymerase from the hox promoter in aerobic conditions. We further infer that cyAbrB2, with altered localization pattern upon microoxic conditions, modifies chromosomal conformation in microoxic conditions, which allows SigEcontaining RNA polymerase to access the hox promoter. The coordinated actions of this NAP and the alternative sigma factor are crucial for the proper hox expression in microoxic conditions. Our results highlight the impact of cyanobacterial chromosome conformation and NAPs on transcription, which have been insufficiently investigated.

 Cancer Biology
 Chromosomes and Gene Expression
MYC family oncoproteins regulate the expression of a large number of genes and broadly stimulate elongation by RNA polymerase II (RNAPII). While the factors that control the chromatin association of MYC proteins are well understood, much less is known about how interacting proteins mediate MYC’s effects on transcription. Here, we show that TFIIIC, an architectural protein complex that controls the threedimensional chromatin organisation at its target sites, binds directly to the aminoterminal transcriptional regulatory domain of MYCN. Surprisingly, TFIIIC has no discernible role in MYCNdependent gene expression and transcription elongation. Instead, MYCN and TFIIIC preferentially bind to promoters with paused RNAPII and globally limit the accumulation of nonphosphorylated RNAPII at promoters. Consistent with its ubiquitous role in transcription, MYCN broadly participates in hubs of active promoters. Depletion of TFIIIC further increases MYCN localisation to these hubs. This increase correlates with a failure of the nuclear exosome and BRCA1, both of which are involved in nascent RNA degradation, to localise to active promoters. Our data suggest that MYCN and TFIIIC exert an censoring function in early transcription that limits promoter accumulation of inactive RNAPII and facilitates promoterproximal degradation of nascent RNA.