Fate of the TADs in chromosomes upon cohesin deletion.

(a) The number of TADs in all the chromosomes, identified by the TopDom method [44], in the wild type (WT) cells and the number of preserved TADs (P-TADs) after deleting cohesin loading factor (Nipbl) in mouse liver. (b) Same as (a) except the experimental data are analyzed for HCT-116 cell before (WT) and after RAD21 deletion. (c) The total number of TADs and the number of P-TADs for each chromosome calculated using the mouse liver Hi-C data. The number above each bar is the percentage of P-TADs in each chromosome. (d) Same as (c) except the results are for chromosomes from the HCT-116 cell line. The percentage of P-TADs is greater in the HCT-116 cell line than in mouse liver for almost all the chromosomes, a feature that is more prominent in the distribution of P-TADs proportions (Right).

Identification of P-TADs from the contact map using the TopDom method.

(a) Schematic representation used to determine the P-TADs. Yellow (Blue) triangles represent the TADs identified using TopDom method in WT (cohesin depleted) contact maps at 50kb resolution.Small square within each triangle represents a single locus (50kb size). The boundaries of a TAD detected in the WT contact map within ± one bin (50kb) from a position of boundaries in cohesin depleted cells is deemed to be a P-TAD. (b) P-TAD upon cohesin loss in HCT116 cell. The bar plots above the contact maps show the epigenetic states. Red (Blue) color represents the active (inactive) state, respectively. The TAD between grey dashed lines is preserved upon cohesin loss. The parameter (with red square) displayed at each left bottom indicates the color scale when plotting contact maps used in Juicebox[54].

CCM simulations reveal characteristics of P-TADs.

(a) The number of TADs in the simulated Chr10 and Chr13 chromosomes for PL=1. The number of P-TADs after CTCF loop depletion (PL=0) is also shown. (b) The number of P-TAD with epigenetic switches (blue) and those identified by the peaks in the boundary probability (green). (c)-(e) Comparison between contact maps for the region of Chr13 with upper (lower) triangle with PL = 1 (PL = 0). The black circles at the corner of the TADs are the CTCF loop anchors. The bar above the contact map are the epigenetic states with red (blue) representing A (B) loci. Arrows above the bar show the epigenetic switch. (c) After loop deletion, TAD structures disappear. (d) TAD whose boundaries are marked by epigenetic switches are preserved. (e) TAD lacking at least one epigenetic switch is disrupted after loop loss. (f)-(h) Comparison of the contact map and the mean spatial-distance matrices for the 2.5Mb genomic regions (25.7-28.2Mbp,73.3-75.8Mbp and 102-104.5Mbp, respectively) with (upper) and without (lower) loop anchors. Bottom graph shows the boundary probability, with the high values indicating population averaged TAD boundary. Purple circles in the boundary probability graph represent the preferred boundaries. A subset of P-TADs boundaries match epigenetic switches (blue lines). P-TADs with high boundary probability is in green line. The magenta line describes P-TADs which is not accounted by epigenetic switch or physical boundary in 3D space but are found using the TopDom method.

Classification of P-TADs from Hi-C maps from two cell lines and link between boundary probability peak and epigenetic switch.

(a) The number of P-TADs in all the chromosomes (orange bar taken from Fig. 1a) that are accounted for by epigenetic switches (blue bar) as well as peaks in the boundary probability (green bar) after Nipbl loss in mouse liver. (b) Same as (a) except the analyses is done using experimental data are for HCT-116 cell after RAD21. (c) Example of P-TAD in the WT 97.7Mb-100.2Mb region of Chr3 from HCT-116 cell line. The mean distance matrices calculated using the 3D structures is in the middle panel.The dark-red circles at the boundaries of the TADs in the contact maps are loop anchors detected using HiCCUPS [55]. The peaks in the boundary probability (bottom panel) are shown by purple circles. Epigenetic switch coincides with peak in the boundary probability (compare top and bottom panels). Bottom plot shows the probability for each genomic position to be a single-cell domain boundary. (d) Same as (c) except the results correspond to the absence of RAD21. Although not as sharp, there is discernible peak in the boundary probability when there is an epigenetic switch after removal of RAD21.

Fate of TADs after ΔNipbl in mouse liver cells.

(a)-(b) Comparison between Hi-C (lower) and calculated contact maps (upper) using the 3D structures obtained from HIPPS method for the 3Mb genomic regions (Chr6:22.6Mb-26.1Mb in WT cells and Chr7:139Mb-142.5Mb in Nipbl -depleted cells), respectively. The distance threshold for contact is adjusted to achieve the best agreement between HIPPS and experiments. Calculated contact maps are in very good agreement with Hi-C data for both WT and Nipbl -depleted cells. (c) Complete Loss (Chr6:23.55Mb-26.05Mb) of TADs in ΔNipbl. (d)-(e) P-TADs (Chr7:139.5Mb-142Mb and Chr15:89.5Mb-92Mb). The plots below the scale on top, identifying the epigenetic states[47], compare 50kb-resolution Hi-C contact maps for the genomic regions of interest with Nipbl (upper) and without Nipbl (lower). Mean spatial-distance matrices, obtained from the Hi-C contact matrices using the HIPPS method [45], are below the contact maps. The dark-red circles at the boundaries of the TADs in the contact maps are loop anchors detected using HiCCUPS [4]. ChIP-seq tracks for CTCF, RAD21 and SMC3 in the WT cells [34] illustrate the correspondence between the locations of the most detected loop anchors and the ChIP-seq signals. Bottom plots give the probabilities that each genomic position is at a single-cell domain boundary in the specified regions. Purple circles in the boundary probability graph represent the physical boundaries. A subset of physical boundaries in P-TADs coincide with epigenetic switches (blue lines), indicating that the probabilities of contact at these boundaries are small. P-TADs in (e), demarcated by green lines, have high peaks in the boundary probability in the absence of epigenetic switch.

Certain TADs enrichhed in E-P/P-P interactions at the boundary are preserved pon cohesin deletion.

(a) Comparison between 5kb Micro-C contact maps in the region (Chr8:72.24Mb-72.57Mb) for the WT (left panel) and cohesin-depleted (right panel) mESC cells[48]. Location of cohesin loops (green square) and EP/P-P (blue circles) plotted in the WT contact maps are from experiments[48]. Bars above the contact map show epigenetic states (Red: Active, Blue: Inactive) annotated based on ChromHMM results[56]. The cohesin-dependent (green dashed lines) and independent (blue dashed lines) TADs are detected in the WT cells using the TopDom method with default parameter (w=5). P-TADs (blue dashed lines) are also found in cohesin deleted cells. (b)-(c) Comparison between 20kb Micro-C contact maps and mean distance maps spanning the regions, Chr19:8.66-9.2Mb and Chr12:56.4-56.9Mb, respectively, in the presence (upper) and absence (lower) cohesin. Bottom graph, below the distance maps, shows the boundary probability calculated from 10,000 3D structures. P-TADs between grey dashed lines are detected using TopDom method (w=5). A P-TAD with high boundary peak, without epigenetic switches, are enriched due to E-P/P-P interactions at the boundaries.

Statistics of the TADs in chromosomes upon cohesin loss using Micro-C contact data. The number of TADs in all the chromosomes in the wild type (WT, dark blue bar), the total number of preserved TADs (P-TADs, light blue bar) after deleting RAD21, and the number of P-TADs whose boundaries coincide withe enhancer-promoter/promoter-promoter (E-P/P-P) interactions (magenta bar) in mESC. About a third of the P-TADs are associated with E-P/P-P interactions.

Parameters for bonding potentials

Schematic representation used to identify the P-TADs with epigenetic switches:

Dark grey triangles represent the P-TADs in contact map. Small square within each triangle represents a single locus (50kb). Red (Blue) color indicates the active (inactive) state in the bar below the contact map. A transition between A and B epigenetic states is referred as to epigenetic switch (Green arrows). We examined if each P-TAD has an epigenetic switch at the boundaries ± 100kb (II). If P-TADs have only one locus (50kb) switch near their boundaries (I) or comprise < 70% of sequences in identical epigenetic state (III), they are excluded. The TAD (yellow star) is a P-TAD with epigenetic switch at the TAD boundary.

CCM simulations for chromosome 13 (Chr13) from the GM12878 cell line:

(a) In the CCM, red (blue) spheres represent active (repressive) loci. The black open circles are the CTCF loop anchor locations. (b) Comparison of the simulated (PL = 1, top half)) and Hi-C contact maps (bottom half). The bar above marks the epigenetic states with red (blue) representing active (repressive) loci. The values of the contact frequencies, converted to a log scale, are shown on the right. (c) Comparison between the Pearson correlation maps consisting of ρij for all loci pairs from simulations (top half) and experimental data (bottom half). The scale for the Pearson Correlation Coefficient (PCC) is on the right. (d) Distribution of the PCC, ρij for all (i, j) pairs from simulations and experiment (1 is positive correlation, 0 is no correlation, and -1 corresponds to anti-correlation). The Kullback-Leibler, DKL, value between CCM prediction and experiment is small. (e) First eigenvector values (PC1) from Principal Component Analysis (PCA) using the correlation matrix for CCM. The compartment A and B are defined by positive (red) and negative (blue) values. (f) Snapshot of the folded Chr13. The color corresponds to genomic distance from one end point, ranging from red to green to blue. (g) Ensemble averaged distance map obtained from simulations. (h) Ward Linkage Matrix (WLM) comparison between simulations and the one computed using Hi-C data. The PCC between the two distance matrices is ∼ 0.83, indicating reasonable agreement between simulations and experiments. (i) Contact map for the 8 Mbp region ((44-52)Mb) with the upper (lower) triangle corresponding to simulations (experiments). (j) On the right is an Illustration of the TADs, identified using the Multi-CD method [72]. The dark-red circles are the positions of the loop anchors detected in the Hi-C experiment, which are formed by two CTCF motifs. A subset of TADs is defined by the CTCF loops, whereas others are not associated with loops. These could arise from segregation between the chromatin states of the neighboring domains in certain experimental studies [26, 73, 74]. The average sizes of the TADs detected using Multi-CD method from Hi-C and simulated contact maps are ∼750kbs and ∼700kbs, respectively. (k) Snapshot of the TAD, marked in (j). (m) Same as (j) except the TADs were calculated for the region ((75-83)Mb) in (l).

Organizational features of Chr10 from human cell line GM12878:

(a) Comparison between the simulated contact map (PL = 1.0, top half) and Hi-C experiments (bottom half). The bar above the contact map shows the epigenetic states with red (blue) representing active (repressive) loci. (b) Experimental (lower triangle) and the simulated (upper triangle) Pearson correlation maps. (c) The distribution of the PCC, ρij for each pair of (i, j) from simulations and experiment. The value of the KL divergence at the bottom is obtained by comparing the distributions obtained in the simulations and experiments. (d) A conformation of the folded Chr10 (N=2,712) obtained using the CCM simulations. The colors correspond to genomic distance from the 50 to 30 end. (e) Ensemble averaged distance map calculated using the simulated structures. (f) Experimental (lower triangle) and the simulated (upper triangle) WLMs. The PCC between the two WLMs is ∼ 0.75. The agreement between simulations and experiments is fair. (g) Hi-C map for the region (19.7-26.25) Mb with the upper (lower) triangle corresponding to simulations (experiments). (h) Right is an illustration of the TADs. The dark-red circles are the positions of the loop anchors detected in the Hi-C experiment, formed by two CTCF motifs. (i) Snapshot of the TAD, marked by the black line in (h). (k) Same as (h) except the TADs were calculated for a region (90.8-97.05)Mb in (j). The diversity of TAD structures is apparent.

Clustering of A and B loci is stronger after loop (cohesin) loss:

(a) Comparison between simulated contact maps using CCM (19-34Mb, upper panel) and Pearson correlation maps (19-29Mb, lower panel) for Chr13 (GM12878 cell line). Upper triangle (lower triangle) was calculated with (without) CTCF loops. The black circles in the upper triangle are the positions of the CTCF loop anchors detected in the Hi-C experiment [4]. The bar on top marks the epigenetic states with red (blue) representing active (repressive) loci. Upon CTCF loop loss, the plaid patterns are more prominent, and finer details of the compartment organization emerge. (b) 3D snapshots of A and B clusters identified using the DBSCAN algorithm with PL = 1 (left panel) and PL = 0 (right panel) computed from simulations of Chr13 with and without loops, respectively. Five A clusters (Upper panel; Red, orange, yellow, dark-green, light-green) and one B cluster (Lower panel; white) are detected in this 3D structure with PL =1. Four A clusters and one B cluster are detected for PL =0. The size of a locus σ 50K ≈ 243nm[11]. (c) Box plot of the number (left) and average size (right) of A (B) clusters determined using 10,000 individual 3D structures for PL =1 and PL =0 for simulated Chr10 and Chr13. The size of the A (B) cluster, SA (SB) is defined as, (the number of A (B) loci within the cluster)/(the total number of A (B) loci within the chromosome). Boxes depict median and quartiles. The black line with caps describes the range of values in the number and size. Loop loss creates a smaller number (enhancement in compartment strength) of A-type clusters whose sizes are larger (Upper). Two-sided Mann-Whitney U test was performed for the statistical analysis. There is no change in the number and size of B clusters after loop deletion (Lower). (d)-(e) Same as (c) except the results were determined using 10,000 3D structures generated with the HIPPS method from the experimental Chr11 and Chr19 contact maps (Chr6 and Chr15 contact maps) from mouse liver for the WT and Δ Nipbl [34] (HCT-116 in (WT) and ΔRAD21 cells [26]), respectively. The number of A clusters decreases by 18% and 27% after Nipbl loss in Chr11 and Chr19, respectively. (f)-(g) Pearson correlation matrix derived from 3D structures for Chr11 and Chr19 of mouse liver, respectively. Two loci, separated by a distance smaller than 1.75σ are in contact (σ is the mean distance between i and i + 1 loci for WT and Δ Nipbl, respectively). The black circles in the upper triangle are loop anchors detected in Hi-C map [34] using HiCCUPS [4]. (h) The percentage of decrease in the number of A (B) clusters after CTCF loop or cohesin loss for some chromosomes in simulations and experiments as a function of the percentage of A (B) loci within the chromosome. When the proportion of B loci is much larger than A loci, there is no change in B clusters despite loop or cohesin deletion (Upper panel).

Enhancement of compartmentalization upon CTCF loop loss:

Compartmentalization saddle plots are shown for (a) Chr13 and (b) Chr10 with PL = 1 (left) and PL = 0 (right). Observed/expected matrix bins are arranged based on PC1, obtained from the contact maps without loops. Numbers at the centre of the maps represent compartment strengths defined as the ratio of ((AA) and (BB) interactions) to ((AB) and (BA) interactions) using the mean values from the corners. The increase in the compartment score (4.2 to 5.2 for Chr13 and 10.9 to 13.3 for Chr10) shows that the compartment features are accentuated in PL = 0 (loop deletion) compared to PL = 1, which accords well with the conclusions in the main text that uses a different method.

Calculation of boundary strength and boundary probability from the distance matrix at 50kb resolution:

(a) A schematic describing the chromosome model. (b) Each small square of size a (= 50 kb) represents distance, rij between two loci i and j. The red square is used to illustrate the idea. (c) Definition of the start and end-of domain boundary strengths in the N×N distance matrix. The distance between the loci are represented as arcs in various colors. (d) The distance maps in 10,000 cells are calculated using the 3D structures using the HIPPS method[68] with Hi-C contact map from Schwarzer et al. [34] as input. Local maxima above a defined threshold at the start/end-of domain boundary strengths (yellow and green lines, respectively) are defined as domain boundaries in the WT Chr13. The start/end boundary probabilities for each locus are calculated as the proportion of cells in which the corresponding locus is a boundary location. The average of the start and end boundary probabilities cover 10,000 cells, is defined as the boundary probability for a given locus.

ChromHMM chromatin state annotation in HCT-116 cells

Single cell TAD-like structures exhibit in both PL = 1 and PL = 0. (a) Meanspatial distance matrix for the genomic region (25.7-28.2Mbps) in CCM Chr13 without (left) and with (right) CTCF loops. (b) Examples of single-cell spatial-distance matrices calculated from the simulated 3D structures. TAD-like structures vary from cell to cell in both PL = 1 (left) and PL = 0 (right). Schematic of structures for the four cells under the two conditions are given below. (c) Distribution of the boundary strengths before (left) and after (right) CTCF loop loss, describing the steepness in the changes in the spatial distance across the boundaries. (d) The probability for each locus to be a single-cell domain boundary in cells for PL = 1 (left) and PL = 0(right).

Same as Appendix Fig. 8 except the results are for the genomic region (123.5-126Mb) in Chr4 of mouse liver [34] with (left) and without (right) cohesin loading factor Nipbl. HIPPS generated single-cell spatial-distance matrices using Hi-C contact maps as inputs.

Same as Appendix Fig. 8 except the results are for the genomic region (182.05-184.55Mb) in Chr2 of HCT116 [26] with (left) and without (right) a core component of the cohesin complex, RAD21. Single cell 3D structures were calculated from Hi-C contact maps using HIPPS.

Epigenetic states contribute to the formation of domain boundaries.

P-TAD does not always have corner dots at their boundaries in the WT cells. (a) The number of P-TADs (after Δ Nipbl) whose boundaries coincide with both epigenetic switches and corner dots (CTCF loop anchors) (red color) and only epigenetic switches (olive color) in the WT chromosomes from mouse liver. (b) Same as (a) except the results are obtained using experimental data from HCT-116 cell. (c) Chr10:57Mb-59.5Mb in mouse liver and (d) Chr1:111.8Mb-114.3Mb in HCT-116 cells, respectively. Comparison between 50kb-resolution contact maps for the 2.5Mb region with (upper) and without (lower) Nipbl (RAD21). The panels below show the mean distance maps obtained from the 3D structures. ChIP-seq tracks for CTCF, RAD21 and SMC1 in WT cells [4, 34] illustrate the correspondence between the locations of the detected loop anchors and the ChIP-seq signals. Comparison of the contact maps and boundary probabilities in (c) and (d) shows that the P-TAD boundaries (blue dotted lines) correspond well with epigenetic switch (blue line) even without corner dots in WT cells. Purple circles in the boundary probability graph represent the preferred boundaries.

Fate of TAD structures after loss of RAD21 in HCT-116 cells.

(a) Complete Loss (Chr21:34.6Mb-37.1Mb) (b)-(c) Preserved (Chr3:97.7Mb-100.2Mb and Chr5:9Mb-11.5Mb). 50kb-resolution contact maps for the 2.5Mb genomic regions of interest with (upper) and without (lower) RAD21 are shown in the middle panels. The dark-red circles at the boundaries of the TADs in the contact maps are loop anchors detected using HiCCUPS [55]. The mean distance maps calculated using the 3D structures with and without RAD21 are compared in the top and bottom panels. ChIP-seq tracks for CTCF, RAD21 and SMC1 in WT cells [4] illustrate the correspondence between the locations of the detected loop anchors and the ChIP-seq signals. Bottom plots are the probability for each genomic position to be a single-cell domain boundary in the regions for cells. Purple circles in the boundary probability graph represent the preferred boundaries. Some P-TADs boundaries match epigenetic switch (blue lines). P-TADs have only high peaks in boundary probability (green line) without evidence for epigenetic switch. The magenta line shows discordance between TopDom and Boundary probability.

Examples of discordance between TopDom and boundary probability predictions in mouse liver [34]. In all cases, the plots show contact maps with TopDom results, mean spatial-distance ma-trix and boundary probability for the 2.5Mb region (a) (Chr1:172Mb-174.5Mb) (b) (Chr4:137.5Mb-140Mb) and (c) (Chr10:8.7Mb-11.2Mb) with (top) and without (bottom) Nipbl. Purple circles in the boundary probability indicate the prominent physical boundary in 3D structures. The magenta lines represent discordance between TopDom and Boundary probability