1. Developmental Biology and Stem Cells
  2. Genes and Chromosomes
Download icon

Global reorganisation of cis-regulatory units upon lineage commitment of human embryonic stem cells

  1. Paula Freire-Pritchett
  2. Stefan Schoenfelder
  3. Csilla Várnai
  4. Steven W Wingett
  5. Jonathan Cairns
  6. Amanda J Collier
  7. Raquel García-Vílchez
  8. Mayra Furlan-Magaril
  9. Cameron S Osborne
  10. Peter Fraser
  11. Peter J Rugg-Gunn Is a corresponding author
  12. Mikhail Spivakov Is a corresponding author
  1. Babraham Institute, United Kingdom
  2. University of Cambridge, United Kingdom
  3. King's College London School of Medicine, United Kingdom
Tools and Resources
Cited
3
Views
2,859
Comments
0
Cite as: eLife 2017;6:e21926 doi: 10.7554/eLife.21926

Abstract

Long-range cis-regulatory elements such as enhancers coordinate cell-specific transcriptional programmes by engaging in DNA looping interactions with target promoters. Deciphering the interplay between the promoter connectivity and activity of cis-regulatory elements during lineage commitment is crucial for understanding developmental transcriptional control. Here, we use Promoter Capture Hi-C to generate a high-resolution atlas of chromosomal interactions involving ~22,000 gene promoters in human pluripotent and lineage-committed cells, identifying putative target genes for known and predicted enhancer elements. We reveal extensive dynamics of cis-regulatory contacts upon lineage commitment, including the acquisition and loss of promoter interactions. This spatial rewiring occurs preferentially with predicted changes in the activity of cis-regulatory elements and is associated with changes in target gene expression. Our results provide a global and integrated view of promoter interactome dynamics during lineage commitment of human pluripotent cells.

https://doi.org/10.7554/eLife.21926.001

eLife digest

Virtually every cell in the body contains the same set of DNA, which encodes thousands of genes. The activities of these genes vary between different types of cells and at different points in time. As a result, our DNA contains a complex array of molecular switches that instruct genes to switch on and off at the right time and in the right cells. These molecular switches, termed regulatory elements, are often a long way away from the genes that they control, and this can make it difficult to find out which switch controls which genes.

DNA is made up of several different building blocks known as bases and the order of these bases encodes specific information about the gene. Every human cell contains approximately two meters of DNA, which is highly folded in the cell nucleus. This three-dimensional folding allows regions that are far apart on the DNA thread to physically contact each other. To reach the genes they control, regulatory elements form loops on the DNA that are near-impossible to predict from looking at the sequence of bases alone. Mapping the locations of these loops can reveal the hidden circuitry within our DNA and help us to understand how unwanted changes (mutations) within regulatory elements may cause disease.

Freire-Pritchett, Schoenfelder et al. set out to reveal how loops between genes and their regulatory elements change as the stem cells specialise into immature brain cells. The experiments show that the pattern of DNA loops is extensively altered after the stem cells specialise into brain cells, that is, some loops are lost and new ones form. Furthermore, the regulatory elements themselves were often toggled between “on” and “off” states. These two processes tend to occur together, so that new loops are formed at the same time as the switch is activated.

Freire-Pritchett, Schoenfelder et al. also show that individual genes are often connected to many different regulatory elements. The next step is to understand how these multiple connections work together to coordinate gene activity, and whether this information could be used to control how stem cells specialise. This knowledge may lead to the development of stem cell-based therapies for stroke, Parkinson’s disease and other conditions.

https://doi.org/10.7554/eLife.21926.002

Introduction

Cell fate decisions are associated with profound changes in chromatin organisation, which underlie the activation of lineage-specific and the silencing of lineage-inappropriate genes (Buecker and Wysocka, 2012; Bulger and Groudine, 2010; Calo and Wysocka, 2013; Hallikas et al., 2006; Ong and Corces, 2012). Cis-regulatory elements such as transcriptional enhancers play a key role in this process by integrating regulatory inputs from intrinsic and extracellular cues, and by mediating the recruitment of core activator and repressor complexes (Pennacchio et al., 2013; Shlyueva et al., 2014; Spitz and Furlong, 2012). The definition of chromatin signatures has enabled the genome-wide identification of enhancer elements across multiple human cell types (ENCODE Project Consortium, 2012; Heintzman et al., 2007, 2009; Pennacchio et al., 2006; Rada-Iglesias et al., 2011; Kundaje et al., 2015). Chromatin states can also provide a robust readout of cis-regulatory activity associated with poised and active enhancers and have been used to show that widespread changes in enhancer position and activity occur upon cell fate decisions such as the lineage commitment of pluripotent cells (Creyghton et al., 2010; Rada-Iglesias et al., 2011; Zentner et al., 2011).

Cis-regulatory elements are often considerable distances away from their target gene promoters and may not control their nearest genes (Carvajal et al., 2001; Jeong et al., 2006; Marinić et al., 2013; Pennacchio et al., 2006; Ruf et al., 2011; Sagai et al., 2005; Spitz et al., 2003). It is generally accepted that this long-range action is facilitated by DNA-looping interactions (Pennacchio et al., 2013; Shlyueva et al., 2014). However, specific determinants of chromosomal interactions are still not fully understood, which presents challenges for the high-confidence prediction of regulatory interactions from sequence and epigenetic information (Mora et al., 2016; Roy et al., 2015; Shlyueva et al., 2014; Whalen et al., 2016). As a result, the target genes of most cis-regulatory elements remain unknown. Furthermore, while it is generally accepted that many genes are controlled by multiple regulatory elements (Barolo, 2012; Miguel-Escalada et al., 2015) our understanding of multi-modular gene regulation remains limited, particularly in the context of mammalian development and stem cell differentiation.

Over the last decade, chromosome conformation capture (3C) and derived methods have enabled the biochemical mapping of looping interactions to offer new insights into their architecture across different cell types (Dekker et al., 2013; de Laat and Duboule, 2013; Schmitt et al., 2016). In particular, Hi-C has allowed genome-wide characterisation of higher order chromatin dynamics during differentiation at the level of contact domains, including A/B compartments and topologically associated domains (TADs) (Fraser et al., 2015; Dixon et al., 2015). The complexity of Hi-C samples creates challenges for a comprehensive identification of individual enhancer-promoter loops using this technology. However, analyses focusing on candidate-interacting regions or those bound by specific proteins (such as cohesin and RNA polymerase II) have made it possible to detect subsets of promoter-enhancer interactions at high resolution (Heidari et al., 2014; Li et al., 2015; Sanyal et al., 2012) and delineate their dynamics during cell differentiation and reprogramming. These studies provided evidence of interactions associated with transcriptional changes upon lineage commitment (Denholtz et al., 2013; Kieffer-Kwon et al., 2013; Phillips-Cremins et al., 2013; Zhang et al., 2013), as well as revealed interactions formed in anticipation of changes in gene activity (Apostolou et al., 2013; Ghavi-Helm et al., 2014; Wei et al., 2013). However, despite these advances, the global and unbiased high-resolution mapping of promoter cis-regulatory interactions that form and remodel during development and stem cell differentiation is still lacking. This hampers an integrated understanding of the cis-regulatory logic underlying transcriptional decisions during lineage commitment.

Recently, we developed Promoter Capture Hi-C that uses sequence capture to enrich Hi-C libraries for interactions involving the promoters of most annotated genes, providing a global view on promoter interactions that is independent of the activity of interacting regions and identity of proteins recruited to them (Mifsud et al., 2015; Schoenfelder et al., 2015a). Here, we use PCHi-C in human embryonic stem cells (ESCs) and ESC-derived neuroectodermal cells (NECs) (Bajpai et al., 2009) to create a high-resolution resource of promoter contacts and their dynamics during early lineage commitment in the context of extensive chromatin changes that occur at the interacting cis-regulatory regions as the cells differentiate (Rada-Iglesias et al., 2011). Our large-scale dataset links thousands of known and predicted enhancer elements with their putative target genes in human pluripotent and early lineage-committed cells, including those known to drive tissue-restricted reporter gene expression in transgene assays. We integrate the promoter interacting regions of each gene to define cis-regulatory units (CRUs) that provide a view of multi-modular gene regulation. We show that CRUs undergo extensive reorganisation during lineage commitment that involves both the ‘rewiring’ (acquisition or loss) of promoter interactions, as well as chromatin state changes at pre-existing interactions. Importantly, we demonstrate that this reorganisation is associated with changes in target gene expression, thereby providing insights into the transcriptional control of early human development.

Results

A high-resolution atlas of promoter interactions in human pluripotent and early lineage-committed cells

We used PCHi-C to profile the interactomes of 21,841 promoters in human ESCs and NECs (Figure 1A). We generated NECs using an established protocol (Rada-Iglesias et al., 2011) (Figure 1—figure supplement 1A) and confirmed efficient differentiation by flow cytometry analysis and RNA-sequencing (Figure 1—figure supplement 1B,C). PCHi-C data normalisation and signal detection using the CHiCAGO pipeline (Cairns et al., 2016) identified 75,795 significant cis-interactions between promoters and other genomic regions in ESCs and 75,624 in NECs. In addition, approximately 300 significant trans-interactions were detected in each cell type. As examples of this rich dataset, high-confidence interactions are shown for the SOX2 and PAX6 promoters (Figure 1B and Figure 1—figure supplement 2A). These examples illustrate the multiple promoter-contacts observed, alongside the conventional Hi-C profiles additionally generated in this study that reveal higher-order genome topology over the same region. Overall, PCHi-C samples showed an 11 to 15-fold enrichment for promoter-containing interactions over conventional Hi-C. This data resource provides a global, high-resolution atlas of chromosomal interactions in human pluripotent and lineage-committed cells. Processed datasets have been made available through Open Science Framework (http://osf.io/sdbg4), and raw sequencing reads have been deposited to Gene Expression Omnibus (accession GSE86821).

Figure 1 with 4 supplements see all
A resource of high-resolution promoter interactions in human embryonic stem cells (ESCs) and ESC-derived neuroectodermal cells (NECs).

(A) Overview of the experimental design. Human embryonic stem cells (ESCs) and ESC-derived neuroectodermal progenitors (1) were analysed with Promoter Capture Hi-C to profile interactions involving 21,841 promoter-containing HindIII fragments (2). Signal detection with the CHiCAGO pipeline revealed ~75,000 high-confidence promoter interactions in each cell type (3). These data were integrated with histone modification and gene expression profiles in the same cells (4) to study chromatin and interaction dynamics during lineage commitment. Characterisation of ESCs and NECs is shown in Figure 1—figure supplement 1. (B) Genome browser representation of the SOX2 promoter interactome in ESCs (upper) and NECs (lower). Significant interactions are shown as purple arcs, with one end of the interaction within the SOX2 promoter and the other end at a promoter-interacting region (PIR). ChIP-seq (H3K27me3, H3K27ac, H3K4me1, H3K4me3; from [Rada-Iglesias et al., 2011]) and mRNA-seq tracks are shown. Chromatin states for each genomic region were defined by ChromHMM (Ernst and Kellis, 2012) using ChIP-seq data (active chromatin, green; poised chromatin, orange; Polycomb-associated chromatin, red; intermediate, yellow; background, grey). Conventional Hi-C heatmaps of contact frequencies reveal chromatin topology over this region. As an additional example, the PAX6 promoter interactome is shown in Figure 1—figure supplement 2. Read count interaction profiles for SOX2 and PAX6 are shown in Figure 1—figure supplement 4. (C) PIRs are significantly enriched in regions that contain histone marks associated with putative regulatory functions, compared with promoter distance-matched control regions (permutation test p-value<0.01 for each mark) (ESCs, left; NECs, right). Blue bars show the number of overlaps observed in detected PIRs, and grey bars show the mean number of overlaps observed in distance-matched random regions over 100 permutations. Error bars show 95% confidence intervals across permutations. (D) Promoters and their associated PIRs show significant concordance in chromatin states. Heatmaps show the log2 odds ratios for the co-occurrence of each combination of promoter and PIR chromatin state compared with that expected at random. p-Values are from Pearson’s χ2 test on the corresponding contingency tables. Clustering of chromatin states and additional examples of promoter interactomes are shown in Figure 1—figure supplement 3.

https://doi.org/10.7554/eLife.21926.003

Identification of putative regulatory elements and their associated gene promoters

To gain insight into the chromatin properties of the promoter-containing interactions, we integrated the PCHi-C data with published genome-wide histone modification profiles in ESCs and NECs (Rada-Iglesias et al., 2011). In both cell types, promoter-interacting regions (PIRs) were significantly enriched for histone marks that are associated with regulatory functions (Figure 1C), including H3 lysine 4 monomethylation (H3K4me1) and H3 lysine 27 acetylation (H3K27ac), which can identify enhancers in human cell types, as well as H3 lysine four trimethylation (H3K4me3), which is associated with transcriptional activation, and H3 lysine 27 trimethylation (H3K27me3), which is associated with Polycomb-mediated transcriptional repression (Di Croce and Helin, 2013; Heintzman et al., 2007, 2009). PIRs were also significantly enriched for sites bound by the architectural protein CTCF (Figure 1—figure supplement 2B; based on ENCODE data available for ESCs only), consistent with previous observations in other cell types (Jin et al., 2013; Phillips-Cremins et al., 2013; Sanyal et al., 2012).

We used ChromHMM (Ernst and Kellis, 2012) to integrate these histone marks and to define four combinatorial chromatin states in both ESCs and NECs, as follows: active (characterised by H3K4me3 and/or H3K27ac); Polycomb-associated (H3K27me3); poised (H3K4 methylation and H3K27me3); and background (no detectable signal for the tested histone modifications) (Figure 1—figure supplement 3A; see Materials and methods for details). Overall, we detected just under 20,000 PIRs in each cell type that harboured either active (12,250 in ESCs and 7510 in NECs), Polycomb-associated (3505 in ESCs and 5856 in NECs) or poised (2274 in ESCs and 4262 in NECs) chromatin state signatures, connecting a large set of putative regulatory sequences in human pluripotent and lineage committed cells to their target promoters. In addition, 25,727 PIRs in ESCs and 20,016 PIRs in NECs were in the background state.

The chromatin states of several example promoters, including those for the POU5F1, PRDM14 and CHD7 genes, together with each of their respective PIRs, are shown in Figure 1—figure supplement 3B. When analysing the whole dataset, we found a significant concordance between the chromatin states at promoters and their PIRs (Figure 1D), which is in line with previous studies in other human cell types (Jin et al., 2013; Mifsud et al., 2015; Sanyal et al., 2012) and provides validation of our dataset. Notably, poised and Polycomb-associated promoters showed similar interaction preferences for PIRs in either of these two states (Figure 1D). This finding suggests that poised and Polycomb-associated regions are broadly interchangeable in terms of their interaction affinities, which is consistent with a key role for Polycomb-group proteins in mediating interactions in the poised state (Schoenfelder et al., 2015b).

Taken together, these data provide a comprehensive resource that links many thousands of known and predicted regulatory elements with their putative target genes and will enable the investigation of regulatory contacts during human lineage commitment.

Promoter-interacting regions can function as tissue-restricted developmental enhancers

The enrichment of PIRs for specific chromatin regulatory features points to the presence of functional enhancer elements at these regions that could potentially provide inputs to the promoters they contact. To assess the enhancer activity of the identified PIRs, we examined whether they were known to efficiently drive reporter gene expression in embryonic day 11.5 mouse embryos based on information from the VISTA Enhancer Browser (Visel et al., 2007). As an initial example, we focused on the 39 PIRs detected in NECs that interact with the promoter of the neural transcription factor POU3F2. Strikingly, four out of the five POU3F2 PIRs tested experimentally in VISTA transgenic assays showed reporter activity exclusively in neural tissues, and one PIR was inactive (Figure 2A). Furthermore, the mRNA expression pattern of mouse Pou3f2 broadly matched the combined tissue-restricted pattern of the tested human POU3F2 PIRs (Figure 2A). These results highlight how PCHi-C can contribute to our understanding of the cis-regulatory networks for key developmental genes.

Figure 2 with 1 supplement see all
Promoter-interacting regions can function as tissue-restricted developmental enhancers and identify associated target genes.

(A) A genome browser representation of the POU3F2 promoter interactome in NECs. Genome coordinates are shown underneath. Chromatin states are indicated (active chromatin, green; poised chromatin, orange; Polycomb-associated chromatin, red; intermediate, yellow; background, grey). Significant interactions are shown as coloured arcs. Five of the identified POU3F2 PIRs have been tested experimentally using a transgenic reporter assay as part of the VISTA Enhancer Browser (Visel et al., 2007). Of those five, four regions (indicated by blue arcs) can drive tissue-restricted LacZ expression in E11.5 mouse embryos. Representative images of X-gal stained mouse embryos are shown for each sequence. These show neural-restricted enhancer activity within the forebrain, midbrain, hindbrain and neural tube, which are tissues derived from NECs. The mRNA expression pattern of Pou3f2 in an E10.5 mouse embryo (EMAGE gene expression database; EMAGE:1689; [Richardson et al., 2014]) broadly matches the combined tissue-restricted pattern of its enhancers. One experimentally tested PIR (indicated by brown arc) is inactive at this developmental stage in mouse embryos. (B) PIRs identified in NECs are enriched for sequences that can drive reporter gene activity in neural tissues and other neuroectodermal derivatives (see also Figure 2—figure supplement 1A). The barplot shows the distribution of tissue-specific reporter expression patterns for all experimentally tested PIRs identified in ESCs (n = 219) and NECs (n = 267). Embryo reporter assays and enhancer activity patterns are from the VISTA Enhancer Browser (Visel et al., 2007). The number of PIRs active within a particular tissue is shown above each bar. PIRs with an active chromatin state in NECs showed an even more pronounced enrichment for enhancer activity in neural tissues (Figure 2—figure supplement 1B). (C) Representative images of X-gal stained mouse embryos from the VISTA Enhancer Browser (Visel et al., 2007) reveal neural-restricted reporter gene activity for six example NEC PIRs. Shown underneath is the gene promoter assignment for the associated enhancer in VISTA and in our PCHi-C dataset.

https://doi.org/10.7554/eLife.21926.008

We next examined all putative enhancer elements and their PCHi-C-identified promoter targets. Overall, 219 PIRs in ESCs and 267 PIRs in NECs overlapped VISTA-annotated human elements (Supplementary file 2). Notably, we found that NEC PIRs were strongly enriched for sequences that could drive reporter-gene activity in neural tissues and in other neuroectodermal derivatives, such as the neural tube and cranial structures (z-score = 11, Figure 2B and Figure 2—figure supplement 1A). In contrast, ESC PIRs were enriched for sequences active in neural (z-score = 6.7) and non-neural tissues (z-score = 4.5) at similar levels (Figure 2B and Figure 2—figure supplement 1A). Interestingly, the enrichment of PIRs with neural enhancer activity in NECs relative to their enrichment in ESCs was even more pronounced when we focused on PIRs in an active chromatin state (Figure 2—figure supplement 1B). Collectively, these results validate the function of several hundreds of PIRs as cell-type-specific developmental enhancers.

We next sought to link enhancers documented in the VISTA Enhancer Browser to their putative target genes on the basis of PCHi-C data. We detected the interactions of 267 VISTA-annotated human enhancers with 277 target gene promoters in NECs (Supplementary file 2). Of these, 122 PIRs (46%) interacted with their nearest gene, which is consistent with their current annotation in the VISTA Enhancer Browser. The remaining PIRs, however, did not interact with their nearest gene in NECs, but engaged with more distal promoters (Supplementary file 2). Figure 2C shows PCHi-C-based reassignment of enhancer targets for several examples of key neural regulators including SOX2, SOX4, and FZD3 (Figure 2C), and the full results are listed in Supplementary file 2.

Taken together, these findings provide a functional validation of the detected human PIRs, and identify the putative promoter targets of multiple known enhancers.

The cis-regulatory unit: an integrated view of promoter interactions

We found interacting promoters to engage a median of four PIRs (Figure 3A), consistent with findings in other human cell types (Jin et al., 2013; Sanyal et al., 2012). To obtain an integrated view of promoter interactions, we considered PIRs connected to each promoter to jointly form a ‘cis-regulatory unit’ (CRU, Figure 3B). Focusing on protein-coding genes, and considering all promoters associated with at least one PIR, we defined 9008 CRUs in ESCs and 9361 in NECs, and studied their localisation, chromatin properties and dynamics during cell lineage commitment.

Figure 3 with 1 supplement see all
Characterisation of cis-regulatory units (CRUs).

(A) Boxplot shows the distributions of the number of PIRs per interacting promoter in ESCs (n = 17955) and NECs (n = 18146). Promoters with no detected PIRs are not shown (4121 in ESCs; 3930 in NECs). The number of interactions per promoter showed only a minor dependence on transcriptional activity and promoter chromatin state (Figure 3—figure supplement 1A,B). (B) A schematic illustrating the concept of a CRU as a collection of all PIRs together with their associated promoter. Note that it cannot be ruled out that some PIRs may provide alternative rather than concurrent interactions. (C) Boxplot shows the distributions of CRU span in ESCs and NECs. We observed a moderate dependence between the span and the number of PIRs (Figure 3—figure supplement 1C). (D) CRUs are preferentially contained within an individual TAD. Line graph shows the percentage of CRUs with different proportions of interactions that reside within an individual TAD (purple) and the summary statistics (mean and 95% confidence error bars) obtained for 1000 random samples, keeping the same CRU structure (grey). There is a significant tendency for CRUs to be contained entirely within a TAD ( * denotes permutation test p-value<0.001). In addition, fewer CRUs span entirely over a TAD boundary (* denotes permutation test p-value<0.001). The 1000 random samples were generated by permutations of CRUs across all promoter fragments, retaining the same overall CRU structure. Error bars show 95% confidence intervals. Data shown are for ESCs (n = 9008 CRUs); data for NECs are shown in Figure 3—figure supplement 1D. We found that CRUs crossing TAD and IN boundaries generally contained a higher number of PIRs (Figure 3—figure supplement 1F,G). E–H) Genome browser representations of CRUs in ESCs. Examples include the SNAI2 CRU (E) and GLI2 CRU (F), which both fit entirely within a TAD and INs; PRDM1 CRU (G), which fits entirely within a TAD but extends beyond INs, and POU3F1 CRU (H), which extends over a TAD boundary and also beyond an IN. (I) CRUs are preferentially contained within INs, but interactions can extend beyond IN boundaries. The line graph shows the percentage of CRUs with different proportions of interactions that reside within an individual IN in ESCs (coordinates obtained from [Ji et al., 2016]). There is a significant tendency for CRUs (purple) to be contained entirely within an IN, compared to random (grey) (* denotes p-value<0.001 from a permutation test done with 1000 random samples). In addition, fewer CRUs span entirely beyond an IN (* denotes p-value<0.001 from a permutation test done with 1000 random samples). Error bars show 95% confidence intervals. Promoters outside of a defined IN were excluded from the analysis.

https://doi.org/10.7554/eLife.21926.010

CRUs spanned a median of ~230 kb (with a range of 1 kb-200Mb) in both cell types (Figure 3C). We assessed their localisation with respect to higher order features of chromosome architecture including TADs and Insulated Neighborhoods (INs) (Dixon et al., 2012; Ji et al., 2016; Nora et al., 2012; Sexton et al., 2012). We defined TADs in ESCs and NECs using Hi-C data for these cells generated as part of this study (see Materials and methods). Overall, ~75% of CRUs were fully contained within a TAD in ESCs and NECs, which was significantly higher than expected by random (Figure 3D–G and Figure 3—figure supplement 1D,E). In the remaining ~25% of CRUs, either some or all PIRs localised outside of the promoter-harbouring TAD (Figure 3H and Figure 3—figure supplement 1D). We found that TAD boundaries crossed by promoter interactions were generally weaker than non-crossed boundaries (Wilcoxon test p-value=1.8e-14; Figure 3—figure supplement 1H). However, the ranges of strength scores for ‘crossed’ and ‘non-crossed’ TAD boundaries were highly overlapping, and even some of the strongest boundaries were penetrable to interactions (Figure 3—figure supplement 1H).

For INs, we used the published genomic coordinates (available for ESCs only) that were defined on the basis of cohesin ChIA-PET and CTCF-binding data (Ji et al., 2016). Just under 30% of CRUs were fully contained within IN boundaries, and this proportion increased to ~45% when considering the largest span of each overlapping set of INs as a single unit (Figure 3I and Figure 3—figure supplement 1I). These numbers significantly exceeded the proportions expected at random (Figure 3I and Figure 3—figure supplement 1I), but at the same time, also provided abundant examples of IN-spanning CRUs (Figure 3G,H). Taken together, these results suggest that CRUs are partially constrained by, but not fully contained within, higher order topological structures such as TADs and INs.

ESC differentiation is associated with the dynamic reorganisation of CRUs

To investigate the potential regulatory features of CRUs, we first characterised their chromatin properties by considering the proportion of PIRs in each chromatin state within a CRU. Applying hierarchical clustering based on this property, we obtained eight distinct clusters of CRUs in both ESCs (Figure 4A) and NECs (Figure 4—figure supplement 1), corresponding to different combinations of PIR chromatin states within CRUs. We found that CRUs within three prevalent clusters contained PIRs in one predominant, non-background, chromatin state (clusters 1–3; Figure 4A and Figure 4—figure supplement 1A). In contrast to these ‘uniform’ CRUs, 18% of CRUs in ESCs and 24% in NECs contained combinations of PIRs in active, poised and Polycomb-associated chromatin states (clusters 4–7; Figure 4A and Figure 4—figure supplement 1). Finally, CRUs in cluster 8 contained PIRs exclusively in the background state (Figure 4A and Figure 4–figure supplement 1). Examples of genes in ESCs assigned to the different CRU clusters are shown in Figure 4B and Figure 4—figure supplement 2. Notably, the chromatin state of each promoter generally matched that of the most prevalent CRU chromatin state (Figure 4A and Figure 4—figure supplement 1). Overall, this classification provides a framework for exploring CRU properties.

Figure 4 with 3 supplements see all
Clustering of CRUs according to chromatin state of each PIR in ESCs.

(A) CRUs from ESCs were clustered hierarchically according to the distribution and fractions of their PIRs that correspond to each chromatin state. Boxplots show the distribution of PIR fractions for each chromatin state (Act, active; Pois, poised; PcG, Polycomb-associated; Bg, background). Heatmaps show the log2 odds ratios of observing each promoter state associated with a CRU in each cluster (p<0.001, χ2 test on the contingency table). Data for NECs are shown in Figure 4—figure supplement 1. (B) Genome browser representations of CRUs in ESCs. U2AF1 CRU from cluster 1 and TBX3 CRU from cluster 2, each exemplify cases where non-background PIRs within a CRU are associated with a uniform chromatin state. SUV39H2 CRU from cluster 6 and CDKN2B CRU from cluster 7, each exemplify cases where PIRs within a CRU are associated with multiple chromatin states. Interaction arcs are coloured according to PIR chromatin state (active, green; poised, orange; Polycomb-associated, red; background, grey). See Figure 4—figure supplement 2 for additional examples, and Figure 4—figure supplement 3 for read count interaction profiles.

https://doi.org/10.7554/eLife.21926.012

We set out to investigate CRU chromatin state transitions on ESC to NEC differentiation. For this analysis, and in each cell type, we classified CRUs into either single-state active (containing active, and possibly also background-state, PIRs), single-state repressed (containing poised and/or Polycomb-associated, and possibly also background-state PIRs), or background (containing only background-state PIRs). CRUs containing a combination of both active and repressed (Polycomb-associated/poised) PIRs were classified as dual-state. We found that 65% of the single-state CRUs in ESCs remained single-state CRUs in NECs, although approximately half of them switched their state (Figure 5A). In addition, similar proportions of CRUs lost (11%) and acquired (13%) a dual-state configuration on ESC to NEC differentiation. These findings demonstrate that considerable reorganisation of CRUs occurs during lineage commitment.

CRU state transitions occur during ESC differentiation and are associated with changes in gene transcription.

(A) Pie chart summarising CRU state transitions that occur upon ESC to NEC differentiation. The number of CRUs within each transition category are shown. Transitions that involve dual-state to single-state, and single-state to dual-state, are further subdivided into whether the single-state is classified as active or repressed (Polycomb-associated or poised). (B–C) Heatmaps show the log2 odds ratios for CRU state transitions and associated changes in gene expression. (B) Single-state transitions showing a non-random segregation withgene expression changes (p-value=0.0031, Fisher’s exact test); (C) dual-state transitions showing a non-random segregation with gene expression changes (p-value=0.0014, Fisher’s exact test). Number of CRUs within each transition category are shown. Genes differentially expressed between ESCs and NECs were identified using DESeq2 (FDR < 0.05 and a log2 fold change of >1.5). Repressed state includes Polycomb-associated and poised states. (D–F) Genome browser representations of CRU state transitions that occur upon the differentiation of ESCs (top image) to NECs (lower image). (D) RGMB provides an example of a CRU transitioning from a single repressed to a single active state and an associated increase in RGMB transcription. (E) MAP2 provides an example of a dual-state to a single active state CRU transition and an associated increase in MAP2 transcription. (F) KCNE3 provides an example of a single active state to a dual-state CRU transition and an associated decrease in KCNE3 transcription. Arcs are coloured according to PIR chromatin state (active, green; poised, orange; Polycomb-associated, red; background, grey).

https://doi.org/10.7554/eLife.21926.016

CRU state transitions associated significantly with changes in the expression of their target genes (p-value<0.005, Fisher’s exact test; Figure 5B,C). In particular, genes that were transcriptionally upregulated upon ESC differentiation preferentially gained an active single-state in NECs, either through switching the chromatin state of a single-state CRU (Figure 5B) or through resolving a dual-state CRU (Figure 5C). Examples of CRUs undergoing each scenario include RGMB and MAP2, which are transcriptionally upregulated in NECs (Figure 5D,E). Pronounced chromatin changes were also detected at the CRUs of genes downregulated upon differentiation, including a loss of the active single-state and/or a transition to the repressed single-state ( Figure 5B,C; example shown in Figure 5F). Taken together, these results suggest that the modulation of CRU chromatin state is associated with transcriptional changes upon ESC differentiation. This modulation might potentially underlie many transcriptional changes in early lineage commitment.

Rewiring and recolouring of promoter-interacting regions contribute jointly to gene expression dynamics upon lineage commitment

To investigate the underlying processes that drive changes in CRU organisation during cell lineage commitment, we studied the dynamics of promoter interactions and chromatin states at the individual PIRs. We refer to changes in PIR connectivity as ‘rewiring’, and to chromatin state changes at PIRs as ‘recolouring’, and note that they do not need to be mutually exclusive (Figure 6A). To distinguish between interactions that are rewired and retained on ESC differentiation at high confidence, we applied additional filters to the PCHi-C data, resulting in 1153 rewired (present in only one cell type) and 1258 retained (present in both ESCs and NECs) interactions (see Materials and methods for details).

Figure 6 with 1 supplement see all
Interaction Dynamics: ‘recolouring’ versus ‘rewiring’.

(A) A schematic of interaction dynamics during cell differentiation. In a ‘recolouring’ interaction (left)the PIR undergoes a change in chromatin colour (reflecting a change in chromatin state) between the two cell types. In a ‘rewiring’ interaction (right), an interaction is gained or lost upon cell differentiation. In a 'rewiring + recolouring' interaction, the loss or gain of an interaction is concomitant with changes in chromatin colour at the respective PIR. (B) Heatmap of log2 odds ratios showing the association between different PIR chromatin state transitions (recolouring) and PIR interaction dynamics (rewiring) on ESC to NEC differentiation (p-value<0.001, Fisher’s exact test). (C–D) Genome browser representations of interaction dynamics upon ESC to NEC differentiation. Note that only one interaction is shown for each example. Arcs are coloured according to PIR chromatin state (active, green; Polycomb-associated, red). (C) Rewiring and recolouring: upon differentiation, NR2F1 gains an interaction with a PIR that is active in NECs, but repressed in ESCs. (D) Rewiring: the JAG1 promoter gains an interaction with an active PIR in NECs. Additional examples are shown in Figure 6—figure supplement 1A,B. (E) Pie chart summarising the different scenarios in which an interaction is retained upon ESC to NEC differentiation. (F) Recolouring: the IRX3 promoter retains an interaction, but the PIR changes from repressed (in ESCs) to active (in NECs). (G) Boxplots revealing the transcriptional changes as a function of active PIR dynamics during recolouring (left) and rewiring (right) events. In either scenario, there was a significant association between the acquisition and loss of an active state and changes in gene expression (p-values<0.001 for both recolouring and rewiring according to one-sided Wilcoxon rank sum tests).

https://doi.org/10.7554/eLife.21926.017

Importantly, we found that the co-occurrence of rewiring and recolouring interactions on ESC differentiation was significantly more common than expected at random (Figure 6B, p-value<0.001, Fisher’s exact test). Specifically, new interactions that were gained by NECs preferentially acquired the active state, or transitioned from the background to repressed state (Figure 6B, bottom row). Interactions that were lost on ESC differentiation were enriched for PIRs that transitioned from the active to poised/Polycomb-associated states, as well as for those switching to the background state (Figure 6B, middle row). Notably, the vast majority of rewiring events (99.7%) were not associated with larger-scale A/B compartment dynamics (not shown). Together, these observations indicate that lineage commitment associates with concerted changes in the connectivity and chromatin state of regulatory regions.

Interactions at NR2F1 (Figure 6C) and ZSCAN2 (Figure 6—figure supplement 1A) exemplify the preferential co-occurrence of rewiring and recolouring events, with interactions present in the cell type in which the respective PIR is in the active state. However, we also found examples of PIR rewiring that showed unchanged chromatin states in both cell types, such as those at the JAG1 and HAPL3 genes (Figure 6D and Figure 6—figure supplement 1B). Finally, we observed that 25% of PIRs that were retained in both cell types undergo chromatin state recolouring (Figure 6E, red segment). These regions included, for example, PIRs associated with the IRX3 and RAB3B promoters (Figure 6F and Figure 6—figure supplement 1C).

We asked how rewiring and recolouring events at PIRs contribute to gene expression dynamics on ESC to NEC differentiation. We found that the loss or gain of interactions with active-state PIRs associated significantly with changes in gene expression (Figure 6G), suggesting their functional contribution to transcriptional control. Notably, gene expression changes were detected at retained and recoloured PIRs (Figure 6G, left panel), and also when active-state PIRs were lost or gained through rewiring (Figure 6G, right panel).

Taken together, these findings demonstrate that chromatin state changes and rewiring of interactions at PIRs contribute jointly to transcriptional regulation. Furthermore, our results show that promoter interaction dynamics preferentially co-occur with chromatin state dynamics upon cell lineage commitment.

Discussion

A comprehensive atlas of promoter interactions

Our study presents an atlas of promoter interactions in human pluripotent and early lineage-committed cells, and offers new insights into the association between genome organisation and gene regulation. The high resolution of PCHi-C has enabled us to detect individual promoter-associated loops at a single restriction enzyme fragment resolution. We find that promoter-interacting regions in both cell types harbour multitudes of previously known and putative enhancer elements, which we link with their physically associated target genes. While the identified connections are predictive of regulatory relationships, it is important to note that the current data are correlative and will require functional validation using targeted genetic approaches and reporter assays. In addition to active enhancers, we find extensive promoter connectivity to regions associated with Polycomb-mediated repression and poising, reinforcing the role of Polycomb-group proteins in controlling chromosomal topology at transcriptionally inactive genes (Entrevan et al., 2016; Li et al., 2015; Schoenfelder et al., 2015b; Vieux-Rochas et al., 2015). Consistent with previous observations (Sanyal et al., 2012), we also detect large numbers of interactions between promoters of both active and inactive genes, and regions devoid of chromatin features. It is possible that such interactions are structural, rather than play gene regulatory roles. However, a regulatory function for some ‘unmarked’ PIRs also cannot be ruled out as recent mutagenesis experiments have identified functional elements that lack 'classic' chromatin annotations (Pradeepa et al., 2016; Rajagopal et al., 2016).

Reconfiguration of cis-regulatory units upon lineage commitment

The high-resolution promoter-interaction information has enabled the identification of cis-regulatory units (CRUs) as sets of interactions connected to the same promoter. Taking the view of CRUs, we consider jointly the dynamics of chromatin states and interactions as ESCs differentiate, and assess the potential contribution of these processes to changes in gene expression during development. We observe that CRUs reconfigure extensively upon cell differentiation. These include the CRUs of ESC- and NEC-specific genes, for which CRU reconfiguration associates with transcriptional changes upon differentiation, as well as the CRUs of genes that are not expressed in either cell type, consistent with the model of progressive chromatin changes at lineage-inappropriate genes during lineage commitment (Spivakov and Fisher, 2007).

Previous studies on candidate loci have proposed that cell-state changes are associated with two predominant models of enhancer–promoter interaction dynamic that have been termed instructive and permissive (de Laat and Duboule, 2013). Instructive interactions are established de novo upon cell state change and are concomitant with target gene activity. In contrast, permissive interactions are already in place before the gene activation occurs and may therefore contribute to enhancer priming. Examples of instructive and permissive interactions have been described in pluripotent cell reprogramming and differentiation (Apostolou et al., 2013; Denholtz et al., 2013; Phillips-Cremins et al., 2013; Wei et al., 2013; Zhang et al., 2013), but little was known on a global scale about which model of enhancer-promoter interactions is predominant during lineage commitment. Here, we show that developmental changes at CRUs involve both the rewiring of ‘instructive’ interactions and the recolouring of the chromatin state of ‘permissive’ interacting regions. Notably, we find that these two processes tend to occur hand in hand, with the strongest association occurring between cell-type-specific promoter interactions and the active state of the respective PIRs. Importantly, CRU chromatin dynamics (at both rewired and preformed interactions) associates with consistent changes in gene expression, suggesting that both mechanisms are functionally important in mediating lineage-specific transcriptional programmes. The exact determinants of ‘permissive’ versus ‘instructive’ interactions remain to be elucidated and may depend on the identity of cis-acting factors recruited to the regulatory regions, as well as on local chromatin environments.

Implications for developmental gene regulation by multiple enhancers

The CRU view provides an opportunity to consider multi-modular gene regulation in early human development that has hitherto been studied on a limited number of genes, predominantly in model organisms (Barolo, 2012; Cannavò et al., 2016; Hong et al., 2008). The ‘single-state’ architecture that we detect at the majority of CRUs is in line with observations of ‘shadow enhancers’ with overlapping activities in Drosophila (Hong et al., 2008). It has been suggested that this cis-regulatory organisation ensures the robustness of gene regulation and can buffer the effects of deleterious sequence variation, as well as providing opportunities for evolutionary innovation (Barolo, 2012; Cannavò et al., 2016; Hong et al., 2008; Perry et al., 2010).

‘Dual-state’ CRUs, although representing a relative minority of the CRUs we analysed, offer additional insights into signal integration at promoters. Specifically, the fact that the chromatin state of the promoter largely associates with the predominant chromatin state of the connected PIRs suggests that promoters may integrate signals from remote elements based on ‘majority vote logic’. Mechanistically, this logic may be a consequence of largely independent enhancer action (potentially at both single-state and dual-state CRUs) that is consistent with the ‘hit-and-run’ model of transcriptional regulation (Schaffner, 1988; Varala et al., 2015), and provides a flexible way to fine-tune the expression of multi-enhancer genes (Guerrero et al., 2010; Lagha et al., 2012). However, this model also does not preclude the possibility that promoter chromatin states at ‘dual-state’ CRUs undergo a continuous turnover depending on the state of the PIR they contact. In this case, the observed ‘majority-vote’ promoter chromatin states would correspond to the predominant state detected at the population level.

Theoretically, the generally independent enhancer action also enables activation signals from individual elements to quantitatively ‘add up’ (at least to some extent) to promote stronger transcriptional outputs (Arnold et al., 2013; Bothma et al., 2015; Lam et al., 2015; Spivakov, 2014). Our observation that the resolution of dual-state CRUs toward a uniformly active state generally results in increased expression (and vice versa) supports this model. However, analyses in Drosophila have identified exceptions to additive enhancer activity (Bothma et al., 2015) and have provided examples of enhancers that activate more than one promoter in a coordinated fashion (Fukaya et al., 2016), which is not immediately expected from the ‘hit-and-run’ looping model. Finally, there is also a possibility that multiple enhancers are jointly engaged in ‘chromatin hubs’ with promoters, rather than acting individually (Hanscombe et al., 1991; Jiang et al., 2016; Patrinos et al., 2004; Tolhuis et al., 2002; Wijgerde et al., 1995). These mechanistic questions go beyond the capabilities of Hi-C-based analyses of cell populations, and as such it is possible that multiple promoter interactions detected within a CRU take place either concurrently or simultaneously. The emerging studies at the single-molecule level (such as [Bartman et al., 2016; Fukaya et al., 2016]) will likely shed further light on the molecular mechanisms that underpin the principles of CRU organisation.

Promoter – enhancer interactions are vitally important for gene regulation and their disruption may lead to pronounced developmental abnormalities (Epstein, 2009). The high-resolution resource of the promoter-interaction landscape in pluripotent and early lineage-committed cells presented here, therefore, provides a stepping stone to understanding the logic of gene regulation and its aberrations during human embryogenesis.

Materials and methods

Cell culture

ESCs (H9/WA09; obtained from WiCell (Madison, WI); RRID:CVCL_9773) were cultured at 37°C in 5% CO2 in air in Pluripro media and matrix (Cell Guidance Systems (Cambridge, UK)). Authentication of ESCs was achieved by confirming the expression of pluripotency genes and protein markers, and by SNP analysis of sequencing data. ESCs were routinely verified as mycoplasma-free using a PCR-based assay (Sigma (St. Louis, MO)). The H9/WA09 line is not on the list of commonly misidentified cell lines (International Cell Line Authentication Committee). ESCs were differentiated into NECs using a previously described protocol (Rada-Iglesias et al., 2011) and samples were harvested on day 7.

Flow cytometry

Following dissociation with accutase, ESCs and NECs were stained on ice for 45 min with CD326-AF647 (BioLegend (London, UK), Cat# 324212, RRID:AB_756086; 5 µL per million cells) and CD56-PE (BD Biosciences (San Jose, CA), Cat# 345812, RRID:AB_2629216; 20 µL per million cells) antibodies in 100 µl PBS containing 2% FBS. After washing, DAPI was included at a final concentration of 5 µl/mL for live/dead cell discrimination, and flow cytometry analysis was performed using a BD LSRFortessa with subsequent data analysis using FlowJo V10.1.

Hi-C and promoter capture Hi-C (CHi-C)

Hi-C and Promoter CHi-C libraries were generated essentially as described (Mifsud et al., 2015; Schoenfelder et al., 2015a), with minor modifications. 3 to 4 × 107 cells (ESCs or NECs) were fixed in 2% formaldehyde (Agar Scientific (Stansted, UK)) for 10 min, after which the reaction was quenched with ice-cold glycine (0.125 M final concentration). Cells were collected by centrifugation (400 x g for 10 min at 4°C), and washed once with PBS (50 ml). After another centrifugation step (400 x g for 10 min at 4°C), the supernatant was completely removed and the cell pellets were immediately frozen in liquid nitrogen and stored at −80°C. After thawing, the cell pellets were incubated in 50 ml ice-cold lysis buffer (10 mM Tris-HCl pH 8, 10 mM NaCl, 0.2% Igepal CA-630, protease inhibitor cocktail (Roche (Basel, Switzerland)) for 30 min on ice. After centrifugation to pellet the cell nuclei (650 x g for 5 min at 4°C), nuclei were washed once with 1.25 x NEBuffer 2. The nuclei were then resuspended in 1.25 x NEBuffer 2, SDS was added (0.3% final concentration) and the nuclei were incubated at 37°C for 1 hr with agitation (950 rpm). Triton X-100 was added to a final concentration of 1.7% and the nuclei were incubated at 37°C for 1 hr with agitation (950 rpm). Restriction digest was performed overnight at 37°C with agitation (950 rpm) with HindIII (NEB; 1500 units per 7 million cells). Using biotin-14-dATP (Life Technologies (Carlsbad, CA)), dCTP, dGTP and dTTP (all at a final concentration of 30 µM), the HindIII restriction sites were then filled in with Klenow (NEB (Ipswich, MA)) for 75 min at 37°C. After addition of SDS (1.42% final concentration) and incubation at 65°C with agitation (950 rpm) for 20 min, ligation was performed for 4 hr at 16°C (50 units T4 DNA ligase (Life Technologies) per 7 million cells starting material) in a total volume of 8.2 ml ligation buffer (50 mM Tris-HCl, 10 mM MgCl2, 1 mM ATP, 10 mM DTT, 100 µg/ml BSA, 0.9% Triton X-100) per 7 million cells starting material. After ligation, reverse crosslinking (65°C overnight in the presence of Proteinase K (Roche)) was followed by RNase A (Roche) treatment and two sequential phenol/chloroform extractions. After DNA precipitation (sodium acetate 3 M pH 5.2 (1/10 volume) and ethanol (2.5 x volumes)) overnight at −20°C, the DNA was spun down (centrifugation 3200 x g for 30 min at 4°C). The pellets were resuspended in 400 µl TLE (10 mM Tris-HCl pH 8.0; 0.1 mM EDTA), and transferred to 1.5 ml eppendorf tubes. After another phenol/chloroform extraction and DNA precipitation overnight at −20°C, the pellets were washed three times with 70% ethanol, and the DNA concentration was determined using Quant-iT Pico Green (Life Technologies). The efficiency of biotin incorporation was assayed by amplifying a 3C ligation product (primers available upon request), followed by digest with HindIII or NheI.

To remove biotin from non-ligated fragment ends, 40 µg of Hi-C library DNA were incubated with T4 DNA polymerase (NEB) for 4 hr at 20°C, followed by phenol/chloroform purification and DNA precipitation overnight at −20°C. After a wash with 70% ethanol, sonication was carried out to generate DNA fragments with a size peak around 400 bp (Covaris E220 settings: duty factor: 10%; peak incident power: 140W; cycles per burst: 200; time: 55 s). After end repair (T4 DNA polymerase, T4 DNA polynucleotide kinase, Klenow (all NEB) in the presence of dNTPs in ligation buffer (NEB)) for 30 min at room temperature, the DNA was purified (Qiagen (Hilden, Germany) PCR purification kit). dATP was added with Klenow exo- (NEB) for 30 min at 37°C, after which the enzyme was heat-inactivated (20 min at 65°C). A double size selection using AMPure XP beads (Beckman Coulter, Brea, CA) was performed: first, the ratio of AMPure XP beads solution volume to DNA sample volume was adjusted to 0.6:1. After incubation for 15 min at room temperature, the sample was transferred to a magnetic separator (DynaMag-2 magnet; Life Technologies), and the supernatant was transferred to a new eppendorf tube, while the beads were discarded. The ratio of AMPure XP beads solution volume to DNA sample volume was then adjusted to 0.9:1 final. After incubation for 15 min at room temperature, the sample was transferred to a magnet (DynaMag-2 magnet; Life Technologies). Following two washes with 70% ethanol, the DNA was eluted in 100 µl of TLE (10 mM Tris-HCl pH 8.0; 0.1 mM EDTA). Biotinylated ligation products were isolated using MyOne Streptavidin C1 Dynabeads (Life Technologies) on a DynaMag-2 magnet (Life Technologies) in binding buffer (5 mM Tris pH8, 0.5 mM EDTA, 1 M NaCl) for 30 min at room temperature. After two washes in binding buffer and one wash in ligation buffer (NEB), PE adapters (Illumina, San Diego, CA) were ligated onto Hi-C ligation products bound to streptavidin beads for 2 hr at room temperature (T4 DNA ligase NEB, in ligation buffer, slowly rotating). After washing twice with wash buffer (5 mM Tris, 0.5 mM EDTA, 1 M NaCl, 0.05% Tween-20) and then once with binding buffer, the DNA-bound beads were resuspended in a final volume of 90 µl NEBuffer 2. Bead-bound Hi-C DNA was amplified with seven PCR amplification cycles using PE PCR 1.0 and PE PCR 2.0 primers (Illumina). After PCR amplification, the Hi-C libraries were purified with AMPure XP beads (Beckman Coulter). The concentration of the Hi-C libraries was determined by Bioanalyzer profiles (Agilent Technologies, Santa Clara, CA) and qPCR (Kapa Biosystems (Wilmington, MA)), and the Hi-C libraries were paired-end sequenced (HiSeq 1000, Illumina) at the Babraham Institute Sequencing Facility.

For Promoter Capture Hi-C, 500 ng of Hi-C library DNA was resuspended in 3.6 µl H2O, and custom hybridization blockers (Agilent Technologies) were added to the Hi-C DNA. Hybridization buffers and the custom-made RNA capture bait system (Agilent Technologies; designed as previously described (Mifsud et al., 2015): 37,608 biotinylated RNAs targeting the ends of 22,076 promoter-containing HindIII restriction fragments) were prepared according to the manufacturer’s instructions (SureSelect Target Enrichment, Agilent Technologies). The Hi-C library DNA was denatured for 5 min at 95°C, and then incubated with hybridization buffer and the RNA capture bait system at 65°C. After 24 hr incubation at 65°C, biotin/streptavidin pulldown (MyOne Streptavidin T1 Dynabeads; Life Technologies) and washes were performed according to the SureSelect Target enrichment protocol (Agilent Technologies). After the final wash, the beads were resuspended in 30 µl NEBuffer 2. After a post-capture PCR (four amplification cycles using Illumina PE PCR 1.0 and PE PCR 2.0 primers), the Promoter CHi-C libraries were purified with AMPure XP beads (Beckman Coulter). The concentration of the Promoter CHi-C libraries was determined by Bioanalyzer profiles (Agilent Technologies) and qPCR (Kapa Biosystems), and the Promoter CHi-C libraries were paired-end sequenced (HiSeq 1000, Illumina) at the Babraham Institute Sequencing Facility.

Hi-C analysis and the definition of TADs, TAD boundaries and compartments

Raw sequencing reads were processed using the HiCUP pipeline (Wingett et al., 2015), which mapped sequencing read pairs against the human genome (GRCh37), filtered out experimental artifacts such as circularized reads and re-ligations, and removed all duplicate read pairs. The aligned Hi-C data were analysed using HOMER v4.7 (http://homer.salk.edu/homer/) (Heinz et al., 2010). Coverage- and distance-related correction factors of the binned data were computed at 25 kb and 250 kb resolutions, based on the iterative correction algorithm (Imakaev et al., 2012). TADs were identified based on directionality indices (Dixon et al., 2012) of Hi-C interactions 1 Mb upstream and downstream from a 25 kb sliding window every 5 kb steps, which were then smoothed using a running average over a ± 25 kb window. TADs were called between pairs of consecutive local maxima (TAD start) and minima (TAD end) of the smoothed directionality indices with a standard score difference (TAD ∆Z score) above 2.0, and the TAD ends were extended outward to the genomic bins with no directionality bias. These TAD definitions were used to compute the fraction of significant PCHi-C interactions falling inside TADs, alongside TADs reported by Dixon et al., 2015. To assess the strength of the TAD boundaries crossed by promoter interactions, we defined a TAD boundary strength score (TADB ∆Z score) as the difference between the smoothed directionality index values at the local maximum (end of the preceding TAD) and the local minimum (start of the following TAD). Defined this way, TADB ∆Z scores (unlike the TAD ∆Z scores) do not depend on the stringency of the opposite boundary of the respective TAD.

A/B compartments were called by computing the principal components of the distance- and coverage-corrected interaction profile correlation matrix at 250 kb resolution (Lieberman-Aiden et al., 2009). Positive values of the principal component were aligned with H3K4me3 ChIP-seq signals for H9 human ESCs (Rada-Iglesias et al., 2011). For chromosomes 4 and X, we used the second principal component instead of the first, as the first component described the preferential contact pattern within chromosome arms rather than compartments. The principal component values ranged from −42 to 42. To quantify the compartment changes of significant interactions, each side of the interaction was classified as A or B compartment if the principal component of its 250 kb bin was above 10 or below −10, respectively. Interactions falling within the 250 bins that had the principal component scores between −10 and 10 were considered as falling outside either compartment.

PCHi-C interaction calling

Interactions were called at the level of individual HindIII fragments using version 0.1.4 of the CHiCAGO pipeline (Cairns et al., 2016) based on two biological replicates for each cell type that were normalised and combined as part of the pipeline. CHiCAGO incorporates a convolution background model, which emcompasses the ‘Brownian’ (real, but expected interactions) and ‘technical’ (assay and sequencing artefacts) components, and a weighted multiple testing correction procedure trained on interaction distance. CHiCAGO interaction scores correspond to –log-transformed, weighted p-values for each fragment read pair. A score threshold of 12 was used (equivalent to a threshold of 5 in Chicago v1.0.0+ due to a soft-thresholding procedure introduced in this version). This threshold was chosen empirically based on balancing the enrichment for chromatin marks at PIRs with the overall number of detected interactions. Additionally, interactions with scores between 11 and 12 were included in the analysis if they scored above 12 in the other cell type.

RNA-sequencing libraries and analysis

Total RNA was extracted from ESCs and NECs using an RNeasy Mini Kit (Qiagen). Indexed mRNA-seq libraries were constructed from 500 ng total RNA using the Tru-Seq RNA Library Prep Kit v2 (Illumina). Library fragment size and concentration was determined using an Agilent Bioanalyzer 2100 and KAPA Library Quantification Kit (KAPA Biosystems). Samples were sequenced on an Illumina HiSeq as single-end libraries at the Babraham Institute Sequencing Facility. Reads were trimmed using trim galore (http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) with default parameters to remove the standard Illumina adapter sequence. Reads were mapped to the GRCh37 assembly using tophat (Trapnell et al., 2009). BAM files were imported to Seqmonk (http://www.bioinformatics.babraham.ac.uk/projects/seqmonk/). Raw read counts per transcript were calculated using the RNA-seq quantitation pipeline on the Ensembl v70 gene set using non-directional counts. Differential analysis of gene expression was performed using the default settings in DESeq2 (Love et al., 2014) without independent filtering of the results. Differentially expressed genes were called at padj < 0.05 and log2 fold change above 1.5 or below −1.5.

ChIP data analysis and definition of chromatin states

The histone modification ChIP-seq data (H3K4me1, H3K4me3, H3K27ac and H3K27me3) for ESCs and NECs were from Rada-Iglesias et al., 2011, available in Gene Expression Omnibus under accession number GSE24447. Data were converted to GRCh37 using liftOver (Kent et al., 2002). CTCF ChIP-seq data were from ENCODE (ENCODE Project Consortium, 2012).

Chromatin segmentations were performed on the basis of multiple histone modification ChIP datasets using a Hidden Markov Model-based method implemented in ChromHMM (version 1.10) (Ernst and Kellis, 2012) with default settings. The segmentation was carried out jointly through providing ‘concatenated’ data for both cell types as input. The resulting 16 states were curated into four broad chromatin states based on analysing their enrichment for different histone marks (Figure 1—figure supplement 3A) as follows. States 1–6 characterised by the presence of H3K4me3 and/or H3K27ac, and the absence of H3K27me3, were labelled ‘active’; states 7–9 showing a combination of H3K4 methylation and H3K27me3 were labelled ‘poised’; state 10 showing H3K27me3 and no H3K4 methylation was labelled ‘Polycomb-associated’; states 14–16 showing no detectable signal for the four tested histone modifications were labelled ‘background’. In addition, two more curated states were defined, but not considered further: states 11–12 were characterised by a ‘mixed’ pattern of both H3K27ac and H3K27me3, which likely arose from a technical issue such as heterogeneity within the samples; state 13 characterised by H3K4me1 alone was classified as ‘intermediate enhancers’, but the fraction of PIRs bearing this signature (~1%) was too small to analyse them as an individual category. HindIII fragments in the human genome (including baits and PIRs) were then classified according to the chromatin states detected within them. When more than one chromatin state was present, classification was resolved in the following manner: (i) any functional state (e.g. active, poised, Polycomb-associated) was prioritised above background; (ii) active, poised and Polycomb-associated states were prioritised above intermediate; (iii) poised state was prioritised above the Polycomb-associated state; (iv) active state together with any inactive state (i.e. poised or Polycomb-associated) was labelled as mixed. Based on these heuristics, we assigned a single chromatin state (including the background state) to 81% of PIRs in both cell types.

Tissue-specific enhancer activity

Transgenic reporter assays for enhancer activity are described within the VISTA Enhancer Browser (Visel et al., 2007). The enhancer sequences from VISTA were overlapped with PIRs, and their putative target genes were defined according to the PCHi-C detected promoter-PIR interactions.

Integration with TADs and INs

HindIII fragments were mapped to TADs defined as described above and INs obtained from Ji et al., 2016. Baited fragments overlapping TAD boundaries, and those mapping outside INs, were excluded from respective analyses. For each CRU, the percentage of interactions that map within the same TAD or IN was calculated and these values were collected into 12 bins, The first and last bins contained the values of 0% and 100%, respectively, and the remaining bins contained all other values split into 10% intervals. These results were compared to 1000 random permutations of CRUs across all promoter fragments performed in a manner retaining the overall CRU structure.

Definition of CRU clusters

Each CRU was categorised according to the fraction of PIRs in the active, poised and Polycomb-associated state. These fractions were used for hierarchical clustering based on Euclidian distances (method='Euclidian’ in dist function in R) with the average agglomeration method (method='average’ in hclust function in R). PIRs assigned intermediate or mixed chromatin states did not contribute to the clustering procedure, in the latter case because the states of the regulatory elements interacting with target promoters within these PIRs are not identifiable. CRUs containing only mixed or intermediate PIRs were not included in the analysis.

Definition of retained and rewired interactions

False-negative rates associated with stringent signal thresholds drive down the observed overlap between conditions and may overestimate the proportion of cell-type-specific interactions. Therefore, we applied additional criteria to identify high-confidence subsets of rewired and retained interactions based on replicate-level CHiCAGO interaction calls. First, we required that rewired interactions have scores above 12 in both biological replicates of the same cell type, and below 12 in both replicates of the other cell type. We then binned the interactions satisfying these criteria into five groups of equal size according to their interaction scores in the merged samples. Interactions belonging to the top bin in one cell type and the bottom bin in the other cell type were considered as rewired. Interactions with scores above 12 in the two replicates of both cell types were considered as retained. Applying these criteria and filtering out interactions with PIRs in the mixed and intermediate states, we obtained high-confidence sets of 1258 retained and 1153 rewired interactions that were used in the analysis.

Data availability

Sequencing data have been deposited in Gene Expression Omnibus (GEO) with accession number GSE86821. Processed data including interaction peak calls in the WashU Genome Browser text format and RNA-seq raw read counts were deposited in the same GEO repository. CHiCAGO objects containing all detected interactions, ChromHMM segmentation data, DESeq2-processed RNA-seq data and the defitions of TADs have been made available through the Open Science Framework (http://osf.io/sdbg4).

References

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
    A BAC transgenic analysis of the Mrf4/Myf5 locus reveals interdigitated elements that control activation and maintenance of gene expression during muscle development
    1. JJ Carvajal
    2. D Cox
    3. D Summerbell
    4. PW Rigby
    (2001)
    Development 128:1857–1868.
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
    Cis-regulatory mutations in human disease
    1. DJ Epstein
    (2009)
    Briefings in Functional Genomics and Proteomics 8:310–316.
    https://doi.org/10.1093/bfgp/elp021
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
  39. 39
  40. 40
  41. 41
  42. 42
  43. 43
  44. 44
  45. 45
  46. 46
  47. 47
  48. 48
  49. 49
  50. 50
  51. 51
  52. 52
  53. 53
  54. 54
  55. 55
  56. 56
  57. 57
  58. 58
  59. 59
  60. 60
  61. 61
  62. 62
  63. 63
  64. 64
  65. 65
  66. 66
  67. 67
  68. 68
  69. 69
  70. 70
  71. 71
  72. 72
  73. 73
  74. 74
  75. 75
  76. 76
  77. 77
  78. 78
  79. 79
  80. 80
  81. 81
  82. 82
  83. 83
  84. 84
  85. 85
  86. 86
  87. 87
  88. 88

Decision letter

  1. Job Dekker
    Reviewing Editor; University of Massachusetts Medical School, United States

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Global reorganisation of cis-regulatory units upon lineage commitment of human embryonic stem cells" for consideration by eLife. Your article has been reviewed by three peer reviewers, one of whom, Job Dekker (Reviewer#1), is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Fiona Watt as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Frank Alber (Reviewer #3).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission. In the course of discussion the reviewers and editors decided that this work would be more appropriately reconsidered as a Tools and Resources (TR) paper rather than a Research Article (RA). The work appears to be a valuable resource but the primary research results are not considered sufficiently novel to warrant publication as an RA. Nonetheless, we hope you are willing to go forward with this in the category of a TR paper.

Summary:

Freire-Pritchett et al. employ Capture Hi-C to detect chromatin interactions between promoters and distal elements in ES cells and in ESC derived neuroectodermal cells (NECs). The authors report significant interactions between many promoters and different distal elements in ESCs and NECs. Analysis of the chromatin state of these elements indicates that these are functional elements including (poised) enhancers. Further, the connectivity between promoters and distal elements differs between ESCs and NECs and this relates to the transcriptional status of the genes and chromatin status of the distal elements. The authors then define cis-regulatory units (CRUs) as promoters and their associated distal elements. Although many of the CRUs are contained within TADs and INs, as had been expected based on earlier studies, the authors report that many can extend beyond TAD and IN boundaries.

The main impact of this paper is to announce the availability of a large data set of enhancer-promoter contacts identified via capture Hi-C. Many of the experimental technologies and statistical analysis methods are established and rely on previously published methodologies. That said the reviewers raise concerns about specific aspects of the analysis. Further, many of the reported correlations are already known and the results confirm earlier studies (e.g. promoters interact with multiple distal elements, these elements coincide with cell type-specific enhancer-like elements and these connections are equally cell type-specific, and enriched for intra-TAD interactions). The main new aspect of this work is the fact that a genome-wide promoter-anchored interaction atlas is described, and this resource could be of interest to the community given that many study these cells.

Essential revisions:

Please address these main criticisms (more details are in the minor points section):

1) Many of the correlations between looping, gene expression and chromatin state are not novel. Focus on the new things, and present the data as a large dataset or resource for the community.

2) The reviewers raised issues related to the statistical analysis and computational methods to determine PIRs (and their hierarchical clustering), TADs, and CRUs (below). Please address all of them.

3) Related to the previous point, the reviewers raised concerns about the claim that CRUs are distinct from INs, TADs. The main point to address is how the methods to determine these features compare and whether lack of overlap can be due to sensitivity of the computational methods.

Minor points:

1) One concern is that the CHiCAGO method identifies significant interactions using only the general distance decay of interactions into account. It does not take into account that any pair of interactions between loci located within a TAD or IN, or between loci located in a similar compartment (A, B or subcompartment types) are generally higher than loci located in different TADs/INs/Compartments. As a result, a number of significant interactions are not necessarily "loops" (specific point-to-point interactions). This is not to say that these loci do not interact more frequently than expected given their genomic distance, but these interactions can reflect general higher order structures such as domains etc. This can explain why many interacting loci appear to not have chromatin marks: these can be "bystander" interactions that are the result of a nearby structural feature. I recommend that the authors repeat their analysis using a background model that takes domains into consideration.

2) The TAD calling procedure is quite simple. Is there any evidence that this procedure produces TAD calls of comparable quality to those produced by, for example, the HMM of Dixon et al. Nature 2012?

3) The TAD and IN comparison is unsatisfying. The main conclusion seems to be that CRUs are not particularly aligned with TADs and INs. This could be because the two phenomena are more or less independent of each other, or because the TADs, INs and CRUs are each defined in a fairly arbitrary fashion. If, for example, the simple TAD calling scheme employed here is not working well, then it would lead to the observed discordance.

4) The finding that CRUs can extend beyond TAD or IN boundaries is interesting, but the data is not sufficient to claim a new "feature of genome architecture". Further analyses are needed such as: when a CRU extends beyond a TAD boundary, is this a weak TAD boundary? It is well known that TADs are nested structures. Do CRUs remain within the larger nested domains? This seems to be the case for POUF2. Quantification of boundary strength is required, as it seems not inconsistent with the TAD/IN models of enhancer action if the authors find interactions across weak boundaries, but not strong boundaries. Also, are these interactions beyond the boundaries equally strong, or are they much lower in contact frequency

5) The main text claims that the promoter capture Hi-C data was processed to identify "significant cis-interactions." Looking at Methods, we learn that these contacts correspond to a threshold of 12, on a "log-transformed, weighted p-value" scale. This vague description makes it impossible to actually interpret the confidence associated with a threshold of 12, especially since we are told that the threshold corresponds to "a threshold of 5 in Chicago v1.0.0+ due to a soft-thresholding procedure introduced in this version." The choice of threshold must be fully justified and described. The text claims that Chicago uses an FDR control procedure, so the interactions that are presented here should be reported along with their FDR threshold.

6) The manuscript contains no details on precisely how ChromHMM was run. Details like software version numbers and any parameter settings should be given. Importantly, the text should specify whether the segmentation was carried out independently on each cell type, or jointly, and if the latter, whether the data tracks were "stacked" (i.e., 8 marks per position) or "concatenated" (4 marks per position, across two concatenated copies of the genome).

7) Details of the "curation" of the 16 inferred states should also be provided. When the 16 states were collapsed to four states, what was the basis for this collapsing? It seems strange that some states were eliminated at this stage because they contained multiple marks.

8) How are PIRs defined: at the level of individual restriction fragments, or are adjacent fragments that both score significant merged into a single PIR?

9) The authors do not show many tracks of real data (only in one of the supplemental figures). Throughout they only show arcs to indicate significant contacts. This makes it difficult to assess the quality of the data.

10) Promoters often interact with CTCF sites. Did the authors find this in their dataset as well?

11) The authors focus on cis-interactions. Can the authors describe trans interactions as well? Given that trans-regulation is considered rare, this analysis will provide context for interpretation of the cis data.

12) Did the authors detect any known enhancer-promoter pairs? Also, the authors imply that the distal elements that touch promoters are regulating these promoters, but beside correlations between interactions, transcription and chromatin state of the distal elements, no evidence for direct regulation is provided. This would require deletion of the elements, e.g. by CRISPR. The Vista analysis of the POUF2 interactions suggests that many PIRs are enhancers. That is similar in information as the observation that PIRs overlap loci with histone modifications associated with enhancers. The key thing to validate is that these enhancers indeed regulate the interacting promoter. The fact that the pattern of activity of these enhancers resembles POU3F2 expression is encouraging but not definitive. The authors should either provide direct evidence for functional relationships or clearly discuss the extent to which the current data predicts such relationships even though the data currently is correlative.

13) The authors examine the dynamic rewiring and recoloring of interactions as cells differentiate from ESCs to NECs. They report gain and loss of interactions and these are related to gain or loss of chromatin marks. This is interesting, but what needs to be tested is whether this is related compartment changes (see above). This is not to say these interactions are not relevant, but it would be important to determine how they relate to larger scale compartment changes. One interesting implication could be that in fact cell type-specific compartment changes are driven by altered interactions between functional elements that are active in a cell type-specific manner.

14) How do the authors interpret interactions that involve non-expressed genes?

15) The hierarchical clustering of PIRs by the prevalence of different chromatin state labels is fairly uninformative. The fact that these different categories exist in the data is not surprising, and the claim that these eight labels somehow have "potential implications for understanding the logic of signal integration at promoters" is not well supported.

16) In the Abstract, the sentence "Here, we generate[…]" should mention what assay was employed.

17).In the Introduction: "The extreme combinatorial complexity of Hi-C[…]" The complexity is really only quadratic, not combinatorial.

18). In the Results section: The URL for data availability should be included at the end of the sentence, "This data resource[…]"

19) Also in the Results section: "NEC PIRs were strongly enriched" This claim requires statistical support.

20) New paragraph at "We next sought[…]"

21) Two of the observations offered in the Discussion do not seem to be supported by direct evidence in Results: "we find extensive promoter connectivity to regions associated with Polycomb-associated repression and poising, and "we[…] detect large numbers of promoter interactions with regions devoid of chromatin features." These observations seem to be offered for the first time in Discussion.

22) The work does not show whether all the interactions are present simultaneously in a cell, or if the promoter state may vary between cells, depending on the probability with which the promoter is interacting with any of the potential alternative enhancers of a given chromatin state. Please discuss this issue in the main text.

23) Out of the ~21000 promoters, the authors focused their CRU analysis on only 16,000 protein-coding genes. Would the authors expect differences in the outcome if the other remaining promoters would be considered?

24) Out of the 16,000 protein-coding promoters about 9000 were defined as CRU. It may be good to know the selection criteria (i.e. is it based on a minimum number of PIRs per promoter?).

25) Figure 3B defines a CRU as a set of PIRs that seem to be concurrently interacting with the promoter in the same cell. Maybe the authors could indicate in the caption that it cannot be ruled out that some PIRs may provide alternative rather then concurrent interactions.

https://doi.org/10.7554/eLife.21926.031

Author response

Essential revisions:

Please address these main criticisms (more details are in the minor points section):

1) Many of the correlations between looping, gene expression and chromatin state are not novel. Focus on the new things, and present the data as a large dataset or resource for the community.

We thank the editors and reviewers for their constructive and thorough feedback on our study. We are happy for our work to be reconsidered as a Tools and Resources paper and we have taken on board the advice to focus more on the resource aspect of our study. We have achieved this, in part, by describing the large dataset more fully, adding further methodological details and support, and adopting a more cautious posture regarding the correlations between interactions and chromatin state.

2) The reviewers raised issues related to the statistical analysis and computational methods to determine PIRs (and their hierarchical clustering), TADs, and CRUs (below). Please address all of them.

3) Related to the previous point, the reviewers raised concerns about the claim that CRUs are distinct from INs, TADs. The main point to address is how the methods to determine these features compare and whether lack of overlap can be due to sensitivity of the computational methods.

We thank the editors and reviewers for their helpful feedback on these two points. We address these criticisms in our responses below.

Minor points:

1) One concern is that the ChICAGO method identifies significant interactions using only the general distance decay of interactions into account. It does not take into account that any pair of interactions between loci located within a TAD or IN, or between loci located in a similar compartment (A, B or subcompartment types) are generally higher than loci located in different TADs/INs/Compartments. As a result, a number of significant interactions are not necessarily "loops" (specific point-to-point interactions). This is not to say that these loci do not interact more frequently than expected given their genomic distance, but these interactions can reflect general higher order structures such as domains etc. This can explain why many interacting loci appear to not have chromatin marks: these can be "bystander" interactions that are the result of a nearby structural feature. I recommend that the authors repeat their analysis using a background model that takes domains into consideration.

The reviewers are correct to point out that CHiCAGO’s domain-unaware model somewhat overestimates the background read count (i.e. that expected at random) for TAD-crossing interactions (Author response image 1A). As a result of this, CHiCAGO is relatively more stringent (conservative) in calling TAD-crossing interactions than within-TAD interactions. In practice, the detected interactions are likely to be of comparable strength, as evidenced by the very similar read counts at the detected within-TAD and TAD-crossing interactions at a given interaction distance range (Author response image 1B). Therefore, the key observation of our study that TADs constrain, but do not fully insulate, promoter interactions is unaffected: the TAD-unaware CHiCAGO background only pushes us to err on the side of caution in detecting TAD-crossing interactions.

In the future, it would indeed be worthwhile to incorporate domain information into the CHiCAGO background model. However, this is a complex and challenging task. One of the challenges that need to be addressed, for example, is that the background depends not just on whether an interaction crosses a TAD boundary, but also on TAD boundary stringency, and does so in a way that scales non-linearly with distance (Author response image 1A). It is also likely that the background is affected by the number of crossed boundaries. Modelling all of these dependencies in the data must be performed with great care to avoid the danger of overfitting, which could lead to circular reasoning when attempting to make inferences about TADs.

Since the identified TAD-crossing interactions are at least no weaker than the within-TAD interactions, and since implementing TAD-aware interaction calling is likely a challenging project of its own, we feel that addressing this point in full goes outside the scope of this study.

Author response image 1
CHiCAGO detection of cross-TAD and within-TAD interactions.

(A) The effect of TAD boundary crossing on background levels. In line with the CHiCAGO background estimation procedure (see Additional file 1, Cairns et al., 2016), we used 20kb distance bins to tile 3Mb regions centred at each bait. For each bin and bait, we computed a normalized number of reads as the average number of reads linking fragments in this bin with the baited fragment, divided by the baited fragment’s bias factor. These values were then considered separately for bins mapping within the same TAD as the respective bait and for those separated from the bait by at least one TAD boundary. Solid lines show the median log-transformed normalised number of reads and ribbons show the interquartile range. It can be seen that the CHiCAGO background estimate (dashed line) is very close to the background estimate computed for within-TAD interactions separately, suggesting that within-TAD pairs primarily drive CHiCAGO’s background model. In contrast, bins that lie beyond a TAD boundary have lower background levels, consistent with the reviewers’ expectation. Furthermore, stronger TAD boundaries correspond to a larger decrease in background level, though this effect is nonlinear. Since the estimated background is higher than the true background, this increases the stringency of CHiCAGO calls for TAD-crossing interactions. (B) The read counts of significant interactions, stratified by whether they cross a TAD boundary or not, and by distance bin. It can be seen the read count distributions of the CHiCAGO-identified TAD-crossing and within-TAD interactions (CHiCAGO score > 12) show only minor differences.

https://doi.org/10.7554/eLife.21926.022
Author response image 2
Tuning a signal threshold for TAD detection with HOMER.

(A) Box plots show TAD length distributions for TADs called by HOMER using a range of TAD ∆Z score thresholds (blue to red boxes), as well as for the published TADs from Dixon et al., 2015 detected with individual and combined Hi-C replicates (green boxes). The dashed grey line marks the median TAD length of the combined Dixon et al., 2015 data. (B) Overlap with CTCF sites for TADs called by HOMER using a range of TAD ∆Z score thresholds (blue to red lines), and for the published TADs from Dixon et al., 2015. (green lines). Solid lines show the observed fraction of TAD boundaries mapping within given distances from CTCF sites, and dashed lines show the expected fraction using boundaries shifted by +1 Mb for each set of boundaries. (C) TAD boundary overlap between the TADs called with HOMER at ∆Z > 2.0 and the Dixon et al., 2015 et al. TADs (combined replicates). The bars show the numbers of HOMER TADs with zero, one or two TAD boundaries within 50 kb of a Dixon et al., 2015 TAD boundary, respectively. (D) A WashU browser view of the directionality index profile and the locations of TADs called in an example ~20 Mb region on chromosome 7. Top row: Directionality indices (shades of blue and red represent the negative and positive index ranges, respectively). Middle row (green): Dixon et al., 2015 TADs (combined replicates). Bottom row (blue): HOMER TADs called at TAD ∆Z > 2.0.

https://doi.org/10.7554/eLife.21926.023

2) The TAD calling procedure is quite simple. Is there any evidence that this procedure produces TAD calls of comparable quality to those produced by, for example, the HMM of Dixon et al. Nature 2012?

We agree that a comparison with Dixon et al., 2015 data is worthwhile and have now performed this analysis. This has revealed that the cutoff on HOMER boundary stringency scores that we initially used (∆Z>1) likely results in a more lenient TAD definition than the Dixon et al., 2015 HMM-based method, as evidenced by the higher numbers of detected TADs (4491 vs 3062) and smaller TAD sizes (Author response image 2A). Notably, the boundaries used in the initial submission showed a comparable enrichment for CTCF binding sites to Dixon et al., 2015 boundaries (Author response image 2B).

In the absence of a gold standard for TAD detection, it is difficult to judge which of the two methods is “more correct”. However, for consistency we now use a more stringent HOMER threshold (∆Z>2) that results in fewer TADs (2761 vs 4491 with ∆Z>1) and a comparable distribution of TAD lengths with the Dixon et al., 2015 TADs (Author response image 2A). More than 70% of our called TADs have at least one boundary that is consistently detected in the Dixon et al. study (Author response image 2C). To illustrate the boundaries called with both methods, we show their TAD calls for an example region in Author response image 2D.

The figures and the main text of the manuscript have been updated based on the new set of TAD calls. Using the more stringent set of TADs has resulted in fewer CRUs identified as crossing TAD boundaries (just over 20% vs ~40% in the initial set). However, this is primarily due to the larger sizes of the updated TADs rather than their higher ‘insulation power’. The ratio of observed-to-expected numbers of CRUs fully contained within TAD boundaries has in fact gone slightly down in the updated set (from 1.5-fold to 1.3-fold). We have also added the results for Dixon et al., 2015 TADs to Figure 3—figure supplement 1E, to enable a direct comparison. These data show that a similar proportion of CRUs cross Dixon et al., 2015 TAD boundaries (~27%) compared to those in our updated HOMER-defined set (~20%; ∆Z>2), demonstrating that the overall conclusion of this analysis is robust across at least two commonly used TAD-calling methods.

3) The TAD and IN comparison is unsatisfying. The main conclusion seems to be that CRUs are not particularly aligned with TADs and INs. This could be because the two phenomena are more or less independent of each other, or because the TADs, INs and CRUs are each defined in a fairly arbitrary fashion. If, for example, the simple TAD calling scheme employed here is not working well, then it would lead to the observed discordance.

We have now modified the TAD analysis as described above, which we believe addresses this point as far as TADs are concerned.

The IN calls have been taken verbatim from the study that introduced this concept (Ji et al., 2016). We noted, however, that the reported INs were often nested, so we have repeated the analysis using only the largest span of each overlapping set. As expected from the larger IN sizes, the numbers of IN-crossing CRUs went down somewhat, but still less than half of them (~45%) were fully contained within the extremities of IN boundaries (Figure 3—figure supplement 1I). Therefore, these results further strengthen a principal conclusion of our analysis that while INs constrain promoter interactions to some extent, they do not fully insulate them.

4) The finding that CRUs can extend beyond TAD or IN boundaries is interesting, but the data is not sufficient to claim a new "feature of genome architecture". Further analyses are needed such as: when a CRU extends beyond a TAD boundary, is this a weak TAD boundary? It is well known that TADs are nested structures. Do CRUs remain within the larger nested domains? This seems to be the case for POUF2. Quantification of boundary strength is required, as it seems not inconsistent with the TAD/IN models of enhancer action if the authors find interactions across weak boundaries, but not strong boundaries. Also, are these interactions beyond the boundaries equally strong, or are they much lower in contact frequency

To address this question, we plotted the distributions of HOMER strength scores for TAD boundaries that are crossed and not crossed by promoter interactions (Figure 3—figure supplement 1H). As can be seen, TAD boundaries crossed by interactions are indeed slightly weaker overall (Wilcoxon test p-value = 1.8e-14). However, the ranges fully overlap, and even the strongest TAD boundaries are not fully impenetrable to promoter interactions. We now also mention this analysis in the main text. Although these new findings have strengthened our analysis of CRUs, we take on board the reviewers’ concern about our interpretation of CRUs as new features of genome architecture, and have removed it from the manuscript.

As for the contact frequency of TAD-crossing and within-TAD interactions, it can be seen from Author response image 1B that it is very similar in both cases given the distance. In the short range, the identified TAD-crossing interactions even appear to have a slightly higher read count, most likely owing to the domain-unaware CHiCAGO background model, as discussed in our response to minor point 1.

Finally, we agree that empirically, TADs indeed represent nested structures, and tools for TAD calling in a hierarchical fashion are beginning to emerge (although neither the Dixon et al. method nor HOMER are among them). However, the conceptual disparity between the different levels of TAD “nesting” is still unclear. For example, from what level of the hierarchy should a sub-TAD be a considered a TAD, or a TAD be considered a “super-TAD”, etc.? It is possible to devise criteria that seem intuitively acceptable, but it will take time for the underlying assumptions of such criteria to acquire a solid structural and biophysical underpinning.

5) The main text claims that the promoter capture Hi-C data was processed to identify "significant cis-interactions." Looking at Methods, we learn that these contacts correspond to a threshold of 12, on a "log-transformed, weighted p-value" scale. This vague description makes it impossible to actually interpret the confidence associated with a threshold of 12, especially since we are told that the threshold corresponds to "a threshold of 5 in Chicago v1.0.0+ due to a soft-thresholding procedure introduced in this version." The choice of threshold must be fully justified and described. The text claims that Chicago uses an FDR control procedure, so the interactions that are presented here should be reported along with their FDR threshold.

We believe that, in part, this question has arisen because we made a confusing statement in the text. CHiCAGO’s weighted multiple testing treatment is not, strictly speaking, “a false discovery control procedure”, and we have now altered the main text to clarify this.

More generally, multiple testing correction when calling peaks in genomic data is a challenging problem, leading to the development of alternatives to standard approaches such as Bonferroni and FDR (such as, for example, IDR for ChIP-seq data). In PCHi-C data, an additional challenge is presented by undersampling, which leads to the violation of the underlying assumptions of both the IDR and conventional FDR approaches. In the case of FDR, this is because the majority of fragment pairs have only one or zero reads, leading to highly non-uniform p-value distributions. Furthermore, in PCHi-C data false discovery rates strongly vary with linear distance, partly due to the decreasing statistical power as distance increases. To account for this property, CHiCAGO adjusts the p-values using a weighting procedure, in which the weights trained on the reproducibility of interaction calls with distance (see Cairns et al., 2016 for details). Soft-thresholding was further introduced in CHiCAGO v1.0.0+ to shift the weighted -log-p-values (scores) such that the score of zero corresponds to the probability of a true interaction in the very short range given zero reads.

CHiCAGO scores are conceptually similar, albeit clearly not identical, to posterior probabilities resulting from an empirical Bayesian treatment. We view them primarily as a ranking measure, and use a heuristic approach based on integration with other types of genomic data to choose a threshold. Specifically, for a range of threshold settings, we compute the observed overlap of the identified PIRs with genomic regulatory features (such as specific histone marks) and the same overlap expected at random (estimated with random regions that are promoter distance-matched to the detected PIRs). In choosing a threshold, we balance the ‘unexpected’ overlap of PIRs with regulatory features with the total number of interactions detected at a given setting. As can be seen in Author response image 3A-B, at the selected threshold of 12 the relative difference in the overlap of PIRs with histone marks and CTCF versus random controls begins to saturate. Further increasing the threshold leads to a continued drop in the identified interactions (Author response image 3C-D) without a considerable further gain in the ‘unexpected’ overlap of PIRs with regulatory features.

We now explicitly mention the rationale for choosing the threshold in the Methods section. However, we agree that it is a crude and potentially subjective method. While varying the threshold within the reasonable range is unlikely to affect the principal conclusions of our study, we release the scores for each fragment pair as part of the Chicago objects on Open Science Framework, so the users can employ other methods of tuning the threshold, as well as apply non-threshold-based approaches to analysing our data.

Author response image 3
Properties of CHiCAGO interactions called at different score thresholds in ESCs and NECs.

(A and B) The relative difference between observed and expected numbers of PIRs overlapping with the ChIP-seq peaks of histone modifications and CTCF, detected at a range of CHiCAGO score thresholds in ESCs (A) and NECs (B). The expected numbers of PIRs overlapping with ChIP peaks were computed in the same way as in Figure 1C, as the average overlap obtained with 100 draws of promoter distance-matched random regions. (C and D) The fraction of fragment pairs with at least one PCHi-C read mapping to them that have CHiCAGO scores passing a given threshold in ESCs (C) and NECs (D).

https://doi.org/10.7554/eLife.21926.024

6) The manuscript contains no details on precisely how ChromHMM was run. Details like software version numbers and any parameter settings should be given. Importantly, the text should specify whether the segmentation was carried out independently on each cell type, or jointly, and if the latter, whether the data tracks were "stacked" (i.e., 8 marks per position) or "concatenated" (4 marks per position, across two concatenated copies of the genome).

We apologise for this omission and have now added full details on the chromHMM procedure to the Methods section. Briefly, the segmentation was carried out jointly, and the data tracks were concatenated.

7) Details of the "curation" of the 16 inferred states should also be provided. When the 16 states were collapsed to four states, what was the basis for this collapsing? It seems strange that some states were eliminated at this stage because they contained multiple marks.

We have added full details on how the chromHMM states were curated to the Methods section. Briefly, the states were curated into “Active”, “Poised”, “Polycomb-associated” and “Background” based on the presence of active (H3K4me1/H3K27ac) and repressive histone (H3K27me3) marks. Overall, we were able to unambiguously assign one of these chromatin states to 81% of PIRs in both cell types. Two more curated states were defined, but not considered further:

1) “intermediate enhancers” characterised by H3K4me1 alone; the fraction of PIRs bearing this signature (~1%) was too small to analyse them as an individual category, and yet we felt that merging them with any other category was not biologically appropriate;

2) the ambiguous (“mixed”) state that showed enrichment for both H3K27ac and H3K27me3; we felt that this conflicting pattern likely arose from a technical issue such as heterogeneity within the samples.

8) How are PIRs defined: at the level of individual restriction fragments, or are adjacent fragments that both score significant merged into a single PIR?

PIRs are defined at the level of individual restriction fragments, with no signal smoothing or pooling across multiple sites. We have now clarified this in the Methods section. We have opted against further peak processing, because at present there is no reliable way to “fine-map” interaction signals without making a priori assumptions about the “causal” regions driving the signals, which we feel still require further mechanistic support at this stage.

9) The authors do not show many tracks of real data (only in one of the supplemental figures). Throughout they only show arcs to indicate significant contacts. This makes it difficult to assess the quality of the data.

We now provide two additional figures (Figure 1—figure supplement 4 and Figure 4—figure supplement 3), in which we plot the raw data for all examples shown in this study.

10) Promoters often interact with CTCF sites. Did the authors find this in their dataset as well?

We have now incorporated this analysis. Indeed, PIRs are highly enriched for CTCF binding sites, as we now show using ENCODE CTCF ChIP-seq data for hESCs in Figure 1—figure supplement 2B, and describe in subsection “Identification of putative regulatory elements and their associated gene promoters” of the main text.

11) The authors focus on cis-interactions. Can the authors describe trans interactions as well? Given that trans-regulation is considered rare, this analysis will provide context for interpretation of the cis data.

We now describe in the main text that we detect 338 trans-interactions in ESCs and 266 in NECs (representing <0.5% of total interactions). Consistent with our previous observations in other PCHi-C datasets (Cairns et al., 2016)(, a large proportion of trans-interactions (50.3% and 64.3%, respectively) reflect large-scale compartment signal that is detectable in conventional Hi-C data at 1-Mb resolution for the same cell types (data not shown). Given the small numbers of detected trans-interactions, we felt we should focus on cis-interactions in the manuscript. However, we agree that trans-interactions may be useful for future reanalyses of our data, and report them in the data files released on Open Science Framework.

12) Did the authors detect any known enhancer-promoter pairs? Also, the authors imply that the distal elements that touch promoters are regulating these promoters, but beside correlations between interactions, transcription and chromatin state of the distal elements, no evidence for direct regulation is provided. This would require deletion of the elements, e.g. by CRISPR. The Vista analysis of the POUF2 interactions suggests that many PIRs are enhancers. That is similar in information as the observation that PIRs overlap loci with histone modifications associated with enhancers. The key thing to validate is that these enhancers indeed regulate the interacting promoter. The fact that the pattern of activity of these enhancers resembles POU3F2 expression is encouraging but not definitive. The authors should either provide direct evidence for functional relationships or clearly discuss the extent to which the current data predicts such relationships even though the data currently is correlative.

This is an important point. We have added new text in the Discussion to highlight that our data set is predictive of regulatory relationships, but that extensive functional validation is required to infer direct regulation. In the absence of a comprehensive “VISTA-style” catalog of functionally validated enhancer-promoter relationships, we hope that large-scale resources such as ours will provide a strong impetus for future validation studies.

13) The authors examine the dynamic rewiring and recoloring of interactions as cells differentiate from ESCs to NECs. They report gain and loss of interactions and these are related to gain or loss of chromatin marks. This is interesting, but what needs to be tested is whether this is related compartment changes (see above). This is not to say these interactions are not relevant, but it would be important to determine how they relate to larger scale compartment changes. One interesting implication could be that in fact cell type-specific compartment changes are driven by altered interactions between functional elements that are active in a cell type-specific manner.

We have now done this analysis. While ~2% of 1-Mb regions transition between A and B compartments upon differentiation (in either direction), nearly all high-confidence rewired interactions detected in our data (>99%) map outside of these “dynamic” regions. We now mention this observation in the main text.

14) How do the authors interpret interactions that involve non-expressed genes?

This is an interesting point, which we now present more extensively in the Discussion. There are several possibilities to interpret interactions with non-expressed gene promoters. For example, it is known that at least in mouse ESCs, inactive genes and their regulatory regions form extensive topological networks (see, e.g., Schoenfelder et al., 2015b ). However, we do not intend to imply that all promoter interactions have a gene regulatory nature: some of them are potentially structural. Furthermore, additional and not yet fully understood mechanisms of topologically-mediated gene repression and priming may also be at play, particularly given the fact that a number of regions unmarked by conventional chromatin signatures currently emerge as having roles in gene regulation (Pradeepa et al., 2016; Rajagopal et al., 2016).

15) The hierarchical clustering of PIRs by the prevalence of different chromatin state labels is fairly uninformative. The fact that these different categories exist in the data is not surprising, and the claim that these eight labels somehow have "potential implications for understanding the logic of signal integration at promoters" is not well supported.

We agree that the majority of CRUs having consistent chromatin marks between PIRs and the promoter is relatively unsurprising. However, we feel that the classification we provide will be helpful for the users of our resource to make the most out of our data, and in particular to select targets for candidate-based studies. Additionally, our hierarchical clustering approach has been instrumental for detecting the relatively large numbers of ‘inconsistent’ CRUs, leading to the observation that the chromatin states at their promoters generally follow the ‘majority rule’ with respect to the chromatin states of the PIRs. We believe that these findings lead to specific and testable mechanistic hypotheses, and are therefore also interesting in this respect. Nevertheless, we concede that our claim about “potential implications[…]” was perhaps too far-fetched, so we have removed this sentence from the manuscript.

16) In the Abstract, the sentence "Here, we generate[…]" should mention what assay was employed.

This has now been done.

17) In the Introduction: "The extreme combinatorial complexity of Hi-C[…]" The complexity is really only quadratic, not combinatorial.

We have removed the words “extreme combinatorial” from this sentence to avoid confusion.

18) In the Results section: The URL for data availability should be included at the end of the sentence, "This data resource[…]"

We have now inserted a new sentence with details on the availability of raw and processed data.

19) Also in the Results section: "NEC PIRs were strongly enriched" This claim requires statistical support.

In the initial submission, the enrichment z-scores were given in Figure 2—figure supplement 1A, but not mentioned in the main text. We have now modified the text to rectify this.

20) New paragraph at "We next sought.[…]"

We have amended this.

21) Two of the observations offered in the Discussion do not seem to be supported by direct evidence in Results: "we find extensive promoter connectivity to regions associated with Polycomb-associated repression and poising, and "we.… detect large numbers of promoter interactions with regions devoid of chromatin features." These observations seem to be offered for the first time in Discussion.

Thank you for drawing our attention to this. We have now added details on these two points into the second section of Results (“Identification of putative regulatory elements and their associated gene promoters”).

22) The work does not show whether all the interactions are present simultaneously in a cell, or if the promoter state may vary between cells, depending on the probability with which the promoter is interacting with any of the potential alternative enhancers of a given chromatin state. Please discuss this issue in the main text.

This is a very good point. Indeed, Hi-C-based methods cannot distinguish between these two possibilities, and there is currently evidence in support of both the “hit-and-run” and the “chromatin hub” models. We discuss these topics in the section “Implications for developmental gene regulation by multiple enhancers”, which we have now expanded to explicitly outline these points.

23) Out of the ~21000 promoters, the authors focused their CRU analysis on only 16,000 protein-coding genes. Would the authors expect differences in the outcome if the other remaining promoters would be considered?

We now show the results for all captured promoters in Figures 1C and D, and have updated the main text accordingly. As expected, PIR enrichment for chromatin marks and the consistency between the promoter and PIR chromatin states are nearly identical when computed for protein-coding versus all captured promoters. Because of the level of similarity, we would also not expect any meaningful differences at the CRU level for all promoters compared to the currently analysed set of protein-coding genes. We would prefer to focus our analysis on protein-coding CRUs, however, as our mRNA-seq data interrogates only coding transcripts.

24) Out of the 16,000 protein-coding promoters about 9000 were defined as CRU. It may be good to know the selection criteria (i.e. is it based on a minimum number of PIRs per promoter?).

We apologise for not explaining this clearly. The ~9,000 figure includes all CRUs that have at least one PIR per promoter. We have updated the main text to explicitly state this fact.

25) Figure 3B defines a CRU as a set of PIRs that seem to be concurrently interacting with the promoter in the same cell. Maybe the authors could indicate in the caption that it cannot be ruled out that some PIRs may provide alternative rather then concurrent interactions.

We agree that this is a good idea and have amended the legend of Figure 3B accordingly.

https://doi.org/10.7554/eLife.21926.032

Article and author information

Author details

  1. Paula Freire-Pritchett

    Nuclear Dynamics Programme, Babraham Institute, Cambridge, United Kingdom
    Contribution
    PF-P, Conceptualization, Resources, Data curation, Formal analysis, Validation, Visualization, Methodology, Writing—original draft, Writing—review and editing
    Contributed equally with
    Stefan Schoenfelder
    Competing interests
    The authors declare that no competing interests exist.
  2. Stefan Schoenfelder

    Nuclear Dynamics Programme, Babraham Institute, Cambridge, United Kingdom
    Contribution
    SS, Conceptualization, Resources, Investigation, Visualization, Writing—original draft
    Contributed equally with
    Paula Freire-Pritchett
    Competing interests
    The authors declare that no competing interests exist.
  3. Csilla Várnai

    Nuclear Dynamics Programme, Babraham Institute, Cambridge, United Kingdom
    Contribution
    CV, Resources, Formal analysis, Methodology, Writing—review and editing
    Competing interests
    The authors declare that no competing interests exist.
  4. Steven W Wingett

    Nuclear Dynamics Programme, Babraham Institute, Cambridge, United Kingdom
    Contribution
    SWW, Data curation, Methodology
    Competing interests
    The authors declare that no competing interests exist.
  5. Jonathan Cairns

    Nuclear Dynamics Programme, Babraham Institute, Cambridge, United Kingdom
    Contribution
    JC, Formal analysis, Methodology, Writing—review and editing
    Competing interests
    The authors declare that no competing interests exist.
  6. Amanda J Collier

    1. Epigenetics Programme, Babraham Institute, Cambridge, United Kingdom
    2. Wellcome Trust – Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom
    Contribution
    AJC, Validation, Investigation, Visualization
    Competing interests
    The authors declare that no competing interests exist.
  7. Raquel García-Vílchez

    Epigenetics Programme, Babraham Institute, Cambridge, United Kingdom
    Contribution
    RG-V, Validation, Investigation, Visualization
    Competing interests
    The authors declare that no competing interests exist.
  8. Mayra Furlan-Magaril

    Nuclear Dynamics Programme, Babraham Institute, Cambridge, United Kingdom
    Present address
    Instituto de Fisiología Celular, Universidad Nacional Autónoma de México, Mexico City, Mexico
    Contribution
    MF-M, Investigation, Methodology
    Competing interests
    The authors declare that no competing interests exist.
  9. Cameron S Osborne

    Department of Genetics and Molecular Medicine, King's College London School of Medicine, London, United Kingdom
    Contribution
    CSO, Conceptualization, Resources, Funding acquisition, Methodology, Writing—review and editing
    Competing interests
    The authors declare that no competing interests exist.
  10. Peter Fraser

    Nuclear Dynamics Programme, Babraham Institute, Cambridge, United Kingdom
    Contribution
    PF, Resources, Supervision, Funding acquisition, Investigation, Writing—review and editing
    Competing interests
    The authors declare that no competing interests exist.
  11. Peter J Rugg-Gunn

    1. Epigenetics Programme, Babraham Institute, Cambridge, United Kingdom
    2. Wellcome Trust – Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom
    Contribution
    PJR-G, Conceptualization, Resources, Supervision, Funding acquisition, Investigation, Visualization, Methodology, Writing—original draft, Project administration, Writing—review and editing
    For correspondence
    peter.rugg-gunn@babraham.ac.uk
    Competing interests
    The authors declare that no competing interests exist.
  12. Mikhail Spivakov

    Nuclear Dynamics Programme, Babraham Institute, Cambridge, United Kingdom
    Contribution
    MS, Conceptualization, Resources, Formal analysis, Supervision, Funding acquisition, Validation, Methodology, Writing—original draft, Project administration, Writing—review and editing
    For correspondence
    mikhail.spivakov@babraham.ac.uk
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon 0000-0002-0383-3943

Funding

Wellcome (WT093736)

  • Peter J Rugg-Gunn

Biotechnology and Biological Sciences Research Council (BB/J004480/1)

  • Paula Freire-Pritchett
  • Stefan Schoenfelder
  • Csilla Várnai
  • Steven W Wingett
  • Jonathan Cairns
  • Mayra Furlan-Magaril
  • Peter Fraser
  • Mikhail Spivakov

Medical Research Council (MR/J003808/1)

  • Amanda J Collier

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Kristina Tabbada at the Babraham Institute Sequencing Facility, and Sarah Elderkin, Simon Andrews, Wolf Reik and members of our groups for insightful discussions. PF, PJR–G and MS are supported by the BBSRC. CV and PF are supported by the European Research Council (Advanced Grant 111608). AJC is supported by an MRC DTG Studentship (MR/J003808/1). RG–V was supported by the ERASMUS+ Program. PJR–G was supported by the Wellcome Trust (WT093736).

Reviewing Editor

  1. Job Dekker, Reviewing Editor, University of Massachusetts Medical School, United States

Publication history

  1. Received: September 28, 2016
  2. Accepted: March 22, 2017
  3. Accepted Manuscript published: March 23, 2017 (version 1)
  4. Version of Record published: April 27, 2017 (version 2)

Copyright

© 2017, Freire-Pritchett et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 2,859
    Page views
  • 999
    Downloads
  • 3
    Citations

Article citation count generated by polling the highest count across the following sources: PubMed Central, Crossref, Scopus.

Comments

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading