1. Chromosomes and Gene Expression
Download icon

Efficient chromatin accessibility mapping in situ by nucleosome-tethered tagmentation

  1. Steven Henikoff  Is a corresponding author
  2. Jorja G Henikoff
  3. Hatice S Kaya-Okur
  4. Kami Ahmad
  1. Basic Sciences Division Fred Hutchinson Cancer Research Center, United States
  2. Howard Hughes Medical Institute, United States
Tools and Resources
  • Cited 0
  • Views 2,563
  • Annotations
Cite this article as: eLife 2020;9:e63274 doi: 10.7554/eLife.63274

Abstract

Chromatin accessibility mapping is a powerful approach to identify potential regulatory elements. A popular example is ATAC-seq, whereby Tn5 transposase inserts sequencing adapters into accessible DNA (‘tagmentation’). CUT&Tag is a tagmentation-based epigenomic profiling method in which antibody tethering of Tn5 to a chromatin epitope of interest profiles specific chromatin features in small samples and single cells. Here, we show that by simply modifying the tagmentation conditions for histone H3K4me2 or H3K4me3 CUT&Tag, antibody-tethered tagmentation of accessible DNA sites is redirected to produce chromatin accessibility maps that are indistinguishable from the best ATAC-seq maps. Thus, chromatin accessibility maps can be produced in parallel with CUT&Tag maps of other epitopes with all steps from nuclei to amplified sequencing-ready libraries performed in single PCR tubes in the laboratory or on a home workbench. As H3K4 methylation is produced by transcription at promoters and enhancers, our method identifies transcription-coupled accessible regulatory sites.

eLife digest

Cells keep their DNA tidy by wrapping it into structures called nucleosomes. Each of these structures contains a short section of DNA wound around a cluster of proteins called histones. Not only do nucleosomes keep the genetic code organized, they also control whether the proteins that can switch genes on or off have access to the DNA. When genes turn on, the nucleosomes unwrap, exposing sections of genetic code called 'gene regulatory elements'. These elements attract the proteins that help read and copy nearby genes so the cell can make new proteins. Determining which regulatory elements are exposed at any given time can provide useful information about what is happening inside a cell, but the procedure can be expensive.

The most popular way to map which regulatory elements are exposed is using a technique called Assay for Transposase-Accessible Chromatin using sequencing, or ATAC-seq for short. The 'transposase' in the acronym is an enzyme that cuts areas of DNA that are not wound around histones and prepares them for detection by DNA sequencing. Unfortunately, the data from ATAC-seq are often noisy (there are random factors that produce a signal that is detected but is not a ‘real’ result), so more sequencing is required to differentiate between real signal and noise, increasing the expense of ATAC-seq experiments. Furthermore, although ATAC-seq can identify unspooled sections of DNA, it cannot provide a direct connection between active genes and unwrapped DNA.

To find the link between unspooled DNA and active genes, Henikoff et al. adapted a technique called CUT&Tag. Like ATAC-seq, it also uses transposases to cut the genome, but it allows more control over where the cuts occur. When genes are switched on, the proteins reading them leave chemical marks on the histones they pass. CUT&Tag attaches a transposase to a molecule that recognizes and binds to those marks. This allowed Henikoff et al. to guide the transposases to unspooled regions of DNA bordering active genes. The maps of gene regulatory elements produced using this method were the same as the best ATAC-seq maps. And, because the transposases could only access gaps near active genes, the data provided evidence that genes switching on leads to regulatory elements in the genome unwrapping.

This new technique is simple enough that Henikoff et al. were able to perform it from home on the countertop of a laundry room. By tethering the transposases to histone marks it was possible to detect unspooled DNA that was active more efficiently than with ATAC-seq. This lowers laboratory costs by reducing the cost of DNA sequencing, and may also improve the detection of gaps between nucleosomes in single cells.

Introduction

Identification of DNA accessibility in the chromatin landscape has been used to infer active transcription ever since the seminal description of DNaseI hypersensitivity by Weintraub and Groudine more than 40 years ago (Weintraub and Groudine, 1976). Because nucleosomes occupy most of the eukaryotic chromatin landscape and regulatory elements are mostly free of nucleosomes when they are active, DNA accessibility mapping can potentially identify active regulatory elements genome-wide. Several additional strategies have been introduced to identify regulatory elements by DNA accessibility mapping, including digestion with Micrococcal Nuclease (MNase) (Reeves, 1978) or restriction enzymes (Jack and Eggert, 1990), DNA methylation (Gottschling, 1992), physical fragmentation (Schwartz et al., 2005) and transposon insertion (Bownes, 1990). With the advent of genome-scale mapping platforms, beginning with microarrays and later short-read DNA sequencing, mapping regulatory elements based on DNaseI hypersensitivity became routine (Crawford et al., 2004; Dorschner et al., 2004). Later innovations included FAIRE (Giresi et al., 2007) and Sono-Seq (Auerbach et al., 2009), based on physical fragmentation and differential recovery of cross-linked chromatin, and ATAC-seq (Buenrostro et al., 2013), based on preferential insertion of the Tn5 transposase. The speed and simplicity of ATAC-seq, in which the cut-and-paste transposition reaction inserts sequencing adapters in the most accessible genomic regions (tagmentation), has led to its widespread adoption in many laboratories for mapping presumed regulatory elements.

For all of these DNA accessibility mapping strategies, it is generally unknown what process is responsible for creating any particular accessible sites within the chromatin landscape. Furthermore accessibility is not all-or-none, with the median difference between an accessible and a non-accessible site in DNA estimated to be only ~20%, with no sites completely accessible or inaccessible in a population of cells (Chereji et al., 2019; Oberbeckmann et al., 2019). Despite these uncertainties, DNA accessibility mapping has successfully predicted the locations of active gene enhancers and promoters genome-wide, with excellent correspondence between methods based on very different strategies (Karabacak Calviello et al., 2019). This is likely because DNA accessibility mapping strategies rely on the fact that nucleosomes have evolved to repress transcription by blocking sites of pre-initiation complex formation and transcription factor binding (Kornberg and Lorch, 2020), and so creating and maintaining a nucleosome-depleted region (NDR) is a pre-requisite for promoter and enhancer function.

A popular alternative to DNA accessibility mapping for regulatory element identification is to map nucleosomes that border NDRs, typically by histone marks, including ‘active’ histone modifications, such as H3K4 methylation and H3K27 acetylation, or histone variants incorporated during transcription, such as H2A.Z and H3.3. The rationale for this mapping strategy is that the enzymes that modify histone tails and the chaperones that deposit nucleosome subunits are most active close to the sites of initiation of transcription, which typically occurs bidirectionally at both gene promoters and enhancers to produce stable mRNAs and unstable enhancer RNAs. Although the marks left behind by active transcriptional initiation ‘point back’ to the NDR, this cause-effect connection between the NDR and the histone marks is only by inference (Wang et al., 2020), and direct evidence is lacking that a histone mark is associated with an NDR.

Here, we show that a simple modification of our Cleavage Under Targets and Tagmentation (CUT&Tag) method for antibody-tethered in situ tagmentation can identify NDRs genome-wide at regulatory elements adjacent to transcription-associated histone marks in human cells. We provide evidence that reducing the ionic concentration during tagmentation preferentially attracts Tn5 tethered to the H3K4me2 histone modification via a Protein A/G fusion to the nearby NDR, shifting the site of tagmentation from nucleosomes bordering the NDR to the NDR itself. Almost all transcription-coupled accessible sites correspond to ATAC-seq sites and vice-versa, and lie upstream of paused RNA Polymerase II (RNAPII). ‘CUTAC’ (Cleavage Under Targeted Accessible Chromatin) is conveniently performed in parallel with ordinary CUT&Tag, producing accessible site maps from low cell numbers with signal-to-noise as good as or better than the best ATAC-seq datasets.

Results

Streamlined CUT&Tag produces high-quality datasets with low cell numbers

We previously introduced CUT&RUN, a modification of Laemmli’s Chromatin Immunocleavage (ChIC) method (Schmid et al., 2004), in which a fusion protein between Micrococcal Nuclease (MNase) and Protein A (pA-MNase) binds sites of antibodies bound to chromatin fragments in nuclei or permeabilized cells immobilized on magnetic beads. Activation of MNase with Ca++ results in targeted cleavage, releasing the antibody-bound fragment into the supernatant for paired-end DNA sequencing. More recently, we substituted the Tn5 transposase for MNase in a modified CUT&RUN protocol, such that addition of Mg++ results in a cut-and-paste ‘tagmentation’ reaction, in which sequencing adapters are integrated around sites of antibody binding (Kaya-Okur et al., 2019). In CUT&Tag, DNA purification is followed by PCR amplification, eliminating the end-polishing and ligation steps required for sequencing library preparation in CUT&RUN. Like CUT&RUN, CUT&Tag requires relatively little input material, and the low backgrounds permit low sequencing depths to sensitively map chromatin features.

We have developed a streamlined version of CUT&Tag that eliminates tube transfers, so that all steps can be efficiently performed in a single PCR tube (Kaya-Okur et al., 2020). However, we had not determined the suitability of the single-tube protocol for profiling low cell number samples. During the COVID-19 pandemic, we adapted this CUT&Tag-direct protocol for implementation with minimal equipment and space requirements that uses no toxic reagents, so that it can be performed conveniently and safely on a home workbench (Figure 1—figure supplement 1). To ascertain the ability of our CUT&Tag-direct protocol to produce DNA sequencing libraries at home with data quality comparable to those produced in the laboratory, we used frozen aliquots of native human K562 cell nuclei prepared in the laboratory and profiled there using the streamlined single-tube protocol. Aliquots of nuclei were thawed and serially diluted in Wash buffer from ~60,000 down to ~60 starting cells, where the average yield of nuclei was ~50%. We used antibodies to H3K4me3, which preferentially marks nucleosomes immediately downstream of active promoters, and H3K27me3, which marks nucleosomes within broad domains of polycomb-dependent silencing. Aliquots of nuclei were taken home and stored in a kitchen freezer, then thawed and diluted at home and profiled for H3K4me3 and H3K27me3. In both the laboratory and at home, we performed all steps in groups of 16 or 32 samples over the course of a single day through the post-PCR clean-up step, treating all samples the same regardless of cell numbers. Whether produced at home or in the lab, all final barcoded sample libraries underwent the same quality control, equimolar pooling, and final SPRI bead clean-up steps in the laboratory prior to DNA sequencing.

Tapestation profiles of libraries produced at home detected nucleosomal ladders down to 200 cells for H3K27me3 and nucleosomal and subnucleosomal fragments down to 2000 cells for H3K4me3 (Figure 1A–B). Sequenced fragments were aligned to the human genome using Bowtie2 and tracks were displayed using IGV. Similar results were obtained for both at-home and in-lab profiles for both histone modifications (Figure 1C–D) using pA-Tn5 produced in the laboratory, and results using commercial Protein A/Protein G-Tn5 (pAG-Tn5) were at least as good. All subsequent experiments reported here were performed at home using commercial pAG-Tn5, which provided results similar to those obtained using batches of lab-produced pA-Tn5 run in parallel.

Figure 1 with 1 supplement see all
CUT&Tag-direct produces high-quality datasets on the benchtop and at home.

Starting with a frozen human K562 cell aliquot, CUT&Tag-direct with amplification for 12 cycles yields detectable nucleosomal ladders for intermediate and low numbers of cells for both (A) H3K4me3 and (B) H3K27me3. The higher yield of smaller fragments with decreasing cell number suggests that reducing the total available binding sites increases the binding of antibody and/or pAG-Tn5 in limiting amounts. (C) Comparison of H3K4me3 CUT&Tag-direct results produced in the laboratory to those produced at home and to an ENCODE dataset (GSM733680). (D) Same as (C) for H3K27me3 comparing CUT&Tag-direct results to CUT&Tag datasets using the standard protocol (Kaya-Okur et al., 2019), and to an ENCODE dataset (GSM788088). pA-Tn5 was used except as indicated by asterisks for datasets produced at home using commercial pAG-Tn5 (Epicypher cat. no. 15–1017).

NDRs attract Tn5 tethered to nearby nucleosomes during low-salt tagmentation

Because the Tn5 domain of pA-Tn5 binds avidly to DNA, it is necessary to use elevated salt conditions to avoid tagmenting accessible DNA during CUT&Tag. High-salt buffers included 300 mM NaCl for pA-Tn5 binding, washing to remove excess protein, and tagmentation at 37°C. We have found that other protocols based on the same principle but that do not include a high-salt wash step result in chromatin profiles that are dominated by accessible site tagmentation (Kaya-Okur et al., 2020).

To better understand the mechanistic basis for the salt-suppression effect, we bound pAG-Tn5 under normal high-salt CUT&Tag incubation conditions, then tagmented in low salt. We used either rapid 20-fold dilution with a prewarmed solution of 2 mM or 5 mM MgCl2 or removal of the pAG-Tn5 incubation solution and addition of 50 µL 10 mM TAPS pH8.5, 5 mM MgCl2. All other steps in the protocol followed our CUT&Tag-direct protocol (Kaya-Okur et al., 2020; Figure 2). Tapestation capillary gel electrophoresis of the final libraries revealed that after a 20 min incubation the effect of low-salt tagmentation on H3K4me2 CUT&Tag samples was a marked reduction in the oligo-nucleosome ladder with an increase in faster migrating fragments (Figure 3A and Figure 3—figure supplement 1A–B). CUT&Tag profiles using antibodies to most chromatin epitopes in the dilution protocol showed either little change or elevated levels of non-specific background tagmentation that obscured the targeted signal (Figure 3—figure supplement 2), as expected considering that we had omitted the high-salt wash step needed to remove unbound pAG-Tn5. Strikingly, under low-salt conditions, high-resolution profiles of H3K4me3 and H3K4me2 showed that the broad nucleosomal distribution of CUT&Tag around promoters for these two modifications was mostly replaced by single narrow peaks (Figure 3B and Figure 3—figure supplement 3).

CUT&Tag with low-salt tagmentation (CUTAC).

Steps in gray are lab-based and other steps were performed at home. Tagmentation can be performed by dilution, removal or post-wash. MEDS (mosaic end double-stranded annealed oligonucleotides).

Figure 3 with 4 supplements see all
Low-salt tagmentation of H3K4me2/3 CUT&Tag samples sharpen peaks.

(A) Tapestation gel image showing the change in size distribution from standard CUT&Tag (CnT), tagmented in the presence of 300 mM NaCl with low-salt tagmentation using the dilution protocol. (B) Representative tracks showing the shift observed with low-salt dilution tagmentation. (C) Average plots showing the narrowing of peak distributions upon low-salt tagmentation using the dilution protocol. (D) Heatmaps showing narrowing of H3K4me2 peaks after removing pAG-Tn5 (removal), after a stringent wash (post-wash), and after a stringent wash with low-salt tagmentation including a 1% pAG-Tn5 spike-in (Add-back). MACS2 was used to call peaks and heatmaps were ordered by density over the peak summits (sites). (E) Heatmaps showing dilution tagmentation and further narrowing of H3K4me2 peak distributions upon low-salt tagmentation (after removal) for 20 min at 37°C in the presence of 10% 1,6-hexanediol (hex) and 10% dimethylformamide (DMF) or both for 1 hr at 55°C. (F) Average plots showing effects of tagmentation with hex and/or DMF over time of low-salt tagmentation (after removal). (G) Smaller fragments (≤120 bp) dominate NDRs. Comparisons of small (≤120 bp) and large (>120 bp) fragments from CUTAC hex and DMF datasets show narrowing for small fragments around their summits. For each dataset a 3.2 million fragment random sample was split into small and large fragment groups.Removal of large fragments increases the number of peaks called (sites).

To evaluate the generality of peak shifts we used MACS2 to call peaks, and plotted the occupancy over aligned peak summits. For all three H3K4 methylation marks using normal CUT&Tag high-salt tagmentation conditions we observed a bulge around the summit representing the contribution from adjacent nucleosomes on one side or the other of the peak summit (Figure 3C). In contrast, tagmentation under low-salt conditions revealed much narrower profiles for H3K4me3 and H3K4me2 (~40% peak width at half-height), less so for H3K4me1 (~60%), which suggests that the shift is from H3K4me-marked nucleosomes to an adjacent NDR.

To determine whether free pAG-Tn5 present during tagmentation contributes, we removed the pAG-Tn5 then added 5 mM MgCl2 to tagment, and again observed narrowing of the H3K4me2 peak (Figure 3D ‘Removal’ and Figure 3—figure supplement 1C-D). We also observed a narrowing if we included a stringent 300 mM washing step before low-salt tagmentation (Figure 3D, ‘Post-wash’), which indicates that peak narrowing does not require free pAG-Tn5. Inclusion of a stringent post-wash step improves consistency relative to the Dilution or Removal protocols, although it resulted in lower yields and reduced library complexity (Figure 3—figure supplement 1E-F). However, if a small amount of pAG-Tn5 was included during tagmentation we obtained higher yields with increased peak narrowing (Figure 3D ‘Add-back’). Because Tn5 is inactive once it integrates its payload of adapters, and each fragment is generated by tagmentation at both ends, it is likely that a small amount of free pA(G)-Tn5 is sufficient to generate the additional small fragments where tethered pA(G)-Tn5 is limiting, albeit with higher background.

Salt ions compete with protein-DNA binding and so we suppose that tagmentation in low salt resulted in increased binding of epitope-tethered Tn5 to a nearby NDR prior to tagmentation. As H3K4 methylation is deposited in a gradient of tri- to di- to mono-methylation downstream of the +1 nucleosome from the transcriptional start site (TSS) (Henikoff and Shilatifard, 2011; Soares et al., 2017), we reasoned that the closer proximity of di- and tri-methylated nucleosomes to the NDR than mono-methylated nucleosomes resulted in preferential proximity-dependent ‘capture’ of Tn5. Consistent with this interpretation, we observed that the shift from broad to more peaky NDR profiles and heatmaps by H3K4me2 low-salt tagmentation was enhanced by addition of 1,6-hexanediol, a strongly polar aliphatic alcohol, and by 10% dimethylformamide, a strongly polar amide, both of which enhance chromatin accessibility (Figure 3E–F). NDR-focused tagmentation persisted even in the presence of both strongly polar compounds at 55°C. Enhanced localization by chromatin-disrupting conditions suggests improved access of H3K4me2-tethered Tn5 to nearby holes in the chromatin landscape during low-salt tagmentation. Localization to NDRs is more precise for small (≤120 bp) than large (>120 bp) tagmented fragments, and by resolving more closely spaced peaks inclusion of these compounds increased the number of peaks called (Figure 3G), also for H3K4me3-tethered Tn5 (Figure 3—figure supplement 4).

CUT&Tag low-salt tagmentation fragments coincide with ATAC-seq and DNaseI hypersensitive sites

Using CUT&Tag, we previously showed that most ATAC-seq sites are flanked by H3K4me2-marked nucleosomes in K562 cells (Kaya-Okur et al., 2019). However, lining up ATAC-seq datasets over peaks called using H3K4me2 CUT&Tag data resulted in smeary heatmaps, reflecting the broad distribution of peak calls over nucleosome positions flanking NDRs (Figure 4A). In contrast, alignment of ATAC-seq datasets over peaks called using low-salt tagmented CUT&Tag data produced narrow heatmap patterns for the vast majority of peaks (Figure 4B). To reflect the close similarities between fragments released by H3K4me2-tethered low-salt tagmentation as by ATAC-seq using untethered Tn5, we will refer to low-salt H3K4me2 and H3K4me3 CUT&Tag tagmentation as Cleavage Under Targeted Accessible Chromatin (CUTAC).

Figure 4 with 2 supplements see all
H3K4me2 CUTAC peaks correspond to ATAC-seq and DNaseI hypersensitivity peaks.

(A–D) Heatmaps showing the correspondence between H3K4me2 CUTAC and ATAC-seq sites. Headings over each heatmap denote the source of fragments mapping to the indicated set of MACS2 peak summits, ordered by occupancy over the 5-kb interval centered over each site. CUT&Tag and CUTAC sites are from samples processed in parallel, where CUTAC tagmentation was performed by 20-fold dilution and 20 min 37°C incubation following pAG-Tn5 binding. (E) Correlation matrix of H3K4me2 and H3K4me3 CUTAC and ATAC-seq data for K562 cells. (F) Heatmaps showing ≤120 bp signals for H3K4me2 CUT&Tag, CUTAC and ATAC-seq at CTCF DNaseI hypersensitive sites. Arrowheads on left indicate CTCF site cutoffs.

We confirmed the similarity between CUTAC and ATAC-seq by aligning H3K4me2 CUT&Tag and CUTAC datasets over peaks called from Omni-ATAC data (Figure 4C). In a scatterplot comparison between CUTAC and Omni-ATAC we did not detect off-diagonal clusters that would indicate a subset of peaks found by one but not the other dataset (Figure 4—figure supplement 1).

To further evaluate the degree of similarity between CUTAC and ATAC-seq, we aligned the ENCODE ATAC-seq dataset over peaks called using Omni-ATAC and CUTAC, where all datasets were sampled down to 3.2 million mapped fragments with mitochondrial fragments removed. Remarkably, heatmaps produced using either Omni-ATAC or CUTAC peak calls for the same ENCODE ATAC-seq data showed occupancy of ~95% for both sets of peaks (compare right panels of Figure 4B–C). We found ~50% overlap between ENCODE ATAC-seq peaks and peaks called from either Omni-ATAC (50.0%) or CUTAC (51.3%) data (Figure 4—figure supplement 2). This equivalence between H3K4me2 CUTAC and Omni-ATAC when compared to ENCODE ATAC-seq implies that CUTAC and Omni-ATAC detect the same chromatin features. This conclusion does not hold for H3K4me3 CUTAC, because similar alignment of ENCODE ATAC-seq data resulted in only ~75% peak occupancy (Figure 4D) and lower correlations (Figure 4E), which we attribute to the greater enrichment of H3K4me3 around promoters than enhancers relative to H3K4me2.

To evaluate whether CUTAC peaks also correspond to sites of DNaseI hypersensitivity, we aligned H3K4me2 CUT&Tag and CUTAC signals over 9403 CCCTC-binding factor (CTCF) motifs scored as peaks of DNaseI sensitivity in K562 and HeLa cells. We excluded nucleosomal fragments by using only ≤120 bp fragments. We observed that 86% of the DNaseI hypersensitive CTCF sites are occupied by CUTAC signal relative to flanking regions (Figure 4F), which suggests equivalence of CUTAC and DNaseI hypersensitive CTCF sites. We also found that the H3K4me2 CUT&Tag sample showed detectable signal at only 53% of the CTCF sites. This improvement in detection of CTCF sites by H3K4me2 CUTAC over H3K4me2 CUT&Tag illustrates the potential of using ≤120 bp CUTAC fragment data to improve the resolution and sensitivity of transcription factor binding site motif detection.

To evaluate signal-to-noise genome-wide, we called peaks using MACS2 and calculated the Fraction of Reads in Peaks (FRiP), a data quality metric introduced by the ENCODE project (Landt et al., 2012). For both ENCODE ChIP-seq and our published CUT&RUN data we measured FRiP = ~0.2 for 3.2 million fragments, whereas for CUT&Tag, FRiP = ~0.4, reflecting improved signal-to-noise relative to previous chromatin profiling methods (Kaya-Okur et al., 2019). Using CUT&Tag-direct, H3K4me2 CUT&Tag FRiP = 0.41 for 3.2 million fragments and ~16,000 peaks (n = 4 replicates), whereas tagmentation by dilution in 2 mM MgCl2 resulted in FRiP = 0.18 for 3.2 million fragments and ~15,000 peaks (n = 4) with similar values for tagmentation by removal [FRiP = 0.21,~15,000 peaks (n = 4)]. In add-back experiments, we measured lower FRiP values after stringent washing conditions, suggesting increased background.

We also compared the number of peaks and FRiP values for CUTAC to those for ATAC-seq for K562 cells and observed that CUTAC data quality was similar to that for the Omni-ATAC method (Corces et al., 2017), better than ENCODE ATAC-seq (Zhang et al., 2020), and much better than Fast-ATAC (Corces et al., 2016), a previous improvement over Standard ATAC-seq (Buenrostro et al., 2013; Figure 5A). CUTAC is relatively insensitive to tagmentation times, with similar numbers of peaks and similar FRiP values for samples tagmented for 5, 20 and 60 min (Figure 5A). We attribute the robustness of CUT&Tag and CUTAC to the tethering of Tn5 to specific chromatin epitopes, so that when tagmentation goes to completion there is little untethered Tn5 that would increase background levels. When we measured peak numbers and FRiP values for ATAC-seq for K562 data deposited in the Gene Expression Omnibus (GEO) from multiple laboratories, we observed a wide range of data quality (Figure 5B, even from very recent submissions from expert groups: Table 1 and Figure 5—figure supplement 1). We attribute this variability to the difficulty of avoiding background tagmention by excess free Tn5 in ATAC-seq protocols and subsequent release of non-specific nucleosomal fragments (Swanson et al., 2020).

Figure 5 with 3 supplements see all
CUTAC data quality is similar to the best available ATAC-seq K562 cell data.

Mapped fragments from the indicated datasets were sampled and peaks were called using MACS2. (A) Number of peaks (left) and fraction of reads in peaks for CUT&Tag (blue), H3K4me2 CUTAC (red) and ATAC-seq (green). Fast-ATAC is an improved version of ATAC-seq that reduces mitochondrial reads (Corces et al., 2016), and Omni-ATAC is an improved version that additionally improves signal-to-noise (Corces et al., 2017). ATAC_ENCODE is the current ENCODE standard (Moore et al., 2020). (B) Five other K562 ATAC-seq datasets from different laboratories were identified in GEO and mapped to hg19. MACS2 was used to call peaks. Peak numbers and FRiP values indicate a wide range of data quality found in recent ATAC-seq datasets. (C) Small H3K4me2 CUTAC fragments improve peak-calling. Hex = 1,6 hexanediol, DMF = N,N-dimethylformamide.

Table 1
CUTAC data quality is similar to that of the best ATAC-seq datasets.

Human K562 and H1 ES cell ATAC-seq datasets were downloaded from GEO, and Bowtie2 was used to map fragments to hg19. A sample of 3.2 million mapped fragments without Chr M was used for peak-calling by MACS2 to calculate FRiP values. Year of submission to GEO or SRA databanks is shown. % Chr M is percent of hg19-mapped fragments mapped to mitochondrial DNA.

SampleSourceYearRead_typeRaw_readshg19-mapped% Chr M# PeaksFRiP %
 CUT&Tag-direct H1This study2020PE254,832,1844,525,5250.223,05153
 CUT&Tag-direct K562This study2020PE253,252,4903,144,253220,55541
 CUTAC H1This study2020PE252,770,9012,734,092116,84825
 CUTAC K562This study2020PE255,973,0634,785,931314,38122
 Omni-ATAC K562 SRR5657531-2Stanford2017PE754,407,7063,181,1101316,73720
 ATAC H1 GSM3677783Fred Hutch2019PE254,504,8124,157,8001319,51716
 ATAC K562 ENCFF123TMXStanford (ENCODE)2020PE10043,473,26623,942,024914,36911
 ATAC K562 GSM4083680U. Texas-Southwestern2019SE7429,193,87317,612,6094358948
 ATAC K562 GSM3452726Cornell U.2018PE3686,907,62583,038,8662415556.2
 ATAC K562 GSM4190694Keio U.2020PE6015,363,85514,067,8037918374.8
 ATAC H1 GSM4130883Stanford2020PE10043,784,18819,562,2196042893.2
 Fast-ATAC K562 SRR5657533-4Stanford2017PE756,702,5584,677,843827802.4
 ATAC K562 GSM4005278Penn State Hershey2020PE10012,772,9978,541,0052516912.1
 ATAC K562 GSM4130894Stanford2020PE10045,122,83419,021,462864491.4

If low-salt tagmentation sharpens peaks of DNA accessibility because tethering to neighboring nucleosomes increases the probability of tagmentation in small holes in the chromatin landscape, then we would expect smaller fragments to dominate CUTAC peaks. Indeed this is exactly what we observe for heatmaps (Figure 5—figure supplement 2), tracks (Figure 5—figure supplement 3), peak calls and FRiP values (Figure 5C). Excluding larger fragments results in better resolution yielding more peaks and higher FRIP values, both of which approach a maximum with fewer fragments. Moreover, the addition of strongly polar compounds during tagmentation provides a substantial improvement in peak calling and FRiPs (Figure 5C, turquoise and orange curves). Excluding large fragments did not improve ATAC-seq peak calls and FRiP values, which indicates that tethering to H3K4me2 is critical for maximum sensitivity and resolution of DNA accessibility maps.

CUTAC maps transcription-coupled regulatory elements

H3K4me2/3 methylation marks active transcription at promoters (Gilchrist et al., 2012), which raises the question as to whether sites identified by CUTAC are also sites of RNAPII enrichment genome-wide. To test this possibility, we first aligned CUT&Tag and CUTAC data at annotated promoters displayed as heatmaps or average plots. CUT&Tag H3K4me2 peaks flank NDRs more downstream on either side than H3K4me3, confirmed by ENCODE ChIP-seq data to be the actual location of these marks (Figure 6—figure supplement 1). In contrast, CUTAC peaks are located in the NDR between flanking H3K4me2-marked chromatin (Figure 6A). CUTAC sites at promoter NDRs corresponded closely to promoter ATAC-seq sites, consistent with expectation for promoter NDRs. Thus, paired CUT&Tag and CUTAC samples can replace both ChIP-seq for an active promoter mark and ATAC-seq in a single experiment with identical processing, analysis and display.

Figure 6 with 1 supplement see all
H3K4me2 CUTAC sites are coupled to transcription.

(A) H3K4me2 fragments shift from flanking nucleosomes to the NDR upon low-salt tagmentation, corresponding closely to ATAC-seq sites. (B) The Serine-5 phosphate-marked initiation form of RNAPII is highly abundant over most H3K4me2 CUT&Tag, CUTAC and ATAC-seq peaks. (C) Run-on transcription initiates from most sites corresponding to CUTAC and ATAC-seq peaks. Both plus and minus strand PRO-seq datasets downloaded from GEO (GSM3452725) were pooled and aligned over peaks called using 3.2 million fragments sampled from H3K4me2 CUT&Tag, CUTAC and Omni-ATAC datasets, and also from pooled CUT&Tag replicate datasets for K562 RNA Polymerase II Serine-5 phosphate.

To determine whether CUTAC sites are also sites of transcription initiation in general, we aligned CUT&Tag RNA Polymerase II (RNAPII) Serine-5 phosphate (RNAPIIS5P) CUT&Tag data over H3K4me2 CUT&Tag and CUTAC and Omni-ATAC peaks ordered by RNAPIIS5P peak intensity. When displayed as heatmaps or average plots, CUTAC datasets show a conspicuous shift into the NDR from flanking nucleosomes (Figure 6B).

Mammalian transcription also initiates at many enhancers, as shown by transcriptional run-on sequencing, which identifies sites of RNAPII pausing whether or not a stable RNA product is normally produced (Kaikkonen et al., 2013). Accordingly, we aligned RNAPII-profiling PRO-seq data for K562 cells over H3K4me2 CUT&Tag and CUTAC and Omni-ATAC sites, displayed as heatmaps and ordered by PRO-Seq signal intensity. The CUT&Tag sites showed broad enrichment of PRO-seq signals offset ~1 kb on either side, whereas PRO-seq signals were tightly centered around CUTAC sites, with similar results for Omni-ATAC sites (Figure 6C). Interestingly, alignment around TSSs, RNAPIIS5P or PRO-seq data resolved immediately flanking H3K4me2-marked nucleosomes in CUT&Tag data, which is not seen for the same data aligned on signal midpoints (Figures 3 and 5). Such alignment of +1 and −1 nucleosomes next to fixed NDR boundaries is consistent with nucleosome positioning based on steric exclusion (Chereji et al., 2018). Furthermore, the split in PRO-seq occupancies around NDRs defined by CUTAC and Omni-ATAC implies that the steady-state location of most engaged RNAPII is immediately downstream of the NDR from which it initiated. About 80% of the CUTAC sites showed enrichment of PRO-Seq signal downstream, confirming that the large majority of CUTAC sites correspond to NDRs representing transcription-coupled regulatory elements.

Discussion

The correlation between sites of high chromatin accessibility and transcriptional regulatory elements, including enhancers and promoters, has driven the development of several distinct methods for genome-wide mapping of DNA accessibility for nearly two decades (Klein and Hainer, 2020). However, the processes that are responsible for creating gaps in the nucleosome landscape are not completely understood. In part this uncertainty is attributable to variations in nucleosome positioning within a population of mammalian cells such that there is only a ~20% median difference in absolute DNA accessibility between DNaseI hypersensitive sites and non-hypersensitive sites genome-wide (Chereji et al., 2019). This suggests that DNA accessibility is not the primary determinant of gene regulation, and contradicts the popular characterization of accessible DNA sites as ‘open’ and the lack of accessibility as ‘closed’. Moreover, there are multiple dynamic processes that can result in nucleosome depletion, including transcription, nucleosome remodeling, transcription factor binding, and replication, so that the identification of a presumed regulatory element by chromatin accessibility mapping leaves open the question as to how accessibility is established and maintained. Our CUTAC mapping method now provides a physical link between a transcription-coupled process and DNA hyperaccessibility by showing that anchoring of Tn5 to a nucleosome mark laid down by transcriptional events immediately downstream identifies presumed gene regulatory elements that are indistinguishable from those identified by ATAC-seq. The equivalence of CUTAC and ATAC at both enhancers and promoters provides support for the hypothesis that these regulatory elements are characterized by the same regulatory architecture (Andersson et al., 2015; Arnold et al., 2019).

The mechanistic basis for asserting that H3K4 methylation is a transcription-coupled event is well-established (Henikoff and Shilatifard, 2011; Soares et al., 2017). In all eukaryotes, H3K4 methylation is catalyzed by COMPASS/SET1 and related enzyme complexes, which associate with the C-terminal domain (CTD) of the large subunit of RNAPII when Serine-5 of the tandemly repetitive heptad repeat of the CTD is phosphorylated following transcription initiation. The enrichment of dimethylated and trimethylated forms of H3K4 is thought to be the result of exposure of the H3 tail to COMPASS/SET1 during RNAPII stalling just downstream of the TSS, so that these modifications are coupled to the onset of transcription (Soares et al., 2017). Therefore, our demonstration that Tn5 tethered to H3K4me2 or H3K4me3 histone tail residues efficiently tagments accessible sites, implies that accessibility at regulatory elements is created by events immediately following transcription initiation. This mechanistic interpretation is supported by the mapping of CUTAC sites just upstream of RNAPII, and is consistent with the recent demonstration that PRO-seq data can be used to accurately impute ‘active’ histone modifications (Wang et al., 2020). Thus CUTAC identifies active promoters and enhancers that produce enhancer RNAs, which might help explain why ~95% of ATAC-seq peaks are detected by CUTAC and vice-versa (Figure 4B–C).

CUTAC also provides practical advantages over other chromatin accessibility mapping methods. Like CUT&Tag-direct, all steps from frozen nuclei to purified sequencing-ready libraries for the data presented here were performed in a day in single PCR tubes on a home workbench. As it requires only a simple modification of one step in the CUT&Tag protocol, CUTAC can be performed in parallel with an H3K4me2 CUT&Tag positive control and other antibodies using multiple aliquots from each population of cells to be profiled. We have shown that three distinct protocol modifications, dilution, removal and post-wash tagmentation yield high-quality results, providing flexibility that might be important for adapting CUTAC to nuclei from diverse cell types and tissues.

Although a CUT&Tag-direct experiment requires a day to perform, and ATAC-seq can be performed in a few hours, this disadvantage of CUTAC is offset by the better control of data quality with CUTAC as is evident from the large variation in ATAC-seq data quality between laboratories (Table 1). In contrast, CUT&Tag is highly reproducible using native or lightly cross-linked cells or nuclei (Kaya-Okur et al., 2020), and as shown here H3K4me2 CUTAC maps regulatory elements with sensitivity and signal-to-noise comparable to the best ATAC-seq datasets, even better when larger fragments are computationally excluded. Although datasets from H3K4me2 CUT&Tag have lower background than datasets from CUTAC run in parallel, the combination of the two provides both highest data quality (CUT&Tag) and precise mapping (CUTAC) using the same H3K4me2 antibody. Therefore, we anticipate that current CUT&Tag users and others will find the CUTAC option to be an attractive alternative to other DNA accessibility mapping methods for identifying transcription-coupled regulatory elements.

Materials and methods

Key resources table
Reagent type
(species) or
resource
DesignationSource or
reference
IdentifiersAdditional
information
Cell line (Human)K562ATCCCat#CCL-243; RRID:CVCL_0004
Cell line (Human)H1 embryonic stem cellsWiCellCat#WA01-lot#WB35186; RRID:CVCL_9771
Antibodyrabbit polyclonal anti-NPATThermo Fisher ScientificPA5-66839; RRID:AB_2663287Concentration: 1:100
Antibodyguinea pig polyclonal anti-rabbit IgGAntibodies OnlineCat#ABIN101961; RRID:AB_10775589Concentration: 1:100
Antibodyrabbit polyclonal anti-mouse IgGAbcamCat#46540; RRID:AB_2614925Concentration: 1:100
Antibodyrabbit monoclonal anti-H3K27me3Cell SignalingCat#9733; RRID:AB_2616029Concentration: 1:100
Antibodyrabbit polyclonal anti-H3K4me2UpstateCat#07–730-
lot#3229364; RRID:AB_11213050
Concentration: 1:100
Antibodyrabbit monoclonal anti-H3K27acMilliporeCat#MABE647Concentration: 1:100
Antibodyrabbit polyclonal anti-H3K4me3Active MotifCat#39159; RRID:AB_2561020Concentration: 1:100
Antibodyrabbit monoclonal
anti-H3K4me2
EpicypherCat#13–0027Concentration: 1:100
Antibodyrabbit monoclonal anti-H3K4me1EpicypherCat#13–0026Concentration: 1:100
Antibodyrabbit polyclonal anti-H3K9me3AbcamCat#ab8898; RRID:AB_306848Concentration: 1:100
Antibodyrabbit monoclonal anti-H3K36me3EpicypherCat#13–0031Concentration: 1:100
Peptide, recombinant proteinProtein A-Tn5Henikoff labdoi:10.17504/protocols.io.8yrhxv6Concentration: 1:200
Peptide, recombinant proteinProtein AG-Tn5Epicypher1 Cat#5–1117Concentration: 1:20-1:60

Biological materials

Request a detailed protocol

Human K562 cells were purchased from ATCC (CCL-243) and cultured following the supplier’s protocol. H1 ES cells were obtained from WiCell (WA01-lot#WB35186) and cultured following NIH 4D Nucleome guidelines. All tested negative for mycoplasma contamination using a MycoProbe kit.

CUT&Tag-direct and CUTAC

Request a detailed protocol

Log-phase human K562 or H1 embryonic stem cells were harvested and prepared for nuclei in a hypotonic buffer with 0.1% Triton-X100 essentially as described (Skene and Henikoff, 2017). A detailed, step-by-step nuclei preparation protocol can be found at protocols.io.

CUT&Tag-direct was performed as described (Kaya-Okur et al., 2020), except that all CUTAC experiments were done on a home laundry room counter (Figure 1—figure supplement 1) with 32 samples run in parallel mostly over the course of a single ~8 hour day. A detailed step-by-step protocol including the three CUTAC options used in this study can be found at protocols.io. Briefly, nuclei were thawed, mixed with activated Concanavalin A beads and magnetized to remove the liquid with a pipettor and resuspended in Wash buffer (20 mM HEPES pH 7.5, 150 mM NaCl, 0.5 mM spermidine and Roche EDTA-free protease inhibitor). After successive incubations with primary antibody (1–2 hr) and secondary antibody (0.5–1 hr) in Wash buffer, the beads were washed and resuspended in pA(G)-Tn5 at 12.5 nM in 300-Wash buffer (Wash buffer containing 300 mM NaCl) for 1 hr. Incubations were performed at room temperature either in bulk or in volumes of 25–50 µL in low-retention PCR tubes. For CUT&Tag, tagmentation was performed for 1 hr in 300-Wash buffer supplemented with 10 mM MgCl2 in a 50 µL volume. For CUTAC, tagmentation was performed in low-salt buffer with varying components, volumes and temperatures as described for each experiment in the figure legends. In ‘dilution’ tagmentation, tubes containing 25 µL of pA(G)-Tn5 incubation solution and 2 mM or 5 mM MgCl2 solutions were preheated to 37°C. Tagmentation solution (475 µL) was rapidly added to the tubes and incubated for times and temperatures as indicated. In ‘removal’ tagmentation, tubes were magnetized, liquid was removed, and 50 µL of ice-cold 10 mM TAPS pH 8.5, 5 mM MgCl2 was added, followed by incubation for times and temperatures as indicated. The ‘post-wash’ protocol is identical to the CUT&Tag-direct protocol except that tagmentation was performed in 10 mM TAPS pH 8.5, 5 mM MgCl2 at 37°C as indicated. In ‘add-back’ tagmentation, the post-wash protocol was used with 10 mM TAPS pH 8.5, 5 mM MgCl2 supplemented with pA(G)-Tn5 and incubated at 37°C as indicated.

Following tagmentation, CUT&Tag and CUTAC samples were chilled and magnetized, liquid was removed, and beads were washed in 50 µL 10 mM TAPS pH 8.5, 0.2 mM EDTA then resuspended in 5 µL 0.1% SDS, 10 µL TAPS pH 8.5. Following incubation at 58°C, SDS was neutralized with 15 µL of 0.67% Triton-X100, and 2 µL of 10 mM indexed P5 and P7 primer solutions were added. Tubes were chilled and 25 µL of NEBNext 2x Master mix was added and vortexed. Gap-filling and 12 cycles of PCR were performed using an MJ PTC-200 Thermocycler. Clean-up was performed by addition of 65 µL SPRI bead slurry following the manufacturer’s instructions, eluted with 20 µL 1 mM Tris-HCl pH 8, 0.1 mM EDTA and 2 µL was used for Agilent 4200 Tapestation analysis. The barcoded libraries were mixed to achieve equimolar representation as desired aiming for a final concentration as recommended by the manufacturer for sequencing on an Illumina HiSeq 2500 2-lane Turbo flow cell.

Data processing and analysis

Request a detailed protocol

For datasets from GEO with fragment read lengths ≥60 bp we ran cutadapt 2.9 with parameters -q 20 -a AGATCGGAAGAGC -A AGATCGGAAGAGC. Paired-end reads were aligned to hg19 using Bowtie2 version 2.3.4.3 with options: --end-to-end --very-sensitive --no-unal --no-mixed --no-discordant --phred33 -I 10 - X 700. Tracks were made as bedgraph files of normalized counts, which are the fraction of total counts at each basepair scaled by the size of the hg19 genome. Peaks were called using MACS2 version 2.2.6 callpeak -f BEDPE -g hs -p le-5 –keep-dup all –SPMR. Heatmaps were produced using deepTools 3.3.1.

To produce the scatterplot (Figure 4—figure supplement 1) and correlation matrix (Figure 4E), we first removed fragments overlapping any repeat-masked region in hg19, then sampled 3.2 million fragments from each of the 11 datasets and called peaks on the merged data using MACS2. As previously described (Meers et al., 2019), we used a CUTAC IgG negative control, summing normalized counts within peaks and removing peaks above a threshold of the 99th percentile of normalized count sums (46,561 final peaks).

A detailed step-by-step Data Processing and Analysis Tutorial can be found at protocols.io.

References

    1. Klein DC
    2. Hainer SJ
    (2020) Genomic methods in profiling DNA accessibility and factor localization
    Chromosome Research : An International Journal on the Molecular, Supramolecular and Evolutionary Aspects of Chromosome Biology 28:69–85.
    https://doi.org/10.1007/s10577-019-09619-9

Decision letter

  1. Roberto Bonasio
    Reviewing Editor; University of Pennsylvania, United States
  2. Jessica K Tyler
    Senior Editor; Weill Cornell Medicine, United States
  3. Charles G Danko
    Reviewer; Cornell University, United States
  4. Junyue Cao
    Reviewer; The Rockefeller University, United States

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Thank you for submitting your article "Efficient transcription-coupled chromatin accessibility mapping in situ" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Jessica Tyler as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Charles G Danko (Reviewer #1); Junyue Cao (Reviewer #3).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

We would like to draw your attention to changes in our revision policy that we have made in response to COVID-19 (https://elifesciences.org/articles/57162). Specifically, when editors judge that a submitted work as a whole belongs in eLife but that some conclusions require a modest amount of additional new data, as they do with your paper, we are asking that the manuscript be revised to either limit claims to those supported by data in hand, or to explicitly state that the relevant conclusions require additional supporting data.

Our expectation is that the authors will eventually carry out the additional experiments and report on how they affect the relevant conclusions either in a preprint on bioRxiv or medRxiv, or if appropriate, as a Research Advance in eLife, either of which would be linked to the original paper.

Essential revisions

1) Reviewers #1 and #3 raise questions regarding the possibility of some accessible regions not being profiled by CUTAC and reviewer #1 suggests to analyze regions bound by CTCF to explore this possibility. Please include these or other analyses to address this point.

2) Reviewers #1 and #3 wonder about the quantitative correlation in addition to the spatial overlap shown. Please include these analyses.

3) While the experiments presented indicate a correlation between transcription and chromatin accessibility, the term "transcription-coupled" in the title implies that a causal link has been demonstrated, but this would require manipulations. We advise you to change the title to better reflect the content of the manuscript.

Additional points

We also think that several of the other points made by the reviewers might help you strengthen this manuscript and encourage you to consider addressing them if possible. The full reviews are included below.

Reviewer #1:

This paper by the Henikoff lab introduces CUTAC, a molecular tool that allows users to sequence DNA inside nucleosome depleted regions accessible to transposition by a protein A (pA)-Tn5 fusion protein. CUTAC builds on the Henikoff lab's exciting new CUT&TAG method. Unlike the CUT&TAG protocol published recently, however, this new work uses low-salt conditions during tagmentation, which appears to promote Tn5 transposition in nucleosome depleted regions adjacent to the primary antibody. The data demonstrating that CUTAC favors transposition inside of nucleosome depleted regions is compelling and clearly shown. Moreover, the new method affords a substantial improvement in the resolution for active regulatory regions compared with CUT&TAG for histone modifications, comparable to that of high-quality ATAC-seq data. Compared with ATAC-seq, there are potentially several compelling advantages of CUTAC, including reproducibility, side-by-side library prep with CUT&TAG, and the possibility of being selective about which open chromatin regions are sequenced (see below). Between these advantages and the authors' past success and broad community interest in the CUT&RUN and CUT&TAG family of methods, I am in favor of publication. I have several comments for the authors:

1) The authors' model is that the primary antibody recruits pA-Tn5 fusion, which then transposes DNA in adjacent accessible regions. However, not all nuclease accessible chromatin is marked by H3K4me2/3. Several lines of evidence suggest that, at least CTCF binding sites have a high level of DNase-I accessibility, but many lack histone modifications indicative of active enhancers/ promoters. Most of the previous work on this subject was done using DNase-I-seq, however presumably the same signal is true for ATAC-seq?! Assuming ATAC-seq shows the same signal, I am curious to know whether CUTAC data collected using K4me2/3 antibodies shows accessibility near CTCF binding sites. An easy way to get at this would be to center on CTCF sites, and break them into classes which do/ do not contain evidence of either K4me2/3 or transcription using ENCODE data. If a heatmap shows similar signal between ATAC and CUTAC at CTCF sites associated with K3me2/3, but only ATAC shows signal at CTCF sites not associated with these marks, then it implies a degree of specificity for open chromatin near the primary antibody as would be expected from the author's model.

2) Selecting which open chromatin regions to measure could be an additional, compelling advantage of CUTAC over ATAC-seq. One could imagine, for instance, using CUTAC to find open chromatin near specific kinds of transcriptional co-activators or co-repressors, Pol II, or (possibly) transcription factors. I would imagine there are a range of applications that would benefit from something like this. Is it worth saying more about this? Or do the authors think that more exploration would be required before this could be stated with any certainty (perhaps the analysis suggested in point #1, above, will help)?

3) To what extent does CUTAC recover the quantitative amount of chromatin accessibility measured by ATAC-seq? Heatmaps suggest the two are highly correlated, as would be expected. It might be useful for readers to see scatterplots that show the correlation in integrated signal near peaks. Note that I would not necessarily expect the correlation to be perfect if there is some specificity for accessible chromatin near H3K4me2/3.

4) In some parts of the text and Abstract, I came away with the impression that CUTAC involves both H3K4me2 and H3K4me3 primary antibodies in the same sample. Based on the main text, however, I think the authors are only using one of these two marks at a time. Please clarify.

5) In Figure 5, please clarify aspects of the CUTAC experiment that were explored in earlier figures were used. Was it me2 or me3? With or without hexanediol or dimethylformamide?

Reviewer #2:

In this manuscript Henikoff et al. present a modification of CUT&Tag, a method that they developed previously to profile chromatin epitopes genome-wide. Here, they show that CUT&Tag can be applied to profile transcription-couple chromatin accessibility sites by simply altering the salt concentration during tagmentation that changes the biochemical binding preferences of the pA-Tn5 transposase. The authors have creatively shown that this method can be performed at home to yield the same results as when performed in the lab, an interesting feature given the current restrictions on laboratory occupancy. The authors claim that while CUTAC takes longer to perform that ATAC-Seq, it gives better quality data than ATAC-seq based on the variation of ATAC-seq data quality between laboratories. The authors assume that all laboratories that perform ATAC-Seq are equally proficient in the technique and that variation is simply due to the technique itself and not the experimentalist. Therefore, it is unclear if CUTAC is indeed superior to ATAC-Seq. Together with the fact that CUTAC is a very minor modification of CUT&Tag, I am not convinced that it is a sufficient advance to warrant publication as a research tool in eLife.

Reviewer #3:

In this manuscript, Henikoff et al. developed a novel approach for in-situ mapping of transcription-coupled chromatin accessibility. Compared with conventional ATAC-seq, this method displays several unique advantages, including high sensitivity and compatibility with parallel Cut&tag profiling. I am rather enthusiastic about the release of this work. Also it is highly appreciated that the authors already uploaded the detailed protocol to protocol.io. For publication in eLife, this work only has several points to be clarified as shown below:

1) In Figure 1AB, CUT&Tag-direct with different starting nuclei numbers gave very different fragment size distributions. Is there a specific reason for this? How does the input nuclei/cell number affect the genome-wide signals?

2) For the divergent outputs from CUT&Tag and CUTAC, the manuscript implies that this is due to different Tn5-DNA binding affinities between low and high salt conditions. Is it also possibly due to that the high salt simply broad the space between nearby nucleosomes for more efficient tagmentation?

3) My major concern for using this approach as a substitution of ATAC-seq is that this method may introduce bias with the use of antibody linked Tn5. Are there enriched H3K4me2 signals in the peaks detected only by H3K4me2 CUTAC compared with peaks detected only by Omni-ATACseq?

4) Figure 5 is helpful for evaluating technique efficiency and qualities. However, the number of peaks per mapped fragments is affected by the input cell/nuclei number and the library's complexity. It would be great if these comparisons are made based on the same number of input nuclei. This is also helpful for comparing the efficiencies of different approaches. Also, how similar is the CUTAC dataset compared with all other ATAC-seq datasets by correlation analysis?

5) For the broad application of the technique, it would be great if the authors can compare the library preparation cost per sample between this technique and conventional ATAC-seq.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Efficient chromatin accessibility mapping in situ by nucleosome-tethered tagmentation " for further consideration by eLife. Your revised article has been evaluated by Jessica Tyler (Senior Editor) and a Reviewing Editor.

The manuscript has been improved but there are some remaining issues that need to be addressed before acceptance, as outlined below:

1) Regarding H3K4me2 and ATAC-seq, your response and the revised text state "Using an interval equal to average peak width at half-height, 51.3% of CUTAC and 50.0% of Omni-ATAC sites overlap ATAC_ENCODE peaks." As this was one of the three key requests of the reviewers, the analysis supporting these numbers should be shown, perhaps as additional panel to Figure 4 (ATAC encode peaks) or as a supplementary figure.

2) In Figure 4F, the estimate of 90% coverage of these sites with CUTAC seems generous based on the heatmap. Could you add a horizontal line to clearly indicate where you believe the cutoff is? Also your response mention that you aligned Omni-seq to these sites but it's not shown in the new figure. The reviewer had specifically asked to compare H3K4me2 positive and negative in Omni-seq and CUT&tag.

3) Did you submit Table 1? I could not find it.

https://doi.org/10.7554/eLife.63274.sa1

Author response

Essential revisions

1) Reviewers #1 and #3 raise questions regarding the possibility of some accessible regions not being profiled by CUTAC and reviewer #1 suggests to analyze regions bound by CTCF to explore this possibility. Please include these or other analyses to address this point.

We address these points with additional analyses detailed below. For CTCF, we show that CUTAC detects ~90% of CTCF DNaseI hypersensitive sites (new Figure 4F).

2) Reviewers #1 and #3 wonder about the quantitative correlation in addition to the spatial overlap shown. Please include these analyses.

We now provide these analyses, which are described below, including additional data (new Figure 4E and Figure 4—figure supplement 1).

3) While the experiments presented indicate a correlation between transcription and chromatin accessibility, the term "transcription-coupled" in the title implies that a causal link has been demonstrated, but this would require manipulations. We advise you to change the title to better reflect the content of the manuscript.

What we meant by the title was that the mapping is transcription-coupled, not chromatin accessibility itself. But we agree that the title can be misinterpreted in that way, and so we have changed it to “Efficient chromatin accessibility mapping in situ by nucleosome-tethered tagmentation”.

Additional points

We also think that several of the other points made by the reviewers might help you strengthen this manuscript and encourage you to consider addressing them if possible. The full reviews are included below.

We thank the reviewers for their many thoughtful comments and we have addressed each of them with new data and analyses together with textual changes as requested.

Reviewer #1:

This paper by the Henikoff lab introduces CUTAC, a molecular tool that allows users to sequence DNA inside nucleosome depleted regions accessible to transposition by a protein A (pA)-Tn5 fusion protein. CUTAC builds on the Henikoff lab's exciting new CUT&TAG method. Unlike the CUT&TAG protocol published recently, however, this new work uses low-salt conditions during tagmentation, which appears to promote Tn5 transposition in nucleosome depleted regions adjacent to the primary antibody. The data demonstrating that CUTAC favors transposition inside of nucleosome depleted regions is compelling and clearly shown. Moreover, the new method affords a substantial improvement in the resolution for active regulatory regions compared with CUT&TAG for histone modifications, comparable to that of high-quality ATAC-seq data. Compared with ATAC-seq, there are potentially several compelling advantages of CUTAC, including reproducibility, side-by-side library prep with CUT&TAG, and the possibility of being selective about which open chromatin regions are sequenced (see below). Between these advantages and the authors' past success and broad community interest in the CUT&RUN and CUT&TAG family of methods, I am in favor of publication. I have several comments for the authors:

1) The authors' model is that the primary antibody recruits pA-Tn5 fusion, which then transposes DNA in adjacent accessible regions. However, not all nuclease accessible chromatin is marked by H3K4me2/3. Several lines of evidence suggest that, at least CTCF binding sites have a high level of DNase-I accessibility, but many lack histone modifications indicative of active enhancers/ promoters.

The evidence that I am familiar with is based on H3K4me2 and H3K4me3 ChIP-seq signal, which is smeared-out relative to DNAseI or ATAC-seq signals, and so weak positives for ATAC-seq may be below the background level for ChIP-seq. The same relationship is seen when comparing CUTAC with CUT&Tag for H3K4me2, where the only difference is in the tagmentation buffer (Figure 4C compare left and middle panels). To test whether there is a class of nuclease-accessible chromatin not detected by CUTAC we aligned H3K4me2 CUTAC and Omni-ATAC data over ATAC_ENCODE peaks. Using an interval equal to average peak width at half-height, 51.3% of CUTAC and 50.0% of Omni-ATAC sites overlap ATAC_ENCODE peaks. If there were a class of ATAC_ENCODE peaks not adjacent to H3K4me2, then these would show up as a lower percentage of overlap with CUTAC than with Omni-ATAC. In light of the moderate correlation between CUTAC and Omni-ATAC (R2 = 0.53, see below), it is likely that technical variation alone can account for the degree of peak overlap we observed.

Most of the previous work on this subject was done using DNase-I-seq, however presumably the same signal is true for ATAC-seq?! Assuming ATAC-seq shows the same signal, I am curious to know whether CUTAC data collected using K4me2/3 antibodies shows accessibility near CTCF binding sites. An easy way to get at this would be to center on CTCF sites, and break them into classes which do/ do not contain evidence of either K4me2/3 or transcription using ENCODE data. If a heatmap shows similar signal between ATAC and CUTAC at CTCF sites associated with K3me2/3, but only ATAC shows signal at CTCF sites not associated with these marks, then it implies a degree of specificity for open chromatin near the primary antibody as would be expected from the author's model.

We thank Reviewer 1 for suggesting this interesting analysis. To rigorously test the assertion that there are CTCF sites that show DNaseI accessibility but lack active histone marks, we used the set of 9403 19-bp CTCF motifs with a DNaseI hypersensitive site in both K562 and HeLa cells collected in an earlier study (Skene and Henikoff, 2015: DOI: 10.7554/eLife.09225.001), and aligned Omni-ATAC and H3K4me2 CUTAC and CUT&Tag fragments in heatmaps. We excluded nucleosomal fragments by using only ≤120 bp fragments. We observed that ~90% of the DNaseI hypersensitive CTCF sites were enriched for CUTAC signal relative to flanking regions. This suggests equivalence of the CUTAC and DNaseI hypersensitive CTCF sites, and we have added a new paragraph and new figure panel (Figure 4E) to this section. We also found that the H3K4me2 CUT&Tag sample showed detectable signal at only ~50% of the CTCF sites. This improvement in detection of CTCF sites by H3K4me2 CUTAC over H3K4me2 CUT&Tag is further evidence that the H3K4me2 mark on nucleosomes around hypersensitive sites, but is probably too weak to be detected above background in ChIP-seq experiments.

2) Selecting which open chromatin regions to measure could be an additional, compelling advantage of CUTAC over ATAC-seq. One could imagine, for instance, using CUTAC to find open chromatin near specific kinds of transcriptional co-activators or co-repressors, Pol II, or (possibly) transcription factors. I would imagine there are a range of applications that would benefit from something like this. Is it worth saying more about this? Or do the authors think that more exploration would be required before this could be stated with any certainty (perhaps the analysis suggested in point #1, above, will help)?

We agree, and now mention this possibility to close the new paragraph describing the CTCF site comparisons: “This improvement in detection of CTCF sites by H3K4me2 CUTAC over H3K4me2 CUT&Tag illustrates the potential of using ≤120-bp CUTAC fragment data to improve the resolution and sensitivity of transcription factor binding site motif detection.”

3) To what extent does CUTAC recover the quantitative amount of chromatin accessibility measured by ATAC-seq? Heatmaps suggest the two are highly correlated, as would be expected. It might be useful for readers to see scatterplots that show the correlation in integrated signal near peaks. Note that I would not necessarily expect the correlation to be perfect if there is some specificity for accessible chromatin near H3K4me2/3.

This is an excellent suggestion. Below is the requested scatterplot (R2 = 0.53) on a log10 scale to capture the 5 order-of-magnitude dynamic range (new Figure 4—figure supplement 1). We did not detect off-diagonal clusters that would indicate a subset of peaks found by one but not the other dataset. We now include a correlation matrix to more quantitatively illustrate the relationships between the H3K4me2 and H3K4me3 CUTAC and ATAC-seq datasets as requested by reviewer 3 (new Figure 4E).

4) In some parts of the text and Abstract, I came away with the impression that CUTAC involves both H3K4me2 and H3K4me3 primary antibodies in the same sample. Based on the main text, however, I think the authors are only using one of these two marks at a time. Please clarify.

Fixed (one at a time).

5) In Figure 5, please clarify aspects of the CUTAC experiment that were explored in earlier figures were used. Was it me2 or me3? With or without hexanediol or dimethylformamide?

It was H3K4me2 (fixed), but we now provide equivalent data for H3K4me3 showing the peak-narrowing effect of 1,6-hexanediol and corresponding improvements in peak-calling (new Figure 3—figure supplement 1).

Reviewer #2:

In this manuscript Henikoff et al. present a modification of CUT&Tag, a method that they developed previously to profile chromatin epitopes genome-wide. Here, they show that CUT&Tag can be applied to profile transcription-couple chromatin accessibility sites by simply altering the salt concentration during tagmentation that changes the biochemical binding preferences of the pA-Tn5 transposase. The authors have creatively shown that this method can be performed at home to yield the same results as when performed in the lab, an interesting feature given the current restrictions on laboratory occupancy. The authors claim that while CUTAC takes longer to perform that ATAC-Seq, it gives better quality data than ATAC-seq based on the variation of ATAC-seq data quality between laboratories. The authors assume that all laboratories that perform ATAC-Seq are equally proficient in the technique and that variation is simply due to the technique itself and not the experimentalist. Therefore, it is unclear if CUTAC is indeed superior to ATAC-Seq. Together with the fact that CUTAC is a very minor modification of CUT&Tag, I am not convinced that it is a sufficient advance to warrant publication as a research tool in eLife.

From a user’s perspective, the fact that a minor modification converts a method for mapping specific chromatin features into one for precisely mapping accessibility in the same experiment is a major advantage. As for the question of experimentalist proficiency, we also showed that CUTAC at home outperforms ATAC-seq generated by the ENCODE project their recent Nature consortium publication (Moore et al., 2020). The poorest K562 ATAC-seq dataset is from another recent Nature paper (PMID: 32728247, June 2020) performed by the same lab that is listed for generating the ENCODE data (Snyder lab, Stanford). It is hard to attribute this extreme variation in ATAC-seq data quality to experimentalist skill when both the excellent and the poor datasets are from the same expert group published a month apart. To further support our assertion that nucleosome tethering of Tn5 is advantageous over free Tn5 tagmentation, our revision now includes a comparison between CUTAC, an ATAC-seq dataset from my own lab (PMID: 31253573) and one from PMID: 32728249 for human H1 ES cells. We find that CUTAC (FRiP = 0.28) is better than our own ATAC-seq dataset (FRiP = 0.16) and much better than the H1 ATAC-seq dataset from PMID: 32728249 (FRiP = 0.014). To reduce the possibility that readers will overlook our comparisons between CUTAC and ATAC-seq from expert groups, we have promoted the table reporting these statistics from part of a supplement to Table 1 in the main body of the paper.

Reviewer #3:

In this manuscript, Henikoff et al. developed a novel approach for in-situ mapping of transcription-coupled chromatin accessibility. Compared with conventional ATAC-seq, this method displays several unique advantages, including high sensitivity and compatibility with parallel Cut&tag profiling. I am rather enthusiastic about the release of this work. Also it is highly appreciated that the authors already uploaded the detailed protocol to protocol.io. For publication in eLife, this work only has several points to be clarified as shown below:

1) In Figure 1AB, CUT&Tag-direct with different starting nuclei numbers gave very different fragment size distributions. Is there a specific reason for this? How does the input nuclei/cell number affect the genome-wide signals?

To explain this difference, we added the following sentence to the Figure 1 legend: “The higher yield of smaller fragments with decreasing cell number suggests that reducing the total available binding sites increases the binding of antibody and/or pAG-Tn5 in limiting amounts.”

2) For the divergent outputs from CUT&Tag and CUTAC, the manuscript implies that this is due to different Tn5-DNA binding affinities between low and high salt conditions. Is it also possibly due to that the high salt simply broad the space between nearby nucleosomes for more efficient tagmentation?

No, because the CUTAC signal is centered over annotated NDRs, not over nucleosomes marked by H3K4me2. To better illustrate that the shift is due to low-salt tagmentation and not to mobility of marked nucleosomes with 300 mM NaCl, we show that the mapping of H3K4me2 by CUT&Tag is similar to mapping by ENCODE ChIP-seq where no such mobility is possible. We now make this point in the text and in the new Figure 6—figure supplement 1.

3) My major concern for using this approach as a substitution of ATAC-seq is that this method may introduce bias with the use of antibody linked Tn5. Are there enriched H3K4me2 signals in the peaks detected only by H3K4me2 CUTAC compared with peaks detected only by Omni-ATACseq?

This point was raised by reviewer 1 and clarified above, including the new analysis showing CUTAC detection of ~90% of annotated DNaseI hypersensitive CTCF sites (new Figure 4E).

4) Figure 5 is helpful for evaluating technique efficiency and qualities. However, the number of peaks per mapped fragments is affected by the input cell/nuclei number and the library's complexity. It would be great if these comparisons are made based on the same number of input nuclei. This is also helpful for comparing the efficiencies of different approaches.

This issue is addressed in the expanded Figure 3—figure supplement 1, where we have compared nuclei from 30,000 cells for CUTUC using each of the three tagmentation variations to ATAC-seq with 50,000 nuclei and show the estimated size of each CUTAC library produced.

Also, how similar is the CUTAC dataset compared with all other ATAC-seq datasets by correlation analysis?

As described in response to reviewer 1, we now include a representative scatterplot and a correlation matrix (shown above) to more quantitatively illustrate the relationships between the H3K4me2 and H3K4me3 CUTAC and ATAC-seq datasets.

5) For the broad application of the technique, it would be great if the authors can compare the library preparation cost per sample between this technique and conventional ATAC-seq.

For each CUTAC sample in a 32-sample experiment over an 8 hour day (all CUTAC@home experiments were done on this scale) we used 0.25 µL primary and 0.25 µL secondary antibody (at 1:100) and ~$10 for Epicypher pAG-Tn5 per 25 µL incubation volume, twice that for 50 µL sample volumes. Other materials, such as Concanavalin A and SPRI paramagnetic beads (~$1 per sample for both), pipette tips (~$1 per sample) and reagents increase the cost of library preparation to perhaps $15 per sample. Based on an ATAC-seq experiment from my lab (Meers et al., Mol Cell, 2019, PMID: 31253573), the cost of materials is similar, and both procedures would take about a full day if on the same scale starting with nuclei and finishing with sequencing-ready libraries (Michael Meers, personal communication). The 32-sample scale is typical for several antibodies multiplied by the number of CUT&Tag samples run in parallel with CUTAC, but is probably too high for ATAC-seq. However, the most important cost-differential is the number of sequencing reads that are required to call peaks, and this is where the number of peaks called (sensitivity) and the fraction of reads in peaks (signal-to-noise) make a big difference. For example, we sequenced to a depth of 4.5 million paired-end 25-bp reads for our published H1 ES cell ATAC-seq data (~20,000 peaks with FRiP = 0.16 for 3.2 million fragments), whereas for the H1 ES cell ATAC-seq data from PMID: 32728247 (~450 peaks with FRiP = 0.014 for 3.2 million fragments) the authors sequenced to a depth of 45 million paired-end 100-bp reads. Our in-house sequencing cost per sample was ~$25 per sample, which exceeds the estimated cost of library preparation.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

The manuscript has been improved but there are some remaining issues that need to be addressed before acceptance, as outlined below:

1) Regarding H3K4me2 and ATAC-seq, your response and the revised text state "Using an interval equal to average peak width at half-height, 51.3% of CUTAC and 50.0% of Omni-ATAC sites overlap ATAC_ENCODE peaks." As this was one of the three key requests of the reviewers, the analysis supporting these numbers should be shown, perhaps as additional panel to Figure 4 (ATAC encode peaks) or as a supplementary figure.

We have added Figure 4—figure supplement 2 with heatmaps centered over the ATAC_ENCODE peaks made from each of the data files used for the Figure 4 heatmaps.

2) In Figure 4F, the estimate of 90% coverage of these sites with CUTAC seems generous based on the heatmap. Could you add a horizontal line to clearly indicate where you believe the cutoff is? Also your response mention that you aligned Omni-seq to these sites but it's not shown in the new figure. The reviewer had specifically asked to compare H3K4me2 positive and negative in Omni-seq and CUT&tag.

We have inserted arrowheads on the left of each heatmap to indicate the cutoffs. We have also added the corresponding ≤120 bp heatmaps for Omni-ATAC and ATAC_ENCODE in Figure 4F. The precise cutoffs are: H3K4me2 CUT&Tag (53%), CUTAC (86%), Omni-ATAC (82%) and ATAC_ENCODE (55%), and we have modified the article file and the response accordingly. To avoid confusion, we removed the >120 bp heatmaps, which had been excluded from this analysis, as pointed out in the text.

3) Did you submit Table 1? I could not find it.

We apologize for the inadvertent omission. Table 1 is now in the article file.

https://doi.org/10.7554/eLife.63274.sa2

Article and author information

Author details

  1. Steven Henikoff

    1. Basic Sciences Division Fred Hutchinson Cancer Research Center, Seattle, United States
    2. Howard Hughes Medical Institute, Seattle, United States
    Contribution
    Conceptualization, Resources, Formal analysis, Funding acquisition, Validation, Investigation, Methodology, Writing - original draft, Writing - review and editing
    For correspondence
    steveh@fhcrc.org
    Competing interests
    has filed patent applications related to this work.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-7621-8685
  2. Jorja G Henikoff

    Basic Sciences Division Fred Hutchinson Cancer Research Center, Seattle, United States
    Contribution
    Data curation, Software, Formal analysis, Writing - review and editing
    Competing interests
    No competing interests declared
  3. Hatice S Kaya-Okur

    Basic Sciences Division Fred Hutchinson Cancer Research Center, Seattle, United States
    Present address
    Altius Institute for Biomedical Sciences, Seattle, United States
    Contribution
    Investigation, Methodology, Writing - review and editing
    Competing interests
    has filed patent applications related to this work.
  4. Kami Ahmad

    Basic Sciences Division Fred Hutchinson Cancer Research Center, Seattle, United States
    Contribution
    Funding acquisition, Validation, Investigation, Methodology, Writing - review and editing
    Competing interests
    No competing interests declared

Funding

National Institutes of Health (R01 HG010492)

  • Steven Henikoff

National Institutes of Health (R01 GM108699)

  • Kami Ahmad

Chan Zuckerberg Initiative (Fred Hutch HCA Seed Network)

  • Steven Henikoff
  • Kami Ahmad

Howard Hughes Medical Institute (Henikoff)

  • Steven Henikoff

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Terri Bryson, Christine Codomo for sample processing, the Fred Hutch Genomics Shared Resource for DNA sequencing, members of our laboratory for helpful discussions and Paul Talbert for critically reading the manuscript. SH is an Investigator of the Howard Hughes Medical Institute. This work was supported by the Howard Hughes Medical Institute (SH), grants R01 HG010492 (SH) and R01 GM108699 (KA) from the National Institutes of Health, and an HCA Seed Network grant from the Chan-Zuckerberg Initiative (SH).

Senior Editor

  1. Jessica K Tyler, Weill Cornell Medicine, United States

Reviewing Editor

  1. Roberto Bonasio, University of Pennsylvania, United States

Reviewers

  1. Charles G Danko, Cornell University, United States
  2. Junyue Cao, The Rockefeller University, United States

Publication history

  1. Received: September 19, 2020
  2. Accepted: November 13, 2020
  3. Accepted Manuscript published: November 16, 2020 (version 1)
  4. Version of Record published: December 7, 2020 (version 2)

Copyright

© 2020, Henikoff et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 2,563
    Page views
  • 449
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Chromosomes and Gene Expression
    Nina Kirstein et al.
    Research Article

    Eukaryotic DNA replication initiates during S phase from origins that have been licensed in the preceding G1 phase. Here, we compare ChIP-seq profiles of the licensing factors Orc2, Orc3, Mcm3, and Mcm7 with gene expression, replication timing and fork directionality profiles obtained by RNA-seq, Repli-seq and OK-seq. ORC and MCM are significantly and homogeneously depleted from transcribed genes, enriched at gene promoters, and more abundant in early- than in late-replicating domains. Surprisingly, after controlling these variables, no difference in ORC/MCM density is detected between initiation zones, termination zones, unidirectionally replicating and randomly replicating regions. Therefore, ORC/MCM density correlates with replication timing but does not solely regulate the probability of replication initiation. Interestingly, H4K20me3, a histone modification proposed to facilitate late origin licensing, was enriched in late replicating initiation zones and gene deserts of stochastic replication fork direction. We discuss potential mechanisms specifying when and where replication initiates in human cells.

    1. Chromosomes and Gene Expression
    2. Computational and Systems Biology
    Anna Nagy-Staron et al.
    Research Article

    Gene expression levels are influenced by multiple coexisting molecular mechanisms. Some of these interactions, such as those of transcription factors and promoters have been studied extensively. However, predicting phenotypes of gene regulatory networks remains a major challenge. Here, we use a well-defined synthetic gene regulatory network to study in Escherichia coli how network phenotypes depend on local genetic context, i.e. the genetic neighborhood of a transcription factor and its relative position. We show that one gene regulatory network with fixed topology can display not only quantitatively but also qualitatively different phenotypes, depending solely on the local genetic context of its components. Transcriptional read-through is the main molecular mechanism that places one transcriptional unit within two separate regulons without the need for complex regulatory sequences. We propose that relative order of individual transcriptional units, with its potential for combinatorial complexity, plays an important role in shaping phenotypes of gene regulatory networks.