Chromatin endogenous cleavage provides a global view of yeast RNA polymerase II transcription kinetics
eLife Assessment
This valuable study compares ChIP-seq and ChEC-seq2 techniques to investigate RNA polymerase II (RNAPII) binding patterns in yeast, revealing that ChEC-seq2 captures distinct regulatory events associated with active transcription missed by ChIP-seq. The authors use ChEC-seq2 data to build a stochastic model of RNAPII kinetics, providing convincing new insights into transcription regulation and the role of the nuclear pore complex. The paper highlights the importance of careful methodological comparisons in understanding RNAPII dynamics.
https://doi.org/10.7554/eLife.100764.3.sa0Valuable: Findings that have theoretical or practical implications for a subfield
- Landmark
- Fundamental
- Important
- Valuable
- Useful
Convincing: Appropriate and validated methodology in line with current state-of-the-art
- Exceptional
- Compelling
- Convincing
- Solid
- Incomplete
- Inadequate
During the peer-review process the editor and reviewers write an eLife Assessment that summarises the significance of the findings reported in the article (on a scale ranging from landmark to useful) and the strength of the evidence (on a scale ranging from exceptional to inadequate). Learn more about eLife Assessments
Abstract
Chromatin immunoprecipitation (ChIP-seq) is the most common approach to observe global binding of proteins to DNA in vivo. The occupancy of transcription factors (TFs) from ChIP-seq agrees well with an alternative method, chromatin endogenous cleavage (ChEC-seq2). However, ChIP-seq and ChEC-seq2 reveal strikingly different patterns of enrichment of yeast RNA polymerase II (RNAPII). We hypothesized that this reflects distinct populations of RNAPII, some of which are captured by ChIP-seq and some of which are captured by ChEC-seq2. RNAPII association with enhancers and promoters - predicted from biochemical studies - is detected well by ChEC-seq2 but not by ChIP-seq. Enhancer/promoter-bound RNAPII correlates with transcription levels and matches predicted occupancy based on published rates of enhancer recruitment, preinitiation assembly, initiation, elongation, and termination. The occupancy from ChEC-seq2 allowed us to develop a stochastic model for global kinetics of RNAPII transcription which captured both the ChEC-seq2 data and changes upon chemical-genetic perturbations to transcription. Finally, RNAPII ChEC-seq2 and kinetic modeling suggests that a mutation in the Gcn4 transcription factor that blocks interaction with the NPC destabilizes promoter-associated RNAPII without altering its recruitment to the enhancer.
Introduction
In eukaryotes, differential expression of the genome is achieved primarily through regulated RNA polymerase II (RNAPII) transcription. Since its discovery (Roeder and Rutter, 1969), transcription by RNAPII has been the focus of intense study using a variety of methods. From biochemical, structural, and genetic studies, a consensus has emerged for the mechanism of RNAPII transcription (Figure 1; Schier and Taatjes, 2020). For genes that are dependent on enhancers, sequence-specific transcription factors (ssTFs) bind to enhancers and recruit coactivators like histone acetyltransferases and chromatin remodelers as well as Mediator (Fishburn et al., 2005; Green, 2005; Prochasson et al., 2003; Ptashne and Gann, 1997). Coactivators facilitate the removal of nucleosomes from the promoter, allowing binding of TFIID (TATA binding protein), which recruits additional general transcription factors (GTFs; TFIIA, TFIIB, TFIIF) and ultimately RNAPII (Figure 1). Last, TFIIE and TFIIH are recruited to complete the formation of the preinitiation complex (PIC). Through Mediator, ssTFs interact with RNAPII to stabilize the PIC (Abdella et al., 2021; Richter et al., 2022). TFIIH stimulates initiation by both unwinding the DNA and by phosphorylating the RNAPII carboxyl terminal domain on Serine 5 (Figure 1, inset; Cadena and Dahmus, 1987; Komarnitsky et al., 2000; Lu et al., 1991). In metazoans, regulatory factors (negative elongation factor and DRB-sensitive factor [DSIF]) cause RNAPII to pause after initiation, leading to an accumulation of RNAPII downstream of the transcription start site (TSS) (Adelman and Lis, 2012; Core and Adelman, 2019). The P-TEF-b kinase releases RNAPII from pausing by phosphorylation of these factors and RNAPII on Serine 2, leading to elongation (Marshall and Price, 1995). Finally, transcription of a polyadenylation sequence both causes RNAPII to pause, stimulating cleavage and polyadenylation (Figure 1; Nag et al., 2007; Orozco et al., 2002).
To study transcription in vivo, the most common approach has been chromatin immunoprecipitation (ChIP), in which protein-DNA complexes are stabilized through formaldehyde crosslinking and recovered by immunoprecipitation (Solomon et al., 1988). Coupled with next-generation sequencing, ChIP-seq has been widely adopted to explore the genome-wide interactions of RNAPII and co-regulators (Barski et al., 2007; Mikkelsen et al., 2007; Welboren et al., 2009). The occupancy of RNAPII over transcribed regions correlates with nascent transcription. Exonuclease foot printing of RNAPII over DNA (ChIP-exo; Rhee and Pugh, 2012) or RNA (NET-seq; Churchman and Weissman, 2011) and nuclear run-on (PRO-seq; Kwak et al., 2013) have provided high resolution of maps of RNAPII binding to the genome. Together, such methods highlight paused and elongating RNAPII and suggest that very little RNAPII is associated with the promoter in the preinitiation state (Core et al., 2012).
The dynamics of RNAPII transcription in vivo has also been explored by tracking single molecules of RNAPII (or co-regulators) or individual transcripts. Such experiments offer a different view of transcription. Fluorescence recovery after photobleaching over arrays of inducible reporter genes reveals that a small fraction (~13%) of the RNAPII molecules that assemble at the promoter initiates transcription (Darzacq et al., 2007; Stasevich et al., 2014). Monitoring the production of single molecules of mRNA from either such arrays or single genes suggests that RNAPII elongation rate is ~1000–3000 bp/min and that termination is associated with a prolonged pause (50–70 s; Larson et al., 2011; Zenklusen et al., 2008). Single-molecule tracking of RNAPII and GTFs reveals that ~40% of RNAPII is chromatin-associated and that when initiation is blocked, the dwell time of RNAPII (presumably at the promoter) is ~10 s (Nguyen et al., 2021). Because these observations would predict that RNAPII levels at the promoter and terminator (as well as pausing sites) should be higher than those over the transcribed region, they are difficult to reconcile with the RNAPII enrichments observed by ChIP-seq.
Single-molecule tracking of ssTF and RNAPII binding to enhancers and promoters in vitro offers another important perspective. In yeast nuclear extracts, ssTF binding to enhancers (also called upstream activating sequences [UASs]) has been observed. Consistent with the consensus model, ssTFs stimulate RNAPII and PIC recruitment to a neighboring promoter (Rosen et al., 2020). Surprisingly, RNAPII and certain PIC components are recruited by ssTFs even in the absence of a promoter (Baek et al., 2021). This suggests that RNAPII is recruited to enhancers/UASs by ssTFs, perhaps through interactions with Mediator, which allows efficient promoter loading of PIC components. However, the association of RNAPII and PIC factors with UASs has not been observed by ChIP-seq.
An alternative to ChIP is chromatin endogenous cleavage (ChEC), in which endogenous proteins of interest are tagged with micrococcal nuclease (MNase; Schmid et al., 2004). Their association with the genome can be monitored by permeabilizing cells and addition of calcium to activate MNase (Schmid et al., 2004). The cleavage events can be identified by next-generation sequencing (ChEC-seq2; VanBelzen et al., 2024; Zentner et al., 2015). To-date, ChEC-seq2 has only been performed in budding yeast. For ssTFs and nuclear pore proteins, ChEC-seq2 gives results very similar to ChIP-seq or ChIP-exo (Ge et al., 2024; VanBelzen et al., 2024). Likewise, ChEC-seq2 with coactivators and Mediator resembles ChIP (Bruzzone et al., 2018; Grünberg et al., 2016; Saleh et al., 2022). However, we find that ChEC-seq2 with RNAPII gives a pattern of enrichment that was notably different from that observed using ChIP-seq. Whereas ChIP shows strong enrichment of RNAPII over the transcribed region and little enrichment over the promoter or upstream, ChEC-seq2 showed strong enrichment of RNAPII over the promoter, UAS, and 3’UTR and little signal over the transcribed region. The ChEC-seq2 enrichment of RNAPII over promoters correlated with both nascent transcription (as measured by SLAM-seq; Herzog et al., 2017) and ChIP-seq enrichment of RNAPII over coding regions, suggesting that it reflects active RNAPII. RNAPII association with UAS regions was strongest for genes that recruit coactivators and was dependent on ssTFs.
The occupancy of RNAPII over UASs and promoters from ChEC-seq2, combined with published RNAPII dynamics, allowed us to develop a stochastic model for the global kinetics of RNAPII transcription. This model and ChEC-seq2 data offer insight into the effects of genetic perturbations that block transcription globally and suggests that the nuclear pore complex promotes transcription by stabilizing promoter-associated RNAPII. This work suggests that ChEC captures important regulatory events associated with transcription that are missed by ChIP.
Results
ChEC-seq2 and ChIP-seq in S. cerevisiae yield distinct RNAPII enrichment patterns
To assess ChEC-seq2 with RNAPII, MNase was inserted at the carboxyl terminus of the endogenous genes encoding the RNAPII subunits Rpo21 (also called Rpb1) and Rpb3 (Zentner et al., 2015). These yeast strains along with a control strain expressing soluble, nuclear MNase (sMNase) were grown in rich medium, harvested and permeabilized to induce MNase activity. Genomic DNA was prepared and converted into ChEC-seq2 libraries (VanBelzen et al., 2024). For comparison, we selected a high-quality RNAPII ChIP-seq dataset from cells grown in rich medium Rpb1 (Vijjamarri et al., 2023b; GEO Accession GSE220578) that used the 8WG16 antibody (Thompson et al., 1989), which recognizes the carboxyl terminal domain of Rpb1 (Komarnitsky et al., 2000). Finally, to confirm that the MNase fusions did not affect RNAPII association with the genome, we also generated ChIP-seq data using this antibody from the yeast strains with and without Rpb1-MNase (Figure 2—figure supplement 2). Over transcriptionally active genes like ILV5, ChIP-seq gave strong enrichment of Rpb1 over the transcribed region and terminator and low enrichment over the enhancer/UAS and the promoter (Figure 2A, first row). In contrast, ChEC-seq2 with either Rpb1 or Rpb3 showed strong enrichment at the UAS, promoter, and terminator of ILV5 and a low enrichment over the transcribed region (Figure 2A, second and third rows; compare with sMNase in black). However, over the repressed GAL1-10 locus, both ChIP-seq and ChEC-seq2 show background enrichment for RNAPII (Figure 2A, right). Notably, sMNase cleavage over GAL1-10 reflects both unprotected linkers between well-positioned nucleosomes and nucleosome depletion upstream of promoters (Chereji et al., 2019; Lee et al., 2004; Figure 2A, right). This pattern was unrelated to the trimming of mapped reads to the first base pair (compare with untrimmed tracks in Figure 2—figure supplement 1A), the normalization of transcript length used in metagene plots (enrichment over promoters and the 5’ end of genes in Figure 2—figure supplement 1B; see Methods), or the presence of the MNase fusion (Figure 2—figure supplement 2). Globally, while both ChIP-seq and ChEC-seq2 showed positive Spearman correlation with nascent transcription (measured by SLAM-seq), different regions of genes correlated best with nascent mRNA (Figure 2—figure supplement 1C). Nascent transcription correlated best with the enrichment of RNAPII over the promoter from ChEC-seq2 and the enrichment of RNAPII over the transcribed region and terminator from ChIP-seq. Thus, both ChIP-seq and ChEC-seq2 with RNAPII show enrichments that correlate with transcriptional activity, but these two methods reveal complementary interaction patterns.
Different classes of RNAPII-transcribed yeast genes show distinct mechanisms of transcriptional regulation (Rossi et al., 2021). To more precisely define the differences between ChIP and ChEC, we compared ChIP-seq with ChEC-seq2 over three such classes: (1) genes that bind ssTFs and coactivators such as SAGA, Tup1, Mediator, SWI/SNF (STM), (2) genes bound to ssTFs but not coactivators (transcription factors only, TFO), and (3) a set of 84 genes that showed no detectable nascent transcription, based on SLAM-seq (repressed). Because these different classes of genes are expressed at different levels (Figure 2—figure supplement 1D), we focused on the most highly expressed 150 genes from the STM and TFO classes. Metagene plots of mean RNAPII ChIP-seq over each of these sets of genes reveal strong enrichment over the transcribed region for the STM genes and, to some extent, for the TFO genes, with a notable dip over the promoter (Figure 2B, left). Metagene plots of RNAPII ChEC-seq2 showed a strong enrichment over the promoter for both STM and TFO genes and over the UAS for STM genes (Figure 2B, middle and right). RNAPII was not enriched over repressed genes by either method.
To better understand the ChEC patterns upstream of TSSs, mean cleavage by RNAPII was plotted at higher resolution by aligning to 597 high-confidence TATA boxes upstream of expressed genes (based on SLAM-seq), oriented so that the TSS is 50 bp ± 39 bp to the right (Figure 2C; ±250 bp). Because sMNase cleaves the TATA boxes strongly (Figure 2—figure supplement 1E) - reflecting either increased accessibility or the T/A sequence preference of sMNase (Dingwall et al., 1981; Hörz and Altenburger, 1981) - we subtracted the sMNase cleavage from specific cleavage frequency (Figure 2C). Both Rpb1-MN and Rpb3-MN produced cleavage peaks ~17 bp upstream and ~34 bp downstream of the TATA box, although their relative intensities were different (Figure 2D). In contrast, Rpb1 ChIP-seq signal was low over the TATA and TSS (Figure 2C).
The ChEC-seq2 signal for RNAPII over the UAS region correlates with recruitment of coactivators upstream of STM genes, but not upstream of TFO genes (Figure 2B, middle and right), arguing that it is not an artifact of nearby promoters or genes. To better understand the ChEC-seq2 signal over promoters and UAS regions, we mapped proteins expected to interact with the promoter (PIC components TFIIA [Toa2] and TFIIE [Tfa1]) or the UAS (the Rap1 ssTF and Mediator). For this comparison, we selected 287 STM genes near high-confidence Rap1 sites (VanBelzen et al., 2024). While the PIC interacted strongly with the promoter region of both STM and TFO genes, Rap1 and Mediator interacted strongly with the UAS region of STM genes (Figure 2D). Rap1 and Mediator also showed a low level of enrichment upstream of the promoter region of TFO genes (Figure 2D). Thus, ChEC-seq2 showed promoter enrichment of PIC components and UAS enrichment of TFs and Mediator.
When mapped over TATA sites, TFIIA (Toa2-MN) produced a major cleavage peak ~12 bp upstream and a minor peak ~12 bp downstream from the TATA box (Figure 2E). TFIIE (Tfa1-MN) showed the strongest peak ~34 bp downstream of the TATA (Figure 2E). These data suggest that ChEC-seq2 reflects the arrangement of TFIIA, RNAPII, and TFIIE within the PIC: TFIIA interacts with DNA immediately upstream of TBP, RNAPII binds on both sides of TBP, and TFIIE binds downstream of TBP (see Video 1; Aibara et al., 2021; He et al., 2013; Schilbach et al., 2021). Also, consistent with an ordered assembly of the PIC, the peak of TFIIA cleavage 12 bp downstream of the TATA box is absent in the RNAPII and TFIIE ChEC data, suggesting that TFIIA binds before RNAPII and TFIIE during PIC assembly and that this site becomes protected when RNAPII and TFIIE join (Video 1). Together, these data suggest that ChEC-seq2 captures both UAS-associated RNAPII and the PIC.
Given the dramatic difference between ChEC-seq2 and ChIP-seq, we next asked if either pattern is consistent with the dynamics of transcription as described in the literature. Because S. cerevisiae lacks promoter-proximal pausing (Booth et al., 2016) and has few intron-containing genes that require splicing (Stajich et al., 2007), these slow elongation steps are expected to be absent. Therefore, RNAPII initiation and pausing during termination (Hyman and Moore, 1993) would represent relatively slow steps compared with the rate of elongation. Both in vivo and in vitro studies in yeast suggest promoter dwell times in the range of approximately 5–20 s (Baek et al., 2021; Nguyen et al., 2021) a termination time of up to 70 s and an elongation rate between 1000 and 3000 bp/min (Larson et al., 2011; Zenklusen et al., 2008). Using these ranges, we calculated the predicted RNAPII occupancy over the promoter, the transcribed region, and the terminator for the typical transcribed yeast gene (see Methods; median size of transcribed region = 1.2 kb; Pelechano et al., 2013). Of 24 combinations of dwell times and elongation rates tested, 21 predicted higher occupancy at the promoter than over the transcribed region and 21 predicted higher occupancy at terminators than over the transcribed region (Figure 2F). While some combinations predicted a relatively flat distribution across the gene with lower levels in the promoter, none of the 24 predicted the strong signal over the transcribed region with promoter depletion characteristic of ChIP-seq. Only very short promoter dwell times (i.e. <1 s) produced the low promoter occupancy seen in ChIP-seq (Figure 2—figure supplement 1F). This suggests that ChIP-seq is unable to detect functionally important RNAPII interactions at the promoter and UAS that are detected by ChEC-seq2.
ChEC-seq2 detects elongating and phosphorylated RNAPII
Next, we performed ChEC-seq2 with the kinases involved in initiation and elongation, as well as the elongation factor Spt5 (part of DSIF). Phosphorylation of the carboxy terminal domain (CTD) of RNAPII regulates its activity and the association of factors involved in splicing, histone modification, and RNA processing. Initiation correlates with phosphorylation of Ser5 of the CTD by Kin28 (Cdk7/TFIIH kinase; Komarnitsky et al., 2000). Elongation is coupled with phosphorylation of Ser2 by Ctk1 (P-TEF-b; CTDK-I; Cdk9; Cho et al., 2001) and Bur1 (P-TEFb; Qiu et al., 2009), and the association of Spt4/5 (DSIF; Hartzog et al., 1998).
Kin28-MN, Ctk1-MN, and Spt5-MN showed strong cleavage over active genes and little enrichment over inactive genes (Figure 3A). All three proteins showed maximum cleavage over the promoters of active genes. Kin28 showed significant enrichment over the UAS region of STM genes that was absent from TFO genes (Figure 3A, left). The elongation factor Spt5 showed enrichment over both as well as the transcribed region (Figure 3A, right). In contrast, Ctk1-MN cleavage was primarily localized to promoters (Figure 3A, middle). Higher resolution mapping aligned to TATA boxes confirmed that, while Rpb1 shows peaks of cleavage upstream and downstream of TATA, Kin28, Ctk1, and Spt5 show a single peak downstream, near the TSS (Figure 3B). Furthermore, the signal upstream of the TATA was greatest for Kin28, followed by Ctk1 and then Spt5 (Figure 3B). This suggests that, while Rpb1 shows interactions at the TSS and upstream, factors involved in initiation and elongation are more enriched with the TSS and over the transcribed region.
To confirm that the ChEC cleavage pattern by Kin28 and Ctk1 reflects their activity, we developed a method to measure RNAPII phosphorylation by ChEC-seq2. Two single chain IgG fragments that recognize phosphorylated Ser2 (Ser2p) RNAPII CTD or phosphorylated Ser5 (Ser5p) RNAPII CTD (Mintbodies) have been expressed as GFP- and SNAP-tagged fusions and shown to localize at transcriptionally active loci in mammalian cells (Ohishi et al., 2022; Uchino et al., 2022). We constructed Mintbody-MNase (Mb-MN) fusions to detect these phosphorylated forms of RNAPII (Figure 3C; α-Ser2p-MN and α-Ser5p-MN). Because binding phosphorylated CTD could compete for critical interactions with RNAPII, we tested several promoters to identify an expression level that produced the smallest growth defect (not shown). Strains expressing the Mb-MNs from the ADH1 promoter had a minimal growth defect (Figure 3D) and cleaved chromatin upon permeabilizing cells and addition of calcium (Figure 3—figure supplement 1A). Both α-Ser5p-MN and α-Ser2p-MN give patterns very similar to those produced by their respective kinases; Ser5p was more enriched over promoters and UAS regions, while Ser2p was more evident in the transcribed region (Figure 3E and F). To better compare these patterns, we normalized mean cleavage by the mintbodies over promoters, UAS regions, transcribed regions, and 3’UTR regions by each Mb-MN (or sMNase) to cleavage by Rpb1 (Figure 3G). Ser5p and Ser2p levels were lower than Rpb1 over the UAS and promoter, but higher than Rpb1 over the transcript and 3’UTR (Figure 3G). Furthermore, consistent with their patterns observed by ChIP, the levels of Ser2p were lower than those of Ser5p over the UAS and promoter and higher than those of Ser5p over the transcript (Figure 3G). Thus, ChEC-seq2 can reveal RNAPII recruitment, initiation, and elongation during transcription.
Global transcriptional changes are detected by ChEC-seq2
To further validate the biological significance of RNAPII ChEC-seq2, we examined the effects of an environmental perturbation that results in a large-scale transcriptional change. Cells exposed to 10% ethanol in growth medium show widespread changes in transcription, downregulating hundreds of genes enriched for those involved in ribosome biogenesis (GO: 0042254; blue in Figure 4A) and upregulating genes enriched for chaperones (GO: 0009266; red in Figure 4A). ChEC-seq2 using Rpb1-MN, Kin28-MN, Ctk1-MN, α-Ser5p-MN, and α-Ser2p-MN captures these changes. These proteins showed increased enrichment over the HSP104 chaperone gene following ethanol treatment (Figure 4B). Likewise, metagene plots over the top 100 induced genes showed increased cleavage by Rpb1-MN, Kin28-MN, Ctk1-MN, as well as their products Ser5p and Ser2p upon ethanol treatment (Figure 4C and D, left). Notably, over the transcribed region, enrichment was higher at the 3’ end, especially for Ser2p and Ctk1-MN (Figure 4C and D, left). In contrast, metagene plots of the average change in cleavage over the 137 ribosomal protein genes showed strong decreases in cleavage by all of these proteins (Figure 4C and D, right). The changes in sMNase cleavage were generally the opposite of what we observed with the specific proteins (Figure 4C and D, black trace/column). Thus, ChEC-seq2 can capture biologically relevant changes in RNAPII association, its regulators, and its phosphorylation states that reflect large-scale changes in global transcription.
RNAPII ChEC-seq2 upon chemical-genetic perturbations of transcription
Next, we tested the effect of blocking either PIC formation or initiation on RNAPII/PIC occupancy by ChEC-seq2. PIC formation was blocked by depleting TFIIB using auxin-induced degradation (Sua7-AID; Figure 5A) and initiation was inhibited by treating an analog-sensitive allele of Kin28 with the ATP analog CMK (kin28-is; Rodríguez-Molina et al., 2016). These treatments resulted in strong downregulation of nascent transcription (SLAM-seq; Figure 5B) and inhibition of growth (Figure 5—figure supplement 1), respectively. ChEC-seq2 with Rpb1 following 20 min of depletion of TFIIB showed a clear decrease of Rpb1-MN cleavage over the promoters of the 150 most highly transcribed STM and TFO genes (Figure 5C). TFIIB depletion caused a shift in sMNase cleavage from the TSS downstream (Figure 5C). Neither Rpb1-MN nor sMNase cleavage over repressed genes was altered by TFIIB depletion (Figure 5C). This suggested that Rpb1 occupancy over the promoters of STM and TFO genes requires TFIIB.
Higher resolution mapping of RNAPII (Rpb1-MN), TFIIA (Toa2-MN), and TFIIE (Tfa1-MN) cleavage over TATA boxes revealed that, upon TFIIB depletion, TFIIA occupancy shifted from the major upstream peak to the downstream peak (Figure 5D). RNAPII and TFIIE peaks near the TATA and TSS were lost (Figure 5D). This supports the notion that the downstream peak of TFIIA is blocked by RNAPII/PIC binding. Furthermore, the cleavage by Rpb1 upstream of the TATA box was unaffected by depletion of TFIIB (Figure 5D, middle), suggesting that TFIIB is required for proper PIC formation over the promoter, but is not required for association with upstream UAS elements.
To test this hypothesis, we mapped RNAPII (Rpb1-MN) cleavage over 896 high-confidence sites for the ssTF Rap1 (VanBelzen et al., 2024). Rap1 regulates hundreds of highly expressed genes and RNAPII ChEC-seq2 showed strong enrichment flanking Rap1 sites, while sMNase did not (Figure 5E). This correlates with Mediator occupancy (Figure 5F). Depletion of TFIIB had no significant effect on RNAPII occupancy over Rap1 sites (Figure 5E). Thus, RNAPII recruitment to the promoter is dependent on TFIIB, while RNAPII recruitment to the UAS is not.
Inhibition of kin28-is with CMK also lead to a strong decrease of RNAPII over the promoter, transcribed region, and 3’UTR, especially for the STM genes (Figure 5G). As expected, this was associated with a strong decrease of Ser5 phosphorylation and Ser2 phosphorylation (Figure 5G). Cleavage by α-Ser5p-MN was most strongly decreased at the promoter, while cleavage by α-Ser2p-MN was most strongly decreased at the 3’ end of the transcribed region. No changes in cleavage were observed at repressed genes. RNAPII cleavage over 597 TATA boxes near expressed genes was also decreased upon Kin28 inhibition, but this effect was not as strong as that observed upon depletion of TFIIB (Figure 5H). Thus, inhibition of Kin28 led to an apparent decrease in total RNAPII and its Ser2 and Ser5 phosphorylated forms from highly expressed genes.
Developing a kinetic model for transcription based on ChEC-seq2 RNAPII occupancy
Because ChEC-seq2 provides information about important regulatory steps that have not been evident from previous global studies, we used these data (as well as ChIP-seq data) to develop models for the global kinetics of yeast RNAPII transcription. Steady-state occupancy of RNAPII should reflect the rates of several steps: RNAPII recruitment to the UAS and/or promoter, PIC assembly, initiation, elongation, and termination. We developed a stochastic computational model for these steps (Figure 6A) by fixing rates that have been experimentally determined (k1, k-1, k3, k5, k6, k7; Table 1) and optimizing the remaining rates to fit to the RNAPII occupancy observed from either ChIP-seq or ChEC-seq2. To capture the distinct mechanisms of RNAPII recruitment, we modeled the STM and TFO gene classes separately: for the STM class, we assumed that all RNAPII is recruited first to the UAS (reflecting k1) before being transferred to the promoter (reflecting k2); for the TFO class, RNAPII is recruited directly to the promoter (reflecting k3). In genes with a UAS (i.e. STM genes), RNAPII is recruited nearly exclusively to the UAS through ssTFs and coactivators (Baek et al., 2021), and we therefore omitted RNAPII recruitment to the promoter (k3) in the STM model. We also modeled dissociation from the UAS (reflecting k-1, STM class) and promoter (reflecting k-3, both classes), as well as the possibility of reversal from promoter to UAS (reflecting k-2, STM class).
Fitting to the RNAPII occupancy from ChIP-seq or ChEC-seq2 over different regions (UAS, promoter, transcribed region, or 3’UTR), we identified the optimal range of values for the undefined rates (i.e. k2, k-2, k-3, and k4), producing an ensemble of best-fit models (Figure 6—figure supplement 1). Agreement between the models and the data was measured using cosine similarity (Methods). The models trained on the ChEC-seq2 occupancy for either the TFO or STM genes showed excellent agreement with the data (cosine similarity >0.995; Figure 6B, top and Figure 6—figure supplement 1A and C). Optimal agreement between the models and ChEC-seq2 data was achieved by using the lower bound for dwell time at the terminator from Zenklusen et al., 2008, and Larson et al., 2011 (30 s; k7=0.0325 s–1; Table 1). Importantly, the rates that are shared between the STM and TFO models are identical (Table 1). Thus, modeling RNAPII occupancy data from ChEC-seq2 produced a range of plausible values for the rates of transcription that agrees well with the empirical data (Table 1).
Using the published rates, neither model was able to find rates for the other steps that produced occupancy that matched that observed by ChIP-seq (i.e. there were no models with cosine similarity >0.9; Figure 6—figure supplement 1B and D). The best ChIP-seq models predicted RNAPII occupancy over all regions that was significantly different from that observed (Figure 6B, bottom). By varying the published rates, the model could produce the occupancies observed by ChIP-seq (Figure 6—figure supplement 1E and F). However, this required eliminating dissociation from the promoter (k-3), increasing the initiation rate (k5) 2-fold with instantaneous recruitment of TFIIH (k4) and increasing the termination rate ~4.3-fold above the maximum published rate (k7=0.14 s–1; Figure 6—figure supplement 1E, inset table). Thus, although it is possible to model the RNAPII occupancy observed by ChIP-seq, the predicted rates are difficult to reconcile with the literature.
We explored which rates in the model could account for the effects of TFIIB depletion (Figure 6C) and Kin28 inhibition (Figure 6D; Methods) on mean RNAPII occupancy over UASs, promoters, transcribed regions, and 3’UTRs. Consistent with a role for TFIIB in recruiting RNAPII to the promoter, reducing the rate of RNAPII recruitment (k3) to the promoters of TFO genes produced RNAPII occupancy changes that matched the observed effects of TFIIB depletion (Figure 6C, left; Table 1).
For the STM genes, decreasing k2 alone (i.e. the rate of RNAPII transfer from the UAS to promoter) predicted an accumulation of RNAPII at the UAS and did not agree well with the data (Figure 6C, right). Instead, models that decreased k2 and either increased the rate of dissociation from the UAS (k-1) or decreased the rate of RNAPII recruitment to the UAS (k1) produced RNAPII occupancies that agreed well with the data (Figure 6C, right; Table 1). Therefore, for STM genes, the model predicts that depletion of TFIIB may both reduce RNAPII recruitment to the promoter and reduce recruitment of RNAPII to, or stimulate RNAPII dissociation from, the UAS.
Next, we asked which rates in our kinetic model could account for the effects of inhibiting Kin28. Modeling a decrease in the rate of initiation (k5) predicted an accumulation at the promoter (and UASs of STM class genes), which is not observed (Figure 6D). Instead, the effects of inhibiting Kin28 fit best with destabilizing RNAPII bound to the UAS or promoter, either by decreasing recruitment (k1 or k3, respectively) or by increasing dissociation (k-1 or k-3, respectively; Table 1). Indeed, for TFO-class genes, either an increase in promoter dissociation (k-3) or a decrease in promoter recruitment (k3) with a decrease in initiation (k5) produced occupancies that agreed with the data (Figure 6D, left; Table 1). Similarly, for STM class genes, incorporating an increase in promoter dissociation (k-3), an increase in UAS dissociation (k-1), or decrease in UAS recruitment (k1) with a decrease in initiation (k5) resulted in fits that agreed with the empirical findings (Figure 6D, right). Notably, for STM class genes, the combination of a decrease in initiation with an increase in promoter dissociation produced the best fit at the UAS. This suggests a feedback mechanism between PIC formation and initiation. Together, these findings indicate that the changes in RNAPII occupancy observed by ChEC-seq2 upon perturbation of PIC components can be explained by reasonable changes in transcriptional rates.
ChEC-seq2 suggests a role for the NPC in stabilizing promoter association of RNAPII
Hundreds of active yeast genes physically associate with the NPC and this is dependent on ssTFs (Ahmed et al., 2010; Brickner et al., 2019; Brickner et al., 2012; Casolari et al., 2005; Casolari et al., 2004; Light et al., 2010; Randise-Hinchliff et al., 2016; Van de Vosse et al., 2013). Mutations that disrupt this interaction cause a quantitative decrease in transcription (Ahmed et al., 2010; Brickner et al., 2016). For example, a mutation in the Gcn4 TF that blocks interaction with the NPC results in a quantitative decrease in transcription of Gcn4 targets (genes involved in amino acid biosynthesis; Brickner et al., 2019; Hinnebusch and Fink, 1983). This mutation replaces three amino acids within a 27 amino acid positioning domain (PDGCN4) that does not overlap the activation or DNA binding domains (Brickner et al., 2019). We confirmed this effect by measuring nascent transcription upon amino acid starvation in gcn4-pd strains or a wild-type control (Methods). Although both GCN4 and gcn4-pd mutant strains showed widespread transcriptional changes upon amino acid starvation (Figure 7A), the upregulation (and downregulation) of transcription was quantitatively stronger for the GCN4 strain (Figure 7A, right panel). We tested if this transcriptional defect is associated with a competitive fitness defect by competing GCN4 and gcn4-pd strains in the absence of histidine±3-amino triazole (3-AT, an inhibitor of the His3 enzyme, which selects for maximal expression of HIS3). The relative abundance of GCN4 and gcn4-pd strains was quantified using Sanger sequencing (Sump et al., 2022). The GCN4 strain showed greater fitness under both conditions, but this was particularly evident in the presence of 3-AT (Figure 7B).
ChEC-seq2 against Rpb1-MN was performed in GCN4, gcn4∆, and gcn4-pd mutant strains grown in the presence or absence of amino acids. This experiment identified 287 genes that showed a log2 fold-change of 1 or greater (adj. p<0.05) in the GCN4 strain upon amino acid starvation, but not in the gcn4∆ strain (Supplementary file 1). These genes were strongly enriched for genes involved in amino acid metabolism (p=3e-46; GO term 0006520) and strongly overlapped with Gcn4 targets (Bonferroni-adjusted p=1e-10 from Fisher’s exact test comparing overlap with targets defined near high-confidence Gcn4 ChEC-seq2 sites; VanBelzen et al., 2024). In the presence of amino acids, neither the gcn4∆ nor gcn4-pd mutations affected Rpb1 occupancy at the 287 Gcn4-dependent genes (Figure 7C, left column). However, upon amino acid starvation, strains lacking Gcn4 showed a stark decrease in Rpb1 recruitment upstream of the TSS that spanned both the UAS and promoter region (Figure 7C, top panel). The gcn4-pd mutation resulted in a more modest decrease in Rpb1 specifically over the promoter (Figure 7C, bottom). This suggested that the recruitment of RNAPII to the UAS region is dependent on Gcn4, but not on the PDGCN4.
Consistent with this possibility, Rpb1 cleavage adjacent to the TATA boxes near the Gcn4 target genes and over Gcn4 binding sites was strongly decreased by loss of Gcn4 (p<2e-16; Kolmogorov-Smirnov test comparing the mean cleavage pattern over 173 TATAs; Figure 7D, right panels). In the gcn4-pd strain, Rpb1 cleavage was decreased over the TATA box (p=4e-5; Figure 7D, top right), but not over the Gcn4 binding site (Figure 7D, bottom right). Thus, the PDGCN4 promotes the association of RNAPII with promoters, but does not affect RNAPII binding to the UAS.
Finally, we compared the effects of adjusting the rates of each step in our kinetic model to the effects of the Gcn4 mutations on Rpb1 occupancy (Figure 7—figure supplement 1). The effects of loss of Gcn4 agreed well with simply decreasing RNAPII recruitment to the UAS alone (k1, resulting in less RNAPII to move from the UAS to the promoter; Figure 7E). For the gcn4-pd mutant, increasing the dissociation of RNAPII from the promoter (k-3) either alone or in combination with decreasing the rate of transfer of RNAPII from the UAS to the promoter (k2) agreed well with these data (Figure 7E). This suggests that Gcn4 recruits RNAPII to the UAS through its activation domains and that its interaction with the NPC stabilizes promoter-bound RNAPII.
Discussion
Understanding complex biological mechanisms requires multipronged, multidisciplinary approaches. Each approach has strengths and weaknesses but together, they provide a more complete picture. Our current understanding of RNAPII transcription, involving the dynamic collaboration of dozens of proteins, is the product of biochemical, structural, genetic, cell biological, and genomic approaches. From decades of such work, we have an excellent working model for this critical biological process. Biochemical, structural, and cell biological approaches (and, in some cases, genetic approaches) can be biased by the particularities of the model system(s). For this reason, global approaches provide an essential perspective to assess the generality of the conclusions from more focused studies. Our current global perspective of molecular biology is dominated by a single technique: ChIP-seq and its derivatives. Indeed, ChIP-seq is the sole method used to define DNA binding and chromatin state by the ENCODE and modENCODE Consortium (Landt et al., 2012). Such a methodological monoculture is problematic if there are ways in which ChIP falters in detecting important interactions (Park et al., 2013; Teytelman et al., 2013).
ChIP vs. ChEC
For proteins that bind directly to DNA at specific sites, ChIP-seq and ChEC-seq2 generally agree. For example, high-confidence binding sites for ssTFs show excellent agreement between either ChIP-seq or ChIP-exo and ChEC-seq (Donczew et al., 2021; VanBelzen et al., 2024; Zentner et al., 2021; Zentner et al., 2015). Likewise, mapping the associations of PIC components, Mediator or the kinases associated with transcription by ChEC-seq2 was very similar to such maps produced by ChIP-seq (Saleh et al., 2021; Wong et al., 2014). However, some exceptions have been noted, as well. ChEC-seq with both Rif1 and Sfp1 reveals biologically sensible binding sites that were not evident from ChIP-seq (Bruzzone et al., 2021).
While both ChIP-seq and ChEC-seq2 with RNAPII gives enrichment over genes that correlates with transcription, the patterns are complementary; ChIP highlights interactions with the transcribed region (reflecting paused or elongating RNAPII) and ChEC highlights interactions with the enhancer, promoter, and terminator (reflecting preinitiation or terminating RNAPII). We have validated the RNAPII enrichment reported by ChEC-seq2 in five ways. First, the maps produced by two different subunits of RNAPII are highly similar (Figure 2). Second, the RNAPII ChEC-seq2 signal over promoters and UAS regions correlates well with the RNAPII ChIP signal over coding sequences and with nascent transcription rates (measured by SLAM-seq; Figure 2). Third, the cleavage by either Rpb1 or Rpb3 (as well as TFIIA and TFIIE) peaks on either side of TATA boxes, which agrees well with biochemical and structural analysis of the PIC (Figure 2, Video 1). Fourth, widespread changes in transcription are captured by changes in the Rpb1 enrichment by ChEC-seq2 over all gene regions (Figure 4). Fifth, depletion of TFIIB leads to loss of Rpb1 and TFIIE, as well as an increase of TFIIA, over TSSs (Figure 5). These data, strengthened by the correlations with ChEC-seq2 using factors involved in initiation and elongation, argue that the patterns of RNAPII enrichment revealed by ChEC-seq2 are biologically meaningful and fit well with the literature.
Why is there a difference between RNAPII ChIP-seq and ChEC-seq2? While ChIP captures direct protein-DNA interactions well, it is much less able to capture indirect interactions. Additional factors that may influence ChIP enrichment include the local nucleosome occupancy, the accessibility of the epitope, and the relative sensitivity of different regions to shearing by sonication (Giresi et al., 2007). Unlike ssTFs or even PIC components that bind to directly to precise genomic sites, RNAPII interacts both indirectly (through ssTFs) and directly (in the PIC and transcribing RNAPII) with different regions, each of which is associated with distinct sets of cofactors. These differences likely impact the two methods; ChEC should detect both direct and indirect interactions with DNA, whereas ChIP should strongly favor direct interactions. Likewise, ChEC will perform better in nucleosome-depleted regions while ChIP crosslinking may be enhanced by lysine-rich nucleosomes.
ChEC-seq2 detects UAS-associated RNAPII observed in single-molecule biochemical experiments (Baek et al., 2021; Rosen et al., 2020) that have not been observed by ChIP-seq. This is consistent with recruitment of RNAPII to ssTFs/Mediator bound to UASs. While the enhanced RNAPII ChEC-seq2 signal in intergenic regions may also reflect lower nucleosome occupancy, sMNase cleavage was not enriched over UASs like Rap1 binding sites (Figure 5E). Because such sites are also occupied by Mediator (Figure 5F), this supports a Mediator-dependent mechanism of RNAPII recruitment to the UAS (Baek et al., 2021; Rosen et al., 2020). Furthermore, it is important to highlight that the RNAPII ChEC-seq2 enrichment observed over promoters and UASs is consistent with that expected from known dwell times and the rate of elongation (Figure 2). The low RNAPII ChIP-seq signal at UASs and the high signal over coding sequences could reflect both its more direct interaction with DNA and its intimate association highly crosslinkable nucleosomes during transcription (Bintu et al., 2012; Ehara et al., 2022; Ehara et al., 2019; Kujirai et al., 2018). However, it is less clear why the RNAPII ChIP-seq signal over the promoter is so low. ChIP-seq successfully captures enrichment of PIC components at promoters, indicating that promoter regions can be successfully enriched by ChIP. Future studies will resolve these differences.
ChEC-seq2 with elongation factors
We also present a novel method for observing the genome-wide location of the phosphorylated forms of RNAPII (Ser2p and Ser5p) using single chain antibodies (Mintbodies) tagged with MNase. ChEC-seq2 with these Mintbodies produces patterns that agree well with total RNAPII and with the kinases responsible for these modifications. Consistent with ChIP, Ser5p RNAPII is enriched in promoters and the 5’ end of active genes, while Ser2p is enriched over the body and 3’ end. Inactivation of the Kin28 Ser5p kinase results in dramatic loss of RNAPII, Ser5p RNAPII, and Ser2p RNAPII from active genes (Figure 5). This is consistent with an important role for Ser5p in initiation and with the observation that Ser2 phosphorylation is functionally downstream of Ser5p.
ChEC-seq2 with factors involved in elongation (Ctk1, Spt5, Ser2p-RNAPII), when normalized to total RNAPII, showed greater enrichment over the CDS (Figure 3G), as expected. However, it is surprising that we also observed clear enrichment of these factors at promoters (e.g. Figure 3A, E, and F). The association of elongation factors with the promoter seems to be biologically relevant. Changes in transcription correlate with changes in ChEC-seq2 enrichment for these factors and modifications (Figure 4C). Blocking initiation by inhibiting TFIIH kinase led to a reduction of Ser5p RNAPII and Ser2p RNAPII over both the promoter and the transcribed region (Figure 5G). This suggests either that the true signal of these factors over transcribed regions is less evident by ChEC-seq2 than by ChIP-seq or that ChEC-seq2 can reveal interactions of elongation factors at early stages of transcription that are missed by ChIP-seq. The expectations for enrichment of elongation factors and phosphorylated CTD are largely based on ChIP data. Because ChIP-seq fails to capture RNAPII enrichment at UASs and promoters, it is possible that ChIP also fails to capture promoter interaction of factors involved in elongation as well.
Factors important for elongation can also function at the promoter. For example, Ctk1 is required for the dissociation of basal TFs from RNAPII at the promoter (Ahn et al., 2009). Transcriptional induction leads to increases in Ctk1 ChEC-seq2 enrichment both over the promoter and over the 3’ end of the transcribed region (Figure 4C). Dynamics of Spt4/5 association with RNAPII from in vitro imaging (Rosen et al., 2020) indicate that the majority of Spt4/5 binding to RNAPII does not lead to elongation; Spt4/5 frequently dissociates from DNA-bound RNAPII. Association of Spt4/5 with RNAPII may represent a slow, inefficient step in the transition to productive elongation. If so, then ChEC-seq2 may capture transient Spt4/5 interactions that occur prior to productive elongation, producing enrichment of Spt5 at the promoter.
A role for interaction with the NPC in stabilizing the PIC
The NPC has been implicated in transcription in yeast and other organisms. In yeast, inactivation of DNA elements or TFs that promote interaction with the NPC leads to a quantitative defect in transcription (Ahmed et al., 2010; Brickner et al., 2012). Single-molecule RNA FISH (smRNA FISH) in strains bearing mutations that blocked the interaction of the GAL1-10 promoter with the NPC showed a decrease in the fraction of cells that exhibit transcription (Brickner et al., 2016). A mutation in the Gcn4 ssTF that blocks its ability to mediate peripheral localization and interaction with the NPC leads to a defect in expression of Gcn4 target genes (gcn4-pd; Figure 7; Brickner et al., 2019) and inactivation of nuclear pore proteins essential for chromatin interaction leads to a global transcriptional defect (Ge et al., 2024). Applying RNAPII ChEC-seq2, we have explored the phenotype of the gcn4-pd mutant. Whereas loss of Gcn4 leads to loss of RNAPII from UASs and promoters, inactivation of the PDGCN4 reduces the association of RNAPII with the promoter without affecting its recruitment to the UAS (Figure 7). This suggests that the PDGCN4 either enhances the transfer of RNAPII from the UAS to the promoter or stabilizes the association of RNAPII with the promoter. Genetic interactions between nuclear pore proteins and Mediator suggest that these two components function at the same step in transcription (Ge et al., 2024). Together with the smRNA FISH result, this suggests that nuclear pore proteins stimulate enhancer function by stabilizing RNAPII association with the PIC.
A global model for yeast RNAPII kinetics
Because ChEC-seq2 measures global occupancy of RNAPII that includes important states that are missed by ChIP-seq, it allowed us to develop a global model for the kinetics of RNAPII transcription. Building on previous work (Rossi et al., 2021), we have modeled two classes of genes: those that show RNAPII association only with promoters (TFO) and those that show association with UASs as well (STM). For the TFO model, RNAPII is recruited directly to the promoter. For STM genes, RNAPII is recruited to the UAS and then transferred to the promoter. Subsequent steps (initiation, elongation, and termination) are assumed to be the same between these two classes. Several of the rates are from the literature, while the others were fit to the experimental RNAPII enrichments over UASs, promoters, transcribed regions, and 3’UTRs. While we were unable to find rates within a reasonable range of parameters that produced RNAPII occupancies matching ChIP-seq, the model identified a large ensemble of rates that produced RNAPII occupancies matching ChEC-seq2 (Figure 6B). The RNAPII occupancy from ChEC-seq2 data over highly active genes matched models that included a short dwell time over the terminator (~30 s), at the lower bound of what was reported in Zenklusen et al., 2008 (mean = 56 ± 20 s) and Larson et al., 2011 (mean = 70 ± 41 s).
The kinetic model suggests that perturbations often have more than one effect, as expected for a dynamic, multi-step process like transcription. For example, the effects of depletion of TFIIB on RNAPII ChEC-seq2 are best modeled by both a decrease in RNAPII recruitment and an increase in non-productive dissociation of RNAPII, either from the promoter or the UAS (Figure 6C). Likewise, the effects of inhibition of Kin28 were most consistent with both a decrease in initiation and an increase in dissociation from the promoter/UAS (Figure 6D). These results suggest that the PIC is unstable and that such perturbations cause RNAPII to dissociate. This conclusion agrees with the observation that a small fraction of the polymerases that assemble at the promoter initiate transcription (Darzacq et al., 2007) and with the observation that conditional inactivation of PIC components does not preserve stable intermediates (Petrenko et al., 2019). Moreover, these results were consistent across the entire ensemble of models, showing that this is a robust effect. These models should serve as a helpful framework for future global studies of transcription.
Methods
Yeast strains
Yeast strains and tagging vectors used in this study are provided in Supplementary files 2 and 3. C-terminal MNase fusions were introduced by recombination as previously described (VanBelzen et al., 2024). Sua7 was tagged with 3xV5-IAA7 using pV5-IAA7-His3MX6, which was generated by swapping the His3MX6 marker in place of the HIS3 marker in pGZ363 (Tourigny et al., 2021). OsTir1-LEU2 was PCR amplified from pSB2271 (Miller et al., 2016) with primers that facilitated recombination at leu2∆0 and simultaneously restored the locus to LEU2. The kin28is mutations V21C and L83G (Rodríguez-Molina et al., 2016) were introduced by two subsequent rounds of CRISPR-Cas9-mediated mutagenesis as described (Anand et al., 2017). The GCN4-sm and gcn4-pd mutations were introduced by CRISPR-Cas9-mediated mutagenesis and are described (Ge et al., 2024).
Mb-MN constructs were synthesized by Integrated DNA Technologies as gBlocks. The gBlocks were flanked by a HindIII and BamHI site, which were used to clone the gBlocks into the pFA6a-NatMX6 vector (Hentges et al., 2005). The constructs were amplified from plasmid by PCR to yield amplicons flanked with homology to the his3∆1 locus, which were then transformed into yeast.
Strains were confirmed to have the desired sequence by amplifying the modified locus from genomic DNA and sequencing. Platinum SuperFi (Thermo Fisher Scientific) was used to amplify long targets by PCR.
Media and growth conditions
Media were prepared as described (Burke et al., 2000). Cells were grown at 30°C with shaking at 200 rpm in SDC media. YPD media was used in growth assays and in Figure 2A-C, Figure 2—figure supplement 2, where cells were grown in YPD to match conditions of ChIP-seq samples. Ethanol stress was induced by growing cells in media spiked with 10% ethanol for 1 hr. Sua7-IAA7 was degraded for by treating cells with 0.5 mM indole-3-acetic acid for 60 min in SLAM-seq experiments or 20 min in ChEC-seq2 experiments. For Kin28 inhibition experiments, cells harboring the kin28is mutation were treated with 5 µM CMK for 60 min.
For SLAM-seq and growth competition experiments with GCN4-sm and gcn4-pd, cells were grown in SDC and then shifted into SDC or SDC-His for 1 hr. Growth competition assays were performed as described (Sump et al., 2022) and the histidine synthesis pathway was blocked through the addition of 3-AT to the media. For ChEC-seq2 experiments with GCN4-sm, gcn4-pd, and gcn4∆, cells were grown in YPD before shifting into either SDC or SD+uracil for 1 hr.
ChEC-seq2
The ChEC-seq2 method was performed as described (VanBelzen et al., 2024). Cells were permeabilized and 2 mM calcium was added to activate MNase activity. Reactions were stopped after genomic DNA was partially digested (e.g. Figure 3—figure supplement 1A), DNA was purified, DNA ends were repaired and ligated to an Illumina-compatible adapter (VanBelzen et al., 2024). A second adapter was incorporated through Tn5-based Tagmentation. Complete adapters and library indexes were incorporated through library amplification with Nextera XT Index Primers.
ChIP-seq
Cell fixation and chromatin isolation was performed as previously described (Kuo and Allis, 1999) but is briefly described here for clarity. Independent cultures of BY4741 (Rpb1) and JVY022 (Rpb1-MN) were grown in YPD at 30°C, 200 rpm until cultures reached a density between 0.6 and 0.9 (OD600). A culture volume of 100 ml was crosslinked with 1% formaldehyde for 10 min at 30°C with gentle mixing. The crosslinking reaction was quenched with 0.3 M glycine for 5 min at 30°C with gentle mixing. A volume of 50 ml was collected by centrifugation and the pellet was washed twice in ice-cold Tris-buffered saline (20 mM Tris-HCl, pH 7.5; 150 mM NaCl), snap-frozen in liquid nitrogen, and stored at –80°C for up to 2 weeks. Pellets were briefly thawed on ice and resuspended in 600 µl of ice-cold FA lysis buffer (50 mM HEPES-KOH, pH 7.5; 140 mM NaCl; 1 mM EDTA; 1% Triton X-100; 0.1% sodium deoxycholate) supplemented with protease inhibitors (1 mM PMSF; 1 µg/ml Leupeptin; 1 µg/ml Pepstatin A; 10 µg/ml Aprotinin). A volume of 600 µl zirconia beads (0.5 mm diameter) was added, and cells were lysed by bead-beating at 4°C in a Vortex Genie for 7 cycles of: 3 min on (highest setting); 1 min on ice. The lysate was separated from the beads and brought to a final volume of 600 µl with ice-cold FA lysis buffer, which was split into two 300 µl fractions for sonication. Sonication was performed on a BioRuptor Pico (Diagenode) at 4°C for 6 cycles of: 30 s on (high setting); 30 s off. Debris was pelleted by centrifugation at 17,000 × g for 15 min at 4°C, and the chromatin-containing supernatant was collected.
Immunoprecipitation was adapted from Sump et al., 2022. Dynabeads Protein G (Thermo Fisher Scientific # 10003D) were equilibrated in chilled FA lysis buffer for 2 hr at 4°C on a rotating stand. Simultaneously, 2 mg of chromatin in a 1 ml volume was incubated with 2 µl of Anti-Rpb1 (Clone 8WG16, BioLegend) at 4°C on a rotating stand. 20 µl of equilibrated Dynabeads was added to each chromatin sample and incubated overnight at 4°C on an inverting rotator. Beads were immobilized with a magnetic stand and washed four times in 1 ml of chilled Wash Buffer (50 mM HEPES-KOH, pH 7.5; 500 mM NaCl; 1 mM EDTA; 1% Triton X-100; 0.1% sodium deoxycholate) supplemented with protease inhibitors (see above). Protein of interest was eluted in 100 µl of Elution Buffer (50 mM Tris-HCl, pH 8.0; 10 mM EDTA; 1% SDS) and crosslinks were reversed by heating overnight at 65°C. Added 5 µl RNAse A (10 µg/µl) and heated at 37°C for 30 min to degrade RNA. Added 10 µl of Proteinase K (20 µg/µl) and incubated at 50°C for 1 hr. Purified DNA with QIAquick spin columns according to the manufacturer’s instructions (QIAGEN # 28104). DNA fragment size was measured on a TapeStation 4150 and confirmed to be approximately 400 bp. Sequencing libraries were prepared from 0.5 ng of DNA with the MicroPlex Library Preparation Kit v3 with dual indexes (Diagenode # C05010001 and C05010004). Libraries were sequenced at NUseq on the NovaSeq X Plus (Illumina) in the paired-end, 50 bp format. Bioinformatic analysis was performed as in ChEC-seq2 (VanBelzen et al., 2024), except reads were mapped with paired-end mode of Bowtie 2 (Langmead and Salzberg, 2012) and the ChEC-specific trimming step was omitted.
SLAM-seq
SLAM-seq was performed as previously described (Herzog et al., 2017) with the following modifications. Approximately 108 cells were collected, resuspended in SDC-uracil+200 µM 4-thiouracil, and incubated for 6 min at 30°C. Cells were collected by centrifugation and frozen in liquid nitrogen. RNA was extracted from cell pellets as described (Schmitt et al., 1990), and purified with the Monarch Total RNA Miniprep Kit (New England Biolabs). Alkylated RNA was purified with the Monarch RNA Cleanup Kit (New England Biolabs). RNA quality was confirmed after each purification with a TapeStation 4150 (Agilent). Sequencing libraries were prepared from 150 ng RNA using the QuantSeq 3’ mRNA-Seq Library Prep Kit (FWD) kit (Lexogen). Sequencing was performed on a HiSeq 4000 (Illumina) in the single-end, 50 bp format at the Northwestern University NUseq core facility. In the case of SLAM-seq performed with JBY555 (gcn4-pd-GFP) and JBY556 (GCN4sm-GFP) (Ge et al., 2024), cells were shifted into SDC-uracil with 2 mM 4-thiouracil for 6 min.
Reads were mapped with SlamDunk (Herzog et al., 2017) to the S288C genome (build R64-3-1) and binned into genes classified as Verified or Uncharacterized by the Saccharomyces Genome Database. This yielded counts values for 5925 genes. Counts files were analyzed in R with DESeq2 (Love et al., 2014) to identify differentially expressed genes between conditions.
Immunoblotting
Protein was isolated from cells as described (Rüegsegger et al., 2001) and quantified by BCA protein assay (#23225, Thermo Fisher Scientific). 40 µg of protein was separated on 10% surePAGE Bis-Tris gels in MOPS running buffer (#M00665, GenScript) and transferred to a nitrocellulose membrane. The membrane was blocked with 5% nonfat dry milk in TBST with 0.05% Tween 20 for 1 hr at room temperature and then probed with anti-V5 (#R960-25, Thermo Fisher Scientific) and anti-ß-Actin (#GTX629630, GeneTex) primary antibodies overnight at 4°C. Membranes were washed for 5 min with TBST for a total of five washes, and then incubated with goat anti-mouse conjugated with HRP (#AP127P, MilliporeSigma) in 5% milk TBST for 1 hr at room temperature. Washes were repeated and then HRP was activated with chemiluminescence reagents (#34075, Thermo Fisher Scientific) for 5 min. Blots were imaged on an c600 imaging system (Azure Biosystems).
Computational model
We used a stochastic model to simulate the average occupancy of RNAPII along a discretized model gene (Figure 6A), assuming each step in the transcription cycle is a Poisson process. We separately modeled two classes of genes: STM genes and TFO genes. For STM genes, we assume that the association of RNAPII with the gene occurs at the UAS and is reversible, with association rate k1 and dissociation rate k-1. Next, the RNAPII transitions from the UAS to the promoter with rate k2. This rate represents an aggregate step that requires the recruitment of early GTFs such as TFIIA and TFIIB. Because these interactions are reversible, we assume RNAPII can transition back to the UAS from the promoter with rate k-2. When at the promoter, the RNAPII awaits the arrival of late GTFs such as TFIIH to form the complete PIC. This process occurs at the aggregate rate k4. While awaiting arrival of late GTFs, the RNAPII can also dissociate from the promoter with rate k-3. Once the PIC has assembled, TFIIH kinase phosphorylates the C-terminal domain of RNAPII to initiate transcription and promoter escape. This occurs with rate k5. The transcribed region is modeled as 10 identical 120 bp compartments, and the RNAPII moves to each succeeding compartment with rate k6. Finally, once the RNAPII reaches the terminator, it dissociates with rate k7. TFO genes are modeled similarly, with the omission of k1,k-1, k2, and k-2, and instead introducing k3, the rate of recruitment directly to the promoter. The UAS, promoter, and terminator regions are modeled as independent 120 bp compartments. No compartment could be occupied by more than one RNAPII.
We simulated 1000 seconds of the transcription cycle to allow the system to reach steady state. We report the RNAPII occupancy of each segment of the gene over the final 60 s to align with the experimental procedure. The simulated data was then normalized using the L2 norm and scaled to have the same magnitude as the empirical data to approximate the unit conversion to CPM or CPMn. This process was repeated across 100,000 genes and the average occupancy in each region of the gene was recorded. Simulations were performed using the Gillespie algorithm (Gillespie, 1977), a stochastic simulation method that generates statistically correct trajectories of a given system. The algorithm uses random sampling to determine the timing and sequence of state transitions that correspond to different steps in the transcription cycle. Code for the simulations is available on GitHub, copy archived at Brickner, 2024.
Model fitting
Several parameters in the model were fixed according to previously published data; k1 and k-1 were from Rosen et al., 2020; k5 was based on the residency time of TFIIH (Nguyen et al., 2021); k6 was based on an average elongation rate of 1000 bp/min (Larson et al., 2011; Zenklusen et al., 2008) and k7 was based on 56±20 s and 70±41 s termination times (Larson et al., 2011; Zenklusen et al., 2008). Other parameters in the model were free and were fit to either ChEC-seq2 or ChIP-seq data by performing a grid search.
We evaluated each model in the grid by computing the cosine similarity between the output of the model and the empirical data. That is, we calculated the quantity
where is the average occupancy of the model in the ith segment (UAS, promoter, transcript, or 3’UTR) and is the corresponding empirical data from the same segment. The cosine similarity ranges from –1 to 1, with 1 indicating perfect alignment, 0 indicating no correlation, and –1 indicating perfect inverse alignment. This measure allows us to quantitatively assess how well each model’s predictions align with the observed data simultaneously across gene regions. Rather than choosing the single model with the best fit, we elected to use an ensemble approach to more thoroughly interpret the data. In this approach, all models with cosine similarity greater than 0.995 were included in the ensemble (for ChEC-seq2). This ensemble approach allows us to explore the full space of models that are consistent with the data and avoid any spurious conclusions that may arise from the investigation of a single parameter set. The recovered ensemble of models was distributed across a manifold in parameter space, establishing required relationships between the unknown parameters (Figure 6—figure supplement 1, Figure 7—figure supplement 1). For ChIP-seq data, the model could not achieve a cosine similarity greater than 0.85, so instead we report the best fitting models to provide context. Genes with fewer than 50 nascent read counts were removed from the STM and TFO datasets, yielding 643 STM genes and 1143 TFO genes.
Based on the established functions of the proteins involved (TFIIB, Kin28, or Gcn4), we identified the rate that would be most likely influenced by the experimental perturbation and simulated the effects of perturbing that rate. If altering that rate was not sufficient to match the data, the effects of changing additional rates were explored to identify the model that best match the data. Changes to rates that did not match the empirical data are not shown. The final list of parameters used to simulate each experiment are given in Table 2.
Data analysis
Gene classifications, coordinates, and regions
The S288C genome sequence and annotations from build R64-3-1 were used for analysis and visualization (Engel et al., 2014). The STM and TFO gene classifications are from Rossi et al., 2021. TATA-positions were from Rhee and Pugh, 2012. The top 150 expressed genes within each class were defined by Nascent RNA counts (SLAM-seq) from the BY4741 strain grown in SDC and are listed in Supplementary file 1. Similarly, expressed genes subsets were defined as genes for which there were ≥50 nascent RNA counts on average across three biological replicates. This resulted in the following number of genes per expressed subsets: STM, 643 genes; TFO, 1143 genes; TATA-containing, 597 genes.
TSS and transcription end site (TES) locations were defined by an RNA-seq dataset (Pelechano et al., 2013), when available. In cases where no TSS was available from RNA-seq, the TSS was instead taken from a CAGE-seq dataset (Lu and Lin, 2021). If neither dataset contained TSS or TES information, the median 5’UTR length (47 bp) or 3’UTR length (118 bp) was used to define these locations, respectively. Median UTR lengths were calculated from the most abundant transcript isoform for mRNAs (Pelechano et al., 2013). ChEC-seq2 signal was binned into gene regions defined as: UAS, –500 bp to –151 relative to TSS; promoter, –150 to +25 relative to TSS; transcript, +26 relative to TSS and –76 relative to TES; terminator, –75 to +150 relative to TES.
Individual gene plots
A region spanning 1000 bp upstream of the TSS and 1000 bp downstream of the TES is shown. Signal was smoothed with a sliding window average (window = 10, step = 5).
Metasite plots
Genes were aligned by TSS or TATA sequence, as indicated in the figure. 250 bp upstream and downstream of the of the aligned site was included. Signal was smoothed with a sliding window average (window = 10, step = 5).
Metagene plots
Metagene plots are composed of three regions: 1000 bp upstream of the TSS, the transcript (TSS to TES), and 1000 bp downstream of the TES. First, the average signal (or change in signal, where indicated) at each base pair from three biological replicates was calculated. Then, each region was divided into 100 bins and the average signal in each bin was calculated. The process was repeated for each gene, and then the average signal for each bin across all genes was calculated and is displayed in metagene plots.
Materials availability
The plasmids and strains described in Supplementary files 2 and 3 are available upon request.
Data availability
Sequencing data has been deposited in the Gene Expression Omnibus at the National Center for Biotechnology Information and can be retrieved with accession numbers GSE267843 and GSE246951. Scripts used in modeling are available at GitHub, copy archived at Brickner, 2024.
-
NCBI Gene Expression OmnibusID GSE267843. ChEC-seq2 of RNA Polymerase II and preinitiation Complex in S. cerevisiae.
-
NCBI Gene Expression OmnibusID GSE246951. ChEC-seq2: an improved chromatin endogenous cleavage sequencing method and bioinformatic analysis pipeline for mapping in vivo protein-DNA interactions.
References
-
Promoter-proximal pausing of RNA polymerase II: emerging roles in metazoansNature Reviews. Genetics 13:720–731.https://doi.org/10.1038/nrg3293
-
DNA zip codes control an ancient mechanism for gene targeting to the nuclear peripheryNature Cell Biology 12:111–118.https://doi.org/10.1038/ncb2011
-
SoftwareRNAPII_kinetics_simulation, version swh:1:rev:d967d221e645ad6e5e3fed3c0f17eac9302005b0Software Heritage.
-
Messenger RNA synthesis in mammalian cells is catalyzed by the phosphorylated form of RNA polymerase IIThe Journal of Biological Chemistry 262:12468–12474.
-
Defining the status of RNA polymerase at promotersCell Reports 2:1025–1035.https://doi.org/10.1016/j.celrep.2012.08.034
-
Promoter-proximal pausing of RNA polymerase II: a nexus of gene regulationGenes & Development 33:960–982.https://doi.org/10.1101/gad.325142.119
-
In vivo dynamics of RNA polymerase II transcriptionNature Structural & Molecular Biology 14:796–806.https://doi.org/10.1038/nsmb1280
-
High sequence specificity of micrococcal nucleaseNucleic Acids Research 9:2659–2673.https://doi.org/10.1093/nar/9.12.2659
-
The reference genome sequence of Saccharomyces cerevisiae: then and nowG3: Genes, Genomes, Genetics 4:389–398.https://doi.org/10.1534/g3.113.008995
-
Exact stochastic simulation of coupled chemical reactionsThe Journal of Physical Chemistry 81:2340–2361.https://doi.org/10.1021/j100540a008
-
Eukaryotic transcription activation: right on targetMolecular Cell 18:399–402.https://doi.org/10.1016/j.molcel.2005.04.017
-
Thiol-linked alkylation of RNA to assess expression dynamicsNature Methods 14:1198–1204.https://doi.org/10.1038/nmeth.4435
-
Sequence specific cleavage of DNA by micrococcal nucleaseNucleic Acids Research 9:2643–2658.https://doi.org/10.1093/nar/9.12.2643
-
Termination and pausing of RNA polymerase II downstream of yeast polyadenylation sitesMolecular and Cellular Biology 13:5159–5167.https://doi.org/10.1128/mcb.13.9.5159-5167.1993
-
ChIP-seq guidelines and practices of the ENCODE and modENCODE consortiaGenome Research 22:1813–1831.https://doi.org/10.1101/gr.136184.111
-
Fast gapped-read alignment with Bowtie 2Nature Methods 9:357–359.https://doi.org/10.1038/nmeth.1923
-
Evidence for nucleosome depletion at active regulatory regions genome-wideNature Genetics 36:900–905.https://doi.org/10.1038/ng1400
-
Purification of P-TEFb, a transcription factor required for the transition into productive elongationThe Journal of Biological Chemistry 270:12335–12338.https://doi.org/10.1074/jbc.270.21.12335
-
The poly(A)-dependent transcriptional pause is mediated by CPSF acting on the body of the polymeraseNature Structural & Molecular Biology 14:662–669.https://doi.org/10.1038/nsmb1253
-
The poly(A) signal, without the assistance of any downstream element, directs RNA polymerase II to pause in vivo and then to release stochastically from the templateThe Journal of Biological Chemistry 277:42899–42911.https://doi.org/10.1074/jbc.M207415200
-
The Mediator complex as a master regulator of transcription by RNA polymerase IINature Reviews. Molecular Cell Biology 23:732–749.https://doi.org/10.1038/s41580-022-00498-3
-
Structure and mechanism of the RNA polymerase II transcription machineryGenes & Development 34:465–488.https://doi.org/10.1101/gad.335679.119
-
ChIC and ChEC; genomic mapping of chromatin proteinsMolecular Cell 16:147–157.https://doi.org/10.1016/j.molcel.2004.09.007
-
A rapid and simple method for preparation of RNA from Saccharomyces cerevisiaeNucleic Acids Research 18:3091–3092.https://doi.org/10.1093/nar/18.10.3091
-
Inhibition of in vivo and in vitro transcription by monoclonal antibodies prepared against wheat germ RNA polymerase II that react with the heptapeptide repeat of eukaryotic RNA polymerase IIThe Journal of Biological Chemistry 264:11511–11520.
-
Live imaging of transcription sites using an elongating RNA polymerase II-specific probeThe Journal of Cell Biology 221:e202104134.https://doi.org/10.1083/jcb.202104134
-
Single-RNA counting reveals alternative modes of gene expression in yeastNature Structural & Molecular Biology 15:1263–1271.https://doi.org/10.1038/nsmb.1514
Article and author information
Author details
Funding
National Institute of General Medical Sciences (R35GM136419)
- Jason H Brickner
National Science Foundation
- Jake VanBelzen
National Science Foundation (DMS-1547394)
- Bennet Sakelaris
National Institute of General Medical Sciences (T32GM008061)
- Jake VanBelzen
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
The authors thank Professors Vu Nguyen (University of California, San Diego), David Shore (University of Geneva), Yuan He (NU), Shelby Blythe (NU), Curt Horvath (NU), and Richard Morimoto (NU) for helpful feedback and support, members of the Brickner laboratory for helpful comments on the manuscript and Gabe Zentner for yeast strains, plasmids, and technical advice. DJV was supported by a National Science Foundation Graduate Fellowship and by T32 NIGMS GM008061. BS was supported by National Science Foundation research training grant DMS-1547394. This work was supported by National Institute of General Medical Sciences grant R35GM136419 (JHB).
Version history
- Sent for peer review:
- Preprint posted:
- Reviewed Preprint version 1:
- Reviewed Preprint version 2:
- Version of Record published:
Cite all versions
You can cite all versions using the DOI https://doi.org/10.7554/eLife.100764. This DOI represents all versions, and will always resolve to the latest one.
Copyright
© 2024, VanBelzen et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 360
- views
-
- 4
- downloads
-
- 0
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Chromosomes and Gene Expression
- Neuroscience
Pathogenic variants in subunits of RNA polymerase (Pol) III cause a spectrum of Polr3-related neurodegenerative diseases including 4H leukodystrophy. Disease onset occurs from infancy to early adulthood and is associated with a variable range and severity of neurological and non-neurological features. The molecular basis of Polr3-related disease pathogenesis is unknown. We developed a postnatal whole-body mouse model expressing pathogenic Polr3a mutations to examine the molecular mechanisms by which reduced Pol III transcription results primarily in central nervous system phenotypes. Polr3a mutant mice exhibit behavioral deficits, cerebral pathology and exocrine pancreatic atrophy. Transcriptome and immunohistochemistry analyses of cerebra during disease progression show a reduction in most Pol III transcripts, induction of innate immune and integrated stress responses and cell-type-specific gene expression changes reflecting neuron and oligodendrocyte loss and microglial activation. Earlier in the disease when integrated stress and innate immune responses are minimally induced, mature tRNA sequencing revealed a global reduction in tRNA levels and an altered tRNA profile but no changes in other Pol III transcripts. Thus, changes in the size and/or composition of the tRNA pool have a causal role in disease initiation. Our findings reveal different tissue- and brain region-specific sensitivities to a defect in Pol III transcription.
-
- Biochemistry and Chemical Biology
- Chromosomes and Gene Expression
The mRNA 5'-cap structure removal by the decapping enzyme DCP2 is a critical step in gene regulation. While DCP2 is the catalytic subunit in the decapping complex, its activity is strongly enhanced by multiple factors, particularly DCP1, which is the major activator in yeast. However, the precise role of DCP1 in metazoans has yet to be fully elucidated. Moreover, in humans, the specific biological functions of the two DCP1 paralogs, DCP1a and DCP1b, remain largely unknown. To investigate the role of human DCP1, we generated cell lines that were deficient in DCP1a, DCP1b, or both to evaluate the importance of DCP1 in the decapping machinery. Our results highlight the importance of human DCP1 in decapping process and show that the EVH1 domain of DCP1 enhances the mRNA-binding affinity of DCP2. Transcriptome and metabolome analyses outline the distinct functions of DCP1a and DCP1b in human cells, regulating specific endogenous mRNA targets and biological processes. Overall, our findings provide insights into the molecular mechanism of human DCP1 in mRNA decapping and shed light on the distinct functions of its paralogs.