Transcription regulation in metazoans often involves promoter-proximal pausing of RNA polymerase (Pol) II, which requires the 4-subunit negative elongation factor (NELF). Here we discern the functional architecture of human NELF through X-ray crystallography, protein crosslinking, biochemical assays, and RNA crosslinking in cells. We identify a NELF core subcomplex formed by conserved regions in subunits NELF-A and NELF-C, and resolve its crystal structure. The NELF-AC subcomplex binds single-stranded nucleic acids in vitro, and NELF-C associates with RNA in vivo. A positively charged face of NELF-AC is involved in RNA binding, whereas the opposite face of the NELF-AC subcomplex binds NELF-B. NELF-B is predicted to form a HEAT repeat fold, also binds RNA in vivo, and anchors the subunit NELF-E, which is confirmed to bind RNA in vivo. These results reveal the three-dimensional architecture and three RNA-binding faces of NELF.https://doi.org/10.7554/eLife.14981.001
Transcription of eukaryotic protein-coding genes by RNA polymerase II (Pol II) is not only regulated during the initiation phase (Hahn and Young, 2011; Sainsbury et al., 2015) but also during elongation (Jonkers and Lis, 2015; Li and Gilmour, 2011; Yamaguchi et al., 2013). For many metazoan genes, elongating Pol II pauses near the promoter, about 20-60 base pairs downstream of the transcription start site (TSS) (Kwak and Lis, 2013). Such promoter-proximal pausing is a key event during post-initiation regulation of transcription (Muse et al., 2007; Zeitlinger et al., 2007). Genes involved in cellular responses, differentiation, and reprogramming are subject to regulation at the step of promoter-proximal pausing (Williams et al., 2015; Min et al., 2011).
Promoter-proximal pausing requires the DRB sensitivity-inducing factor (DSIF), a heterodimer of subunits Spt4 and Spt5 (Wada et al., 1998; Yamaguchi et al., 1999b). DSIF binds over the active site cleft of the Pol II elongation complex to encircle nucleic acids bound in the cleft (Martinez-Rucobo et al., 2011; Klein et al., 2011). Promoter-proximal pausing also employs the negative elongation factor (NELF) (Pagano et al., 2014; Yamaguchi et al., 1999a), which comprises the four subunits NELF-A, -B, -C (or its variant -D, which lacks nine N-terminal amino acid residues), and -E (Narita et al., 2003).
DSIF and NELF assemble early with transcribing Pol II, which leads to stable promoter-proximal pausing (Henriques et al., 2013; Wu et al., 2003; Yamaguchi et al., 1999a). Nucleosomes may also contribute to pausing (Gilchrist et al., 2008, 2010; Jimeno-González et al., 2015). Pol II pause release relies on a kinase complex called positive transcription elongation factor b (P-TEFb) (Chiba et al., 2010). P-TEFb phosphorylates DSIF, NELF and the Pol II C-terminal domain (CTD), which encodes 52 heptapeptide repeats (consensus sequence Y1S2P3T4S5P6S7) that are variably phosphorylated during the transcription cycle (Buratowski, 2009; Cheng and Price, 2007; Fujinaga et al., 2004; Yamada et al., 2006).
DSIF, P-TEFb and NELF are differentially conserved among eukaryotes. DSIF is the most widely conserved complex with homologs present in most eukaryotes and even prokaryotes. P-TEFb homologs are found in most eukaryotes, whereas NELF conservation is more limited. For example, NELF homologs have not been identified in model organisms such as Arabidopsis thaliana, Saccharomyces cerevisiae, and Caenorhabditis elegans (Narita et al., 2003).
NELF was identified as a complex that cooperates with DSIF to repress Pol II elongation (Yamaguchi et al., 1999a). NELF apparently requires a preformed Pol II-DSIF elongation complex for stable binding (Missra and Gilmour, 2010; Narita et al., 2003; Yamaguchi et al., 2002). NELF binding efficiency and the rate of transcription elongation define the position of Pol II pausing (Li et al., 2013). NELF is associated with chromatin (Wu et al., 2005) and represses transcription elongation (Wu et al., 2003; Yamaguchi et al., 2002). The NELF-E subunit contains a RNA recognition motif (RRM) domain that binds RNA (Pagano et al., 2014; Rao et al., 2006; 2008), and this may contribute to pausing (Yamaguchi et al., 2002; Fujinaga et al., 2004). NELF is required for the regulation of the heat stress gene hsp70 (Wu et al., 2003), immediate early genes such as junb (Aida et al., 2006), and expression of genes of the human immunodeficiency virus (Natarajan et al., 2013; Zhang et al., 2007). It was recently suggested that NELF binds to enhancer RNAs (Schaukowitch et al., 2014).
Despite the central role of promoter-proximal pausing in gene regulation, the molecular mechanisms for Pol II pausing are unknown. Elucidating this mechanism requires structural information of the NELF complex. Here we show that regions of human NELF-A and NELF-C form a highly conserved core subcomplex with a novel fold. One side of this NELF-AC subcomplex exhibits a conserved binding face for single-stranded nucleic acids. RNA binding experiments in human cells reveal that NELF-B, NELF-C, and NELF-E associate with RNA in vivo. Our data provide the first structural model of NELF and extend our understanding of its RNA binding surfaces.
In a long-standing effort to obtain structural information on the intrinsically flexible NELF complex, we delineated regions in human NELF subunits that form soluble subcomplexes amenable to structural analysis (Figure 1A, Table 1, Figure 1—figure supplement 1, Materials and methods). Bacterial co-expression of NELF subunit variants revealed that the N-terminal region of NELF-A could be co-purified with NELF-C. Limited proteolysis and co-expression analysis with truncated protein variants showed that the N-terminal residues 6–188 of human NELF-A and residues 183–590 of human NELF-C formed a stable subcomplex (‘NELF-AC’) (Figure 1—figure supplement 1B). Purified NELF-AC could be crystallized by vapor diffusion, and the X-ray structure was solved by single isomorphous replacement with anomalous scattering (SIRAS) (Figure 1—figure supplement 1C–E, Materials and methods). The structure contained one NELF-AC heterodimer in the asymmetric unit and was refined to a free R-factor of 25.6% at 2.8Å resolution (Table 2). The structure shows very good stereochemistry and lacks only the mobile NELF-A residues 183–188, and NELF-C residues 183–185, 401–402, 445–448, 523, and 564–572.
The structure of human NELF-AC reveals a novel fold and an extended interface between the two NELF subunits (Figures 1B, 2). NELF-C adopts a horseshoe-like structure (Figure 2A). NELF-C consists of 22 α-helices (α1’-α22’) and a small two-stranded β-sheet (β1’-β2’, residues 367–379) that protrudes from the surface. The C-terminal half of NELF-C (helices α14’-α19’) forms three HEAT repeats (H1-H3). The HEAT repeat region shows structural similarity (Holm and Rosenstrom, 2010) to the C-terminal repeat domain (CTD)-interacting domain (CID) (Meinhart and Cramer, 2004) and the polyadenylation factor symplekin (Xiang et al., 2010). Despite the presence of a CID-like fold, NELF-AC did not show significant binding to CTD diheptad peptides carrying phosphorylations at CTD residues serine-2, serine-5, serine-2 and serine-5, or a consensus non-phosphorylated CTD diheptad peptide (not shown). Subunit NELF-A forms a highly conserved helical ‘N-terminal domain’ (helices α1–α5, residues 6–110) that resembles (Holm and Rosenstrom, 2010) the fold of the HIV integrase-binding domain present in human PC4 and SFRS1-interacting protein (PSIP1) (Cherepanov et al., 2005) (Figure 2B). This domain is followed by an ‘extended region’ in NELF-A that forms four additional helices (helices α6-α9, residues 111–182) arrayed around the NELF-C horseshoe (Figures 1B, 2A).
Both NELF-A regions interact extensively with NELF-C through hydrophobic and polar contacts. Two invariant tryptophan side chains (W24 and W89) on the NELF-A N-terminal domain insert into largely conserved hydrophobic pockets of NELF-C (Figures 1B, 2C). The extended region of NELF-A is essential for NELF-C interaction (Narita et al., 2003) and contacts the N- and C-terminal regions of NELF-C with its helices α6 and α9, respectively. NELF-A helices α7 and α8 are buried in the NELF-C horseshoe (Figure 2D). Overall, the heterodimer interface has a large surface area (3690 Å2), explaining the stability of the complex in 2 M sodium chloride (not shown).
The crystallized regions of human NELF-AC share considerable homology among metazoans, particularly at residues forming the hydrophobic cores and the interface between NELF-A and NELF-C (Figure 1B). The extent of conservation is evident when human and Drosophila melanogaster are compared, which share 55% identity for NELF-A and 50% identity for NELF-C. Intriguingly, NELF-A and -C homologs are present in some worms such as the filiarial nematode Loa loa (Figure 1—figure supplements 2, 3) and single celled organisms such as the green algae Chlorella variabilis and the slime mold Dictyostelium discoideum (Figure 1—figure supplements 2, 3). Most regions outside of the crystallized NELF-AC core diverge between single celled organisms and metazoans (Figure 1—figure supplements 2, 3). Such conservation suggests that NELF may have been present in early eukaryotes and was lost in certain lineages over time.
To place the NELF-AC crystal structure in context of the NELF tetramer, we determined the architecture of the four-subunit NELF complex by lysine specific crosslinking followed by mass spectrometry. We expressed the full-length four-subunit NELF complex recombinantly in insect cells from a single virus and purified it to homogeneity (Materials and methods, Figure 3A). The purified complex was crosslinked with disuccinimidyl suberate (DSS) and lysine-lysine crosslinks were detected by mass spectrometry as previously described (Herzog et al., 2012). We obtained a total of 424 unique high-confidence lysine-lysine crosslinks, including 279 intersubunit and 145 intrasubunit crosslinks (Figure 3—figure supplement 1A,B, Figure 3—source data 1).
Our NELF-AC crystal structure explained 11 inter- and intrasubunit crosslinks, with Cα distances below the maximum allowed distance of 30 Å (Figure 3C, Figure 3—figure supplement 1C). We detected only one NELF-A and five NELF-C intrasubunit crosslinks that exceeded a Cα distance of 30 Å, and these could generally be explained by local flexibility. However, for NELF-C, the >30 Å intrasubunit crosslinks occur between α helices 12’ and 13’ (K353, K380, K384, K388) and α helices 18’ and 19’ (K494, K518) (Figures 2, 4A). These crosslinks suggest helices 12’ and 13’ may change conformation. Together these data indicate that the structure of NELF-AC is largely preserved within the complete NELF complex.
We next used our crosslinking data to determine the topology of the complete NELF complex. NELF-B is the only subunit of the NELF complex with no available structural information. We first used our crosslinking data to determine the relative architecture of NELF-B. Intrasubunit crosslinks reveal three distinct modules that we identify as the N-terminal, middle, and C-terminal regions. The N-terminal region of NELF-B (85–291) forms extensive crosslinks within itself and with the C-terminal region (438–519). The C-terminal region also crosslinks considerably with itself. The middle region (291–438) forms intrasubunit crosslinks within but not outside of the module, suggesting that it may act as a hinge between the N- and C- terminal regions.
We next used the program I-TASSER to generate homology based models of NELF-B (Kelley et al., 2015; Yang et al., 2015). I-TASSER predicted that NELF-B forms a HEAT repeat fold (C-score = -2.31, best template structure is 1B3U – human PP2A). The model is supported by our crosslinking data, suggesting a strong curvature of the HEAT repeat fold, as observed for a HEAT repeat protein folding around its interaction partner (Cingolani et al., 1999) (Figure 3—figure supplement 1D).
NELF-B crosslinks with NELF-AC via its N- and C- terminal regions. The crosslinks primarily map to a single face of NELF-AC (Figure 4A). Two NELF-A residues present in the NELF-AC crystal structure, K55 and K166 form crosslinks with NELF-B residues K72, K278, K487 and K85 (Figure 3D). NELF-C primarily crosslinks to the N-terminal region of NELF-B (K85, K92, K126, K146) via α helices 12’ and 13’ and helices 18’ and 19’ (Figure 3E). Interestingly, one N-terminal residue of NELF-C that is not present in our crystal structure (K125) forms three crosslinks with NELF-B (K85, K278, and K497). NELF-B and -E crosslink extensively, consistent with biochemical interaction data (Narita et al., 2003) (Figure 3F). The N-terminus and the RRM domain of NELF-E (Rao et al., 2006) (residues 257–335, PDB ID: 2JX2) crosslink to both the N- and C-terminal regions of NELF-B. We also detect an intrasubunit crosslink between NELF-E residues K260 and K332, which are located on the same face of the RRM (Figure 3—figure supplement 1E), supporting the general conservation of the fold in the complex.
With respect to crosslinks between NELF-E and NELF-C, two lysines in the non-crystallized N-terminal region of NELF-C (K66 and K125) form several crosslinks with NELF-E, including its RRM domain. Regions of NELF-C in the vicinity of helices 18’ and 19’, which are also responsible for some of the detected crosslinks between NELF-B and -C, crosslink with the NELF-E N-terminal region and the RRM (Figures 3G, 4A). No crosslinks were detected between the crystallized region of NELF-A and NELF-E (Figure 3H). The region directly following the crystallized portion of NELF-A (190–255) forms multiple crosslinks with the rest of the NELF complex. Given that NELF-A (190–255) is highly susceptible to proteolysis and is predicted to be the primary region associating with Pol II, the multiple interprotein crosslinks formed by NELF-A (190–255) with NELF-B, -C, and -E are likely favored by flexibility within the molecule when Pol II is absent (Narita et al., 2003).
Our crosslinking data suggest the 3D topology for the entire NELF complex. The NELF-AC subcomplex interacts with NELF-B primarily through contacts made by NELF-C and the N-terminal region of NELF-B. The opposite face of NELF-AC remains solvent exposed in the complex. NELF-B with its predicted heat repeats forms a cradle around the N-terminal region of NELF-E, tethering the RRM domain to the rest of the complex. Taken together, NELF is a modular, flexible, multivalent complex with many interaction faces for both nucleic acids and protein partners.
Analysis of the surface of the NELF-AC structure showed that the face opposite of where we detected NELF-BE crosslinks contains four positively charged patches (Figure 4A–C, bottom view). Patch 1 consists of NELF-A residues R65 and R66 and NELF-C residues R291 and K315. Patch 2 encompasses NELF-C residues K371, K372, and K374, and patch 3 contains NELF-C residues K384 and K388. Patch 4 is composed of NELF-A residues K146, K161, K168, and R175, and NELF-C residues R419 and R506. These patches are well conserved among metazoans, and are partially conserved in Dictyostelium (Figure 1B). Our crosslinking MS data revealed that all positive patches, except for patch 3 (NELF-C K384, K388), were devoid of crosslinks with NELF-B or -E indicating that surface patches 1, 2, and 4 are not involved in subunit contacts. In addition to the four positive patches, the NELF-AC surface contains a conserved polar patch (patch 5) that is formed by NELF-A residues K166, R167, K170, L174, E177, K181, and S182, and residues E491, K494, D498, D526, S528, R531, Y532, T535 and E536 that protrude from NELF-C helices α18’ and α20’ (Figure 4—figure supplement 1). Interestingly, three lysines in this region (NELF-A K166, K181, NELF-C K494) crosslink to NELF-B.
The positively charged patches of NELF-AC suggested that the subcomplex may associate with nucleic acid. To investigate this, we used fluorescence anisotropy titration assays (Figure 5, Materials and methods). We first assessed NELF-AC binding to 25-nt, single-stranded (ss) DNA and ssRNA oligonucleotides bearing a 5’ FAM label. Two random sequences with either 44% or 60% GC content were employed. Interestingly, we detected moderate binding of NELF-AC to the ssDNA and ssRNA with 60% GC content. Fitting the resulting binding curves by linear regression analysis gave apparent Kd’s in the low micromolar range (Figure 5A, Table 3). The addition of competitor tRNA did not affect NELF-AC association with the 60% GC ssDNA/ssRNA indicating that the interaction is specific (Figure 5—figure supplement 1A). In contrast, we found that the 44% GC content RNA failed to associate significantly with the NELF-AC complex (Figure 5B). Additionally, NELF-AC did not associate with nucleic acid duplexes composed of the 60% GC sequence (DNA or DNA-RNA hybrids, not shown), suggesting that RNA and DNA binding by the subcomplex may be sequence and structure dependent.
To investigate whether the positively charged patches were involved in nucleic acid binding, we generated NELF-AC variants in which lysine and arginine residues in the patches were substituted with methionine and glutamine, respectively. Indeed, single-stranded nucleic acid binding to the 60% GC RNA and DNA constructs was impaired in variants with mutations in three or four of the positively charged patches (Figure 5C, Table 3). The strongest RNA binding defects are associated with mutations to patches 1 and 4, whereas mutations to patch 2 appear to have a greater impact on ssDNA association.
We also tested whether single-stranded nucleic acids corresponding to known Pol II in vivo pause sites could associate with NELF-AC (Figure 5—figure supplement 1C inset). A ssRNA oligonucleotide with a sequence corresponding to a promoter-proximal transcript from the junB gene bound NELF-AC with a Kd of ~8.0 ± 0.9 µM, whereas ssDNA corresponding to the non-template strand in this region bound more weakly (Aida et al., 2006) (Figure 5—figure supplement 1C). Furthermore, ssRNA and ssDNA derived from the c-fos promoter-proximal region sequences (Fivaz et al., 2000) also bound NELF-AC, albeit with a preference for DNA (Figure 5—figure supplement 1C). Taken together, NELF-AC binds single-stranded nucleic acids in vitro via positively charged patches, and suggests both the strength of binding and the preference for RNA or DNA is possibly sequence-dependent.
We next addressed whether NELF-AC associates with nucleic acids while residing in the NELF tetramer. The NELF-E RRM is reported to bind RNA in the mid nanomolar to micromolar range (Pagano et al., 2014; Rao et al., 2006) and thus could mask nucleic acid interactions by other subunits in our binding assays. To aid data interpretation, we generated NELF variants that lack the NELF-E RRM or NELF-E entirely. We used our crosslinking and limited proteolysis experiments to generate a NELF-E N-terminal fragment that stably binds NELF-B, but lacks the RRM (NELF-E residues 1–138). The WT NELF tetramer, NELF ∆RRM, and NELF-ABC were overexpressed in insect cells and purified to homogeneity (Materials and methods, Figure 6A, Figure 6—figure supplement 1A).
The WT tetramer, NELF ∆RRM and NELF-ABC were subjected to fluorescence anisotropy titration assays using the labeled 25-nt 60% GC random RNA/DNA oligonucleotides employed for the NELF-AC studies. The WT protein binds both the 60% GC ssDNA and RNA, however, the resulting curves are complex and cannot be fit by a simple single site binding model (Figure 6B, Figure 6—figure supplement 1B). NELF ∆RRM and NELF-ABC also bind the 60% GC RNA, but the curves can be fit by a single site binding model (apparent Kd NELF ∆RRM 30 ± 7 nM, NELF-ABC 75 ± 14 nM), demonstrating that regions of NELF outside of the NELF-E RRM associate with RNA. To further investigate NELF’s RRM domain-independent RNA-binding behavior, a patch mutated variant of NELF-C (residues R291Q, K315M, K371M, K372M, K374M, K384M, K388M, R419Q, R506Q) was used to replace the WT NELF-C protein in the WT, ∆RRM, and ABC complexes (Figure 6—figure supplement 1C). All NELF variants containing patch-mutated NELF-C showed reduced binding to RNA (Figure 6C–E). Binding deficits to the 60% GC RNA and ssDNA were similar in magnitude to those observed with the NELF-C patch mutations present in the NELF-AC subcomplex (Figure 5, Figure 6—figure supplement 1D–F, Table 3).
Despite containing the NELF-C patch-mutated variant, NELF ∆RRM and NELF-ABC retain the ability to bind RNA, suggesting that other regions of NELF-AC or NELF-B associate with RNA. To address whether NELF-B can associate with ss nucleic acid, NELF-B or NELF-B with an N-terminal fragment of NELF-E (1–138) were overexpressed in insect cells and purified (Figure 6—figure supplement 1A). Binding experiments performed with these NELF-B variants demonstrate that both bind the 60% GC RNA and ssDNA with affinities similar to that measured for NELF-AC (Table 3, Figure 7A). Together these data indicate that in addition to the NELF-E RRM, the NELF tetramer associates with RNA via NELF-B and -C.
The human immunodeficiency virus (HIV)-1 transactivation response (TAR) element is a hairpin shaped RNA produced +59 nucleotides after Pol II initiates transcription from the HIV-1 long terminal repeat of the integrated HIV-1 virus (Karn and Stoltzfus, 2012) (Figure 7C). The TAR RNA is used to recruit P-TEFb and other factors to promoter proximally paused Pol II (Ott et al., 2011). The NELF-E RRM domain binds the HIV-1 TAR RNA in vitro (Pagano et al., 2014; Rao et al., 2006; Yamaguchi et al., 2002; Fujinaga et al., 2004) and is postulated to regulate Pol II elongation by associating with the TAR element (Karn and Stoltzfus, 2012). To expand our understanding of NELF association with the TAR RNA stem loop, we performed fluorescence anisotropy binding experiments with our collection of NELF variants and a 5’ FAM labeled TAR stem loop. Binding experiments performed with the WT NELF complex revealed strong association between NELF and the TAR RNA stem loop (146 ± 30 nM), similar to affinities reported by the Lis group for the isolated human NELF-E RRM and the TAR stem loop (200 ± 10 nM) (Pagano et al., 2014). Interestingly, NELF complexes lacking the NELF-E RRM or NELF-E retained the ability to associate with the TAR stem loop albeit with a ≈6–10-fold reduction in binding affinity (869 ± 140 nM NELF ∆RRM, NELF-ABC 1.32 ± 0.18 µM). To determine which subcomplex of NELF is responsible for the non-RRM mediated association with the TAR stem loop, we tested binding of NELF-AC and NELF-BE (1–138) to the TAR stem loop. NELF-BE (1–138) modestly associated with the TAR stem loop (5.6 ± 1.0 µM) whereas NELF-AC showed little association (Figure 7D). This further suggests that RNA binding by NELF-AC is influenced by RNA sequence and/or structure and that NELF associates with RNA on surfaces outside of the RRM domain.
It is known that the NELF-E subunit binds RNA in vitro and in vivo (Pagano et al., 2014; Yamaguchi et al., 2002; Schaukowitch et al., 2014; Missra and Gilmour, 2010). Our biochemical experiments demonstrate that NELF additionally binds to single-stranded nucleic acids via NELF-B and NELF-AC in vitro. To determine whether NELF-B or NELF-AC can also associate with RNA in vivo, we performed photoactivatable-ribonucleoside-enhanced crosslinking and immunoprecipitation experiments in Jurkat and 293FT cells. Cells were treated for 16 hr with 4-thiouridine (4sU) to label RNA and enhance crosslinking efficiency. RNA was crosslinked to associated proteins using UV light at a wavelength of 365 nm prior to immunoprecipitation with subunit-specific antibodies. The immunoprecipitated material was treated with RNase, dephosphorylated, and rephosphorylated in the presence of ATP [γ-32P]. The resulting material was analyzed by SDS-polyacrylamide gel electrophoresis (SDS-PAGE).
We found that a NELF-E antibody immunoprecipitated the entire NELF complex, as determined by mass spectrometry analysis and Western blotting, in an apparently stoichiometric fashion, allowing us to assess RNA binding by each subunit (Figure 8A and Figure 8—figure supplement 1A–C, Figure 8—figure supplement 2). Bands corresponding to NELF-E and NELF-B/C were readily and reproducibly detected in the radiolabeled sample from both cell lines (Figure 8B and Figure 8—figure supplement 1C). The intensity of the band for NELF-B/C was less than that observed for the NELF-E band, indicating that NELF-B/C may associate more weakly with RNA than NELF-E. This is consistent with the reported high RNA-binding affinity of the NELF-E RRM domain and our biochemical results (Pagano et al., 2014; Rao et al., 2006). Immunoprecipitation with a NELF-A antibody produced similar results in Jurkat cells (Figure 8—figure supplement 1D). To confirm that NELF-C binds RNA, the NELF-A, -B and -C subunits were cloned into mammalian expression vectors and overexpressed in 293FT cells. Consistent with the native protein, the overexpressed NELF-B and NELF-C subunits bound RNA whereas the NELF-A subunit failed to associate with RNA (Figure 8C, Figure 8—figure supplement 3A–C). Together these results indicate that NELF-B, -C, and -E all associate with RNA in cells.
Deciphering the mechanism of promoter-proximal Pol II pausing is essential for understanding gene regulation and requires structural information of Pol II elongation complexes bound by DSIF, NELF, and P-TEFb. To this end, structures of the involved multi-protein components are required. Structural information is available for Pol II elongation complexes (Martinez-Rucobo and Cramer, 2013), DSIF (Klein et al., 2011; Martinez-Rucobo et al., 2011), and P-TEFb (Baumli et al., 2008; 2012; Schulze-Gahmen et al., 2013; 2014; Tahirov et al., 2010). However, structural information about NELF is lacking, except for the RRM domain of NELF-E (Rao et al., 2006; 2008). To close this gap, we report here the crystal structure of the conserved core NELF subcomplex NELF-AC and the architecture of the 4-subunit, complete NELF complex. We further show that NELF-B and NELF-AC bind single-stranded nucleic acids in vitro, and that NELF-B and NELF-C, in addition to NELF-E, associate with RNA in vivo. These results provide an important step in understanding NELF function and provide the basis for a mechanistic analysis of the role of NELF in promoter-proximal pausing.
From our structural, biochemical and in vivo data, we propose an architectural model for the NELF complex. In the complex, NELF-AC binds to the N-terminal region of NELF-B. The N-terminal region of NELF-E is sandwiched in between the N- and C- termini of NELF-B, anchoring the flexible NELF-E RRM domain to the rest of the complex. Strong RNA binding by the NELF-E RRM may initially recruit NELF to RNA and secondary binding events by NELF-B and NELF-C may further stabilize the complex on nucleic acid. Future structural studies are required to determine the nucleic acid binding surfaces of NELF-B and to determine how RNA snakes from the NELF-E RRM through the rest of the complex (Figure 9). It is also likely that RNA-binding involves major conformational changes.
It is known that the extent of Pol II pausing strongly differs between different genes (Muse et al., 2007). Such gene specificity may be explained by differences in promoter-proximal DNA regions. How can DNA sequence influence pausing? First, certain sequences may lead to DNA-RNA hybrids that favor Pol II pausing by slowing down the elongation rate, similar to DNA sequences that influence pausing of bacterial RNA polymerase or Drosophila Pol II (Larson et al., 2014; Vvedenskaya et al., 2014; Greive and Hippel, 2005; Nechaev et al., 2010). DNA sequence also affects binding of proteins such as GAGA factor and estrogen receptor, which both are reported to recruit NELF to specific genes to induce pausing (Aiyar et al., 2004; Li et al., 2013). Second, nascent RNA may bind to NELF with different affinities, influencing the efficiency of NELF recruitment to pause sites. Indeed, we observed that nucleic acid binding of NELF-AC may depend on the nucleic acid sequence, and it is known that the RNA-binding activity of the NELF-E RRM domain is sequence-dependent (Pagano et al., 2014). It is also known that DNA regions differ in their GC content (Ginno et al., 2012) and in Drosophila a sequence motif was reported to be associated with pausing (Hendrix et al., 2008). Third, nucleosome stabilities vary with DNA sequence and nucleosomes are known to influence Pol II elongation (Gilchrist et al., 2008; 2010; Mayer et al., 2015).
NELF association with RNA may not only be required for pausing, but may also be important for mRNA processing when NELF acts together with interaction factors such as Integrator and the cap binding complex (CBC) (Narita et al., 2007; Stadelmayer et al., 2014; Yamamoto et al., 2014). The NELF interaction with CBC is involved in the appropriate 3’ processing of histone mRNAs (Narita et al., 2007). Similarly, processing of U1, U2, U4, and U5 snRNAs is dependent on NELF and Integrator (Yamamoto et al., 2014). Future work is required to determine which RNAs associate with NELF in cells and if this binding activity is independent of NELF’s role in promoter proximal pausing.
We note that nucleic acid binding alone may explain recruitment of NELF to certain genes and its association with promoter-proximal regions, but is insufficient to explain Pol II pausing, which additionally requires a change in the elongation behavior of Pol II. This may involve a conformational switch in the polymerase that may be triggered or stabilized by NELF binding to the Pol II surface. Analysis of this intricate mechanism awaits structural studies of functional complexes comprising Pol II, DSIF, NELF and additional factors. The results reported here provide an important step towards this goal.
The borders of NELF-A and NELF-C within the NELF-AC subcomplex were determined by limited proteolysis of human full-length NELF-AC complex followed by Edman sequencing. Human NELF-A (Q9H3P2) and NELF-C (Q8IXH7) were amplified from codon optimized DNA (Mr. Gene) and cloned into pET28a and pET21b vectors, between NdeI and XhoI or NdeI and BamHI restriction sites, respectively, resulting in N-terminally His6-tagged NELF-A (6–188) and untagged NELF-C (183–590). Synthetic oligonucleotides were purchased from Thermo Fisher Scientific and Sigma Genosys.
Plasmids encoding NELF-A (6–188) and NELF-C (183–590) were co-transformed into E. coli BL21 CodonPlus (DE3) RIL cells (Stratagene). Cells were grown in LB medium at 37°C until OD600 ~0.6 and cooled on ice for 30 min. Protein expression was induced by the addition of 1 mM IPTG. After induction, cells were grown for an additional 16 hr at 18°C. All purification steps were performed at 4°C. Cells were resupsended and lysed in buffer A (150 mM NaCl, 40 mM Na-HEPES pH 7.4, 10 mM imidazole, 2 mM DTT, 0.284 µg/ml leupeptin, 1.37 µg/ml pepstatin A, 0.17 mg/ml PMSF, 0.33 mg/ml benzamidine). The lysate was applied to Ni-NTA agarose (Qiagen) beads and washed extensively with buffer A containing 20 and 40 mM imidazole. Protein was eluted from the beads with buffer A containing 200 mM imidazole. The eluted protein was mixed with 1 U thrombin/mg protein (Sigma) and dialyzed against buffer B (150 mM NaCl, 40 mM Na-HEPES pH 7.4, 2 mM DTT) for 16 hr at 4°C. The protein was applied to Ni-NTA beads equilibrated in buffer B to remove uncleaved protein. The Ni-NTA flow through was applied to an anion exchange column (HiTrap Q-HP, 1 ml, GE Healthcare) equilibrated in buffer B. Protein was eluted via a salt gradient from 100 mM to 1 M NaCl in buffer B. The protein was further purified by size exclusion chromatography with the use of a Superose 6 10/300 column (GE Healthcare) equilibrated in buffer B. Peak fractions were pooled and concentrated by centrifugation in Amicon Ultra 4 ml concentrators (10 kDa MWCO) (Millipore) to 12 mg/ml. Protein concentration was determined by absorbance at 280 nm using protein-specific parameters. Protein was aliquoted, flash frozen, and stored at -80°C.
For production of selenomethionine-labeled protein, NELF-AC (6–188 and 183–590) plasmids were co-transformed into E. coli B834 (DE3) cells. For protein expression, cells were grown in SelenoMet Medium (AthenES) supplemented with 40 µg/ml L-selenomethionine (SeMet). Selenomethionine-labeled protein was purified as above.
Native and selenomethionine-labeled NELF-AC crystals were grown by hanging-drop vapor diffusion and were obtained by mixing 1 µl NELF-AC protein (12 mg/ml) with 1 µl reservoir solution containing 14–14.5% (w/v) PEG 3350 and 200 mM sodium malonate pH 6.8–7.0. Tetrahedral NELF-AC crystals grew within 3–5 days. Crystals were cryo-protected in mother liquor containing 25% (w/v) glucose, and flash frozen in liquid nitrogen.
Diffraction data for native crystals were collected under cryo conditions (100 K) in 0.1° increments at beamline X06DA of the Swiss Light Source in Villigen (Switzerland) using a wavelength of 1.0000 Å and a Pilatus 2M-F detector (Broennimann et al., 2006). Raw data were processed and scaled with XDS (Kabsch, 2010). The structure was solved by SIRAS using diffraction data from an isomorphous crystal of SeMet-labeled protein. Location of 13 selenomethionine sites, calculation of initial phases and density modification were performed with the SHELX suite (Sheldrick, 2008). An initial model was built with Buccaneer (Cowtan, 2006). The model was iteratively built with COOT (Emsley and Cowtan, 2004) and refined with REFMAC (Vagin et al., 2004) and phenix.refine (Adams et al., 2010) until the R-factors converged. In the final model, 98.2% of residues are in preferred Ramachandran regions and 1.8% of residues are in additionally allowed regions. Figures were prepared with PyMol (PyMOL, 2002).
WT and mutant NELF-AC proteins were expressed and purified as described above. For the final size exclusion step, the column was equilibrated in buffer C (50 mM NaCl, 10 mM HEPES pH 7.4 and 2 mM DTT). Peak fractions were pooled, concentrated by centrifugation to 30 mg/ml, aliquoted, flash frozen, and stored at −80°C.
5’-/6-FAM labeled ssDNA, ssRNA and dsDNA were obtained from Integrated DNA Technologies and dissolved in water to 100 µM. Sequences for ssDNA and dsDNA were 44% GC ACCCCACAACTAAAAAATCCCAACC, and 60% GC AAGGGGAGCGGGGGAGGATAATAGG (corresponding sequences for ssRNA). Natural ssDNA sequences correspond to sequences of exposed coding (non-template) strands at the c-fos gene (bps 87–96 downstream of the TSS [Fivaz et al., 2000]) and the junB gene (bps 45–54 downstream of the TSS [Aida et al., 2006]) during promoter-proximal pausing +/−5 bps (Figure 5—figure supplement 1C, inset): AAGACTGAGCCGGCGGCCGC and AGGGAGCTGGGAGCTGGGGG, respectively. Natural ssRNA sequences correspond to 25 nt of nascent mRNA sequence predicted to be proximal to the RNA exit pore on the Pol II surface at c-fos (bps 53–77 relative to TSS) and junB (bps 13–37 relative to TSS) during promoter-proximal pausing (Figure 5—figure supplement 1C inset): CCGCAUCUGCAGCGAGCAUCUGAGA and AGCGGCCAGGCCAGCCUCGGAGCCA, respectively. The sequence corresponding to the HIV-1 TAR RNA stem loop is: CCAGAUCUGAGCCUGGGAGCUCUCUGG. The HIV-1 TAR RNA substrate was diluted to 50 µM with folding buffer (final conditions 100 mM NaCl, 20 mM Na•HEPES pH 7.5, 3 mM MgCl2, 10% (v/v) glycerol). The TAR RNA was folded by incubating the RNA at 95°C for 3 min and transferring to ice for 10 min. The TAR RNA was diluted in folding buffer instead of water for all experiments.
NELF-AC was serially diluted in two fold steps in buffer C. Nucleic acid (2.4 µl, 10 nM final concentration) and NELF-AC (12 µl, 100–0.1 µM final concentration) were mixed on ice and incubated for 10 min. The assay was brought to a final volume of 24 µl and incubated for 20 min at RT in the dark (final conditions: 30 mM NaCl, 3 mM MgCl2, 10 mM Na-HEPES pH 7.4, 2 mM DTT and 50 µg/ml BSA). To test for non-specific binding, 5 µg/ml yeast tRNA (Sigma) was added to reactions as a competitor (Figure 5—figure supplement 1). 20 µl of each solution was transferred to a Greiner 384 Flat Bottom Black Small volume plate.
Fluorescence anisotropy was measured at 30°C with an Infinite M1000Pro reader (Tecan) with an excitation wavelength of 470 nm (±5 nm), an emission wavelength of 518 nm (±20 nm) and a gain of 72. All experiments were done in triplicate and analyzed with GraphPad Prism Version 6. Binding curves were fit with a single site quadratic binding equation:
where Bmax is the maximum specific binding, L is the concentration of nucleic acid, x is the concentration of NELF-AC, Kd,app is the apparent disassociation constant for NELF-AC and nucleic acid. Error bars are representative of the standard deviation from the mean of three experimental replicates. Experiments were performed on different days from at least two different protein preparations.
Vectors encoding the full-length human cDNAs for NELF-A, -B, -D and -E were generous gift of Hiroshi Handa (Narita et al., 2003) and were used as PCR templates for subcloning into modified pFastBac vectors via ligation independent cloning (LIC) (a gift of Scott Gradia, UC Berkeley, vectors 438-A, 438-B [Addgene: 55218, 55219]). NELF-D bears an N-terminal 6x His tag followed by a tobacco etch virus protease cleavage site. Individual subunits were combined into a single plasmid by successive rounds of ligation independent cloning. Each subunit is proceeded by a PolH promoter and followed by an SV40 termination site. For simplicity, residues of the NELF-D subunit are numbered in this text according to NELF-C nomenclature. The NELF-D patch mutant was generated as a synthetic gene block (IDT) from the cDNA sequence and cloned into the 438-B vector using LIC. NELF-E (1–138) was truncated by round the horn PCR. For NELF-B or NELF-BE (1–138) expression constructs, NELF-B was cloned with an N-terminal 6x His tag followed by a tobacco etch virus protease cleavage site.
Purified plasmid DNA (0.5–1 µg) was electroporated into DH10EMBacY cells to generate bacmids (Berger et al., 2004). Bacmids were prepared from positive clones by isopropanol precipitation and transfected into Sf9/Sf21 cells grown in Sf-900 III SFM (ThermoFisher) or ESF921 (Expression Technologies), respectively, with X-tremeGENE9 transfection reagent (Sigma) to generate V0 virus. V0 virus was harvested 48–72 hr after transfection. V1 virus was produced by infecting 25 ml of Sf9 or Sf21 cells grown at 27°C, 300 rpm with V0 virus (1E6 cell/ml, 1:50 (v/v) cells:virus). V1 viruses were harvested 48 hr after proliferation arrest and stored at 4°C. For protein expression, 600 ml of Hi5 cells grown in ESF921 medium (Expression Technologies) were infected with 300 µl of V1 virus and grown for 48 hr at 27°C. Cells were harvested by centrifugation (238xg, 4°C, 30 min), resuspended in lysis buffer at 4°C (300 mM NaCl, 20 mM Na•HEPES pH 7.4, 10% glycerol (v/v), 1 mM DTT, 30 mM imidazole pH 8.0, 0.284 µg/ml leupeptin, 1.37 µg/ml pepstatin A, 0.17 mg/ml PMSF, 0.33 mg/ml benzamidine), snap frozen, and stored at −80°C.
Protein purification steps were performed at 4°C. Frozen cell pellets were thawed and lysed by sonication. Lysates were clarified by centrifugation in an A27 rotor (ThermoFisher) (26,195 xg, 4°C, 30 min), followed by ultracentrifugation in a Type 45 Ti rotor (Beckman Coulter) (235,000 xg, 4°C, 60 min). Clarified lysates were filtered through 0.8 µm syringe filters (Millipore) and applied to a 5 mlL HisTrap columns (GE Healthcare) equilibrated in lysis buffer. HisTrap columns were washed with 10CV of lysis buffer followed by 5CV of high salt wash buffer (800 mM NaCl, 20 mM Na•HEPES pH 7.4, 10% glycerol (v/v), 1 mM DTT, 30 mM imidazole pH 8.0, 0.284 µg/ml leupeptin, 1.37 µg/ml pepstatin A, 0.17 mg/ml PMSF, 0.33 mg/ml benzamidine) and 5CV of lysis buffer. The NELF-B construct was washed with a high salt buffer containing 1 M NaCl.
For the FL NELF tetramer, HisTrap columns were washed with 5CV of low salt buffer (150 mM NaCl, 20 mM Na•HEPES pH 7.4, 10% glycerol (v/v), 1 mM DTT, 30 mM imidazole pH 8.0, 0.284 µg/ml leupeptin, 1.37 µg/ml pepstatin A, 0.17 mg/ml PMSF, 0.33 mg/ml benzamidine) before a tandem 5 ml HiTrap Q and HiTrap S column (GE Healthcare) equilibrated in low salt buffer were directly coupled to the HisTrap column. Protein was eluted from the HisTrap column by a gradient from 0–100% nickel elution buffer (150 mM NaCl, 20 mM Na•HEPES pH 7.4, 10% glycerol (v/v), 1 mM DTT, 500 mM imidazole pH 8.0, 0.284 µg/ml leupeptin, 1.37 µg/ml pepstatin A, 0.17 mg/ml PMSF, 0.33 mg/ml benzamidine), after which the HisTrap and HiTrap S column were decoupled from the HiTrap Q column. The HiTrap Q column was washed with 5CV of low salt buffer and protein was eluted by gradient from 0–100% high salt buffer. Peak fractions were analyzed by SDS-PAGE. HiTrap Q fractions containing FL NELF were combined with 2 mg of His6-TEV protease, 416 µg lambda protein phosphatase and dialyzed overnight at 4°C in a Slide-A-Lyzer (2–12 ml 10 kDa MWCO) (ThermoFisher) against 1 L of lysis buffer containing 1 mM MnCl2. Truncation constructs were eluted directly from the HisTrap column by a gradient from 0–100% nickel elution buffer containing 300 mM NaCl. Peak fractions were analyzed by SDS-PAGE, and pooled for TEV protease and lambda phosphatase treatment overnight as described for the FL tetrameric protein.
Protein was removed from the Slide-A-Lyzer cassette and applied to a 5 mL HisTrap column to remove uncleaved protein and TEV protease. Protein was concentrated in an Amicon 15 ml centrifugal concentrator (FL tetramer 100 MWCO; NELF ∆RRM and NELF-ABC 50 MWCO; NELF-B and NELF-BE (1–138) 30 MWCO) (Millipore) to 1.0–2.0 ml. The protein was applied to a S200 16/600 pg column (GE Healthcare) equilibrated in 150 mM NaCl, 20 mM Na•HEPES pH 7.4, 10% (v/v) glycerol, and 1 mM DTT. Peak fractions were analyzed by SDS-PAGE. Pure fractions were concentrated as described above to 500 µl, aliquoted, flash frozen, and stored at −80°C. Typical protein preparations yield 10–15 mg of FL tetrameric NELF from 1 L of insect cell culture.
Fluorescence anisotropy experiments were performed essentially as described for NELF-AC except for the following modifications. Protein was diluted in half log dilution steps in a buffer containing 150 mM NaCl, 20 mM Na•HEPES pH 7.4, 10% (v/v) glycerol, and 1 mM DTT. The final buffer contained 60 mM NaCl, 3 mM MgCl2, 10 mM Na-HEPES pH 7.4, 2 mM DTT, 50 µg/ml BSA, and 5 µg/ml baker’s yeast tRNA. 18 µl of the solution was used for measurements.
The 4-subunit human NELF complex (10 µg, 5.5 µM in 95 µl final volume) purified from insect cells was incubated with 1.1 mM disuccinimidyl suberate (DSS) H12/D12 (Creative Molecules) for 30 min at 30°C in a final buffer containing 100 mM NaCl, 30 mM Na•HEPES pH 7.4, 10% (v/v) glycerol, 1 mM DTT, and 3 mM MgCl2. The crosslinking reaction was quenched by adding ammonium bicarbonate to a final concentration of 100 mM and incubation for 10 min at 30°C. The chemical cross-links on NELF complexes were identified by mass spectrometry as described previously (Herzog et al., 2012). Briefly, cross-linked complexes were reduced with 5 mM TCEP (Thermo Scientific) at 35°C for 15 min and subsequently treated with 10 mM iodoacetamide (Sigma-Aldrich) for 30 min at room temperature in the dark. Digestion with lysyl enodpeptidase (Wako) was performed at 35°C, 6 M Urea for 2 hr (at enzyme-substrate ratio of 1:50 w/w) and was followed by a second digestion with trypsin (Promega) at 35°C overnight (also at 1:50 ratio w/w). Digestion was stopped by the addition of 1% (v/v) trifluoroacetic acid (TFA). Acidified peptides were purified using C18 columns (Sep-Pak, Waters). The eluate was dried by vacuum centrifugation and reconstituted in water/acetonitrile/TFA, 75:25:0.1. Cross-linked peptides were enriched on a Superdex Peptide PC 3.2/30 column (300 × 3.2 mm) at a flow rate of 25 μl min−1 and water/acetonitrile/TFA, 75:25:0.1 as a mobile phase. Fractions of 100 μl were collected, dried, and reconstituted in 2% acetonitrile and 0.2% FA, and further analyzed by liquid chromatography coupled to tandem mass spectrometry using a hybrid LTQ Orbitrap Elite (Thermo Scientific) instrument. Cross-linked peptides were identified using xQuest (Walzthoeni et al., 2012). False discovery rates (FDRs) were estimated by using xProphet (Walzthoeni et al., 2012) and results were filtered according to the following parameters: FDR = 0.05, min delta score = 0.90, MS1 tolerance window of −4 to 4 ppm, ld-score > 22.
Jurkat cells were maintained in RPMI (Gibco) with 10% FBS and Glutamax. 293FT cells were maintained in DMEM (Gibco) with 10% FBS and Glutamax. Cells were routinely checked for mycoplasma contamination using the PlasmoTest Mycoplasma Detection Kit (InvivoGen, 12K06-MM). Antibodies used were anti-NELF-A (Santa Cruz, sc-23599); anti-COBRA1 (Bethyl Laboratories, A301-911A-M); anti-THL1 (Cell Signaling, D5G6W); anti-NELF-E (Millipore, ABE48); anti-GAPDH (Sigma, G8795); anti-FLAG M2 (Sigma, clone M2, F1804); and anti-c-MYC (Sigma, clone 9E10, M4439).
Photoactivatable-ribonucleoside-enhanced crosslinking and immunoprecipitation was performed as described, with a few modifications. All concentrations are final unless otherwise indicated. Jurkat and 293FT cells were incubated with 4-thiouridine (4sU) (100 µM) (Carbosynth, NT06186) for 16 hr in growth medium. Cells were then washed two times with PBS and crosslinked at 365 nm with a Bio-Link BLX 365 (PeqLab) UV lamp operated at 0.15 J/cm2 (293FT) or 0.2 J/cm2 (Jurkat). Cells were scraped from the plates with cold PBS and collected by centrifugation. Pellets were resuspended in 3 volumes of NP-40 Lysis Buffer (50 mM HEPES-KOH pH 7.5, 150 mM KCl, 2 mM EDTA-NaOH pH 8.0, 1 mM NaF, 0.5% (v/v) NP-40, 0.5 mM DTT, 1x complete EDTA-free protease inhibitor cocktail (Sigma, P8340)) and incubated on ice for 10 min. The resulting lysate was then passed through a syringe with 27G needle seven times and centrifuged at 13000 g for 15 min at 4°C. The supernatant fraction was further clarified by passing it through a 5 μm syringe filter (Pall Corporation, 4650). Total protein concentration was determined by the Bradford method. Control cells were treated with 4sU but not crosslinked and underwent the same treatment as the crosslinked samples.
For endogenous NELF-A and NELF-E immunoprecipitation experiments, 30 mg of total protein was incubated with 70 μg of anti-NELF-A antibody (Santa Cruz, sc-23599) or 50 μg of anti-NELF-E antibody (Bethyl Laboratories, A301-914A) conjugated to Protein G Dynabeads (Invitrogen, B00262). For plasmid overexpression experiments, 20 mg of total protein was incubated with 70 μl anti-FLAG M2 Magnetic Beads (Sigma, M8823) or 15 μg of anti-c-Myc antibody (Sigma, clone 9E10, M4439) conjugated to Protein G Dynabeads (Invitrogen, B00262). The protein was incubated with the antibody-conjugated beads for 2 hr at 4°C on a rotating wheel. Beads were then washed three times with 1 ml of cold IP Wash Buffer (50 mM HEPES-KOH pH 7.5, 300 mM KCl, 0.05% (v/v) NP-40, 0.5 mM DTT, 1x complete EDTA-free protease inhibitor cocktail (Sigma, P8340)). The beads were resuspended in 200 μl of IP Wash Buffer and treated with 50 U/µl RNAse T1 (Thermo Scientific, EN0542) for 15 min at 22°C and cooled on ice for 5 min. Beads were washed three times with 1 ml of cold High Salt Wash Buffer (50 mM HEPES-KOH pH 7.5, 500 mM KCl, 0.05% (v/v) NP-40, 0.5 mM DTT, 1x complete EDTA-free protease inhibitor cocktail (Sigma, P8340)), followed by one wash with 1 ml of Phosphatase Buffer pH 6.0 (50 mM Tris-HCl pH 7.0, 1 mM Mg2Cl2, 0.1 mM ZnCl2).
RNAs were dephosphorylated in 100 μl of Phosphatase Reaction Mix (1X Antarctic Phosphatase Reaction Buffer (NEB, M0289S), 1 U/μl Antarctic Phosphatase (NEB, M0289S), and 1 U/μl RNase OUT (Invitrogen, 10777–019)) for 30 min at 37°C, 300 rpm. Beads were washed once with 1 ml of Phosphatase Wash Buffer (50 mM Tris-HCl pH 7.5, 20 mM EGTA, 0.5% (v/v) NP-40) and two times in Polynucleotide Kinase Buffer (50 mM Tris-HCl pH 7.5, 50 mM NaCl, 10 mM MgCl2). Beads were resuspended in 20 μl of Kinase Reaction Mix (1X T4 PNK Reaction Buffer (NEB, M0201S), 1 U/μl T4 PNK (NEB, M0201S), 2 U/μl RNAse OUT (Invitrogen, 10777–019), and 1 μCi/μl ATP-γ-32P (Perkin Elmer, NEG502Z) and incubated for 1 hr at 37°C, 800 rpm. Beads were washed five times with 1 ml of Polynucleotide Kinase Buffer (50 mM Tris-HCl pH 7.5, 50 mM NaCl, 10 mM MgCl2), resuspended in 25 μl of 2X SDS-loading buffer, and incubated for 5 min at 95°C. The eluted supernatant (20 µl) was run on a Novex Bis-Tris 4–12% (Invitrogen) SDS-PAGE in 1X MOPS buffer for 1 hr at 160 V. The gel was exposed to a phosphorimager screen overnight at −20°C. The phosphorimager screen was scanned on a Typhoon FLA 9500 (GE). To determine the specificity and crossreactivity of the NELF antibodies used for immunoprecipitation experiments, samples treated with nonradioactive ATP were submitted for mass spectrometry (MS) analysis. The MS analysis confirmed that all NELF subunits are present and verified that the detected radiolabeled signal corresponds to NELF subunits.
To generate NELF-A, -B, and -C overexpression plasmids, the gene coding regions of human NELF-A, -B, and -C were cloned into pCMV-GLuc2 (NEB) between the BamHI and NotI sites. Genes were cloned with N-terminal affinity tags followed by a TEV protease cleavage site (3xMYC NELF-A, 3xFLAG NELF-B and -C). DNA for transfection was isolated from E. coli and purified by phenol chloroform extraction and ethanol precipitation. Plasmids were resuspended in water at a final concentration of ~10 µg/µL and stored at −20°C prior to transfection.
For plasmid transfection and 4-thiouridine labeling in 293FT, plasmids were transfected into 293FT cells using Lipofectamine 2000 (ThermoFisher Scientific) as directed by the manufacturer. Briefly, 25 µg of plasmid was transfected into cells growing in p145 cm2 dishes (5 dishes per condition). Thirty-two hrs after transfection, 100 µM 4sU (Carbosynth, NT06186) was added to the growth medium. Cells were incubated at 37°C for an additional 14–16 hr. Photoactivatable-ribonucleoside-enhanced crosslinking and immunoprecipitation experiments were performed 48 hr after transfection.
PHENIX: a comprehensive Python-based system for macromolecular structure solutionActa Crystallographica. Section D, Biological Crystallography 66:213–221.https://doi.org/10.1107/S0907444909052925
Transcriptional pausing caused by NELF plays a dual role in regulating immediate-early expression of the junB geneMolecular and Cellular Biology 26:6094–6104.https://doi.org/10.1128/MCB.02366-05
Electrostatics of nanosystems: application to microtubules and the ribosomeProceedings of the National Academy of Sciences of the United States of America 98:10037–10041.https://doi.org/10.1073/pnas.181342398
Baculovirus expression system for heterologous multiprotein complexesNature Biotechnology 22:1583–1587.https://doi.org/10.1038/nbt1036
Progression through the RNA polymerase II CTD cycleMolecular Cell 36:541–546.https://doi.org/10.1016/j.molcel.2009.10.019
MolProbity: all-atom structure validation for macromolecular crystallographyActa Crystallographica. Section D, Biological Crystallography 66:12–21.https://doi.org/10.1107/S0907444909042073
Properties of RNA polymerase II elongation complexes before and after the P-TEFb-mediated transition into productive elongationJournal of Biological Chemistry 282:21901–20912.https://doi.org/10.1074/jbc.M702936200
Solution structure of the HIV-1 integrase-binding domain in LEDGF/p75Nature Structural & Molecular Biology 12:526–532.https://doi.org/10.1038/nsmb937
Promoter-proximal pausing and its release: molecular mechanisms and physiological functionsExperimental Cell Research 316:2723–2730.https://doi.org/10.1016/j.yexcr.2010.05.036
xiNET: cross-link network maps with residue resolutionMolecular & Cellular Proteomics 14:1137–1147.https://doi.org/10.1074/mcp.O114.042259
The Buccaneer software for automated model building. 1. Tracing protein chainsActa Crystallographica. Section D, Biological Crystallography 62:1002–1011.https://doi.org/10.1107/S0907444906022116
Thinking quantitatively about transcriptional regulationNature Reviews. Molecular Cell Biology 6:221–232.https://doi.org/10.1038/nrm1588
Promoter elements associated with RNA Pol II stalling in the Drosophila embryoProceedings of the National Academy of Sciences of the United States of America 105:7762–7767.https://doi.org/10.1073/pnas.0802406105
A positioned +1 nucleosome enhances promoter-proximal pausingNucleic Acids Research 43:3068–3078.https://doi.org/10.1093/nar/gkv149
Getting up to speed with transcription elongation by RNA polymerase IINature Reviews. Molecular Cell Biology 16:167–177.https://doi.org/10.1038/nrm3953
Transcriptional and posttranscriptional regulation of HIV-1 gene expressionCold Spring Harbor Perspectives in Medicine 2:a006916.https://doi.org/10.1101/cshperspect.a006916
MAFFT multiple sequence alignment software version 7: improvements in performance and usabilityMolecular Biology and Evolution 30:772–852.https://doi.org/10.1093/molbev/mst010
The Phyre2 web portal for protein modeling, prediction and analysisNature Protocols 10:845–858.https://doi.org/10.1038/nprot.2015.053
RNA polymerase and transcription elongation factor Spt4/5 complex structureProceedings of the National Academy of Sciences of the United States of America 108:546–550.https://doi.org/10.1073/pnas.1013828108
Control of transcriptional elongationAnnual Review of Genetics 47:483–508.https://doi.org/10.1146/annurev-genet-110711-155440
Promoter proximal pausing and the control of gene expressionCurrent Opinion in Genetics & Development 21:231–235.https://doi.org/10.1016/j.gde.2011.01.010
Regulating RNA polymerase pausing and transcription elongation in embryonic stem cellsGenes & Development 25:742–754.https://doi.org/10.1101/gad.2005511
Interactions between DSIF (DRB sensitivity inducing factor), NELF (negative elongation factor), and the Drosophila RNA polymerase II transcription elongation complexProceedings of the National Academy of Sciences of the United States of America 107:11301–11306.https://doi.org/10.1073/pnas.1000681107
RNA polymerase is poised for activation across the genomeNature Genetics 39:1507–1511.https://doi.org/10.1038/ng.2007.21
Negative elongation factor (NELF) coordinates RNA polymerase II pausing, premature termination, and chromatin remodeling to regulate HIV transcriptionThe Journal of Biological Chemistry 288:25995–26003.https://doi.org/10.1074/jbc.M113.496489
The control of HIV transcription: keeping RNA polymerase II on trackCell Host & Microbe 10:426–435.https://doi.org/10.1016/j.chom.2011.11.002
The PyMOL Molecular Graphics System, version Version 22.214.171.124Schrödinger, LLC.
Structural basis of transcription initiation by RNA polymerase IINature Reviews. Molecular Cell Biology 16:129–143.https://doi.org/10.1038/nrm3952
REFMAC5 dictionary: organization of prior chemical knowledge and guidelines for its useActa Crystallographica. Section D, Biological Crystallography 60:2184–2195.https://doi.org/10.1107/S0907444904023510
NELF and DSIF cause promoter proximal pausing on the hsp70 promoter in DrosophilaGenes & Development 17:1402–1414.https://doi.org/10.1101/gad.1091403
Transcription elongation factors DSIF and NELF: promoter-proximal pausing and beyondBiochimica Et Biophysica Acta 1829:98–104.https://doi.org/10.1016/j.bbagrm.2012.11.007
Structure and function of the human transcription elongation factor DSIFThe Journal of Biological Chemistry 274:8085–8092.https://doi.org/10.1074/jbc.274.12.8085
Negative elongation factor NELF represses human immunodeficiency virus transcription by pausing the RNA polymerase II complexThe Journal of Biological Chemistry 282:16981–16988.https://doi.org/10.1074/jbc.M610688200
Karen AdelmanReviewing Editor; National Institute of Environmental Health Sciences, United States
In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.
Thank you for submitting your article "NELF architecture and RNA binding" for consideration by eLife. Your article has been reviewed by three peer reviewers, one of whom is a member of our Board of Reviewing Editors and the evaluation has been overseen by Kevin Struhl as the Senior Editor.
The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.
Note: The authors should make their title more broadly accessible.
– Use of acronyms (like NELF) is discouraged. NELF should be described as the Negative Elongation Factor
– The authors should note that they are working with human NELF proteins
(further information is below)
NELF complex plays a critical role in RNA polymerase II pausing, which has been shown to be important in transcriptional regulation of many genes in various organisms. The structure of NELF complex, which has been elusive for a long time, is crucial of mechanistic understanding of its function in Pol II pausing. In their paper, Patrick Cramer and his colleagues have solved the crystal structure of NELF-A/C subcomplex, and modeled NELF-B and -E based on the cross-linking data. Interestingly, in addition to the predominant RNA interaction of NELF complex via the RRM domain of NELF-E subunit, they identified a region of NELF-A/C complex that associates with RNA (and ssDNA) both in vitro and in vivo. Information provided in this manuscript is a first step for a detailed mechanistic study of NELF's role in Pol II pausing and its regulation.
While the study provides novel insights into the assembly structure of the NELF complex, there are several limitations (listed below), that should be addressed prior to publication.
Additionally, the reviewers noted that there is limited explanation of figures in the main text, and strongly recommend that the authors endeavor to describe the experiments and figures in more detail so that a reader needn't access supplemental methods in order to understand what has been done.
1) The authors identified novel RNA binding sites in NELF-C, but it is unclear whether this interaction actually occurs in the context of tetramer (the authors observed RNA-crosslinking to NELF-C only when over-expressing NELF-C, Figure 6—figure supplement 1F) and whether this interaction is important for the NELF function. Crosslinking data obtained with full NELF complex purified from insect cells (data not shown) is arguably the most informative about the structure of NELF complex, compared to data obtained from other incomplete NELF subcomplexes. It would be important to include this data in this manuscript, ideally as Figure 5.
2) It was unclear to the reviewers whether the authors were arguing for sequence specificity in the RNA binding of NELF-C (Figure 4). However, we felt that the data did not support specificity. The affinity of NELF-AC binding to 'random' single stranded nucleic acids in Figure 4A is quite low, but it is just as good as the presumed 'specific' targets shown in Figure 4D.
Since this RNA binding region of NELF-C is a main novel aspect of the manuscript, the authors should clarify this section. If they believe there is any specificity to the RNA ligands bound, they must expand their biochemical analysis of NELF-AC binding to nucleic acids. For example, Figure 4D should have several negative controls, such as sequences that occur in junB or fos well downstream of the pause site, sequences that won't form simple single stranded regions, such as hairpins, etc. However, if the authors concur that the RNA binding is likely non-specific charge-charge interactions, this should be clearly spelled out in the manuscript.
For example, in subsection “NELF-AC binds single-stranded nucleic acids” "both the strength of binding and the preference for RNA or DNA are sequence-dependent." In this concluding sentence of the paragraph, the authors should add "protein" before "sequence" or substitute "protein residue-dependent" to make it clear that they are referring to the protein having regions that are specific for RNA binding and not that the protein binds specific RNAs.
3) The data supporting an interaction of NELF-C with RNA in cells is not very strong. First, the bands for NELF-C/D are not well separated from NELF-B, making it totally possible that they are in fact detecting NELF-B interactions with RNA rather than NELF-C. This is a serious caveat. Further, I am concerned that the signals shown don't appear to depend much on UV irradiation (especially in 293FT cells, Figure 6—figure supplement 1C). Can the authors find a more convincing way to demonstrate NELF-C interactions with RNA in vivo? This would really strengthen the manuscript.
4) The authors should provide a final model of the NELF architecture with and without RNA. The authors may also consider performing protein-protein cross linking (as in Figure 5) in the presence of RNA to understand the architecture of the tetramer bound to RNA, and to better understand the protein-RNA cross linking result.
[Editors' note: further revisions were requested prior to acceptance, as described below.]
Thank you for resubmitting your work entitled "Architecture and RNA binding of the human Negative Elongation Factor" for further consideration at eLife. Your revised article has been favorably evaluated by Kevin Struhl (Senior editor) and a Reviewing editor. The manuscript has been improved but there are some remaining issues that need to be addressed before acceptance, as outlined below:
In the revised version of Vos et al. the authors do a nice job of clarifying the manuscript and addressing reviewer concerns. Overall, the manuscript is now acceptable for publication. The presentation of a basic structural model for NELF is highly useful for the field, as is the discovery of multiple, low-affinity RNA binding sites across the surface of the NELF complex. This work will clearly provide a nice framework for extending our mechanistic understanding of NELF activity and will stimulate considerable follow-up.
However, there is one remaining issue we would like the authors to remedy: the language about the specificity of RNA binding by NELF-AC is stronger than warranted by the data presented. NELF-AC does appear to bind RNA comprised of 60% GC more strongly than 40% GC or TAR RNA, but the nature of such selectivity is not addressed (What might be driving this preference? Is it a question of structure? Flexibility? A specific motif? These questions are all left unanswered). Thus, there is concern over definitive statements such as "the strength of binding and preference for RNA or DNA are sequence-dependent".
When one talks of specificity in an RNA-binding protein, one often is referring to a specific motif or sequence context (see work on PUF proteins, or NELF-E, etc.) that is recognized by specific interactions. That is not at all the case in this work. For this reason, we request that the authors modify their language in subsection “NELF-AC binds single-stranded nucleic acids” and elsewhere to better match the limited data presented. Rather than their data 'indicating' specificity, rather it 'suggests' specificity, or they 'provide some evidence for selectivity'. If the authors can tone down this language to better reflect the data presented, we will be happy to accept this work.https://doi.org/10.7554/eLife.14981.026
- Seychelle M Vos
- Livia Caizzi
- Franz Herzog
- Franz Herzog
- Patrick Cramer
- Patrick Cramer
- Patrick Cramer
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
We thank the crystallization facility at the Max Planck Institute of Biochemistry, Martinsried, Germany. Part of this work was performed at the Swiss Light Source at the Paul Scherrer Institut, Villigen, Switzerland. We thank Monika Raabe, Annika Kühn and Henning Urlaub for help with mass spectrometry. We thank the Cramer lab for discussions. We thank Oleh Rymarenko for supporting initial cloning of NELF truncation mutants. SMV is supported by an EMBO Long-Term Postdoctoral Fellowship (ALTF 745-2014). LC is supported by an EMBO Long-Term Postdoctoral Fellowship (ALTF 1261-2014). FH received funding from the European Research Council (StG no. 638218) and the Deutsche Forschungsgemeinschaft (GRK1721). PC was supported by the Deutsche Forschungsgemeinschaft (SFB 860), the European Research Council Advanced Grant TRANSIT, and the Volkswagen Foundation.
- Karen Adelman, Reviewing Editor, National Institute of Environmental Health Sciences, United States
© 2016, Vos et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.