1. Biochemistry and Chemical Biology
  2. Structural Biology and Molecular Biophysics
Download icon

The guide sRNA sequence determines the activity level of box C/D RNPs

  1. Andrea Graziadei
  2. Frank Gabel
  3. John Kirkpatrick
  4. Teresa Carlomagno  Is a corresponding author
  1. European Molecular Biology Laboratory, Structural and Computational Biology, Germany
  2. Leibniz University Hannover, Centre for Biomolecular Drug Research, Germany
  3. University Grenoble Alpes, CEA, CNRS IBS, France
  4. Institut Laue-Langevin, France
  5. Helmholtz Centre for Infection Research, Group of Structural Chemistry, Germany
Research Article
  • Cited 1
  • Views 337
  • Annotations
Cite this article as: eLife 2020;9:e50027 doi: 10.7554/eLife.50027

Abstract

2’-O-rRNA methylation, which is essential in eukaryotes and archaea, is catalysed by the Box C/D RNP complex in an RNA-guided manner. Despite the conservation of the methylation sites, the abundance of site-specific modifications shows variability across species and tissues, suggesting that rRNA methylation may provide a means of controlling gene expression. As all Box C/D RNPs are thought to adopt a similar structure, it remains unclear how the methylation efficiency is regulated. Here, we provide the first structural evidence that, in the context of the Box C/D RNP, the affinity of the catalytic module fibrillarin for the substrate–guide helix is dependent on the RNA sequence outside the methylation site, thus providing a mechanism by which both the substrate and guide RNA sequences determine the degree of methylation. To reach this result, we develop an iterative structure-calculation protocol that exploits the power of integrative structural biology to characterize conformational ensembles.

Introduction

In a wide variety of cellular processes, ranging from biosynthesis to signalling and regulation of gene expression, RNA is chemically modified both co- and post-transcriptionally. All classes of RNA are modified, and RNA processing and editing mechanisms are highly conserved, with more than 140 chemical modifications supporting RNA function in all three domains of life (Machnicka et al., 2013). In rRNA, the most abundant modification is 2’-O-methylation, which impacts pre-rRNA processing, ribosome assembly and function. Functionally, 2’-O-methylation has been shown to protect RNA from ribonucleolytic cleavage (Herschlag et al., 1993), stabilize single base-pairs, act as a chaperone (Helm, 2006; Williams et al., 2001) and influence folding at high temperatures (Kawai et al., 1992). Nonetheless, the exact role of position-specific 2’-O-ribose methylation is mostly unknown.

Recent evidence shows that, while methylation sites are largely conserved and cluster in functionally important regions of the ribosome (Decatur and Fournier, 2002), the abundance of modified nucleotides is not uniform across species, or even across tissues. In humans, one third of methylated sites show variable levels of modification according to the cell-type (Krogh et al., 2016). The heterogeneous ribosome population resulting from these different methylation levels is consistent with the notion of specialized ribosomes that translate particular genes with improved efficiency (Xue and Barna, 2012). In agreement with its putative role in regulating translation, the complexity of rRNA 2’-O-methylation has increased with evolution: in bacteria, a protein enzyme catalyses 2’-O-methylation at a handful of rRNA sites, while in yeast and humans a small nucleolar ribonucleoprotein complex (the Box C/D snoRNP) uses a set of guide RNAs to deposit methyl groups in a sequence-specific manner at ~50 and 100 rRNA sites, respectively.

Besides their role in guiding 2’-O-methylation, Box C/D RNPs are involved in a variety of other functions, ranging from rRNA processing (for example, the U3 snoRNP, Kass et al., 1990) to RNA base acetylation (Sharma et al., 2017). Furthermore, nearly half of all human snoRNPs have no predictable rRNA targets, suggesting that they may have other roles within the cell (Falaleeva et al., 2017). Some of these so-called orphan snoRNPs have been associated with cancer and other diseases (Gong et al., 2017; Williams and Farzaneh, 2012).

The varying levels of methylation measured at different sites and the involvement of the Box C/D RNPs in processes other than methylation raise the question as to how the enzymatic activity is regulated or even silenced in the various Box C/D RNPs.

The lack of an in vitro reconstitution protocol yielding an active snoRNP currently precludes mechanistic and structural studies of the eukaryotic Box C/D complex. All structural and in vitro functional work to date has focused on the archaeal Box C/D sRNP (Figure 1a). The validity of this system as a proxy for the eukaryotic enzyme is established by their architectural similarity and comparable complexity of the rRNA methylation patterns (~115 rRNA methylation sites are predicted in Pyrococcus furiosus).

Figure 1 with 6 supplements see all
Oligomeric assembly states of the archaeal Box C/D RNP.

(a) Top-left: molecular components of the archaeal Box C/D sRNP. Top-right: schematic model of the apo sRNP. Bottom-left: schematic model of the holo mono-RNP from Lin et al. (2011) Bottom-right: schematic model of the holo di-RNP from Lapinaite et al. (2013). NTD: N-terminal domain; CTD: C-terminal domain; CC: coiled-coil. (b) Two RNA sequences (st-sR26 and ssR26) were derived from the Pf sR26 RNA and used to assemble the Box C/D sRNPs either in this (st-sR26) or previous studies (ssR26, Lapinaite et al., 2013). The sequence of st-sR26 is derived from the native sR26 RNA by substitution of the apical K-loop element with the more stable K-turn element. (c) SAXS curves with Guinier plots in the inserts of the Box C/D sRNPs reconstituted with st-sR26 before (apo) and after (holo) addition of 1.25 equivalents of each of substrate D and D’ at a concentration of 2 mg/ml. The transition from an apo di-RNP to a holo mono-RNP is evident from the respective Rg values (Figure 1—figure supplement 4). The data was collected at 40°C. All curves are scaled to the same forward scattering intensity.

In archaea, Box C/D sRNPs consist of three proteins assembled around the guide sRNA (Figure 1—figure supplement 1). Within the guide RNA, the highly conserved box C/D sequence motif folds into the kink-turn (K-turn) (Kiss-László et al., 1998) structure and recruits the protein L7Ae (Snu13 and 15.5K in yeast and human, respectively) (Moore et al., 2004). By analogy, the less conserved box C’/D’ motif has been proposed to fold into the kink-loop (K-loop) structure (Nolivos et al., 2005), which also binds L7Ae (Gagnon et al., 2010). The guide RNA–L7Ae complex binds the two C-terminal domains (CTDs) of the homodimer Nop5 (heterodimer Nop58–Nop56 in yeast and humans), which then recruits two copies of the methylation enzyme fibrillarin (Nop1 and fibrillarin in yeast and human, respectively) through its N-terminal domains (NTDs). The guide sRNA recognizes the rRNA substrate sequences at spacer regions located between boxes C and D′ and between boxes C′ and D; once bound to the substrate, it directs methylation to the fifth nucleotide upstream of either box D (substrate D) or D’ (substrate D’) (Reichow et al., 2007).

In the absence of substrate RNA (apo form), the archaeal Box C/D sRNP has been found to assemble mainly as a dimeric RNP, comprising four copies of each protein and two copies of the guide sRNA (Bleichert et al., 2009) (di-RNP, Figure 1). Upon saturation of the substrate RNA binding sites (holo form), two oligomeric states have been reported (Figure 1—figure supplement 2): the monomeric RNP (mono-RNP, Lin et al., 2011), containing two copies of each protein, one guide sRNA and two substrate RNAs (Figure 1—figure supplement 2a), and the dimeric RNP (di-RNP, Lapinaite et al., 2013), containing four copies of each protein, two guide sRNAs and four substrate RNAs (Figure 1—figure supplement 2b). Whether the existence of both mono- and di-RNP forms is merely a consequence of the different experimental set-ups in vitro or has a functional relevance in vivo remains an open question (Yu et al., 2018). In any case, the monomeric sRNP is believed to be a better representation of the eukaryotic system, as snoRNPs have never been shown to assemble into dimers, and the structure of the U3 snoRNP bound to a pre-ribosomal complex displays a mono-RNP architecture (Cheng et al., 2017).

The levels of methylation catalysed by sRNP complexes in vitro vary according to the substrate sequence. In early studies the efficiency of 2’-O-methylation in vitro was proposed to depend on the stability of the substrate–guide duplex and on the formation of an ideal A-form helical geometry close to the modification site (Appel and Maxwell, 2007). Using the Pyrococcus furiosus (Pf) sR26 guide RNA, whose corresponding sRNP methylates substrate D’ more efficiently than substrate D, we demonstrated that methylation levels depend on — among other factors — the nature of the first base-paired nucleotide of the substrate (Graziadei et al., 2016). The observation that substrate D’, with a 5’-uridine, displays good turnover in all conditions, while turnover of substrate D, with a 5’-guanosine, requires binding of substrate D’ (Graziadei et al., 2016), led us to suggest that the nature of the last base-pair before the box D (or box D’) regulates product dissociation. In agreement with the hypothesis that methylation levels are not exclusively dependent on the stability of the substrate–guide duplex, a recent study, which quantified site-specific rRNA methylation in two different human cell lines (Krogh et al., 2016), revealed that methylation levels in vivo do not correlate with either the number of base-pairs or the stability of the substrate–guide helix.

Here we demonstrate that the sequence of the substrate–guide duplex influences the affinity of fibrillarin for the substrate and that the extent of fibrillarin binding correlates with the efficiency of methylation. Using nuclear magnetic resonance (NMR), small angle X-ray (SAXS) and neutron (SANS) scattering data, we demonstrate that, in the context of the sRNP complex, the affinity of fibrillarin for the substrate depends on the RNA sequence beyond the methylation site. This difference in affinity is explained by the energetics of a global conformational transition of the sRNP from an inactive to an active state and provides a further route, besides the modulation of product dissociation described previously (Graziadei et al., 2016), to tune RNA methylation levels. To derive these results we developed an ensemble structure-calculation method that exploits the ability of integrative structural biology in solution to reveal and characterize conformational equilibria.

Results

Structure determination of the half-loaded mono-RNPs

To understand the reasons for the higher efficiency of substrate D’ methylation as compared to substrate D in the Pf sR26 RNP we set out to determine the structure of the corresponding half-loaded sRNPs, bound to either substrate D or substrate D’. We used a stabilized version of the Pf sR26 guide RNA, where the apical K-loop has been substituted by a K-turn sequence (stabilized sR26, st-sR26, Figure 1b). This modification was necessary to ensure that the complex remains stably assembled over several days at 55°C, as required by the NMR experiments, and does not affect the oligomerization state of the complex (Figure 1 and Figure 1—figure supplement 3).

First, we determined the oligomerization state of the RNP complexes assembled with st-sR26 from their radius-of-gyration (Rg), measured by SAXS or SANS. To estimate the compatibility of experimentally determined Rg values with the mono- or di-RNP assembly states, we evaluated the theoretical Rg distributions of 5000 di-RNP models with randomized positions of the fibrillarin copies not bound to the RNA in both apo and holo (fully-loaded) conformations from Lapinaite et al. (2013)Figure 1—figure supplement 4). We obtained a mean Rg value of 55.9 Å with a standard deviation (SD) of 2.0 Å for the apo di-RNP and a mean Rg of 58.1 ± 3.6 Å for the holo di-RNP. The SAXS curves of the apo sRNP assembled with st-sR26 (Figure 1c) correspond to a radius-of-gyration (Rg) of 54.3 Å, which is consistent with a di-RNP architecture (Figure 1—figure supplement 4). Addition of 1.25 molar equivalents of either substrate D or D’ reduces the Rg from 54.3 Å to 50.0 or 47.3 Å, respectively, with a further reduction to 45.0 Å, upon addition of both substrates (holo state) (Graziadei et al., 2016). These radii are no longer compatible with a di-RNP, demonstrating that both the half-loaded and holo st-sR26 complexes are mono-RNPs (Figure 1—figure supplement 4). The same transition from a di-RNP to a mono-RNP occurred for the Box C/D RNP assembled with sR26 upon substrate RNA binding (Figure 1—figure supplement 3). This is different from the holo complex assembled previously in our laboratory with the ssR26 RNA (symmetric and stabilized sR26), which contains two substrate D’ RNA binding sites of the same sequence (Figure 1b and Figure 1—figure supplement 2b). The RNP assembled with ssR26 remained a di-RNP after saturation of the substrate RNA binding sites (Lapinaite et al., 2013).

Before embarking upon the structural study of the sRNPs containing st-sR26, we wanted to understand which elements are responsible for the different oligomerization states of the holo ssR26- and holo st-sR26-RNPs. The ssR26 and the st-sR26 RNAs differ only in the sequence of the guide RNA at the box D position, which in the case of ssR26 is identical to that of guide D’. Thus, we generated two additional guide RNAs with distinct D and D’ sequences, st-sR26-1 and st-sR26-2: in st-sR26-1 (st-sR26-2), guide sequence D is a chimeric sequence, formed by the 5’ half of st-sR26 guide D (st-sR26 guide D’) and the 3’ half of st-sR26 guide D’ (st-sR26 guide D) (Figure 1—figure supplement 5a). Interestingly, the Box C/D enzyme containing st-sR26-1 maintained the di-RNP architecture upon binding of either substrate RNAs, while the sRNP containing st-sR26-2 transitioned to the mono-RNP state (Figure 1—figure supplement 5b). Mutation of the last nucleotide of st-sR26-1 guide D to either C or U (A61C and A61U with complementary substrate D) did not perturb the di-RNP architecture (Figure 1—figure supplement 5c). We conclude that the guide sequence strongly influences the oligomerization state of the holo complex.

Further evidence of the monomeric state of half-loaded and holo st-sR26 complexes emerges from the P(r) distribution calculated from the SANS curve of the complexes assembled with 2H-fibrillarin in 42%:58% D2O:H2O solvent: the number and relative intensities of the maxima are compatible with the presence of two fibrillarin copies but incompatible with the presence of four (Figure 1—figure supplement 6). As monomeric complexes, the substrate-loaded st-sR26 RNPs can serve as proxies for the eukaryotic snoRNP. As we showed previously (Graziadei et al., 2016), the sRNP assembled with this RNA catalyses the methylation of the substrate D’ more efficiently than substrate D, in a similar manner to the native Pf sR26 RNP.

To investigate whether the difference in methylation efficiency of substrate D and D’ correlates with structural differences, we assembled the Box C/D RNP with the st-sR26 guide RNA and saturated either its D or D’ guide site (Figure 1b) to obtain two half-loaded mono-RNPs. We then determined their structures in solution, where the conformational dynamics of the complexes are preserved. The mono-RNPs are ~190 kDa in size and thus not amenable to standard structure determination by NMR. In this molecular-weight range, solution NMR focuses on methyl-group resonances, which have favourable relaxation properties and show strong signal intensity (Sprangers and Kay, 2007; Tugarinov et al., 2003) Thus, to solve the structure of the two half-loaded sRNPs, we used a combination of methyl-group NMR spectroscopy and small-angle scattering (see Methods and Carlomagno, 2014).

As in our earlier work on the fully-loaded di-RNP complex (Lapinaite et al., 2013), we started from the assumption that the interaction interface of the Nop5-CTD with the L7Ae–K-turn-RNA complex and that of the Nop5-NTD with fibrillarin do not change with respect to those observed in the respective crystal structures (Liu et al., 2007; Xue et al., 2010; Aittaleb et al., 2003). To validate this assumption we acquired two-dimensional 1H-13C correlation spectra of fibrillarin and L7Ae labelled specifically at the methyl groups of Ile, Val and Leu residues (Tugarinov and Kay, 2003). The chemical shift perturbations measured for L7Ae in the Box C/D mono-RNP with respect to L7Ae in the L7Ae–K-turn-sRNA complex map to the previously described interface between L7Ae and the Nop5-CTD (Xue et al., 2010; Figure 2—figure supplement 1). Similarly, the chemical-shift perturbations measured for fibrillarin in the Nop5-NTD–fibrillarin complex with respect to free fibrillarin map to the interaction interface observed in previous crystal structures (Aittaleb et al., 2003). These CSPs are conserved in the Nop5–fibrillarin complex and in the apo Box C/D mono-RNP (Figure 2—figure supplement 2), demonstrating that fibrillarin interacts exclusively with the Nop5-NTD in all complexes.

We then used the signals from the L7Ae and fibrillarin methyl groups to measure paramagnetic relaxation enhancements (PREs). In this technique, a paramagnetic tag (spin-label) carrying an unpaired electron is coupled to a unique cysteine engineered on one protein subunit within the complex. The PREs elicited on the methyl groups of a second protein subunit by the unpaired electron are translated into distance restraints (Battiste and Wagner, 2000), which define the position and relative orientation of the two subunits in the complex. For the D-loaded (D’-loaded) mono-RNP, we collected a total of 407 (442) PREs using spin-labels on L7Ae-Q45C, L7Ae-E58C/C68S, L7Ae-C68, Nop5-E196C, Nop5-D247C and Nop5-S343C while observing the methyl resonances of fibrillarin and on Nop5-E65C while observing the methyl resonances of L7Ae (Figure 2—figure supplement 3a). The PRE data were validated by means of intra-molecular PREs within the rigid fibrillarin module (Figure 2—figure supplement 4). The excellent fit between the experimental PRE intensity ratios and those predicted from the known distances confirms the reliability of the PRE-derived inter-molecular distances.

A second class of structural restraints was derived from SANS curves acquired with contrast-matching. In these experiments one or more proteins in the complex are 2H-labelled and contribute to the observed scattering signal, while the scattered intensity of the unlabelled proteins is masked by the solvent, which is prepared as a 42%:58% D2O:H2O mixture. A combination of such datasets provides sufficient information to restrain the relative position of several molecules within a multi-subunit complex. In our case we acquired SANS curves for 2H-L7Ae, 2H-Nop5, 2H-Fib, 2H-RNA, 2H-Fib/2H-RNA and 2H(70%)-Nop5/2H-RNA in 42%:58% D2O:H2O (Figure 2—figure supplement 3b). In addition, we also collected SAXS curves, which report on the shape of the entire complexes.

These data were then incorporated into a structure-calculation protocol adapted from that developed in our previous study (Lapinaite et al., 2013) (for a description of the adapted protocol, refer to Methods and Figure 3—figure supplement 1). We used the conformations of the modules L7Ae–K-turn-sRNA–Nop5-CTD and Nop5-NTD–fibrillarin observed in previous crystal structures, and restricted our conformational search to the relative orientations of the three domains of Nop5, the conformation of the sRNA in parts other than the K-turn motifs and A-form helices and the relative positions of the two copies of each protein in the mono-RNP.

Conformation of the half-loaded mono-RNPs in solution

The methyl-group NMR spectrum of fibrillarin in the apo RNP assembled with st-sR26 is identical to the spectrum of the RNP assembled with ssR26 (Figure 2a, left panel). This was expected, as in both di-RNPs all four fibrillarin copies are far from the RNA and thus their chemical shifts are independent of the RNA sequence used to assemble the complex.

Figure 2 with 4 supplements see all
NMR and SAS of the half-loaded st-sR26 RNPs.

(a) Left, overlay of ILV-methyl 1H-13C spectra of fibrillarin in the apo ssR26 (turquoise) and apo st-sR26 (blue) RNPs. In both di-RNPs, all four fibrillarin copies are distant from the RNA and the two spectra are identical. Middle, overlay of ILV-methyl 1H-13C spectra of fibrillarin in the apo st-sR26 (blue) and substrate D’-loaded st-sR26 (green) RNPs. Right, expanded view of the overlay of ILV-methyl 1H-13C spectra of fibrillarin in the apo st-sR26 (blue) and substrate D’-loaded st-sR26 (green) RNPs. (b) Left, structural snapshots of the on- (left) and off- (right) states of one fibrillarin copy in the substrate D’-loaded mono-RNP. Upon binding of fibrillarin to the substrate–guide duplex, the Nop5-E65C spin-label (red) comes close to one L7Ae copy (green), leading to PRE intensity-ratios below 0.8 for the L7Ae-ILV residues shown as yellow spheres. In contrast, when fibrillarin is in the off-state (right), the Nop5-E65C spin-label is far from L7Ae and cannot induce any PRE-mediated attenuation of peak intensities. Colour-code as in Figure 1. Right, PRE effects (Ipara/Idia, ratio of the peak intensities when the spin-label is in the paramagnetic and diamagnetic state, respectively) of the Nop5-E65C tag on the L7Ae-ILV peaks in the substrate D-bound (red) and substrate D’-bound (blue) mono-RNPs. The yellow bars indicate the residues represented as yellow spheres in the left panel. (c) Left, cartoon representation of the [on,off]-conformer of the substrate D’-loaded mono-RNP; right, cartoon representation of the conformational equilibrium between the [on,off]- and [off,off]-conformers of the same complex.

Methyl groups are rather sparse in the protein surfaces involved in recognition of the RNA backbone; in the RNA-bound form of fibrillarin, only the methyl groups of V35, I82, V110, L114, I117, V151 and V185 are expected to be within 8 Å of the RNA, while only V110 should be closer than 5 Å. Therefore, the chemical shift perturbations (CSPs) for fibrillarin upon RNA binding should be few and relatively small in magnitude. As expected, the methyl-group NMR spectrum of the substrate-bound RNPs showed only moderate CSPs; nonetheless, these were mainly localized in the spectral region containing V110, V151 and V185, thus confirming that fibrillarin recognizes the substrate D’–guide duplex (Figure 2a, right panel).

Further evidence of substrate–guide recognition by fibrillarin was provided by the PRE data. As shown in Figure 2b for substrate D’, upon fibrillarin binding to the substrate–guide duplex (on-state, upper left), the Nop5-E65C spin-label (red) comes close to one L7Ae copy and would lead to PRE intensity-ratios of less than 0.8 for the L7Ae-ILV residues shown as yellow spheres. In contrast, when fibrillarin is not bound to the substrate–guide duplex (off-state, upper right), the Nop5-E65C spin-label is far from L7Ae and cannot cause any PRE attenuation of L7Ae peaks. Thus, the low PRE intensity-ratios observed experimentally for the methyl groups of the residues marked in yellow (Figure 2b, bottom) indicates the presence of conformers in which fibrillarin is bound to the substrate–guide duplex.

In an half-loaded mono-RNP, one fibrillarin copy is necessarily in the off-state, due to the lack of the corresponding substrate; the second fibrillarin copy could be either stably bound to the substrate–guide duplex (yielding a complex in the [on,off]-conformation) or exchanging between the on- and off-states (corresponding to the RNP exchanging between the RNP [on,off]- and [off,off]-conformations, Figure 2c). The NMR data are qualitatively compatible with both scenarios, as the broad line-widths and the overlap of the fibrillarin NMR peaks that show the largest CSPs upon RNA binding preclude a quantitative analysis of the magnitude of the CSPs in terms of relative proportions of the two conformations. Thus, we decided to consider both scenarios in the interpretation of the structural data.

Structure calculations

To determine the [on,off]- and [off,off]-conformations of both the substrate D- and D’-loaded sRNPs, we adapted our previously developed structure-calculation protocol (Lapinaite et al., 2013). We initially performed two structure calculations per complex: in the first calculation, we imposed the restraint that one fibrillarin copy is in contact with the corresponding substrate–guide duplex, while the other copy is not ([on,off]-state); in the second calculation, we left both fibrillarin copies free to adopt any position compatible with the PRE data ([off,off]-state). We then recursively binned the PRE-derived distance-restraints into two sets, according to their compatibility with the the [on,off]- or [off,off]-conformations (Figure 3—figure supplement 1). The majority of restraints were found to be consistent with both states and therefore appeared in both sets. One notable exception is the set of PRE restraints derived from the methyl-groups of L7Ae in the presence of spin-labelled Nop5-E65C, which are compatible only with fibrillarin being in contact with the substrate–guide duplex (Figure 2b). In total, we performed four structure-calculation runs, two for each of the half-loaded complexes. The [on,off]-conformations were compatible with nearly all PRE-derived restraints (400 out of 407 for the substrate D-loaded and 436 out of 442 for the substrate D’-loaded RNP, respectively), while the [off,off]-conformations were compatible with 364 and 414 restraints for the substrate D- and substrate D’-loaded RNP, respectively.

Each individual structure calculation proceeded through a global and a local search stage. At each stage, the total and distance-restraint energies as well as the back-calculated fits to the SANS curves were used for structure selection. The two final structure ensembles corresponding to the [on,off]-states (Figure 3) are defined to a precision of better than 2.5 Å (root-mean-square-deviation, RMSD, of the protein Cα and RNA P atoms, excluding flexible regions). When compared to the existing structure of the holo mono-RNP from Sulfolobus solfataricus (PDB entry 3pla, Lin et al., 2011), the substrate D- and substrate D’-loaded complexes show a reasonable similarity (Figure 3—figure supplement 2). All major features of the substrate-bound site are conserved: the RNA-guide sequences lie on the coiled-coil Nop5 domain at an angle of about 70° and the C-terminal tip of L7Ae is in proximity to the short Nop5 β-sheet 77–79 and α-helix 64–73. However, the solution structures differ from the crystallographic structure in many details, demonstrating that the sRNP architecture is flexible enough to adapt to different guide- and substrate-RNAs. As expected, a significant divergence from the structure of PDB entry 3pla is observed in the substrate-unbound half of the complexes.

Figure 3 with 5 supplements see all
Ensembles of structures in agreement with the experimental data for the [on,off]- and [off,off]-states of substrate D’- and substrate D-loaded sRNPs.

The RMSD values of each ensemble (in parentheses) are calculated as the average of the RMSD values of the ensemble structures with respect to the structure closest to the mean over the Cα and P atoms of the protein and RNA structured domains, including the fibrillarin units not bound to the RNA. Colour-code as in Figure 1.

Importantly, neither the [on,off]- nor the [on,on]-ensemble are able to reproduce the combination of PRE and SAS data satisfactorily for each of the substrate D- or the substrate D’-loaded RNPs. The PRE intensity-ratios measured for the Nop5-NTD-E65C mutant on the methyl-groups of L7Ae indicate the presence of conformers in the [on,off]-state. In agreement with this, the [on,off]-structures of Figure 3 reproduce the PRE data reasonably well both for the substrate D- and substrate D’-loaded complexes (Figure 3—figure supplements 3 and 4). However, these structures are unable to fit the 2H-Fib SANS, 2H-Fib/2H-RNA SANS and SAXS curves in a satisfactory manner (Figure 3—figure supplement 5). Thus, the combination of PRE and SAS data is incompatible with a single state for each of the substrate D- or substrate D’-loaded RNPs, but rather reveals the presence of conformational ensembles.

Conformational ensembles

Because the SAS data that are in disagreement with the [on,off]-conformations of Figure 3 all report on the position of the fibrillarin copies in the complexes, we deduced that the conformational equilibria present in solution must be related to the position of fibrillarin. Different types of conformational equilibria are conceivable. In the simplest scenario, only the fibrillarin in the off-state samples multiple conformations, with the second fibrillarin remaining stably in the on-state; in a more complex scenario, the second fibrillarin copy may sample both the on- and off-states (in addition to the conformational flexibility of the fibrillarin copy in the off-state).

To represent both scenarios and obtain structural ensembles compatible with both PRE and SAS experimental data, we developed an ensemble scoring protocol (Figure 3—figure supplement 1b, Methods). For both the substrate D- and substrate D’-loaded RNPs, we used representative structures of the [on,off]- and [off,off]-state ensembles (Figure 3) — defined as the structure closest to the mean structure — as starting points to generate four sets of ~4000 conformations, in which the positions of the Nop5-NTD–fibrillarin units not bound to the substrate–guide duplex were randomized, in order to account for their flexibility. We then used a pseudo-genetic algorithm to select ensembles of either exclusively [on,off]-conformers or of both [on,off]- and [off,off]-conformers that best fit the PRE data, as well as the 2H-Fib and 2H-Nop5 SANS, 2H-Fib/2H-RNA SANS, 2H(70%)-Nop5/2H-RNA SANS and SAXS curves (Figure 3—figure supplement 1).

Conformational ensemble of the substrate D’-loaded sRNP

Despite the reasonable fit of the PRE intensity ratios of the substrate D’-loaded sRNP with the representative structure of the [on,off]-conformers of (Figure 3; Figure 3—figure supplement 3), the larger Rg of the experimental 2H-Fib SANS curve with respect to the theoretical one indicated the presence of conformers where the two copies of fibrillarin are more distant from each other than in this set of [on,off]-conformers (Figure 3—figure supplement 5).

We thus set out to improve the fit to the experimental data by deriving mixed ensembles containing both [on,off]- and [off,off]-conformers using the ensemble scoring protocol described above. The resulting best-fit ensembles contained 66 ± 8% [on,off]-conformers and showed a much improved fit to both the SAXS and 2H-Fib SANS curves (Figure 4). The agreement between experimental and predicted PREs also improved (Figure 5).

Figure 4 with 1 supplement see all
Fibrillarin binds the substrate–guide duplex more strongly in the substrate D’-loaded sRNP.

(a) The structural ensemble selected by the pseudo-genetic scoring algorithm (Methods) for the substrate D’-loaded sRNP, containing two [on,off]-state and one [off,off]-state conformers, with fibrillarin shown in shades of blue. The fits to the experimental SAS curves are shown on the right. All SANS curves were measured in 42%:58% D2O:H2O. (b) Structural ensemble selected by the pseudo-genetic scoring algorithm for the substrate D-loaded sRNP, containing three [on,off]-state and eight [off,off]-state conformers. In both a and b, the mean and standard deviation of the percentage of [on,off]-state structures in the three top-scoring ensembles across three independent scoring runs is shown in the title. The structural ensembles yield much better agreement with the SAS curves than do the individual [on,off]- and [off,off]-state structures (Figure 3—figure supplement 5).

Figure 5 with 1 supplement see all
Fit of the ensemble structures representing the substrate–loaded RNPs to the PRE data.

(a) Comparison of Ipara/Idia ratios back-calculated from the selected ensemble of conformers of the substrate D’-loaded st-sR26 RNP shown in Figure 4a (blue) with the experimental ratios (black). The reported Q-factors were calculated as recommended by Clore and Iwahara (2009). In the title of each panel the first name indicates the spin-labelled protein, the number indicates the position of the spin-label and the second name indicates the protein whose ILV methyl groups were detected. (b) Comparison of Ipara/Idia ratios back-calculated from the selected ensemble of conformers of the substrate D-loaded st-sR26 RNP shown in Figure 4b (blue) with the experimental ratios (black). The structural ensembles yield better or similar agreement with the PRE data than do the individual [on,off]- and [off,off]-state structures Figure 3—figure supplements 3 and 4).

To verify that an acceptable fit to the experimental data requires the combination of both [on,off]- and [off,off]-conformers in the structural ensemble, we repeated the ensemble scoring protocol selecting from only [on,off]- or [off,off]-conformers. The fit to the SAS curves remained unsatisfactory for both these ensembles (Figure 4—figure supplement 1), with the [on,off]-ensemble yielding a poor fit to the 2H-Fib SANS curve and the [off,off]-ensemble being unable to reproduce the SAXS curve. In addition, the fit of the [on,off]-ensemble to the PRE data (Figure 5—figure supplement 1a) remained inferior to that of the ensemble containing both [on,off]- and [off,off]-structures.

Conformational ensemble of the substrate D-loaded sRNP

The higher values of the PRE intensity-ratios measured for the L7Ae methyl-groups in the presence of the spin-labelled Nop5-NTD-E65C mutant in the substrate D-loaded mono-RNP as compared to the substrate D’-loaded mono-RNP indicated that the proportion of fibrillarin bound to the substrate–guide duplex is lower for the mono-RNP loaded with substrate D than for that loaded with substrate D’. Accordingly, the combination of PRE and SAS data could not be fit with an ensemble consisting of [on,off]-conformers only, as the χ2 value of the SAXS curve remained as poor as that obtained with a single [on,off]-conformer (>250) (Figure 4—figure supplement 1). Conversely, ensembles containing only [off,off]-conformers failed to reproduce the PRE dataset of the complex containing spin-labelled Nop5-E65C (Figure 5—figure supplement 1b).

When we fitted the PRE and SAS data with ensembles consisting of both [on,off]- and [off,off]-conformers, we could reproduce all experimental data satisfactorily with a population of [on,off]-conformers of 34 ± 10% (Figures 4 and 5).

Fibrillarin binds preferentially to substrate D’

The combination of the NMR and SAS data demonstrated the existence of a conformational equilibrium between [on,off]- and [off,off]-conformers for both substrate D- and substrate D’-loaded RNPs. The ensemble of conformations representing the substrate D’-loaded sRNP (Figure 4) contained a reproducibly higher proportion of conformers with fibrillarin in the [on,off]-state (66 ± 8%) than did the ensemble representing the substrate D-loaded sRNP (34 ± 10%), as was expected from the stronger PRE effects induced on L7Ae by the Nop5-E65C paramagnetic tag for the substrate D’-loaded sRNP (Figure 2b). Thus, despite the lack of sequence-specific interactions with the RNA, fibrillarin binds more strongly to the substrate D’–guide duplex than to the substrate D–guide duplex in the context of the Box C/D RNP.

This observation prompted us to analyse in more detail the structural differences between the [on,off]-states of the substrate D- and D’-loaded RNPs, as well as their stability in a 150-ns molecular-dynamics (MD) simulation. In the [off,off]-state, both half-loaded RNPs display a regular A-form helix of 11 base-pairs formed by the guide and substrate RNAs and positioned far from the Nop5 coiled-coil domains. The geometry of this helix was given as a restraint in the structure calculations, because of the perfect complementarity of the substrate–guide sequences over these 11 nucleotides. Binding of fibrillarin pushes the substrate–guide duplex towards the Nop5 coiled-coil domain, thereby perturbing the base-pairing at the substrate 3’ end (Figure 6). This observation is in agreement with a recent study, reporting that a substrate–guide duplex of only 10 base-pairs results in the highest level of in vitro methylation for a S. solfataricus enzyme (Yang et al., 2016). During the 150-ns MD trajectory of the D’-loaded complex, the two base-pairs at the 3’ end of substrate D’ are disrupted and the Nop5 α10 helix and its flanking loops form many electrostatic contacts with the RNA (Figure 7). In addition, W319 forms a face-to-face interaction with the no-longer base-paired G15 of the guide RNA. In contrast, in the substrate D-loaded complex, only one base-pair is melted at the 3’ end of the substrate (the second last), fewer new contacts are formed between the protein and the RNA and some other contacts are lost during the simulation (Figures 6 and 7). Furthermore, in the substrate D’-loaded complex the A–U base-pair at the 5’ end of substrate D’ iwas often disrupted during the simulation, allowing for the formation of electrostatic contacts between E289 and A25/C5 and K290 and U4 (Figure 7). Conversely, in the substrate D-loaded complex, the C–G base-pair at the 5’ end of the substrate remains stable throughout the simulation (Figure 6). The first-base paired nucleotide of substrate D is kept in place by hydrogen bonds between G22 of the unpaired guide and its sugar backbone. In agreement with our MD simulations, Yang et al. (2016) demonstrated that high levels of methylation occur for a substrate–guide duplex length of 8–10 base pairs.

Substrate–guide duplex hydrogen-bonds throughout the molecular dynamics runs.

Plots showing the hydrogen bonding pattern across substrate–guide duplex 3' and 5' ends in the substrate D'-bound (left) and substrate D-bound (right) sRNPs over two 150-ns molecular dynamics simulations. A blue line indicates the presence of at least two hydrogen bonds between the corresponding bases. The numbering is according to Figure 1.

Contacts between proteins and the 5' and 3' ends of the substrate–guide duplex in a 150-ns molecular dynamics run.

(a) Protein–RNA contacts at the 3' end of the substrate–guide duplex in the [on,off]-state of the substrate D'-bound RNP. Each line marks the presence of a contact between the two residues under consideration. Black and blue indicate amino acids of Nop5 and fibrillarin, respectively; orange and cyan indicate nucleotides of the sRNA and substrate D’, respectively. The numbering of the RNA is as in Figure 1. Contacts are H-bonds between polar amino-acid side-chains and polar atoms of the nucleotide (as defined in CPPTRAJ within Amber); hydrophobic interactions involving aromatic amino acid side chains and base rings (with a distance cut-off of 4.0 Å between the centres of the rings); electrostatic contacts between polar amino acid side chains and the RNA phosphorus atoms (with a distance cut-off of 4.0 Å between the polar group and the P atom). The interacting amino acids and nucleotides are displayed in the structural panel in the middle (starting structure) and on the right (structure towards the end of the simulation). (b) Protein–RNA contacts at the 5' end of the substrate–guide duplex in the [on,off]-state of the substrate D'-bound RNP. (c) Protein–RNA contacts at the 3' end of the substrate–guide duplex in the [on,off]-state of the substrate D-bound RNP. Only the second last base-pair melts, leading to a lower number of protein–RNA contacts as compared to the substrate D’-bound RNP. (d) Protein–RNA contacts at the 5' end of the substrate–guide duplex in the [on,off]-state substrate D-bound RNP. Both the RNA secondary structure and the position of the RNA relative to the proteins remain constant throughout the simulation, without formation of new protein–RNA contacts.

We conclude that the stability of the fibrillarin-bound form depends on a delicate balance between the loss of entropy due to fibrillarin localization, and the positive and negative enthalpy changes associated with base-pair melting and formation of new protein–RNA contacts, respectively. Given that:

(1) (Gon,offDGoff,offD)+(Gon,offDGoff,offD)=RT(lnpon,offDpoff,offDlnpon,offDpoff,offD)

where Gon,offD' Gon,offD  and Goff,offD' (Goff,offD) are the free energies of the substrate D’ (D)-loaded complex in the [on,off]- and [off,off]-states, respectively, and pon,offD'poff,offD'pon,offDpoff,offD  is the ratio of the populations of the substrate D’ (D)-loaded complex in the [on,off]- and [off,off]-states, we can calculate that the difference between the ΔG values for the [off,off]→[on,off] transition of the substrate D’- and substrate D-loaded complexes in the st-sR26 RNP is only 0.86 ± 0.55 kcal/mol. This small value suggests that fine differences in the stability of the substrate–guide helices may regulate the affinity of fibrillarin for the methylation site and thus the fractional population of active enzyme.

Discussion

2’-O-rRNA methylation is one of the most extensive modification processes occurring during ribosome synthesis and maturation. The strong conservation of the methylation sites over different species, together with the lethal effect of methylation suppression, led to the conclusion that methylation is a constitutive modification of functional ribosomes. However, rRNA methylation has recently been proposed to exert a regulatory function by generating an heterogeneous ribosome population with differential methylation levels (Erales et al., 2017).

2’-O-methylation is implemented by the Box C/D RNP enzyme through an RNA-guided catalysis. In addition to methylation, Box C/D complexes are involved in a plethora of other functions related to RNA processing. In the context of the multiple roles of Box C/D complexes, the question arises as to how Box C/D RNPs distinguish whether the RNA substrate bound to the guide sequence should be methylated and to what extent.

To address this question, we studied the structure-function relationship of Box C/D RNPs in solution through a combination of NMR and SAS data. Using the archaeal Box C/D sRNP, we determined the solution-state structures of the half-loaded substrate D- and substrate D’-bound mono-RNPs. Instead of a single, well-defined conformational state, we found that the copy of fibrillarin at the substrate-loaded site exchanges between the substrate-bound and unbound states, with the substrate D’-loaded complex displaying a higher population of the methylation-competent [on,off]-state than the substrate D-loaded RNP. Accordingly, the substrate D’-loaded RNP achieves higher levels of methylation (Graziadei et al., 2016). The existence of dynamic equilibria between substrate-bound and unbound conformers of fibrillarin has not been detected by X-ray crystallography, which instead selects for the most ordered conformation.

Our results suggest that the proportion of the methylation-competent complex is subtly tuned by the free-energy difference between the active [on,off]- and inactive [off,off]- conformations (and possibly also by the kinetics of transition, on which our structural data at equilibrium do not provide any information). Recognition of the RNA ribose by fibrillarin is accompanied by a loss in entropy at the junction between the Nop5-NTD and the Nop5 coiled-coil and between the box C/D (or box C’/D’) RNA elements and the substrate–guide duplex. In addition, upon fibrillarin binding, the substrate–guide helical structure must deviate from the ideal A-form geometry, in order to adapt to the proteins. This is particularly evident at the 3’ end of the substrate, where any base-pair beyond the tenth is melted (Yang et al., 2016; Figure 6). These energetically costly events are compensated by the formation of contacts between fibrillarin and the RNA backbone, as well as by contacts between the Nop5-CTDs and fibrillarin and the two ends of the substrate–guide duplex. MD simulations showed that the substrate D’–guide duplex is less stable at the substrate 3’ end than the substrate D–guide duplex, and that the melting of the last two base-pairs results in a large number of protein–RNA contacts. These appear to stabilize the [on,off]-state in the substrate D’-loaded RNP, suggesting that the exact sequence of the 3’ end segment of the substrate RNA influences the fractional population of the active conformation of the RNP. However, as we are unable to detect the RNA signals of the 190 kDa RNP by NMR spectroscopy in solution, we cannot exclude the possibility that in vivo other RNA elements, involving for example the overhang at the substrate 5’ and 3’ ends, could also play a role in stabilizing the [on,off]-state, as previously suggested (Appel and Maxwell, 2007).

In conclusion, methylation efficiency appears to be regulated by a complex interaction network depending on the substrate rRNA sequence beyond the methylation site. We propose that, together with substrate turnover (Graziadei et al., 2016), the ability of different substrate–guide duplexes to shift the position of the equilibrium between the [off,off]- and [on,off]-state conformers modulates the level of methylation at distinct rRNA sites. When the difference in the free energies of the active and inactive enzyme states is small, the correspondingly variable ratio between the populations of the active and inactive conformations provides a mechanism to tune the activity level. When the free-energy difference is large and positive, the population of the methylation-competent conformation becomes vanishingly small and the Box C/D RNP loses its capacity to catalyse methylation. This situation could be the basis for supporting functions of the Box C/D complexes that are unrelated to methylation and thus may not require fibrillarin to bind the RNA (for example, the U3 snoRNP, which guides the formation of the central pseudoknot in 18S rRNA).

To calculate the structural models of the substrate D- and substrate D’-loaded RNPs we developed a novel hybrid structure-calculation protocol that fits a combination of NMR and SAS data to an ensemble of conformations. The application of integrative structural biology approaches is particularly relevant to the detection of inter-domain dynamics of RNP complexes, as the different types of structural data are sensitive to conformational changes in different ways. In this case, the combination of NMR PRE data and SAS data was essential for revealing the equilibrium between RNA-bound and RNA-unbound fibrillarin states. The computational workflow developed here allows interpretation of hybrid structural data in terms of structural ensembles, rather than as individual conformations. The protocol proceeds in a step-wise fashion, where the structural ensemble becomes progressively well-defined while increasing the demand on the quality of the fit between predicted and experimental data. We anticipate the methodology developed here to be generally applicable to modular enzymes undergoing domain reorientation during catalysis.

Materials and methods

Key resources table
Reagent type
(species) or
resource
DesignationSource or
reference
IdentifiersAdditional
information
Strain, strain background (Escherichia coli)BL21 (DE3)EMBL protein expression facilityNA
Strain, strain background (Escherichia coli)BL21 Rosetta 2Merck MilliporeCat #71400–3
Recombinant DNA reagentpETM-11 Fibrillarin (plasmid)Lapinaite et al. (2013)Nterminal His6 + TEV site
Recombinant DNA reagentpETM-11 Nop5 (plasmid)Lapinaite et al. (2013)Nterminal His6 + TEV site; L113K V223E mutant. Codon-optimised synthetic gene (GeneArt)
Recombinant DNA reagentpETM-11 Nop5 E65C (plasmid)Lapinaite et al. (2013)Mutation of pETM-11 Nop5
Recombinant DNA reagentpETM-11 Nop5 E196C (plasmid)Lapinaite et al. (2013)Mutation of pETM-11 Nop5
Recombinant DNA reagentpETM-11 Nop5 D247C (plasmid)Lapinaite et al. (2013)Mutation of pETM-11 Nop5
Recombinant DNA reagentpETM-11 Nop5 S343C (plasmid)Lapinaite et al. (2013)Mutation of pETM-11 Nop5
Recombinant DNA reagentpETM-11 L7Ae (plasmid)Lapinaite et al. (2013)Nterminal His6 + TEV site
Recombinant DNA reagentpETM-11 L7Ae Q45C (plasmid)Lapinaite et al. (2013)Mutation of pETM-11 L7Ae also carrying C68S mutation
Recombinant DNA reagentpETM-11 L7Ae E58C (plasmid)Lapinaite et al. (2013)Mutation of pETM-11 L7Ae also carrying C68S mutation
Sequence-based reagentst-sR26Graziadei et al. (2016)In vitro transcribed RNA
Sequence-based reagentst-sR26-1This paperIn vitro transcribed RNAMethod section: RNA synthesis
Sequence-based reagentst-sR26-1 substrateThis paperIn vitro transcribed RNAMethod section: RNA synthesis
Sequence-based reagentst-sR26-1 A61CThis paperIn vitro transcribed RNAMethod section: RNA synthesis
Sequence-based reagentst-sR26-1 A61UThis paperIn vitro transcribed RNAMethod section: RNA synthesis
Sequence-based reagentst-sR26-2This paperIn vitro transcribed RNAMethod section: RNA synthesis
Sequence-based reagentst-sR26-2 substrateThis paperIn vitro transcribed RNAMethod section: RNA synthesis
Sequence-based reagentsR26Graziadei et al. (2016)In vitro transcribed RNA
Sequence-based reagentssR26Lapinaite et al. (2013)In vitro transcribed RNA
Commercial assay or kitTLAM-ILVproS labellingNMR-BioNA
Chemical compound, drugIodoacetoamido-PROXYLSigma-AldrichCat # 253421–25 MG
Chemical compound, drug(methyl-13C, 99%; 3,3-D2, 98%) α-ketobutyric acidCambridge Isotope LabsCDLM-7318-PK
Chemical compound, drug(3-methyl-13C, 99%; 3,4,4,4-D4, 98%) α-ketoisovaleric acidCambridge Isotope LabsCDLM-7317-PK
Chemical compound, drug[3–2 H2,4–2H, 5–13C, 5’−2 H3]-a-ketoiso-caproateLichtenecker et al. (2013)
Software, algorithmCNSThis paperMethod section: Structure calculation and selection. Adaptation of protocol from Lapinaite et al. (2013)
Software, algorithmPython-based SAS-PRE scoring algorithmThis paper
Software, algorithmATSAS 2.7.5Petoukhov et al., 2012
Software, algorithmPython-based SAS-PRE scoring algorithmThis paperMethod section: Ensemble Scoring

Protein expression, labelling and purification

Request a detailed protocol

L7Ae (UniProtKB accession code Q8U160), Nop5 (Q8U4M1) and archaeal fibrillarin (Q8U4M2) were expressed, purified and reconstituted with sRNAs as described previously (Graziadei et al., 2016). Nop5 was expressed with the L113K and V223E mutations in order to prevent the formation of aggregates. Deuterated proteins were expressed in 100% D2O M9 minimal medium using 2H-glycerol as the sole carbon source. Deuterated proteins with 1H,13C-labelled ILV methyl groups were produced following protocols developed in the Kay laboratory (Tugarinov and Kay, 2003). Stereospecific pro-S 1H,13C-labelling of valine and leucine methyl groups was obtained by expression with the appropriate metabolic precursor according to the specifications of the manufacturer (TLAM-Iδ1LVproS; NmrBio). Leucine-specific labelling was achieved using the protocol described by Lichtenecker et al. (2013). All NMR samples were assembled with 2H-Nop5, and, in the case of 1H,13C -ILV methyl-labelled L7Ae, with both 2H-Nop5 and 2H-fibrillarin. The 2H(70%)-Nop5 sample for SANS experiments was obtained by expression in 100% D2O M9 minimal medium with 1H-glucose as the sole carbon source; deuteration levels for this sample were verified by MALDI mass spectrometry.

RNA synthesis

Request a detailed protocol

Guide-RNAs were produced by in vitro transcription from double-stranded plasmid DNA templates using T7 RNA polymerase produced in-house and rNTPs (Roth). RNAs were purified by denaturing 12–20% polyacrylamide gel electrophoresis, and extracted by electro-elution. For 2H-RNA samples, RNA synthesis was performed using 2H-labelled rNTPs (Silantes).

st-sR26: 5’-GCGAGCAAUGAUGAGUGAUGGGCGAACUGAGCUCGAAAGAGCAAUGAUGACGGAGGUGAUCACUGAGCUCGC-3’ st-sR26-1: 5’-CGAGCAAUGAUGAGUGAUGGGCGAACUGAGCUCGAAAGAGCAAUGAUGACGGAGGGGCGAACUGAGCUGCG-3’

st-sR26-2: 5’-CGAGCAAUGAUGAGUGAUGGGCGAACUGAGCUCGAAAGAGCAAUGAUGAGUGAUGUGAUCACUGAGCUGCG-3’ sR26: 5'-GCGAGCAAUGAUGAGUGAUGGGCGAACUGAAAUAGUGAUGACGGAGGUGA UCUCUGAGCUCGC-3’

Substrate RNAs for st-sR26 were produced in-house using synthetic DNA oligonucleotides:

Substrate D′: 5′-GCUUCGCCCAUCAC-3’

Substrate D: 5′-GUAGAUCACCUCCG-3’

st-sR26-1 substrate D: 5’-GUAUCGCCCCUCCG-3’

st-sR26-2 substrate D: 5’-GUAGAUCACAUCAC-3’

Transfer of NMR methyl-group assignments

Request a detailed protocol

In the free state, fibrillarin methyl resonances were stereospecifically assigned by means of 3D NOESY–13C-HMQC spectra, acquired on ILV and ILVproS-labelled samples, in combination with 3D TOCSY–13C-HMQC spectra and by comparison to the NOEs expected from the fibrillarin structure. The assignment was transferred stepwise from the free fibrillarin to the Nop5-NTD–fibrillarin complex, the Nop5–fibrillarin complex and finally to the full Box C/D complex. For the ILV-labelled Nop5-NTD–fibrillarin complex, we also acquired a 3D NOESY–13C-HMQC spectrum; for all complexes we acquired 13C-HMQC spectra on ILV-labelled, ILVproS-labelled and L-labelled samples. For the ILV-labelled Nop5–fibrillarin complex, pairings of HMQC peaks from the diastereotopic methyl-groups of leucine and valine residues were verified with the assistance of a 3D experiment in which the 1H and 13C resonances of the methyl groups were correlated with the 13C resonances of the directly bonded methine carbon (Cγ and Cβ for leucine and valine residues, respectively), thereby allowing methyl-pairs to be identified from their common methine resonance. The pulse-sequence for this experiment comprises an out-and-back magnetization-transfer-pathway starting and ending on the methyl protons, using COSY-type transfers between the methyl and methine carbons and constant-time chemical-shift evolution periods for both indirect 13C dimensions.

PRE measurements

Request a detailed protocol

Mutants were generated following the QUIKCHANGE-XL protocol (Agilent Technologies) and purified in the presence of 5 mM β-mercaptoethanol in order to prevent disulfide bond formation. For L7Ae, the native C68 was mutated to serine prior to the introduction of cysteine residues at other sites. The purified protein was then buffer exchanged into 50 mM NaPi, 500 mM NaCl, pH 6.6 using a HiPrep 26/10 desalting column (GE Healthcare) and eluted directly into tubes containing a 10-fold molar excess of the 3-(2-iodoacetoamido)-PROXYL radical (Sigma-Aldrich) in the dark. The spin-labelling reaction was allowed to proceed overnight at room temperature. Spin-labelled proteins were used for complex reconstitution; the free spin-label was removed during the gel-filtration step. The final reconstitution step was carried out in 100% D2O buffer (50 mM NaPi, 500 mM NaCl, pH 6.6), prior to concentration with a 10 kDa-cutoff Amicon centrifugal concentrator (Merck Millipore).

All substrate-loaded sRNPs were obtained by addition of 1.25 molar equivalents of substrate RNA. This ratio yields full saturation of the substrate RNA-binding sites of the guide RNA. We verified this by monitoring the appearance of peaks indicative of free RNA (sharp peaks) in one-dimensional 1H spectra of the sRNP upon addition of increasing concentrations of substrate RNA. Sharp peaks began to appear after a 1:1 molar ratio of substrate:guide RNA was reached.

13C-HMQC spectra were acquired on Bruker Avance 800 and 850 MHz spectrometers, equipped with TCI cryoprobes, at 55°C with sample concentrations between 10 and 40 μM (2–8 mg/ml). Diamagnetic spectra were recorded after reduction of the spin-label by addition of ascorbic acid to a final concentration of 5 mM.

All spectra were processed using apodization with an exponential function in order to preserve Lorentzian line-shapes. Peaks were fitted with the program FUDA (http://www.ucl.ac.uk/hansen-lab/fuda/) assuming Lorentzian line-shapes. When necessary, overlapped peaks were fitted as groups. The fitted volumes and line-widths were then converted into peak-heights. The heights in the paramagnetic and diamagnetic states were used to calculate the distance between the nitroxide group of the paramagnetic tag and the respective methyl-group (see below).

The diamagnetic R2 rates corresponding to the transverse relaxation rates of 1H single-quantum coherence (R2diaH) and 1H-13C multiple-quantum coherence (R2diaHC) of each individual peak were quantified using the pulse-schemes from the Kay laboratory (Tugarinov and Kay, 2006; Tugarinov and Kay, 2013), modified to remove the fast-relaxing-component purging-element. Relaxation delays were 0, 2, 3, 4, 6, 7, 10 and 16 ms for fibrillarin, and 0, 2, 3, 4, 6, 7 and 10 ms for L7Ae. The peak-heights were fitted to a mono-exponential decay function to extract R2diaH and R2diaHC.

In order to derive the correlation-time for the electron-nucleus interaction vector, τC, we quantified paramagnetic (Ipara: oxidized, paramagnetic state of the spin-label) and diamagnetic (Idia: reduced, diamagnetic state of the spin-label) peak-heights corresponding to known distances within fibrillarin in complexes reconstituted with the Fib-R109C mutant. For L7Ae, we used known distances between the Nop5-CTD and L7Ae in complexes reconstituted with the Nop5-S343C mutant. The ratios of peak-heights were converted into PREs (Γ2), using Equation 2 and the R2diaHC and R2diaH rates measured for the respective peaks.

(2) IparaIdia=exp-Γ2tHMQCR2diaHR2diaHCR2diaH+Γ2R2diaHC+Γ2

where tHMQC represents the magnetization transfer time in the HMQC sequence (7.6 ms). As this equation is non-invertible, Γ2 was derived by plotting the simulated bleaching ratio, Ipara/Idia, as a function of Γ2 for a given set of diamagnetic rates, with the experimental errors on Ipara/Idia, R2diaH and R2diaHC used to determine the upper and lower bounds of the derived PRE. These PREs were then used as restraints in the protocol developed in the Clore Lab (Iwahara et al., 2004), which optimizes an ensemble of multiple spin-label conformations in combination with τC. For L7Ae, we used isoleucine resonances only. The minimization was run using the recommended ‘obsig’ setting for the weighting of the different PREs. After minimization of 20 structures, τC was 51.8 ± 5.7 ns for fibrillarin and 50.4 ± 9.4 ns for L7Ae.

For a given value of τC, distances r between the unpaired electron and the methyl protons were extracted from the equation:

(3) r=KΓ2(4τC+3τC1+ω2τc2)6

where K is a constant (1.23×10-23 cm6 s−2) and ω is the proton Larmor frequency in rad/s. The errors on the distances were again estimated by using the errors in τC, experimental Ipara/Idia ratios and R2 rates to yield upper and lower bounds on a calibration curve. A lower-bound of 10% was used for the errors of Ipara/Idia, as recommended by Battiste & Wagner (Battiste & Wagner, 2000). A lower-bound of 2 Å was imposed for the errors on the distances in order to account for tag flexibility. Finally, a minimum error of −4 Å was used as lower bound for the distances extracted from the PRE ratios in the calculation of the [on,off]-structures, to account for the possibility that the methyl group of only one fibrillarin copy is close the paramagnetic tag: in this case, the effective distance of the methyl group of the one fibrillarin copy to the paramagnetic tag would be smaller than the distance calculated from the sum of the two overlapping fibrillarin peaks (one with PRE intensity-ratios < 0.8 and one with PRE intensity-ratios close to 1).

In the structure calculations (CNS), distances were imposed from the nitrogen atom of the nitroxide group of the paramagnetic tag to the carbon atoms of fibrillarin methyl groups. For L7Ae, where stereospecific assignment of LV methyl groups was not available, the distance restraint was imposed to both methyl group carbons with an ‘OR’ statement. For complexes with both fibrillarin copies positioned away from the RNA, the same set of distance restraints was imposed on each fibrillarin copy; for complexes with one fibrillarin copy close to the RNA, distance restraints were imposed with an ‘OR’ statement.

Small-angle X-ray scattering (SAXS)

Request a detailed protocol

Box C/D sRNPs reconstituted in 50 mM NaPi pH 6.6, 500 mM NaCl were recorded at 40°C and concentrations varying from 0.4 to 5 mg/ml, unless otherwise specified. In most experiments a temperature of 40°C instead of 55°C was used for SAXS measurements due to the difficulty in collecting data with high salt concentrations at the higher temperature. For all measurements, 2 mM dithiothreitol (DTT) was added to mitigate radiation damage. Data collection was performed at the ESRF bioSAXS beamline BM29 with exposure of 10 frames each of 1 s duration. The curves were compared, merged, and the buffer contribution subtracted by the beamline software BsxCube (Pernot et al., 2013). Forward scattering intensity I(0) values were normalized relative to an ideal protein in an ideal solution, and were reported as 288, 194, 215 and 197 for the apo st-sR26 RNP, the substrate D’-bound st-sR26 RNP, the substrate D-bound st-sR26 RNP and the holo st-sR26 RNP, respectively, all at 5 mg/ml. The Rg and I(0) values were extracted according to the Guinier approximation using PRIMUS in ATSAS 2.7.5 (Konarev et al., 2003). All Rg values were computed using an s.Rg upper limit of 1.3 (where s is the modulus of the scattering vector), as recommended for globular particles.

To estimate the compatibility of the experimentally determined Rg values with the mono- or di-RNP assembly states, we evaluated the theoretical Rg distributions of 5000 di-RNP models in both apo and holo conformations from Lapinaite et al. (2013) and 500 half-loaded mono-RNP models generated in both [on,off] and [off,off]-states using the torsion-angle simulated-annealing protocol described below. The apo di-RNP showed a mean Rg value of 55.9 Å with a standard deviation (SD) of 2.0 Å; the holo di-RNP showed a mean Rg of 58.1 ± 3.6 Å; the [on,off]-state of the mono-RNP showed a mean Rg of 44.7 ± 1.4 Å; and the [off,off]-state of the mono-RNP showed a mean Rg of 48.5 ± 1.7 Å (Figure 1—figure supplement 3).

Small-angle neutron scattering (SANS)

Request a detailed protocol

2H-L7Ae, 2H-Nop5, 2H-fibrillarin, 2H-RNA, 2H-fibrillarin/2H-RNA and 2H(70%)-Nop5/2H-RNA samples were measured in 50 mM NaPi pH 6.6, 500 mM NaCl, 42%:58% D2O:H2O solutions, in order to mask the contribution of the 1H-proteins. The curves corresponding to 2H-L7Ae, 2H-Nop5, 2H-RNA and 2H(70%)-Nop5/2H-RNA were acquired at D22 at the Institute Laue Langevin (ILL, Grenoble, France), with a neutron wavelength of 6 Å. The 2H-fibrillarin and 2H-fibrillarin/2H-RNA curves were acquired at KWS-1 at JCNS (Munich, Germany) (Feoktystov et al., 2015) with a neutron wavelength of 5 Å. Both instruments were configured with sample-detector distances of 4 m and collimation lengths of 4 m. Data reduction and radial integration were done with standard procedures using beamline-specific software. Buffer subtraction was done in PRIMUS. Pair-wise distance-distribution functions P(r) were calculated from experimental data using GNOM in ATSAS 2.7.5 (Svergun, 1992). All SANS curves were acquired at 55°C.

Structure calculation and selection

Request a detailed protocol

Structures were calculated using an adapted version of the protocol described in Lapinaite et al. (2013); Nilges, 1995 according to the workflow described in Figure 3—figure supplement 1. The starting st-sR26 RNA structures, bound to either substrate D or substrate D’, were generated in separate calculation runs using restraints to impose an A-form helical geometry on the substrate–guide duplex, and to yield the appropriate K-turn structures. Starting protein conformations were generated from the PDB entry 3nmu and assembled into two L7Ae–Nop5–Fib protomers, in which the L7Ae–Nop5-CTD and Nop5-NTD–Fib interaction interfaces of 3nmu were preserved, but not the relative orientation of the Nop5-NTD and CTD, which were randomised. The two copies of the protomers within the sRNP were separated and randomly rotated with respect to each other. The building-blocks L7Ae–Nop5-CTD, Nop5-NTD–Fib and the Nop5 coiled-coil domain were kept rigid throughout the calculations. Structures were calculated for both the substrate D- and substrate D’-loaded sRNPs. For each sRNP the proteins and RNA were subjected to two sets of parallel torsion-angle simulated-annealing procedures; one included a set of restraints positioning one fibrillarin copy on the methylation site of the substrate–guide duplex ([on,off]-state); in another no restraints were imposed between fibrillarin and the RNA ([off,off]-state). The conformational sampling was driven by PRE-derived distance restraints, distance restraints positioning the two L7Ae–Nop5-CTD modules onto the RNA K-turns and a loose distance restraint between the centres of mass of the two L7Ae modules (90 ± 15 Å), which was derived from the P(r) curve of 2H-L7Ae in 42%:58% D2O:H2O. Restraints positioning the Nop5-α9’ helix between the two guide regions (from Nop5-K301 and K304 to the phosphate backbone of the nucleotide linking the K-turn and substrate–guide helix) were also used. With this set up, we started an iterative procedure, to generate two lists of PRE-derived distance-restraints compatible with either the [on,off]- or [off,off]-state. 500 structures were calculated per iteration. At the end of each iteration, restraint violations were evaluated: restraints violated by more than 10 Å in either set of calculations were classified, eliminated from that particular set, but kept in the other. After 5 iterations, this led to two restraint-lists per sRNP, corresponding to the [on,off]- and [off,off]-states of the sRNP.

With these four sets of restraints (two for the substrate D-loaded and two for the substrate D’-loaded sRNP), four separate runs of torsion-angle simulated-annealing calculations were performed; we generated 2500 structures per run, using the settings described in Lapinaite et al. (2013).

The fitness of the experimental SAS and PRE data with respect to the calculated structures was assessed by calculation of the χ2 statistic (Equation 4) and by visual inspection of fits between back-calculated and experimental data:

(4) χ2=1Ni=1NIexpsi-cIcalcsiσsi2

where Icalc represents the back-calculated data-point (Ipara/Idiaintesity-ratios or SAS intensities), Iexp is the corresponding experimental value, N is the number of experimental points, σ represents the experimental error and c is the scaling factor:

(5) c=i=1NIexpsiIcalcsiσsi2i=1NIcalcsiσsi2

The structures ranking in the top 2% in both total energy and restraint energy were selected. To further narrow down the selection on the basis of the SAS data, we evaluated the χ2 distribution of the 2H-Nop5, 2H-L7Ae and 2H-RNA SANS curves. The SAS curves including the contribution from fibrillarin were left out, because we expected the position of fibrillarin to be variable when it is not in contact with the RNA. SAS fitness was calculated with the programs CRYSOL and CRYSON, from the ATSAS suite, version 2.7.5 (Svergun et al., 1998). Based on the distribution of fitness for all structures in each of the runs, we set loose cut-offs, which excluded only structures beyond the smooth, linearly increasing portion of the distribution curve. For the substrate D’-loaded complex, we selected structures within the top 90% ranking by 2H-RNA fitness χ2 <1.4 χ2min in the [off,off]-state, χ2 <2.2 χ2min in the [on,off]-state), the top 50% by 2H-L7Ae fitness (χ2 < 1.3 χ2min in the [off,off]-state, χ2 <1.3 χ2min in the [on,off]-state), and the top 80% by 2H-Nop5 fitness (χ2 < 6.1 χ2min in the [off,off]-state, χ2 <6.8 χ2min in the [on,off]-state); for the substrate D-loaded complex, we selected structures within the top 90% ranking by 2H-RNA fitness (χ2 < 2.5 χ2min in the [off,off]-state, χ2 <3.4 χ2min in the [on,off]-state), the top 80% by 2H-L7Ae fitness (χ2 < 2.0 χ2min in the [off,off]-state, χ2 <1.8 χ2min in the [on,off]-state) and the top 90% by 2H-Nop5 fitness (χ2 < 6.5 χ2min in the [off,off]-state, χ2 <5.3 χ2min in the [on,off]-state). The average pair-wise RMSD of the structures of each ensemble, calculated over the Cα and P atoms of the protein and RNA structured domains, including the fibrillarin units not bound to the RNA, was below 5 and 7 Å for the [on,off] and [off,off] conformers, respectively, with a maximum RMSD value of less than 10 Å in all cases.

Among the selected structures of each of the four runs ([on,off]- and [off,off]-states of both substrate D- and substrate D’-loaded sRNPs), the one with the lowest restraint-violation energy that maintained the correct RNA topology was chosen as the starting point for refinement in Cartesian space. The four refinement runs comprised 1500 structures each spanning up to 10 Å RMSD of Cα and P atoms relative to the starting structure (number calculated for the substrate D’-loaded [on,off]-state). At the end of the refinement, we applied stringent selection criteria with respect to the SAS curves and loose criteria with respect to the energy. The cut-offs for the SAS data were set upon visual inspection of the χ2 distributions for each run and curve, whereby we allowed more structures to be selected when the χ2 distribution was flat.

For the substrate D’-loaded sRNP the cut-offs are as follows: top 33% of restraint-violation, van der Waals and total energy; top 83% for 2H-RNA (χ2 < 1.3 χ2min for the [off,off]-state, χ2 <2.0 χ2min for the [on,off]-state); top 67% for 2H-L7Ae (χ2 < 1.8 χ2min for the [off,off]-state, χ2 <1.1 χ2min for the [on,off]-state); top 33% for 2H-Nop5 (χ2 < 2.7 χ2min for the [off,off]-state, χ2 <3.4 χ2min for the [on,off]-state); top 10% for 2H(70%)-Nop5-RNA (χ2 < 6.1 χ2min for the [off,off]-state, χ2 <3.3 χ2min for the [on,off]-state). Applying these criteria we selected 1 structure for the substrate D’-loaded [off,off]-state and 12 structures for the [on,off]-state. The [on,off]-state structures displayed an average RMSD of 2.4 Å, calculated on all Cα and P atoms (Figure 3) excluding the fully flexible regions, namely the free guide region of the RNA (nucleotides 51–62), the loops connecting the Nop5-NTD to the coiled-coil domain (residues 116–122), and the loops connecting the coiled-coil domain to the Nop5-CTD (residues 249–251).

For the substrate D-loaded sRNP the cut-offs are as follows: top 33% of restraint-violation, van der Waals and total energy; top 83% for 2H-RNA (χ2 < 2.1 χ2min for the [off,off]-state, χ2 <2.2 χ2min for the [on,off]-state); 66% for 2H-L7Ae (χ2 < 2.7 χ2min for the [off,off]-state, χ2 <1.2 χ2min for the [on,off]-state); 33% for 2H-Nop5 (χ2 < 3.0 χ2min for the [off,off]-state, χ2 <1.9 χ2min for the [on,off]-state); 10% for 2H(70%)-Nop5-RNA (χ2 < 3.2 χ2min for the [off,off]-state, χ2 <2.8 χ2min for the [on,off]-state). The final ensembles for the substrate D-loaded [off,off]- and [on,off]-states consist of 3 and 20 structures, respectively, with a Cα and P RMSDs of 4.6 and 2.4 Å, respectively.

Representative structures in the final ensembles were minimized in explicit water using Amber 14 and the corresponding Amber99SB force field (Hornak et al., 2006).

Ensemble scoring

Request a detailed protocol

The PRE data and the SAS data indicated the presence of a conformational equilibrium between the [on,off]- and [off,off]-states, as discussed in the main text. The 2H-fibrillarin, 2H-fibrillarin/2H-RNA SANS and SAXS curves were therefore fitted to a mixture of structures in the [on,off]- and [off,off]-states.

In order to address the flexibility of the Nop5-NTD–fibrillarin modules not in contact with the RNA, we sought to generate ensembles containing different orientations of these modules that would improve the fit to the SAS curves. This conformational diversity is in addition to the equilibrium between the [on,off]- and [off,off]-states, resulting in a pool of structures containing both [on,off]- and [off,off]- states and multiple conformations of Nop5-NTD–fibrillarin modules in each state.

To generate these ensembles we proceeded as follows. Starting from the representative structure of each ensemble of Figure 3, corresponding to the structure closest to the mean of the ensemble, we performed a further simulated-annealing step, where the loops connecting the Nop5-NTD–fibrillarin modules to the rest of the Box C/D particle were allowed to adopt random orientations, while the rest of the particle was kept rigid. At this stage, we generated 4000 structures with randomised Nop5-NTD–fibrillarin positions, from which we removed structures containing steric clashes. The structures also contained all spin-labels, which were left flexible, in order to allow back-calculation of PREs (see below).

In a separate run comprised of 300 structures, the template structures were kept entirely rigid while the spin-label side-chains were allowed to rotate in order to generate different orientations, as multiple conformations of the spin-label have been demonstrated to fit the PRE data more accurately than a single conformation (Iwahara et al., 2004).

Ensemble scoring was carried out for substrate D’- and substrate D-loaded sRNPs via the pseudo-genetic algorithm shown in Figure 3—figure supplement 1b. First, we grouped the structures into four pools, containing 3500, 3500, 300 and 300 structures: [on,off]-state with randomised Nop5-NTD−fibrillarin positions, [off,off]-state with randomised Nop5-NTD−fibrillarin positions, [on,off]-state with randomised spin-label orientations and [off,off]-state with randomised spin-label orientations. The algorithm generated four ‘parent’ ensembles, each comprising of 2–10 conformers randomly chosen from the pools. These ensembles were merged and sub-sampled, yielding 20 ‘children’ sub-ensembles ranging from 3 to 10 conformers in size. Each sub-sampling event had a 30% probability of duplicating a conformer or replacing one with another from the main pool. The process of parent selection, sub-sampling and scoring was repeated 250 times.

The theoretical scattering curve of the ensemble was computed as the linear combination of the scattering curves of each individual conformer (scaling the populations to represent molar fractions rather than volume fractions, which is the standard ATSAS output). The χ2 value with respect to the experimental data was calculated by OLIGOMER (Konarev et al., 2003). The normalization of χ2 of all sub-sampled ensembles and across iterations was done according to Equation 6 (Karaca et al., 2017):

(6) χnorm2=χensemble2-χmin2χmax2-χmin2

where χ2ensemble is the fitness of an individual ensemble, and χ2min and χ2max are the respective minimum and maximum values across the iterations or sub-ensembles being considered. Five SAS curves were used for scoring: 2H-Nop5, 2H-Fib, 2H-Fib/2H-RNA, 2H(70%)-Nop5/2H-RNA and SAXS. The normalized χ2 values for each curve were then summed and renormalized into a single value, obtained with the same Equation 6, which then represented the overall SAS-fitness.

The calculation of the theoretical Ipara/Idia ratios from mixed [on,off]- and [off,off]-state ensembles requires an estimation of the timescale of the exchange rate kex (kex = k1 + k-1) between the [on,off]- and [off,off]-conformers. This can be easily done by inspecting the ILV-methyl 1H-13C spectra of fibrillarin: in the case of slow conformational exchange, the methyl groups in the fibrillarin copy sampling the on- and off-states should each yield two separate NMR peaks, while for fast conformational exchange these methyl groups should each show only a single peak, at a position corresponding to the population-weighted average of the positions corresponding to the on- and off-states. To investigate this, we used the spectrum of the RNP assembled with ssR26 and loaded with substrate RNA as a reference for the slow-exchange situation: in this complex, two of the four fibrillarin copies adopt a stable on-state, while the other two are in the off-state, and a subset of the fibrillarin methyl groups show separate and resolvable peaks corresponding to the two states. In the spectra of the half-loaded st-sR26 RNPs we did not detect any peak at the positions corresponding to RNA-bound fibrillarin in the holo ssR26 RNP spectra, indicating that in the half-loaded mono-RNP, either the kex is faster than the differences in the resonance frequencies of the fibrillarin methyl groups in the on- and off-states (~40–100 Hz), or the population of the on-state is too small to be detected. In the second case, one would expect no CSPs upon substrate RNA binding, which does not correspond with the observed spectra (Figure 2a, right panel), Thus, we back-calculated the PREs for the fibrillarin copy that can be in contact with the substrate–guide duplex using <r−6>ensemble averaged distances over the [on,off]- and [off,off]-states, as appropriate for the fast exchange regime.

Each methyl group of each fibrillarin or L7Ae copy is influenced by two PRE tags (SL1 and SL2). The resulting Γ2 values for the methyl groups of the two copies are given by:

Γ2Methyl1=Γ2,SL1Methyl1+Γ2,SL2Methyl1
(7) Γ2Methyl2=Γ2,SL1Methyl2+Γ2,SL2Methyl2

where Methyl1 and Methyl2 refer to the two copies of L7Ae or fibrillarin. Because Methyl1 and Methyl2 have almost indistinguishable chemical shifts, the resulting Ipara/Idia ratios for Methyl1 and Methyl2, calculated from Equations (2) and (3), were averaged before comparison to the experimental data. The PRE fitness was quantified using χ2 to all experimental PRE values using Equation (4). Distances were computed from the PDB files using the Biopython Bio.PDB module (Cock et al., 2009). The fitness of PRE data was normalized using Equation (6) and summed with the SAS-fitness score, to yield a consensus PRE-SAS score for each ensemble within the 20 sub-sampling events, and across the 250 iterations.

Three independent runs of the scoring algorithm were performed for substrate D’- and substrate D-loaded sRNPs, with the top scoring ensemble, judged by the consensus PRE-SAS score, displayed in Figure 4.

After this selection, the conformations of each individual tag were refined by generating additional 3000 conformers per tag and by using the same pseudo-genetic algorithm to select the ensembles of tag conformations that best fitted each individual PRE dataset. During this refinement step the positions of all proteins and RNA, as well as the populations of fibrillarin conformers in the ensemble, were left invariant, in order not to alter the fit to the SAS data.

Molecular dynamics

Request a detailed protocol

Molecular dynamics simulations of the substrate D’- and substrate D-bound structures representing the [on,off]-states were carried out in AMBER 2018 (Case et al., 2018). The simulations were carried out in explicit TIP3P water using a cubic box with a 14 Å water layer and the ff14SB parameter set. The system was subjected to 20,000 cycles of solvent minimization with positional restraints on the complex (NPT), followed by heating to 328 K (NVT). The complete system was subjected to an additional 20,000 cycles of energy minimisation, and then allowed to relax, keeping restraints on the proteins and heavy atoms (NPT at 328 K, 0.5 ns). Subsequently, the two structures were subjected to a 150-ns molecular dynamics. Contacts were extracted using CPPTRAJ (Roe and Cheatham, 2013).

References

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
    AMBER 2018
    1. DA Case
    2. IY Ben-Shalom
    3. SR Brozell
    4. DS Cerutti
    5. I Cheatham
    6. TE Cruzeiro
    (2018)
    San Francisco: University of California.
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
  39. 39
  40. 40
  41. 41
  42. 42
  43. 43
  44. 44
  45. 45
  46. 46
  47. 47
  48. 48
  49. 49
  50. 50
  51. 51
  52. 52
  53. 53
  54. 54
  55. 55

Decision letter

  1. Lewis E Kay
    Reviewing Editor; University of Toronto, Canada
  2. Philip A Cole
    Senior Editor; Harvard Medical School, United States

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

A relatively large macromolecular RNA protein complex, Box C/D RNP, was known to methylate ribosomal RNA at ribose 2'-OH, but how this activity is regulated has been poorly understood. The authors show here that the substrate RNA sequence outside of the methylation site can impact C/D RNP mediated interactions and rRNA methylation. This work provides a strong example of the power of using multiple biophysical methods in addressing important structural biology problems, as well as a compelling example of the role of NMR in such studies.

Decision letter after peer review:

Thank you for submitting your manuscript, "The guide sRNA sequence determines the activity level of Box C/D RNPs" for consideration in eLife. Your manuscript has been reviewed by two experts and both they and I are impressed with the range of advanced methodologies that has been used to study such a large complex by NMR. We also feel that this can be an important contribution should you and your colleagues address the concerns listed below. Please note that further extensive experiments are not required, but that conclusions must be strongly supported by experiment.

The reviewers, which have opted to remain anonymous, have discussed the reviews with one another, and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Major

1) Previous work by the group described a di-RNP complex in the presence of substrate (holo di-RNP); this work describes a mono-RNP complex in the presence of substrate (holo mono-RNP). The authors mention this but don't provide details. The questions that arise are: a) why are the previous and the present complexes different, is there a specific RNA contact that drives the formation of the di-RNP? and b) in case small differences in the guide RNA have a pronounced effect on the fully assembled complex, what stoichiometry do the authors expect for the Box C/D complex with the native guide RNA?

2) The substrate RNA that the authors use is significantly shorter than the natural substrate RNA (rRNA is long). Are the presented structures compatible with much longer substrate RNAs and is the proposed mechanism between the methylation efficiency of D' and D substrates also valid in the context of longer substrates? Can the terminal bases of the 3' and 5' ends of the substrate RNA that is used here adopt the same conformation when extra nucleotides are present?

3) The authors should be more accurate regarding the description of the RNA that they used. In the text and Materials and methods, the authors write that they use a stabilized version of the sRNA (for what reason, this is not explained, see also remark 1). In addition, Figure 1B displays two sRNA sequences (the WT and the stabilized one), the legend refers to this as "Sequence of the sRNAs used…" We assume the authors mean that the lower one of the two is the used sRNA sequence. Likewise, Figure 2 refers to a ssR26 species. We assume that this is the ss-sR26 RNA, the RNA that was used in a previous study. Such inaccuracies often cause confusion regarding the type of the RNA that is used. One suggestion is to add this info to the cartoons in Figure 1A, where cartoon of the di-RNP-apo, mono-RNP-holo and di-RNP-holo complexes are shown.

4) "In the di-RNP complex, the methyl-group NMR spectra indicated the presence of fibrillarin in two states, one close to the substrate-guide duplex (on-state) and one far from the substrate (off-state) (Figure 2A)." We don't see this. In Figure 2A, the NMR spectra of the complex assembled with ss-sR26 (previous study; cyan) and the complex assembled with st-sR26 (this study, blue) are overlaid. According to SAXS data these are both di-RNPs. However, we don't see any sign of the presence of two states in those NMR spectra. In addition, later in the manuscript, the authors conclude that the RNP is dynamic in the presence of substrate, whereas the displayed spectra are from complexes in the absence of substrate.

5) "In contrast, both half-loaded mono-RNPs (substrate D- bound and substrate D'-bound) show only one set of fibrillarin resonances; however, the peaks that are split in two in the di-RNP are noticeably elongated in the mono- RNP, suggesting a conformational equilibrium between on- and off-states interchanging at a rate that is fast compared to the corresponding difference in NMR frequencies." Again, we are not following this:

a) In Figure 2B, we see one resonance for the st-sR26 apo complex (blue, same spectrum as panel a. This contradicts with statement above, that the apo-di RNP samples two states.

b) For the st-sR26 holo complex (green, mono-RNP holo) (is this after addition of both D and D' or after either D or D'?), we see one maybe slightly broadened resonance. For the ssR26-holo (purple, di-RNP holo) complex one can imagine two resonances (but the ssR26 RNA is not really the subject to this study). This could suggest that the holo complexes (mono- or di-RNA) are actually sampling multiple states. The text, however, mentions that the half-loaded mono-RNPs only show one set of resonances.

6) We also don't fully understand the statement in the legend of Figure 2B "a single but elongated peak, characteristic of one copy of fibrillarin being in fast-exchange between the two states", refering to the green contours from the mono-RNP holo complex. In case there is fast exchange between 2 states, the green resonance should appear between the two purple resonances. In our view, this is not the case; the broad resonance appears exactly at one of the two signals that form the purple "doublet". Also, the broadening of the green resonance appears to be mainly in the carbon dimension, and not in the proton dimension. Are the spectra recorded with exactly the same experimental settings, or is it possible that differences in processing, viscosity, temperature or acquisition time is the cause of the differential linewidth? In summary, the conclusion regarding the dynamics of the complex seems not supported by the displayed NMR spectra.

7) Does Figure 2C belong to the st-sR26 holo complex in Figure 2B? If yes, this should be clearly indicated.

8) Figure 2D, top. How can Ipara/Idia be (significantly) larger than 1? For readability, please label in the figure that the blue curve is for D' bound and that the red one is for D bound and indicate that this is the mono-RNP holo complex.

9) Do the Ipara/Idia values depend on the excess of the D or D' RNA? Are the binding sites for D and D' in both cases fully saturated? We assume that less saturation results in more mobility, so it is important to ensure that one compares fully saturated complexes.

10) For Figure 2D, bottom, the deviations between the experimental and simulated P(r) values are similar for the short distances (0 – 40 A) and for the large distances (70 – 140 A). The deviations for the small r values are independent of conformational changes and these are thus intrinsic to the method. The question is, if the deviations at larger r values are thus significant enough that one can conclude from these SANS data that there are two states? If yes, does this agree with the population ratio of the [on, off] and [off, off] states that the authors determined later in the manuscript?

11) Subsection “The conformational ensembles of the half-loaded mono-RNPs in solution” paragraph three: "show a reasonable similarity". The authors need to provide at least an RMSD and an overlay of their two structures with the structure of Lin et al. to support this statement.

12) In context of sRNP complex, the affinity of fibrillarin for substrate depends on RNA sequence beyond recognition site

– It is unclear what the recognition site is. Referring to substrate sequences? Without a clearly defined recognition site, this claim is not supported by the data

– Do the authors mean methylation site rather than recognition site as specified in the Discussion? If this is the case, the Abstract/Introduction should be modified to reflect this level of detail.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Thank you for submission of a revised version of your manuscript "The guide sRNA sequence determines the activity level of Box C/D RNPs" to eLife. The paper has been re-evaluated by the reviewers. The reviewers feel that the manuscript is much improved over the initial submission but bring up a number of critical points that must be addressed properly should the manuscript be accepted for publication.

In short, there are serious questions about the NMR data. For example, it is extremely difficult to establish that a dynamic equilibrium is occurring between on/off conformations for one of the copies of fibrillarin from the CSP data. Second, it would be useful to show that the PRE data are inconsistent with the off state and presumably also with the on state. The authors attempt to say this and refer to Figure 2C but it is not possible to see much from this figure, as it is small. It would be important to show the structures of the on and off state along with the PRE values and establish that the PRE values can simply not be explained by a static entity. We suggest separating the on and off structures pictorially to make the point. Second, given that some of the PREs are > 1 it is important that they be validated. One way of doing this is to establish that intra-protein PREs (i.e., PREs from probes attached to the same protein as spin label) are consistent with expectations based on distances.

Major comments

Previous points #4, 5 and 6; Legend Figure 2B: "Despite the absence of peak splitting, peak budding is visible in the spectrum of the substrate D'-loaded st-sR26 RNP in all instances for which peak splitting or budding is observed in the spectrum of the holo ssR26 di-RNP."

1) This figure and its legend have been remade for this version of the manuscript. Nevertheless, we cannot follow what the authors see in their spectra. The only instance, where we can see a peak splitting (not sure if "budding" is a term that can be used), is in the purple (ssR26 holo) spectrum for residue L58 and, with some imagination, for I62. For all other panels and all other complexes, we fail to see any change in the resonance. In other words, for V118, L160, V141 and L200, there are effectively no chemical shift perturbations that are larger than 0.01 ppm in proton. In carbon there is no CSP whatsoever. For the open-closed conformations, one would expect many and large CSPs.

This is a very important point, as the authors have little convincing NMR data that show that the complex exists in two states with fibrillarin either in the on- or the off-state.

2) Previous point #8; Figure 2C. The PRE data with an Ipara/Idia still are very problematic. It is physically not possible that this ratio is larger than 1. In the plot, ratios that are 1.25 are shown. The authors write in their letter that there are only two values over 1, in the plot there appear to be at least 6. The remark that it is hard to extract reliable data for crowded spectral regions is correct, however, in case the data are unreliable or highly uncertain (which is not reflected in the presented error bars), it is not possible to draw solid conclusions. So how do the authors come to their conclusions based on the presented data? As an example, a ratio of 0.75 is mentioned in the text as highly relevant for the closed conformation, whereas a ratio of 1.25 is rebutted (only in the letter, not in the manuscript) as resulting from difficult spectral interpretations.

As the NMR data do not agree with SANS data, the authors involve conformational rearrangements between open and closed conformations. Clearly, in case things are not in agreement, it can always be explained by dynamics. Because of the very weak NMR evidence (2 points above), it seems a far stretch to conclude that the complex undergoing open-closed motions.

Previous point #10; Figure 2D. The figure is still hard to understand. The substrate D loaded experimental (red) and theoretical (brown) are very different. The substrate D' loaded experimental (cyan) and theoretical (blue) are also very different.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "The guide sRNA sequence determines the activity level of Box C/D RNPs" for further consideration by eLife. Your revised article has been evaluated by Philip Cole (Senior Editor) and a Reviewing Editor.

Thank you for your letter addressing the comments that were raised based on the second revision and you are invited to resubmit a revised manuscript. Let me first address whether it is worth submitting a further revised paper. I agree that it is a lot of work on all sides – but there was very significant enthusiasm from the reviewers and me, and I think that we would all like to see some final issues clarified.

First, when there are issues raised by the reviewers that seem unreasonable to the authors, this sometimes reflects the fact that the material was not explained as clearly as might be possible. One of the advantages of eLife is that space is not nearly as constrained as in some other prestigious journals. I would suggest taking advantage of this. For example, include validation of the PREs (intra) for both fibrillarin and L7Ae, as a main figure, to convince the reader that your data are good (I believe that this will placate the reviewers). Then if you fit your inter-data to 1 structural model you should be able to show that the fit is worse. We request that you do this, as then you establish that there could be more than 1 structure. We request that you also include structural models of the sort in Figure 2C but larger so that these can be better viewed. Following this up with the SAXS data will then make your thought process transparent. As the reviewers had difficulty with the SAXS please make it very clear that the disagreement is a really powerful argument for why there are dynamics.

We suggest that the NMR CSPs are less convincing because they are small. We recommend that you might emphasize this before the PREs and SAXS (your ordering seems very logical to me) but tone it down a bit as the differences in shifts are not large. One can sort of imagine that the green peaks are between blue and purple but we are talking about very small changes. If you say this as a follow-in to the other data, we think that it may sit easier with the reviewers.

Finally, regarding the errors in PREs. we can appreciate that some values could well be > 1 given statistics. One would hope that the errors would reflect this. For a couple of cases that we count, this is not the case. Perhaps if you show more PRE data this will help – and as you will hopefully be showing correlation plots assuming 1 state and that these are not as good as the intra-validation – showing more raw data would be justified.

https://doi.org/10.7554/eLife.50027.sa1

Author response

Major

1) Previous work by the group described a di-RNP complex in the presence of substrate (holo di-RNP); this work describes a mono-RNP complex in the presence of substrate (holo mono-RNP). The authors mention this but don't provide details. The questions that arise are: a) why are the previous and the present complexes different, is there a specific RNA contact that drives the formation of the di-RNP? and b) in case small differences in the guide RNA have a pronounced effect on the fully assembled complex, what stoichiometry do the authors expect for the Box C/D complex with the native guide RNA?

We thank the reviewers for this question. We have thoroughly investigated this point during the course of this work, as the discovery that the complexes assembled with the sR26 and st-sR26 RNAs form monomeric species upon substrate binding was unexpected.

With regard to point b): RNPs assembled with both sR26 and st-sR26 RNAs transition from the di-RNP to the mono-RNP form upon substrate binding. To demonstrate this, we have added Figure 1—figure supplement 4, which shows the SAXS curves of apo and holo-complexes assembled with sR26.

With regard to point a): our experiments revealed that, while the apo complex is always a di-RNP, the oligomerization state of the holo complex depends on the sequence of the guide RNA in the substrate recognition regions. In particular, the sequences directly upstream of either box D or box D’ play a fundamental role (see new Figure 1—figure supplement 5). Unfortunately, because of the lack of high-resolution data for the disordered parts of the RNA components of the complex, we do not yet know why this is the case and cannot propose a predictive set of rules. We are planning solid-state NMR experiments to answer this question. However, these experiments have never been tried before and are likely to require substantial experimental effort.

Following the question of the reviewers we have now added new experimental data (Figure 1—figure supplement 5) and an entire new paragraph to address this point.

“Before embarking upon the structural study of the sRNPs containing st-sR26, we first wanted to understand which elements are responsible for the different oligomerization states of the holo ssR26- and holo st-sR26-RNPs. The ssR26 and the st-sR26 RNAs differ only in the sequence of the guide RNA at the box D position, which in the case of ssR26 is identical to that of guide D’. Thus, we generated two additional guide RNAs with distinct D and D’ sequences, st-sR26-1 and st-sR26-2: in st-sR26-1 (st-sR26-2), guide sequence D is a chimeric sequence, formed by the 5’ half of st-sR26 guide D (st-sR26 guide D’) and the 3’ half of st-sR26 guide D’ (st-sR26 guide D) (Figure 1—figure supplement 5A). Interestingly, the Box C/D enzyme containing st-sR26-1 maintained the di-RNP architecture upon binding of either substrate RNAs, while the sRNP comprising st-sR26-2 transitioned to the mono-RNP state (Figure 1—figure supplement 5B). Mutation of the last nucleotide of st-sR26-1 guide D to either C or U (A61C and A61U with complementary substrate D) did not perturb the di-RNP architecture (Figure 1—figure supplement 5C). We conclude that the guide sequence strongly influences the oligomerization state of the holo complex. “

2) The substrate RNA that the authors use is significantly shorter than the natural substrate RNA (rRNA is long). Are the presented structures compatible with much longer substrate RNAs and is the proposed mechanism between the methylation efficiency of D' and D substrates also valid in the context of longer substrates? Can the terminal bases of the 3' and 5' ends of the substrate RNA that is used here adopt the same conformation when extra nucleotides are present?

Both the 3’ and 5’ ends of the substrates used in this work are not involved in base-pairing and can be easily elongated without perturbing the structure of the Box C/D RNP. We use substrates that are longerthan the 11 base-pairing nucleotides, in order to account for the longer natural substrates.

There are several reports in the literature stating that longer RNAs are methylated more efficiently in vitro, probably due to additional interactions within the Box C/D RNP, which however remain uncharacterized. rRNA methylation partially occurs during transcription; thus, another question one can ask is whether the sequences flanking the methylation regions at more than 5–6 nucleotides distance are involved in interactions with other proteins or other RNA stretches and thus unavailable for additional interactions within the Box C/D RNP. We do not know the answer to these questions. There is currently no good model to test the effect of the longer rRNA sequences in vivo.

3) The authors should be more accurate regarding the description of the RNA that they used. In the text and Materials and methods, the authors write that they use a stabilized version of the sRNA (for what reason, this is not explained, see also remark 1). In addition, Figure 1B displays two sRNA sequences (the WT and the stabilized one), the legend refers to this as "Sequence of the sRNAs used…" We assume the authors mean that the lower one of the two is the used sRNA sequence. Likewise, Figure 2 refers to a ssR26 species. We assume that this is the ss-sR26 RNA, the RNA that was used in a previous study. Such inaccuracies often cause confusion regarding the type of the RNA that is used. One suggestion is to add this info to the cartoons in Figure 1A, where cartoon of the di-RNP-apo, mono-RNP-holo and di-RNP-holo complexes are shown.

We apologize for this. We have now added the ssR26 RNA sequence to panel B of Figure 1. In addition, we have changed the figure caption to:

“Two RNA sequences (st-sR26 and ssR26) were derived from the Pf sR26 RNA and used to assemble the Box C/D sRNPs either in this (st-sR26) or previous studies (ssR26, (Lapinaite et al., 2013)). The sequence of st-sR26 is derived from the native sR26 RNA upon substitution of the apical K-loop element with the more stable K-turn element. This substitution does not affect the oligomerization state of the complex, as shown in Figure 1—figure supplement 4, but ensures the stability of the complex over several days.”

This text explains why we used the st-sR26 RNA instead of sR26 and also addresses the question raised by the reviewers with respect to the oligomerization state expected for the complex assembled with the sR26 RNA.

4) "In the di-RNP complex, the methyl-group NMR spectra indicated the presence of fibrillarin in two states, one close to the substrate-guide duplex (on-state) and one far from the substrate (off-state) (Figure 2A)." We don't see this. In Figure 2A, the NMR spectra of the complex assembled with ss-sR26 (previous study; cyan) and the complex assembled with st-sR26 (this study, blue) are overlaid. According to SAXS data these are both di-RNPs. However, we don't see any sign of the presence of two states in those NMR spectra. In addition, later in the manuscript, the authors conclude that the RNP is dynamic in the presence of substrate, whereas the displayed spectra are from complexes in the absence of substrate.

We apologize for having generated confusion with our typo. The sentence should have made referenceto Figure 2B and not 2A. As the reviewers and the editor correctly noticed, there is no evidence of multiple fibrillarin states in the apo form of the complexes. The apo complexes assembled with either ssR26 or st-sR26 are both di-RNPs, with all four copies of fibrillarin far from the RNA due to the absence of the substrate RNAs. The two spectra of Figure 2A (now Figure 2A, left panel) overlap well with each other and indeed are not expected to differ. We have corrected the typo, re-designed Figure 2, substantially changed the main text and the figure caption to provide answers to this as well as to the next points. With respect to this point, the figure caption now reads:

“Overlay of ILV-methyl 1H-13C spectra of fibrillarin in the apo ssR26 (turquoise) and apo st-sR26 (blue) RNPs. In both di-RNPs, all four fibrillarin copies are distant from the RNA and the two spectra are identical.”

5) "In contrast, both half-loaded mono-RNPs (substrate D- bound and substrate D'-bound) show only one set of fibrillarin resonances; however, the peaks that are split in two in the di-RNP are noticeably elongated in the mono- RNP, suggesting a conformational equilibrium between on- and off-states interchanging at a rate that is fast compared to the corresponding difference in NMR frequencies." Again, we are not following this:

a) In Figure 2B, we see one resonance for the st-sR26 apo complex (blue, same spectrum as panel a. This contradicts with statement above, that the apo-di RNP samples two states.

Please see the answer to the point above.

b) For the st-sR26 holo complex (green, mono-RNP holo) (is this after addition of both D and D' or after either D or D'?), we see one maybe slightly broadened resonance. For the ssR26-holo (purple, di-RNP holo) complex one can imagine two resonances (but the ssR26 RNA is not really the subject to this study). This could suggest that the holo complexes (mono- or di-RNA) are actually sampling multiple states. The text, however, mentions that the half-loaded mono-RNPs only show one set of resonances.

The green spectrum of Figure 2B referred to the fully loaded st-sR26 RNP, which has a complicated behavior, as both fibrillarin copies are in equilibrium between the RNA-bound and the RNA-free states. The relative populations of the RNA-bound forms of the fibrillarin at the substrate-loaded site are different for the two different half-loaded mono-RNPs; in addition, the chemical shifts of the respective RNA-bound states will also show differences, as the substrates have different sequences (D and D’). By showing the holo form of the st-sR26 complex we have generated much confusion, especially as we did not explain the spectrum. In this revised version we have re-designed the representation of the spectra in Figure 2: we now show the spectrum of the substrate D’-loaded mono-RNP, which is directly comparable to that of the holo ssR26 di-RNP, as the substrate sequences are the same. We also show the expanded view of more methyl group peaks to show that peak “budding” effects are present in both the 1H and the 13C dimensions and always appear in the same direction of the peak splitting or budding observed in the holo ssR26 di-RNP.

We have also added two paragraphs in the main text to explain our interpretation of the peak shape. However, we would like to stress that the presence of the conformational equilibrium is not proven by the minuscule peak budding in the HMQC spectra, but rather by the incompatibility of the SANS and PRE data. Thus, by comparing the HMQC spectra, we do not want to prove the existence of the conformational equilibrium, but rather verify that the peak shapes are compatible with the presence of this equilibrium. Here, we summarize the explanation that we give in the text:

1) In the apo complexes all fibrillarin copies are distant from the RNA in both the st-sR26 and ssR26 RNPs. The spectra of the two species overlap perfectly, as shown in Figure 2A, left.

2) In the holo di-RNP complex assembled with the ssR26 RNA (our previous work), all structural data are compatible with two fibrillarin copies being in stable contact with the RNA and two being far from the RNA. Two distinct NMR peaks are observed for some of the fibrillarin methyl groups, while many other peaks showed substantial asymmetric broadening (“budding”). Thus, the position of the second peak in the di-RNP HMQC spectrum is representative of fibrillarin bound to the substrate D’–guide duplex, as all substrate-binding sites in this di-RNP recognize substrate D’.

3) In the mono-RNP half-loaded with substrate D’, the spectra appear to show only one set of peaks. However, the peaks are elongated (“budding”) in the direction of the second peak seen in the di-RNP spectra. The half-loaded mono-RNP bound to substrate D’ contains two fibrillarin copies. One fibrillarin copy never binds the RNA, as its corresponding substrate (in this case substrate D) is absent. Thus, its methyl group peaks are expected to appear at the same positions as in the apo complex. The other fibrillarin copy may bind the RNA. If this fibrillarin copy were stably bound to the RNA, we would see peaks at a similar position as the second peaks in the holo ssR26 di-RNP spectrum, but this is not the case. If this fibrillarin copy were rapidly exchanging between RNA-bound and -unbound forms, we should see a peak halfway between the two peaks seen in the holo ssR26 di-RNP HMQC spectrum, assuming that the population distribution is 50%:50% and that the line-widths of the two states are similar. This peak, however, would overlap strongly with the peak of the first fibrillarin copy, which stays in the unbound form, because of the modest chemical shift differences between the RNA-bound and RNA-unbound states (see spectra of the di-RNP) as well as the large line-widths. Thus, the observed peak-shape originates from the overlap of the peak of one exchanging fibrillarin molecule with the peak of one non-exchanging fibrillarin molecule in the RNA-unbound state. We have now provided more examples of peaks in Figure 2B and simplified the spectral overlay to allow the reader to appreciate this.

6) We also don't fully understand the statement in the legend of Figure 2B "a single but elongated peak, characteristic of one copy of fibrillarin being in fast-exchange between the two states", refering to the green contours from the mono-RNP holo complex. In case there is fast exchange between 2 states, the green resonance should appear between the two purple resonances. In our view, this is not the case; the broad resonance appears exactly at one of the two signals that form the purple "doublet". Also, the broadening of the green resonance appears to be mainly in the carbon dimension, and not in the proton dimension. Are the spectra recorded with exactly the same experimental settings, or is it possible that differences in processing, viscosity, temperature or acquisition time is the cause of the differential linewidth? In summary, the conclusion regarding the dynamics of the complex seems not supported by the displayed NMR spectra.

Please see the answer to the question above, as well as the new Figure 2. The spectra of the apo and substrate D’-loaded sRNPs were acquired and processed in exactly the same way (as well as all other spectra). The display of additional peaks in Figure 2B now demonstrates that the “budding” does not occur exclusively in the 13C dimension. In addition, the clearer display of the peaks should allow better appreciation of the differences in the line-shapes of the peaks of the apo versus the substrate D’-loaded st-sR26 RNPs, which are consistent with (but do not prove) the conformational equilibrium.

The difference between “budding” peaks (those belonging to methyl groups which are split or budding in the holo di-RNP) and non-budding peaks (those belonging to methyl groups which are NOT split or budding in the holo di-RNP) can be appreciated from the lower leftmost panel of Figure 2B, where L160 (budding in the holo ssR26 di-RNP, purple spectrum) shows budding upon addition of substrate D’ to the st-sR26 RNP, while L66 does not. This comparison also demonstrates that the budding is not an artifact generated by different experimental or processing parameters. We have now commented on this in the figure caption.

7) Does Figure 2C belong to the st-sR26 holo complex in Figure 2B? If yes, this should be clearly indicated.

We have amended the figure and figure caption accordingly.

8) Figure 2D, top. How can Ipara/Idia be (significantly) larger than 1? For readability, please label in the figure that the blue curve is for D' bound and that the red one is for D bound and indicate that this is the mono-RNP holo complex.

Only two values of Ipara/Idia are larger than 1 (between 1.1 and 1.2). We often encounter these cases when quantifying PREs, due to errors in line-shape fitting as a consequence of spectral overlap and due to noise. The quantification of the Ipara/Idia values in regions of spectral overlap may be difficult. To address this problem, we exclude the more overlapped peaks from the analysis and we apply a generous error-range on the extracted distances (lower-bound of 2 Å, independent of the noise-derived error).

We have amended the figure according to the reviewers’ suggestion.

9) Do the Ipara/Idia values depend on the excess of the D or D' RNA? Are the binding sites for D and D' in both cases fully saturated? We assume that less saturation results in more mobility, so it is important to ensure that one compares fully saturated complexes.

The Ipara/Idia ratios do depend on the saturation of the D and D’ sites as they are different for the apo and holo states and for the half-loaded substrate D- and substrate D’-bound mono-RNPs. To verify that both sites are saturated we measured the 1D 1H spectra of the sRNP with increasing concentrations of substrate and followed the appearance of free substrate in the spectral region of the RNA H1’ peaks, which is devoid of protein peaks. We see no free-RNA peaks for a sub-stoichiometric ratio of substrate RNA:enzyme (see spectrum in Author response image 1), but we see free RNA peaks (sharp peaks) at a molar excess of 1.25:1 (starting from 1:1 due to small errors in calculating the concentration of the enzyme effectively present in solution). This demonstrates that at a stoichiometric ratio of 1:1 the guide RNA binding site is completely saturated with substrate RNA. To be on the safe side, we use a 1.2 molar excess of substrate RNA in all experiments.

We have now added a sentence in the Experimental Section to clarify this point.

Author response image 1
Titration of substrate RNA onto the apo Box C/D sRNP.

10) For Figure 2D, bottom, the deviations between the experimental and simulated P(r) values are similar for the short distances (0 – 40 A) and for the large distances (70 – 140 A). The deviations for the small r values are independent of conformational changes and these are thus intrinsic to the method. The question is, if the deviations at larger r values are thus significant enough that one can conclude from these SANS data that there are two states? If yes, does this agree with the population ratio of the [on, off] and [off, off] states that the authors determined later in the manuscript?

When we fit SAS data, we directly fit the scattering curves rather than the P(r) distributions. If we calculate the structures of the half-loaded sRNPs assuming only one state of each half-loaded complex, namely the [on,off]-state as required by the PRE data, the resulting structures do not fit the directly measured SANS and SAXS curves. This is the main argument to support the existence of a conformational equilibrium. We show this now in Figure 2D and in Figure 2—figure supplement 3.

The choice of displaying the P(r) distributions in Figure 2D rather than the direct SAS scattering curves is done for visual reasons, as the P(r) distribution function is easier to interpret intuitively. The discrepancy between the experimental and theoretical functions noted by the reviewers in the first version of this manuscript was due to the fact that in the old Figure 2D, instead of calculating the P(r) curves from the simulated SANS curves of the structural ensembles, we displayed the histogram of the pair-wise Cα–Cα and P–P distances scaled by the respective scattering contrast of 2H-protein and 1H-RNA in 42%:58% D2O:H2O solvent mixture. This calculation does not take into consideration the fact that the scattering of the 1H-proteins is only perfectly matched at the beginning of the SANS curve and that atoms other than the Cα or P atoms contribute to the scattering. In the new version of Figure 2D we calculate the theoretical P(r) functions from the theoretical 2H-Fib SANS curves averaged over the ensemble of [on,off]-structures that satisfy the PRE data of either the substrate D- or substrate D’-loaded sRNPs. Now, the theoretical P(r) functions at short distances perfectly match the experimental ones (as at short distances the P(r) function represents the pairwise distance distribution of atoms in one fibrillarin copy), while they differ at large distances demonstrating the incompatibility of the SANS data with a pool of [on,off]-structures only.

11) Subsection “The conformational ensembles of the half-loaded mono-RNPs in solution” paragraph three: "show a reasonable similarity". The authors need to provide at least an RMSD and an overlay of their two structures with the structure of Lin et al. to support this statement.

The overlays are shown in Figure 3—figure supplement 2. We have now added the RMSDs to the figure caption.

12) In context of sRNP complex, the affinity of fibrillarin for substrate depends on RNA sequence beyond recognition site

– It is unclear what the recognition site is. Referring to substrate sequences? Without a clearly defined recognition site, this claim is not supported by the data

– Do the authors mean methylation site rather than recognition site as specified in the Discussion? If this is the case, the Abstract/Introduction should be modified to reflect this level of detail.

We have changed “recognition site” to “methylation site” as this is what we mean.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Major comments

Previous points #4, 5 and 6; Legend Figure 2B: "Despite the absence of peak splitting, peak budding is visible in the spectrum of the substrate D'-loaded st-sR26 RNP in all instances for which peak splitting or budding is observed in the spectrum of the holo ssR26 di-RNP."

1) This figure and its legend have been remade for this version of the manuscript. Nevertheless, we cannot follow what the authors see in their spectra. The only instance, where we can see a peak splitting (not sure if "budding" is a term that can be used), is in the purple (ssR26 holo) spectrum for residue L58 and, with some imagination, for I62. For all other panels and all other complexes, we fail to see any change in the resonance. In other words, for V118, L160, V141 and L200, there are effectively no chemical shift perturbations that are larger than 0.01 ppm in proton. In carbon there is no CSP whatsoever. For the open-closed conformations, one would expect many and large CSPs.

This is a very important point, as the authors have little convincing NMR data that show that the complex exists in two states with fibrillarin either in the on- or the off-state.

As noted also in the previous answers, it is not the NMR data by itself that indicates the existence of the conformational exchange but the combination of PRE and SAS data. To make this absolutely clear we have rewritten the entire Results section and remade Figure 2. We no longer discuss chemical shifts in terms of conformational exchange but focus only on the discrepancy between PRE and SAS data when attempting to fit them to a single conformational state. To provide further evidence that the conformational exchange is needed to fit the data we now compare the PRE and SAS fits for: the representative structures of [on,off] and [off,off]-states (Figure 3—figure supplements 3, 4 and 5); ensembles of mixed [on,off]- and [off,off]-conformers (Figure 4 and new Figure 5); ensembles of [on,off]- OR [off,off]-conformers only (Figure 4—figure supplement 1 and Figure 5—figure supplement 1). While the PRE data can be fit in multiple scenarios, structural ensembles containing both [on,off]- and [off,off]-conformers (Figure 4 and new Figure 5) can properly fit the PRE and SAS data simultaneously. We are confident that this line of reasoning together with the new analysis presented in the supplements to Figure 3, supplement to Figure 4, new Figure 5 and its supplement, should be convincing.

Please also note that we do not agree with the sentence “For the open-closed conformations, one would expect many and large CSPs.” Fibrillarin interacts with the RNA backbone and this interaction is driven by arginine and lysine amino acids, as well as other polar side-chains, but definitely not by methyl-group-containing amino acids. Indeed, of all the fibrillarin ILV methyl groups, only those of V110 are expected to be within 5 Å of the RNA. Thus, we expect only a few detectable methyl-group CSPs upon RNA binding. Nonetheless, to demonstrate more clearly that there are indeed CSPs upon RNA binding, we now show a different region of the spectrum containing peaks from the residues expected to be closest to the RNA. This region was not shown in the original version of the paper as the peaks are quite overlapped. Nevertheless, the CSPs are clear, although difficult to quantify due to the overlap.

2) Previous point #8; Figure 2C. The PRE data with an Ipara/Idia still are very problematic. It is physically not possible that this ratio is larger than 1. In the plot, ratios that are 1.25 are shown. The authors write in their letter that there are only two values over 1, in the plot there appear to be at least 6. The remark that it is hard to extract reliable data for crowded spectral regions is correct, however, in case the data are unreliable or highly uncertain (which is not reflected in the presented error bars), it is not possible to draw solid conclusions. So how do the authors come to their conclusions based on the presented data? As an example, a ratio of 0.75 is mentioned in the text as highly relevant for the closed conformation, whereas a ratio of 1.25 is rebutted (only in the letter, not in the manuscript) as resulting from difficult spectral interpretations.

From the original PRE publication of Battiste and Wagner (Battiste and Wagner, 2000) onwards, all Ipara/Idia data-sets reported in the literature contain values higher than 1 and even 1.1:

Author response image 2
Examples of experimental PRE intensity-ratios in the literature.

Excerpt from the text of Battiste and Wagner: “The apparent noise or variation of intensity ratios for cross-peaks that should be far away from the spin-label nitroxide (no broadening effect) is approximately 10-15%. “

Other examples of PRE values higher than 1.1 can be found in Huang et al. Scientific Reports 6, Article number: 33690 (2016, Sjodt and Clubb, Bio Protoc. 2017 Apr 5; 7(7): e2207and many other publications.

Nevertheless, in order to address the concern regarding the values higher than 1, we have reprocessed all spectra to correct for a slight baseline offset present in three of them. We have then extracted all PREs from the newly processed spectra and confirmed that they are very similar to the old values. In addition we have increased the errors on the Ipara/Idia values to at least 10%, as recommended by Battiste and Wagner, to take into account possible changes in intensity due to sample manipulation and/or sample ageing.

For each of the D’-loaded and D-loaded complexes, we have ~15 values of (Ipara/Idia- error) larger than 1.1 among a total of 449 and 414 PREs, respectively. These numbers correspond to ~6% of the 258 (D’-loaded) and 261 (D-loaded) Ipara/Idia ratios higher than 0.9, namely those Ipara/Idia ratios corresponding to long distances. If one considers that the estimated error corresponds to the standard deviation of a Gaussian distribution, 68% of these “null PRE” values should be found between 0.9 and 1.10 (1 standard deviation), while another 15% should be found between 1.1 and 1.2. Thus the numbers of Ipara/Idia ratios higher than 1.1 are well within the statistics.

The new sets of restraints measured after re-processing all spectra were used to repeat all steps of structure calculation, selection and ensemble scoring. This process yielded structures nearly identical to those of the original submission. However, the exact numbers reported from Figure 3 onwards are slightly different from the original submission. All new structures and restraints are included in an updated deposition in Dryad.

In addition, we have validated our PRE data with intra-subunit distances, as shown now in Figure 2—figure supplement 4 for fibrillarin.

The reviewers criticize that we invoke the presence of a closed conformation on the basis of Ipara/Idia ratios < 0.75, while we ignore a value of 1.25. As explained above, the value of 1.25 in Figure 2 is an isolated outlier (as always in a distribution of experimental values, not all points are within one standard deviation from the average), while the values to which we refer when invoking the closed conformation comprise 6 residues in a row, with values of Ipara/Idia = 0.5 in the D’-loaded complex and 0.75 in the D-loaded complex. We think that this piece of evidence, among others (see above), demonstrate that these intensity ratios are significant.

As the NMR data do not agree with SANS data, the authors involve conformational rearrangements between open and closed conformations. Clearly, in case things are not in agreement, it can always be explained by dynamics. Because of the very weak NMR evidence (2 points above), it seems a far stretch to conclude that the complex undergoing open-closed motions.

Please see the evidence for the conformational exchange now provided in Figures 3, 4 and 5 and their supplements. The ability of integrative structural biology to detect conformational exchange when data from only one technique could or would fail is exactly the point that we want to make here.

Previous point #10; Figure 2D. The figure is still hard to understand. The substrate D loaded experimental (red) and theoretical (brown) are very different. The substrate D' loaded experimental (cyan) and theoretical (blue) are also very different.

The figure has been removed. Instead of P(r) we now show the direct fit to the 2H-Fib SANS curves in several figures and supplements.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

First, when there are issues raised by the reviewers that seem unreasonable to the authors, this sometimes reflects the fact that the material was not explained as clearly as might be possible. One of the advantages of eLife is that space is not nearly as constrained as in some other prestigious journals. […] Perhaps if you show more PRE data this will help – and as you will hopefully be showing correlation plots assuming 1 state and that these are not as good as the intra-validation – showing more raw data would be justified.

In brief, for this revised version, we have re-analysed PRE data after reprocessing all spectra, redid all structure calculations and ensemble scoring with the new sets of PRE data (very similar to the previous ones), added new analysis and validation figures and rewritten the Results section. In addition, we no longer discuss chemical shift perturbations to support the existence of conformational exchange. The conclusions are identical to the first version of the manuscript: the fact that we now refer to a new round of structural calculations justifies the small differences in the exact numbers reported from Figure 3 onwards.

https://doi.org/10.7554/eLife.50027.sa2

Article and author information

Author details

  1. Andrea Graziadei

    1. European Molecular Biology Laboratory, Structural and Computational Biology, Heidelberg, Germany
    2. Leibniz University Hannover, Centre for Biomolecular Drug Research, Hannover, Germany
    Contribution
    Resources, Data curation, Software, Formal analysis, Investigation, Visualization, Methodology
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7709-6002
  2. Frank Gabel

    1. University Grenoble Alpes, CEA, CNRS IBS, Grenoble, France
    2. Institut Laue-Langevin, Grenoble, France
    Contribution
    Data curation, Formal analysis
    Competing interests
    No competing interests declared
  3. John Kirkpatrick

    1. Leibniz University Hannover, Centre for Biomolecular Drug Research, Hannover, Germany
    2. Helmholtz Centre for Infection Research, Group of Structural Chemistry, Braunschweig, Germany
    Contribution
    Data curation, Formal analysis
    Competing interests
    No competing interests declared
  4. Teresa Carlomagno

    1. Leibniz University Hannover, Centre for Biomolecular Drug Research, Hannover, Germany
    2. Helmholtz Centre for Infection Research, Group of Structural Chemistry, Braunschweig, Germany
    Contribution
    Conceptualization, Data curation, Supervision, Funding acquisition
    For correspondence
    teresa.carlomagno@oci.uni-hannover.de
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-2437-2760

Funding

European Commission (FP7 ITN project RNPnet (contract number 289007)

  • Andrea Graziadei

Deutsche Forschungsgemeinschaft (CA294/3-2)

  • Teresa Carlomagno

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

The authors thank Dr. Artem Feoktystov (MLZ Munich) for assistance with recording and processing SANS data at KWS-1; Dr. Roman Lichtenecker (University of Vienna), for kindly providing the leucine methyl labelling precursor sodium [3-2H2,4-2H, 5-13C, 5’-2 H3]-α-ketoiso-caproate; Dr. Pawel Masiewicz (EMBL Heidelberg) and Susanne Zur Lage (HZI Braunschweig) for RNA production and Dr. Bernd Simon (EMBL Heidelberg) for assistance with structure calculations.

Senior Editor

  1. Philip A Cole, Harvard Medical School, United States

Reviewing Editor

  1. Lewis E Kay, University of Toronto, Canada

Publication history

  1. Received: July 8, 2019
  2. Accepted: March 8, 2020
  3. Version of Record published: March 23, 2020 (version 1)

Copyright

© 2020, Graziadei et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 337
    Page views
  • 59
    Downloads
  • 1
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Biochemistry and Chemical Biology
    2. Plant Biology
    Pengxiang Fan et al.
    Research Article
    1. Biochemistry and Chemical Biology
    2. Cell Biology
    Senem Aykul et al.
    Research Article Updated