Translational initiation in E. coli occurs at the correct sites genome-wide in the absence of mRNA-rRNA base-pairing
Abstract
Shine-Dalgarno (SD) motifs are thought to play an important role in translational initiation in bacteria. Paradoxically, ribosome profiling studies in E. coli show no correlation between the strength of an mRNA’s SD motif and how efficiently it is translated. Performing profiling on ribosomes with altered anti-Shine-Dalgarno sequences, we reveal a genome-wide correlation between SD strength and ribosome occupancy that was previously masked by other contributing factors. Using the antibiotic retapamulin to trap initiation complexes at start codons, we find that the mutant ribosomes select start sites correctly, arguing that start sites are hard-wired for initiation through the action of other mRNA features. We show that A-rich sequences upstream of start codons promote initiation. Taken together, our genome-wide study reveals that SD motifs are not necessary for ribosomes to determine where initiation occurs, though they do affect how efficiently initiation occurs.
Introduction
Translational initiation is a critical step in the regulation of gene expression that impacts which proteins are synthesized and to what extent. Unlike eukaryotic ribosomes, which scan from the 5’-end of messages and generally initiate at the first start codon, bacterial ribosomes can initiate at any position along an mRNA; this is a critical requirement because many bacterial mRNAs are polycistronic. Bacterial ribosomes must select the correct start codons amidst a vast excess of potential sites (AUG, GUG, and to some extent UUG) that have to be ignored. Not only does initiation determine where translation occurs (and therefore which proteins are made), in most cases the rate of initiation determines the level of protein output. In bacteria, a common strategy for regulating translation is to block ribosome recruitment to an mRNA through the action of small RNAs (Altuvia et al., 1998; Majdalani et al., 1998; Storz et al., 2004), small-molecule binding riboswitches (Winkler et al., 2002; Mandal and Breaker, 2004), and regulatory proteins (Moine et al., 1990; Babitzke et al., 2009).
Initiation rates vary in response to several mRNA features that determine how effectively an mRNA recruits 30S subunits to the start codon. Thermodynamically stable secondary structures surrounding the initiation site prevent 30S recruitment (Hall et al., 1982; de Smit and van Duin, 1990). The kinetics of RNA folding and unfolding are also critical (de Smit and van Duin, 2003; Espah Borujeni and Salis, 2016): some structures exist in an unfolded state for such a short period of time that 30S subunits cannot find the start codon quickly enough by diffusion alone. In several well-characterized examples, regions of single-stranded RNA known as standby-sites are found nearby, positioning 30S subunits in close proximity so that they can efficiently capture the start codon upon unfolding of the mRNA secondary structure (de Smit and van Duin, 2003; Espah Borujeni et al., 2014). Interactions of 30S subunits and single-stranded mRNA regions (especially those that are AU-rich) can be mediated through ribosome protein S1 (Boni et al., 1991; Komarova et al., 2005). Bound on the back of the 30S subunit, the S1 protein contains multiple RNA-binding domains that can recruit mRNA and melt secondary structures (Qu et al., 2012), facilitating hybridization of 16S rRNA with complementary mRNA sequences colloquially known as Shine-Dalgarno motifs.
Shine-Dalgarno motifs have the consensus sequence GGAGG and can base pair with as many as nine nt in the 3’ terminal sequence of 16S rRNA (ACCUCCUUA in E. coli) referred to as the anti-Shine Dalgarno or ASD (Shine and Dalgarno, 1974). Pairing of the SD-ASD sequences can recruit 30S subunits to the start codon 5–10 nt downstream (Steitz and Jakes, 1975). SD motifs that differ significantly from the consensus or that are positioned too close or too far from the start codon yield lower levels of initiation. Indeed, many experiments using reporter genes showed that raising the SD-ASD affinity increases protein output, demonstrating its importance for determining translation levels (Hui and de Boer, 1987; Jacob et al., 1987; de Smit and van Duin, 1990; Salis et al., 2009). In addition, the SD model serves as the foundation of practical bioengineering efforts ranging from optimizing expression of recombinant proteins to expansion of the genetic code (Rackham and Chin, 2005; Salis et al., 2009).
On the other hand, even though the ASD in 16S rRNA is almost universally conserved throughout the bacterial kingdom (Nakagawa et al., 2010), the percentage of genes with SD motifs varies widely between species. While well-characterized model species such as E. coli and B. subtilis have a high percentage of genes with SD motifs (54% and 78% respectively), there is little to no enrichment of SD motifs upstream of start codons in Bacteriodetes and Cyanobacteria (Nakagawa et al., 2010). In addition, although the majority of species in the phyla Firmicutes, Actinobacteria, and Proteobacteria have high percentages of SD-containing genes, several species have low percentages, arguing that the loss of this mechanism has occurred multiple times during evolution (Nakagawa et al., 2010; Hockenberry et al., 2017). These variations across the bacterial kingdom, despite the high conservation of the ASD element on the ribosome, raise questions as to how important the SD mechanism is for ribosome recruitment.
Ribosome profiling is a method for deep sequencing of ribosome-protected mRNA fragments that allows us to define the position and number of ribosomes bound across the transcriptome at nucleotide resolution (Ingolia et al., 2009). This information allows us to calculate the ribosome density on each mRNA as a proxy for the efficiency of translation initiation. In pioneering ribosome profiling studies in bacteria, the paradoxical observation was made that there is little or no correlation between the ribosome occupancy of a gene and the strength of its SD motif (calculated using thermodynamic algorithms for RNA pairing), as had been anticipated based on the SD model (Li et al., 2014; Schrader et al., 2014; Li, 2015; Del Campo et al., 2015). This surprising observation suggested that other mRNA features could effectively mask the effects of the SD correlation at the genome-wide level.
To isolate the effects of SD motifs on the global translational landscape, we expressed 16S rRNA mutants with altered (non-functional) ASD sequences, purified mutant ribosomes, and used ribosome profiling to ask how efficiently they translate each mRNA in the cell. Unlike previous studies that vary the SD motif and other mRNA-specific features, this approach allows us to specifically eliminate the SD-ASD interaction while keeping mRNA sequences and structures intact, so that we can specifically ask questions about the role the SD-ASD interaction plays in determining mRNA translation rates. Through this analysis, we observe for the first time the effects of SD motifs at the global level, revealing a linear correlation between SD strength and ribosome occupancy. We then combined our new profiling approach with retapamulin treatment to trap ribosomes at start codons (Meydan et al., 2019; Weaver et al., 2019) in order to study the role of SD motifs in selecting start codons. To our surprise, the ASD-mutant ribosomes selectively recognize the correct initiation sites as well as wild-type ribosomes, arguing that these sites are hard-wired for initiation independent of their SD-ASD pairing strength. We show that A-rich sequences recently identified by Fredrick and co-workers (Baez et al., 2019) are enriched at annotated start sites compared to other AUG codons in the transcriptome where initiation does not take place; these A-rich sequences are also found upstream of start codons in a wide variety of species across the bacterial kingdom. In addition, mRNA structure at annotated start sites is lower than at other AUG codons, facilitating 30S binding. Together, these studies refine our understanding of the role of SD motifs and other mRNA features in defining the proteomes of bacteria.
Results
Selective profiling of ribosomes with mutant ASD sequences
Studies of the role of SD motifs in promoting translation in their native contexts have been complicated by the fact that changing the sequence of an mRNA also affects other determinants of translational regulation such as its overall structure. To perturb the function of SD motifs at the global level, we developed a new approach in which we mutate the ASD in 16S rRNA, purify the mutant ribosomes, and use ribosome profiling to ask how efficiently they translate each mRNA in the cell. This strategy provides us with a genome-wide view of the function of SD motifs in interactions with the unaltered transcriptome—all of the features of an mRNA that affects its translation are maintained, thereby isolating the effects of the SD motif mutation. In this manner, we eliminate the SD-ASD interaction as a contribution to mRNA translation rates and see how translation changes across the transcriptome.
We created three 16S rRNA alleles in which the ASD is mutated (Figure 1A). Two of these mutants were described previously in the literature. The ASD in specialized (S) ribosomes was inverted from CCUCC to GGAGG in a pioneering study by de Boer who showed that although these S-ribosomes were relatively inactive on endogenous transcripts, they efficiently translate a reporter gene with a complementary SD motif (Hui and de Boer, 1987). In later studies, Cunningham and Chin used genetic selections to characterize additional SD-ASD pairs and improve their selectivity, creating orthogonal (O) ribosomes where the ASD is mutated to UGGGA (Lee et al., 1996; Rackham and Chin, 2005). Ribosomes with mutant ASD motifs (like S and O) have been used in numerous studies of protein synthesis where they selectively translate reporter genes with complementary SD motifs (Rex et al., 1994; Neumann et al., 2010; Orelle et al., 2015). In addition to these two ASD mutants, we constructed a third (A) with the ASD sequence AAAAA that we anticipated would bind mRNA more weakly than the O- or S-ribosomes (given that their ASD sequences are G-rich). The MS2 aptamer was inserted into these three ASD mutants to facilitate their purification as described below (Youngman et al., 2004; Youngman and Green, 2005); as a control, we also created an MS2-tagged 16S rRNA with the canonical ASD sequence (C).

Capturing the role of SD motifs by MS2RP.
(A) ASD mutations at the 3’-end of 16S rRNA are highlighted in color. (B) Schematic of MS2RP: polysomes are collapsed to monosomes by RNase T1 digestion, MS2-tagged monosomes are pulled down with the MS2 coat-protein, and mRNA is fully digested to yield ribosome footprints that are subjected to deep sequencing. (C) RT-PCR of 16S rRNA from cell lysates (L) and the eluate (E) from the MS2 coat-protein column. (D) Scatter plot of ribosome occupancy (RO), the ratio of ribosome profiling to RNA-seq reads, from MS2RP of O-ribosomes vs. C-ribosomes. The red line indicates a 10-fold enrichment and the Pearson correlation is given. (E) Ribosome footprints (in reads per million mapped reads) from MS2RP of O-ribosomes and C-ribosomes on the hemA gene. The sequence upstream of the start codon is predicted to pair with the ASD of O-ribosomes.
These four 16S rRNA mutants were expressed from plasmids in E. coli MG1655 containing the normal complement of seven wild-type rRNA operons to sustain growth. Because overexpression of ASD mutants is toxic (Jacob et al., 1987), we induced expression for only 20–25 min during which growth rates were not affected (Figure 1—figure supplement 1A). Polysome profiles from the four mutants were similar (Figure 1—figure supplement 1B) suggesting that translation remains robust during the transient expression of MS2-tagged 16S mutants whether the ASD is intact (C) or mutated (S, O, and A). A previous study of orthogonal ribosomes suggested that altering the ASD in 16S rRNA reduces rRNA processing efficiency, leading to the accumulation of processing intermediates, but that mature rRNAs containing ASD mutations have the correct 3’-end (Aleksashin et al., 2019). To look for processing defects in our system, we performed RNA-seq on affinity-captured MS2-tagged rRNA without nuclease digestion. As shown in Figure 1—figure supplement 1C, we do not observe the accumulation of precursors with 3’-extensions or other defects in the processing of the 3’-end of 16S rRNA. This result indicates that correctly processed rRNA is produced and should be able to form mature 30S subunits.
RT-PCR with primers that distinguish endogenous 16S rRNA from the MS2-tagged mutants was used to ask whether the ASD mutants are found in actively translating polysomes (Figure 1—figure supplement 1D). We observed that the signal from C-ribosomes is equally strong in the lysate, light, and heavy polysome fractions. In contrast, the signal from the three ASD mutants is present but weaker in the polysome fractions than in the lysate. These data show that although ribosomes with mutant ASDs can engage in translation, their activity is impaired, consistent with earlier studies. Keeping this in mind, we focus our analyses not on their absolute activity but on their selectivity, asking which mRNAs they translate better than other mRNAs.
To purify mutant ribosomes, we employed a method previously developed for in vitro biochemical studies of ribosomes with lethal mutations (Youngman et al., 2004; Youngman and Green, 2005): the MS2 aptamer was fused to helix 6 of 16S rRNA allowing us to capture mutant ribosomes through their interaction with the MS2 coat protein (Figure 1B). To avoid pulling down wild-type ribosomes bound to the same mRNA as mutant ribosomes, we first treated cell lysates with RNase T1 to collapse polysomes to monosomes prior to isolating MS2-tagged ribosomes. RT-PCR reveals how well this purification strategy works: although signal from the wild-type 16S rRNA predominates in cell lysates (lower band, Figure 1C), it is nearly undetectable in purified ribosome samples eluted from the MS2-coat protein column. These data show that MS2-tagged ribosomes can be isolated with high purity for ribosome profiling studies; we refer to this procedure as MS2RP.
Comparison of the translational landscape of the canonical (C) to the orthogonal (O)-mutant confirms that the MS2RP strategy is effective. For 2217 genes with adequate coverage in each sample, we computed ribosome occupancy (RO) values by dividing the ribosome profiling density by RNA-seq density. Although we recognize that RO is not a perfect measure of initiation rates—it may also reflect differences in elongation in some cases—the number of ribosome footprints correlates strongly with protein levels in exponentially growing E. coli cultures (Li et al., 2014); RO therefore reports on the level of protein output per mRNA. We observed compelling differences in RO values for many genes in the two samples (Figure 1D). An initial straightforward expectation is that genes with SD motifs with high affinity to orthogonal (O) ASD sequence would have high RO values in MS2RP data from O-ribosomes; indeed, we observe that a complementary SD motif (UCCCG) five nt upstream of the start codon gives the hemA gene 10-fold higher RO with the O-ribosome than with the C-ribosome (Figure 1E). The same phenomenon was observed on rbsK (7-fold higher RO) and mreB (10-fold higher RO) with the O-ribosome and on sapA (9-fold higher RO) and rsmH (4-fold higher RO) with the S-ribosome (Figure 1—figure supplement 2). In each of these examples, the increase in RO can be attributed to higher levels of translation because the mRNA differs by less than two-fold. These examples are quite rare, however, because endogenous genes have evolved to interact with the canonical ASD and so the probability of finding a sequence with strong complementarity to the mutant ASD at just the right position is relatively low. Indeed, our data are most consistent with the conclusion that all three ASD mutants essentially act as general loss of function mutants.
The global role of SD motifs on the endogenous translational landscape
We next used MS2RP to isolate the effect of SD motifs on global translation, asking to what extent they drive translation under optimal growth conditions. For each gene, we computed the SD strength as the inverse of the free energy (-∆G) of pairing between the sequence −15 to −6 nt upstream of the start codon and the wild-type ASD (ACCUCCU). Based on the well-known role of SD motifs in promoting translational initiation, the expectation is that genes with strong affinity should have high RO values, and conversely, genes with weak affinity should have low RO values, yielding a strong correlation. However, our analysis of data from canonical (C) ribosomes showed only a very weak correlation (Figure 2A), consistent with previous reports from ribosome profiling studies (Li et al., 2014) showing that SD strength has little power to predict ribosome occupancy in E. coli. Strikingly, the RO values from the three ASD mutants (S, O, and A) showed a robust negative correlation with SD affinity for the wild-type ASD sequence (Figure 2B and Figure 2—figure supplement 1). In other words, ASD mutant ribosomes translate genes with weak SD motifs better than genes with strong SD motifs.

MS2RP reveals that SD motifs enhance translation genome-wide.
Ribosome occupancy (RO) is the ratio of ribosome profiling to RNA-seq reads per gene. Log10RO values are plotted against the SD strength (-∆G of pairing to the wild-type ASD) for each gene with MS2RP data for C-ribosomes (A) and A-ribosomes (B). (C) Scatter plot of ∆logRO (C-ribosomes minus A-ribosomes) and -∆G where r values indicate Pearson correlations.
Because the ASD mutants are unlikely to participate in SD-ASD interactions, RO values in these samples reflect the contributions of all the other mRNA elements that promote initiation. The observation that these other elements yield a negative correlation with SD strength suggests that they in general counteract the positive correlation contributed by SD-ASD pairing (with wild-type ribosomes). As such, these contributions effectively mask the effect of SD motifs in Figure 2A. By calculating the difference in RO (∆logRO) for each gene between the C- and A-ribosomes, we effectively subtract all the mRNA elements that determine RO independent of SD-ASD pairing, thus isolating the effects of the SD motifs on mRNA translation rates. The ∆logRO term reflects how much better a message is translated by wild-type ribosomes than by ASD mutants. When ∆logRO values are plotted as a function of SD-ASD affinity (-∆G) using the wild-type ASD sequence, we observe a strong linear correlation with SD-ASD affinity for each of the mutants (Figure 2C and Figure 2—figure supplement 1). As expected, genes with strong SD motifs are translated better by ribosomes with the canonical ASD than by ASD-mutant ribosomes. The fact that we observe this correlation validates our calculations of SD strength; analysis of the distance of SD motifs from the start codon confirms that genes with the highest ∆logRO have the strongest SD affinity in the −15 to −6 region as shown in previous studies (Figure 2—figure supplement 2). These data obtained with MS2RP reveal for the first time the effect of SD motifs on translation genome-wide, consistent with their characterized role in promoting initiation.
SD motifs are not necessary for start codon selection
SD motifs are also widely held to play a critical role in recognizing and selecting initiation sites (Steitz and Jakes, 1975). In the analyses described so far, we have used MS2RP to estimate the ribosome density on each mRNA as a proxy for initiation rates in order to address questions about how much translation is occurring on annotated genes. These data are less informative about the degree to which mutant ribosomes initiate at the wrong sites in the transcriptome. Non-canonical initiation is difficult to observe in E. coli because 5’- and 3’-untranslated regions of mRNAs are generally quite short and translation at alternate start codons within ORFs is swamped by the signal of elongating ribosomes from the canonical start site. In eukaryotes, the antibiotics harringtonine and lactimidomycin have been used with great success together with ribosome profiling to identify sites where translational initiation takes place (Ingolia et al., 2011; Lee et al., 2012). These compounds do not interfere with elongating ribosomes, allowing them to continue translation and terminate normally. In contrast, they trap newly-initiated ribosomes, providing a way of identifying initiation sites in ribosome profiling studies. Two antibiotics were recently shown to similarly specifically trap initiation complexes in bacteria: Onc112 and retapamulin (Meydan et al., 2019; Weaver et al., 2019).
To study the role of SD motifs on start codon selection, we treated cells with retapamulin for 5 min and then used MS2RP to identify start sites occupied by ribosomes with the various ASD sequences. For example, elongating wild-type (C) ribosomes are found all across the lpp gene in untreated cells (Figure 3A, light grey), whereas they are highly enriched at the annotated start codon in retapamulin-treated cells (dark grey). As expected, ribosome footprints are not seen at three internal AUG codons, since these do not function as initiation sites. Strikingly, in retapamulin-treated cells, the A-ribosomes also find the correct start site, ignoring the three other AUG codons (Figure 3A, dark green). In another example, the gmk gene, both C- and A-ribosomes are enriched at the annotated start codon in retapamulin-treated cells but not at several internal AUG codons (Figure 3B). In both examples, both WT and mutant ribosomes select the correct, annotated start site while ignoring other AUG codons.

Loss of SD-ASD pairing has little effect on start codon selection.
(A,B) Ribosome footprints on lpp and gmk from MS2RP data obtained with and without retapamulin, an antibiotic that traps ribosomes at start codons. Annotated AUGs are indicated by a red bar, non-annotated AUGs are indicated by blue bars. (C, D) Average ribosome protected fragments (RPFs) at annotated AUGs and non-annotated AUGs (where AUG starts at 1). (E) Average RPFs at the start codon of genes whose ribosome-binding sites have little or no affinity to all three mutant ASD sequences.
To analyze the accuracy of start codon selection by the ASD variants in retapamulin-treated samples genome-wide, we computed the average number of ribosome footprints across many genes aligned at their annotated start codons or aligned at all the other AUG triplets in the transcriptome (non-annotated AUGs). Our initial expectation was that in the absence of SD-ASD base pairing, the mutant ribosomes might fail to recognize the correct start sites and bind more often to other AUG triplets in the transcriptome. Strikingly, both the C- and A-ribosomes show strong initiation peaks at annotated AUGs (Figure 3C), whereas these peaks are absent in both samples at non-annotated AUGs (Figure 3D). These results provide initial evidence that ribosomes correctly select annotated start sites genome-wide in the absence of the SD-ASD interaction.
To further explore this surprising finding, we next asked how the affinity of mRNA-rRNA base pairing influences initiation at annotated start codons. We assumed that for the mutant ribosomes, base pairing would play little or no role in initiation because they would likely have low affinity for annotated start sites that evolved to bind the wild-type ASD. To test this assumption, we calculated the affinity of each mutant ASD for the sequence upstream of the start codon of each gene. We grouped genes into different sets based on these affinities and plotted the average number of ribosome footprints at the annotated start sites as in Figure 3C. In the subset of genes with no predicted affinity for any of the three ASD mutants (ΔG > −1), we still see robust enrichment of A, O, and S ribosomes at the annotated start sites (Figure 3E). Since all three ASD variants initiate at annotated start sites, these data argue against the possibility that serendipitous base-pairing between the mRNA and the mutant ASD sequences explains this enrichment.
We also analyzed a set of annotated start sites with strong calculated affinity to the wild-type ASD. These sites are expected to be dependent on the SD-ASD interaction. Yet we again observed robust start peaks for each ASD variant ribosome, indicating that SD-ASD pairing is dispensable for initiation even in genes with strong SD motifs (Figure 3—figure supplement 1A). Furthermore, we found that in a set of sites with predicted high affinity to the ASD of the O-ribosome, there was strong enrichment of A- and S-ribosomes at start codons, despite the differences in the ASD sequence (Figure 3—figure supplement 1B). Likewise, in a set of genes with predicted high affinity to the ASD of the S-ribosome, there was strong enrichment of O- and A-ribosomes at start codons (Figure 3—figure supplement 1C). (There were too few genes with high affinity to the A-rich ASD sequence to perform an equivalent analysis for A-ribosomes). Taken together, these analyses show that annotated initiation sites are hard-wired for initiation independent of their potential for base pairing between the mRNA and rRNA.
SD motifs are not necessary for initiation at non-canonical sites
We next asked what role mRNA-rRNA pairing plays in initiation at AUG triplets in the transcriptome that are not normally used for initiation (non-annotated AUGs). For this purpose, we used data from retapamulin-treated cells to calculate an initiation score (IS) for each AUG triplet, defined as the average number of reads mapped within 3 to 21 nt downstream of an AUG (to capture footprints of various sizes) divided by the average number of reads mapped over a wider spacing (100 nt, Figure 4A). The first and most general finding is that the log2IS values from the C- and A-ribosomes have a similar distribution with medians close to 0 (Figure 4B), indicating that footprints from the A-ribosomes are not enriched at non-annotated AUG codons. This result is consistent with the average gene plot shown in Figure 3D and with the fact that most of these AUG codons do not serve as initiation sites. To better characterize the difference between C- and A-ribosomes in initiation at non-annotated AUG codons, we selected a subset of sites that effectively recruit C-ribosomes and yield strong initiation peaks. These sites have log2IS values > 1.5 and are highlighted in black in Figure 4B. Surprisingly, this same subset of AUG codons also shows high IS values for A-ribosomes (Figure 4C), arguing that SD-ASD pairing is not the feature that explains why initiation takes place at these specific AUG triplets and not at others.

The effects of SD-ASD pairing on initiation at non-canonical sites.
(A) Evaluation of initiation score, IS. (B) Initiation scores on non-annotated AUG triplets. For C-ribosomes, the fraction with IS >1.5 is colored black. (C) Initiation scores for C- and A-ribosomes for the set of sites with IS >1.5 for C-ribosomes (High, colored black in B) and those with IS <1.5 (Low). (D,E) IS values for all four ribosome types on the subset of sites with high affinity for the ASD of the S-ribosome (CCUCC). Average RPFs at the AUG triplets with high IS scores (F) or low IS scores (G) from the S-ribosome data.
To further characterize how SD-ASD pairing affects initiation at non-annotated AUG triplets, we grouped potential initiation sites by their affinity for wild-type or mutant ASDs as described above for annotated start sites. For sites with high affinity to the ASD of the S-ribosome, for example, the distribution of IS values for S-ribosomes closely resembled the other three ribosomes (Figure 4E), with median values near zero. These data show that the presence of a complementary Shine-Dalgarno-like sequence near an AUG codon is not sufficient to recruit S-ribosomes and generate a robust start codon peak. We selected the subset of AUGs with high affinity to S-ribosomes where initiation occurs with S-ribosomes (log2IS >1.5, dark red in Figure 4E). As expected, these high-IS sites show strong start peaks with S-ribosomes; however, the other ribosomes with different ASD sequences show robust start peaks as well (Figure 4F). Similarly, low-IS sites that are not translated by S-ribosomes (light red in Figure 4E) are also not translated by the other ribosomes (Figure 4G). The observation that SD-ASD pairing does not contribute to initiation at these sites with high affinity to the S-ribosomes also holds true for non-annotated AUGs with high affinity to the wild-type ASD (Figure 4—figure supplement 1). Once again, these data argue that AUGs that recruit ribosomes and lead to initiation are hard-wired for this purpose irrespective of the strength of the mRNA-rRNA base pairing interaction. Taken together, these data on initiating ribosomes show that mRNA-rRNA base pairing is neither necessary nor sufficient for translational initiation.
A-rich sequences upstream of start codons promote initiation
To provide insight into mRNA features other than SD strength that might contribute to ribosome recruitment, we asked which features are enriched at annotated start sites. To avoid interference from SD motifs, we selected only annotated start sites with low affinity to the wild-type ASD (∆G > 0) and compared them to non-annotated AUG codons, most of which do not lead to initiation. We observed enrichment of adenosines (A) at many sites within 15 nt upstream of the start codon and 5 nt downstream (Figure 5A).

A-rich sequences as a signal for start codon selection.
(A) Probability logo of the region surrounding annotated AUGs with low affinity to the wild-type ASD sequence (∆G > 0) as compared with all non-annotated AUGs in the transcriptome. Enriched nucleotides are shown above the axis and depleted nucleotides below the axis. The height of the letter represents the binomial P-value. (B) Design of the reporter assay. The reporter plasmid encodes mCherry with a strong ribosome binding site (RBS) and separately GFP downstream of a region containing a start site of interest (30 nt upstream of AUG and 42 nt downstream). (C) Initiation sites used in the reporter assay; the number indicates the genomic position of AUG. In the T- and C-mutants, the A’s upstream of AUG (highlighted in green) were substituted by T or C. (D) Results of the reporter assay. Each dot is the median of GFP/mCherry from an independent run of flow cytometry. The bar graph indicates the mean and SD from four independent tests. NoGFP (a plasmid that encodes mCherry but not GFP) serves as a control showing the baseline signal from cellular autofluorescence; the other data are normalized to this ratio. (E) Median (solid line) and interquartile range (shaded) of mRNA structure in SHAPE-MaPseq data for 365 annotated start sites (red) and 7310 non-annotated AUGs within coding sequences (blue).
To test whether these A’s promote translation, we selected four mRNAs with A-rich initiation sites (and weak SD motifs) and established a GFP reporter assay to follow their activity (Figure 5B). Of these four mRNAs (Figure 5C), two contain annotated initiation sites with low ASD-affinity, the start codons from yhbY and gsk. We also selected two representative non-annotated AUG codons found within the creA and yeiR genes; these sites have high IS values in both the C-ribosome and O-ribosome MS2RP data from retapamulin-treated cells. The sequences surrounding these four AUG codons (from 30 nt upstream to 45 nt downstream) were fused in frame to GFP such that GFP fluorescence reports on the activity of the AUG of interest. In addition to the wild-type sequence, we constructed mutants in which all of the A’s 15 nucleotides upstream of AUG were changed to either U’s or C’s (G’s were avoided because they have high affinity for the ASD). The reference protein mCherry was also expressed from the same plasmid with a standard ribosome binding site. The GFP/mCherry ratio was then normalized to a control lacking the GFP sequence (measuring only cellular auto-fluorescence).
We observed that the GFP/mCherry ratio was higher than background for all four AUG codons, showing that all are capable of driving GFP expression (Figure 5D). The two annotated start sites from yhbY and gsk induced stronger GFP expression than the non-annotated start sites, creA* and yeiR*. Importantly, however, the fact that fluorescence was observed from these latter examples confirms the results from the MS2RP data from retapamulin-treated cells showing that they are translated to some extent by wild-type ribosomes. We observed that replacement of the A’s with U’s lowered GFP expression in all cases except for yeiR* which showed the weakest GFP expression. A stronger effect was observed by changing the A’s to C’s, which led to complete loss of GFP fluorescence from all four AUG contexts tested. These results support our hypothesis that A-rich sequences upstream of start codons contribute to the identification of translational start sites.
The ability of A-rich sequences to promote initiation is likely not limited to E. coli: when we compared the local context of AUG codons in annotated start sites vs. non-annotated AUG codons for a set of diverse bacteria, we again saw that A-rich sequences were enriched (Figure 5—figure supplement 1). For E. coli and most other species examined, the enrichment of A’s was weaker than the enrichment of G’s corresponding to the SD sequence, but for Mycoplasma pneumoniae and Flavobacterium johnsoniae, the SD signal is not observed and there the enrichment of A’s is particularly striking. A-rich sequences are highly conserved and may serve as an important mechanism for start site selection in these species, while contributing broadly to more diverse species.
mRNA structure is lower at annotated start sites than at non-annotated AUG codons
In bacteria, mRNA structure surrounding the start codon has been shown in mechanistic studies to reduce ribosomal occupancy (Lodish, 1970; de Smit and van Duin, 1990; de Smit and van Duin, 2003; Espah Borujeni and Salis, 2016). Moreover, several transcriptome-wide analyses of mRNA structure in E. coli show lower levels of structure surrounding initiation sites (Del Campo et al., 2015; Burkhardt et al., 2017). We asked how mRNA structure differs between annotated start sites and internal AUG codons that are not annotated as start sites. We used data from a recent study of the structure of mRNAs in vivo using SHAPE and deep sequencing (Mustoe et al., 2018). From transcripts with sufficient coverage, we calculated the median SHAPE reactivity over a 120 nt window surrounding 365 annotated start sites and compared it to 7310 non-annotated AUGs (Figure 5E). For annotated initiation sites, the level of mRNA structure is significantly lower for a region 30 nt in length on both sides of the AUG codon (shown in red) as previously reported (Del Campo et al., 2015; Burkhardt et al., 2017). In contrast, except for a sharp dip in reactivity at the aligned AUG codon due to sequence bias, we see that mRNA structure is consistently high across this window for the set of non-annotated AUGs (shown in blue). These differences may be due in part to the ability of ribosomes to melt RNA structure during translation; indeed, initiation leads to the unfolding of RNA, which facilitates initiation by another 30S subunit (Espah Borujeni and Salis, 2016; Andreeva et al., 2018). But, given that SHAPE and DMS reactivity of mRNAs in vivo and in vitro are strongly correlated (Burkhardt et al., 2017; Mustoe et al., 2018), it is also likely that mRNA structure plays a causal role in setting initiation rates.
Discussion
In this study, we performed ribosome profiling on mutant ribosomes purified using an RNA tag, the MS2 aptamer, a strategy we call MS2RP (Figure 1). Originally developed for in vitro studies of ribosomes containing lethal rRNA mutations (Youngman et al., 2004; Youngman and Green, 2005), MS2-tagged ribosomes also have potential to yield insights into the function of key rRNA sequences in vivo. In addition to the studies of the ASD sequence in 16S rRNA reported here, MS2RP could be employed to characterize the functions of rRNA domains on initiation, elongation, termination, and recycling at a genome-wide level in vivo. Because MS2RP can be performed on rRNA mutants expressed from plasmids, the method can be easily transferred to other bacteria or to eukaryotes without altering rDNA in the genome. Of particular interest are rRNA variants in bacterial genomes that are expressed differentially in response to changes in the environment and are proposed to have different specificities or functions (Kurylo et al., 2018; Song et al., 2019). Variant rRNA alleles have also been reported for eukaryotic cells (Parks et al., 2018); for example, different small subunit rRNA alleles are expressed in various developmental stages in Plasmodium (Gunderson et al., 1987). In addition, the functions of the highly variable rRNA expansion segments in eukaryotes are poorly understood (Spahn et al., 2001; Anger et al., 2013). MS2RP could be a powerful tool to elucidate the activities of various subpopulations of variant or mutated rRNAs.
Previous genome-wide studies in bacteria have shown little or no correlation between SD strength and ribosome occupancy (Li et al., 2014; Schrader et al., 2014; Del Campo et al., 2015). Using MS2RP, we are able for the first time to reveal the role of SD motifs in promoting initiation across the transcriptome. In our approach, we mutated the ASD on the ribosomes, thus maintaining mRNA sequence and structure, thus allowing us to isolate the effects of the SD:ASD interaction on translation. In the absence of SD:ASD pairing, we observed a strong negative correlation between ribosome occupancy and the SD strength (calculated by pairing with the wild-type ASD sequence). In other words, the mutant ribosomes translate genes with strong SD motifs worse than those with weak SD motifs (Figure 2B). There are two possible explanations for this negative correlation. It may be that the binding of wild-type ribosomes to mRNAs with strong SD motifs occludes their ribosome-binding sites, preventing mutant ribosomes from initiating and efficiently translating these genes. Alternatively, mRNA structure and other features may outweigh the impact of SD motifs, masking their effects, explaining why conventional ribosome profiling studies failed to observe correlations between SD strength and ribosome occupancy. Regardless of which of these explanations is correct, the MS2RP strategy allows us to subtract the cumulative contribution to ribosome occupancy of all of such other mRNA features, and thus to focus exclusively on the contribution to ribosome occupancy of the SD:ASD interaction genome-wide. In this analysis, we are now able to see a linear correlation between the SD strength of an mRNA and protein output (Figure 2C).
Given that the SD motif functions through a well-defined mechanism and is widely conserved throughout bacteria, it has been thought to provide an important mechanism for start codon selection and translational output. Consistent with such a view, SD motifs are underrepresented within ORFs in order to avoid spurious initiation at internal start codons (Hockenberry et al., 2018). Strikingly, however, we find that ribosomes with altered ASDs still find the correct start codons about as efficiently as wild-type ribosomes (Figure 3). Start peaks for all four ribosome types are observed at annotated start sites regardless of the affinity of the ribosome binding site for the ASD. This shows that initiation sites are hard-wired for initiation based on mRNA features separate from the potential for SD-ASD pairing. These observations also hold true at the occasional non-annotated AUG codons where some initiation occurs (Figure 4). These data are consistent with the conclusion that SD motifs are not essential for determining where translation starts on mRNAs genome-wide.
What, then, are other mechanisms that could be used for start codon selection? Local mRNA structure and RNA folding kinetics clearly must play a critical role in allowing ribosomes to find the start codon. A number of mechanistic studies have demonstrated that RNA structure around the start codon lowers translation levels (Hall et al., 1982; de Smit and van Duin, 1990; Osterman et al., 2013; Espah Borujeni et al., 2014). Studies of factors that alter the expression of simplified reporter genes (involving randomization of the 5’-UTR or coding sequences) show that lack of secondary structure surrounding the initiation site has the most significant correlation with protein output (Salis et al., 2009; Kudla et al., 2009; Goodman et al., 2013). Recent transcriptome-wide analyses of mRNA structure in E. coli confirm that annotated start sites have lower levels of mRNA structure, as seen by PARS on purified mRNA and DMS-seq in vivo (Del Campo et al., 2015; Burkhardt et al., 2017). mRNA structure is likely an important factor in start site selection: using high-resolution SHAPE-MaPseq data (Mustoe et al., 2018), we showed that annotated AUGs have lower levels of RNA structure 30 nt upstream and downstream whereas internal AUG are not surrounded by regions of lower structure (Figure 5E).
Interestingly, in comparing the sequence context of AUG codons that are annotated as initiation sites with those that are not, we found that A’s are enriched both upstream and downstream of annotated initiation sites (Figure 5A) and we confirmed their importance in reporter assays (Figure 5B–D). These results from endogenous initiation sites are reminiscent of observations of the over-representation of A’s in 5’-UTR sequences selected for strong affinity to the ribosome in vitro (Gao et al., 2016) and in 5’-UTRs selected from random sequences upstream of a reporter gene for high levels of translation in vivo (Evfratov et al., 2017). Comparison of annotated start sites and non-annotated AUGs across several bacterial genomes shows that this mechanism is widespread (Figure 5—figure supplement 1). Although enrichment of A’s is more subtle than enrichment of G’s in E. coli and B. subtilis, in organisms that lack SD motifs, such as Mycoplasma pneumoniae and Flavobacterium johnsoniae, A-rich motifs may play an important role in initiation. Indeed, in a recent study, Fredrick and co-workers used ribosome profiling in F. johnsoniae and observed enrichment of A’s upstream of start codons in mRNAs with high ribosome occupancy in comparison to genes with are translated less efficiently (Baez et al., 2019). We envision that this sequence, like the Shine-Dalgarno motif, acts as a translational enhancer, fine-tuning the efficiency of initiation.
The mechanism by which A-rich sequences enhance initiation is not clear. The prevalence of A’s may alter the mRNA dynamics; A-rich sequences tend to have less secondary structure than GC-rich sequences. We note however that replacing A’s with U’s in several reporters reduced translation levels even though the U’s are similarly not expected to yield strong structures. A second possibility is that ribosomal components may interact specifically with A’s close to the start codon that are bound inside the ribosome during initiation. Fredrick and co-workers used reporter assays to show that mutation of a particular A at the −3 position reduces expression; this result is intriguing because the classic Kozak sequence (GCC(A/G)CCAUG) that promotes high levels of translation in eukaryotes also contains a purine at position −3. A-rich sequences have been reported to enhance translation in a variety of eukaryotic contexts includingDrosophilaand wheat germ and reticulocyte lysates (Ranjan and Hasnain, 1995; Sano et al., 2002; Suzuki et al., 2006; Pfeiffer et al., 2012). It may be that A-rich sequences interact with conserved elements of the ribosome across the domains of life. A’s further from the start codon (10–20 nt upstream) may interact with bacteria-specific ribosomal protein S1. bS1 preferably binds to A/U-rich sequence elements upstream of SD sequences (Boni et al., 1991; Komarova et al., 2005) and is thought to unwind mRNA structure to induce initiation (Qu et al., 2012; Duval et al., 2013).
Our findings have broad implications for the evolution of translational mechanisms in bacteria. Not all bacteria utilize SD motifs to promote translational initiation—SD motifs are notably lacking in Bacteroidetes and Cyanobacteria. Because the prevalence of SD motifs is a feature of the genome in general and not of a single gene, it makes sense that evolutionary selective pressure for or against SD usage would act at the level of the transcriptome. The nature of these selective pressures remains unclear, although Hockenberry recently argued that bacteria with high levels of SD usage tend to have higher maximal growth rates (Hockenberry et al., 2017). Future studies will clarify the evolutionary relationship between the growth environment, levels of SD usage among bacterial species, and their transcriptome-wide effects.
Materials and methods
Growth conditions
Request a detailed protocolUnless otherwise specified, cells were cultured at 37°C in 500 mL of LB + ampicillin (50 mg/L). IPTG was added (0.3 mM final) when the culture reached OD600 = 0.3 and cells were harvested by filtration at OD600 = 0.5. For profiling with retapamulin, cells were grown at 37°C in 500 mL of LB + ampicillin to OD600 = 0.3, induced with IPTG, grown to OD600 = 0.45, and then harvested by filtration 5 min after the addition of retapamulin (100 µg/mL final).
Cell harvest and lysis
Request a detailed protocolCells were harvested by filtration using a Kontes 99 mm filtration apparatus and 0.45 um nitrocellulose filter (Whatman) and then flash frozen in liquid nitrogen. Cells were lysed in lysis buffer (20 mM Tris pH 8.0, 10 mM MgCl2, 100 mM NH4Cl, 5 mM CaCl2, 100 U/mL DNase I, and 1 mM chloramphenicol) using a Spex 6870 freezer mill with 5 cycles of 1 min grinding at 5 Hz and 1 min cooling. Lysates were centrifuged at 20,000 g for 30 min at 4°C to pellet cell debris.
Overexpression and purification of MBP-MS2-His protein
Request a detailed protocolBL21(DE3) cells were transformed with the plasmid pMal-c2G-MBP-MS2-His, cultured at 37°C in LB + ampicillin (50 mg/L) to OD600 = 0.7, and induced with 0.3 mM final IPTG for 4 hr at 37°C. Cells were harvested by centrifugation and lysed on a french press in the binding buffer (50 mM NaH2PO4 pH 8.0, 300 mM NaCl, 10 mM imidazole, 6 mM BME). The MBP-MS2 protein was purified by FPLC (Atka, GE); after washes with the binding buffer, it was elution with the binding buffer supplemented with 200 mM imidazole.
Affinity purification of MS2-tagged ribosomes
Request a detailed protocol3 mL of amylose resin (NEB) were transferred to a Poly-Prep Chromatography Column (Bio-Rad) and washed 3 times with 10 mL of lysis buffer. 2.5 mg of MBP-MS2-His protein were loaded onto the amylose resin, incubated at 4°C for 1 hr, and washed twice with 10 mL of lysis buffer. For MS2RP, 1.5 mL of cell lysate and 15 µL of RNase T1 (1000 U/µL, Thermo) were loaded onto the MBP-MS2 resin, incubated at 4°C for 2 hr, and washed 3 times with 10 mL of lysis buffer. The resin was re-suspended in 1 mL of lysis buffer and 360 µg MNase was added to digest mRNA and remove the MS2 hairpin in rRNA, releasing the ribosomes from the column. Following a 2 hr incubation at 25°C, the flow-through was collected. Another 2 mL of lysis buffer was passed through the resin and collected. The flow-through fractions were then combined.
Sucrose density gradient centrifugation
Request a detailed protocol10–54% sucrose density gradients were prepared using the Gradient Master 108 (Biocomp) in the gradient buffer (20 mM Tris pH 8.0, 10 mM MgCl2, 100 mM NH4Cl, 2 mM DTT). 5–20 AU of E. coli lysate was loaded on top of sucrose gradient and centrifuged in a SW41 rotor at 35,000 rpm for 2.5 hr at 4°C. Fractionation was performed on a Piston Gradient Fractionator (Biocomp).
Library preparation
Request a detailed protocolLibraries for MS2RP and standard ribosome profiling are prepared as in Woolstenhulme et al. (2015) and Mohammad et al. (2016). At least two biological replicates were performed for each MS2RP library as detailed in the GEO database entry. RNA-seq libraries were prepared with TruSeq Stranded Total RNA Gold from 250 ng of total RNA following depletion of rRNA by RiboZero rRNA Removal Kit for bacteria (Illumina). Libraries were analyzed by BioAnalyzer high sensitivity DNA kit (Agilent) then sequenced on the HiSeq2500 (Illumina).
Analysis of rRNA purity by RT-PCR
Request a detailed protocolRNA was purified by hot-phenol extraction. The first strand synthesis was performed with 500 ng of total RNA, primer MS2check_R (5’-AGACATTACTCACCCGTCCGCCACTC-3’) and SuperScript III (Invitrogen). 15 cycles of PCR amplification were performed with primer MS2check_F70 (5’-TGCAAGTCGAACGGTAACAGGAAG-3’), primer MS2check_R, and Phusion polymerase (NEB). PCR products were resolved by 8% TEB gel and analyzed by Typhoon FLA 9500 (GE).
GFP/mCherry assay
Request a detailed protocolMG1655 cells carrying the reporter plasmid were cultured in LB + ampicillin (50 mg/L) to early log phase. Cells were diluted 50-fold in TBS. GFP and mCherry fluorescence were measured on a Guava easyCyte flow cytometer (Millipore Sigma).
General processing of sequencing data
Request a detailed protocolFor libraries prepared by linker with UMI (rAppNNNNNNCACTCGGGCACCAAGGAC), perfectly matching reads (including 5’-end and 3’-end UMI) were converted to a single read by Tally (Davis et al., 2013). 3’-linker sequences were removed by Skewer (Jiang et al., 2014). The 5’ end UMI added by the RT primer were removed by seqtk. Reads were aligned using bowtie version 1.1.2 (Langmead et al., 2009), first to the tRNAs, rRNAs, and the ssrA, ssrS, lacI and ffs genes. Reads that failed to align to those sequences were aligned to E. coli MG1655 NC_000913.2. Ribosome position was assigned by the 3’-end of aligned reads. RNA-seq data were assigned by the 5’-end of aligned reads.
Calculation of ∆G
Request a detailed protocolThe affinity (∆G) of the ASD and the sequence of a start codon was calculated for each mRNA using free_scan with ‘-l 0 –b 0’ option to disallow internal loop and internal bulge (Nakagawa et al., 2010). The input sequences were −15 and −6 nt upstream of AUG and the reverse sequence of wild-type ASD (UCCUCCA) or the mutant ASD where appropriate.
Analyses of genome-wide mRNA structural data
Request a detailed protocolAverage SHAPE reactivity was based on the SHAPE-MaP data (Mustoe et al., 2018). A median of the SHAPE reactivity from the region −25 to +25 upstream and downstream of the start codon was used as degree on RNA structure.
Analyses of initiation peaks in samples treated with retapamulin
Request a detailed protocolAUG codons were only included in the analysis of average ribosome density and initiation scores if they had more than 10 mapped reads in the window of −50 upstream and +50 downstream of the AUG. To calculate average ribosome density, for each AUG we took the rpm at each position across this window, divided it by the total rpm in the window, and then computed the mean of these values for all AUGs included in the calculation. Initiation scores were computed by taking the mean of reads mapped within +3 to +21 nt downstream of the A in AUG and dividing it by the mean of reads mapped on the region −50 to +50 of the AUG.
Probability logo
Request a detailed protocolProbability logos were generated by kpLogo (Wu and Bartel, 2017) using its default settings. For Figure 5A, input and background sequences are described in the figure legend. For Figure 5—figure supplement 1 the set of input sequences consisted of annotated AUGs from the GFF file available at NCBI and the set of background sequences consisted of all AUGs in the genome that were not annotated as initiation sites.
Data availability
Request a detailed protocolThe sequencing data are available in processed WIG format at the GEO using accession number GSE135906 and as the raw FASTQ files at the SRA. Custom python scripts used to analyze the sequencing data are freely available at https://github.com/greenlabjhmi/2019_SDASD (Saito, 2020; copy archived at https://github.com/elifesciences-publications/2019_SDASD).
Data availability
Sequencing data have been deposited in the GEO under accession code GSE135906.
-
NCBI Gene Expression OmnibusID GSE135906. Translational initiation in E. coli occurs at the correct sites genome-wide in the absence of mRNA-rRNA base-pairing.
References
-
Assembly and functionality of the ribosome with tethered subunitsNature Communications 10:1–13.https://doi.org/10.1038/s41467-019-08892-w
-
Regulation of translation initiation by RNA binding proteinsAnnual Review of Microbiology 63:27–44.https://doi.org/10.1146/annurev.micro.091208.073514
-
Ribosome-messenger recognition: mrna target sites for ribosomal protein S1Nucleic Acids Research 19:155–162.https://doi.org/10.1093/nar/19.1.155
-
Translational standby sites: how ribosomes may deal with the rapid folding kinetics of mRNAJournal of Molecular Biology 331:737–743.https://doi.org/10.1016/S0022-2836(03)00809-X
-
Translation initiation is controlled by RNA folding kinetics via a ribosome drafting mechanismJournal of the American Chemical Society 138:7016–7023.https://doi.org/10.1021/jacs.6b01453
-
Diversity of translation initiation mechanisms across bacterial species is driven by environmental conditions and growth demandsMolecular Biology and Evolution 35:582–592.https://doi.org/10.1093/molbev/msx310
-
Within-Gene Shine-Dalgarno sequences are not selected for functionMolecular Biology and Evolution 35:2487–2498.https://doi.org/10.1093/molbev/msy150
-
Genetic analysis of the Shine-Dalgarno interaction: selection of alternative functional mRNA-rRNA combinationsRNA 2:1270–1285.
-
How do Bacteria tune translation efficiency?Current Opinion in Microbiology 24:66–71.https://doi.org/10.1016/j.mib.2015.01.001
-
Secondary structure of bacteriophage f2 ribonucleic acid and the initiation of in vitro protein biosynthesisJournal of Molecular Biology 50:689–702.https://doi.org/10.1016/0022-2836(70)90093-8
-
Gene regulation by riboswitchesNature Reviews Molecular Cell Biology 5:451–463.https://doi.org/10.1038/nrm1403
-
Comparison of mRNA features affecting translation initiation and reinitiationNucleic Acids Research 41:474–486.https://doi.org/10.1093/nar/gks989
-
A network of orthogonal ribosome x mRNA pairsNature Chemical Biology 1:159–166.https://doi.org/10.1038/nchembio719
-
The mechanism of translational coupling in Escherichia coli. Higher order structure in the atpHA mRNA acts as a conformational switch regulating the access of de novo initiating ribosomesThe Journal of Biological Chemistry 269:18118–18127.
-
Automated design of synthetic ribosome binding sites to control protein expressionNature Biotechnology 27:946–950.https://doi.org/10.1038/nbt.1568
-
Divergent rRNAs as regulators of gene expression at the ribosome levelNature Microbiology 4:515–526.https://doi.org/10.1038/s41564-018-0341-1
-
Controlling mRNA stability and translation with small, noncoding RNAsCurrent Opinion in Microbiology 7:140–144.https://doi.org/10.1016/j.mib.2004.02.015
-
Performance of expression vector, pTD1, in insect cell-free translation systemJournal of Bioscience and Bioengineering 102:69–71.https://doi.org/10.1263/jbb.102.69
-
kpLogo: positional k-mer analysis reveals hidden specificity in biological sequencesNucleic Acids Research 45:W534–W538.https://doi.org/10.1093/nar/gkx323
Decision letter
-
Joseph T WadeReviewing Editor; Wadsworth Center, New York State Department of Health, United States
-
James L ManleySenior Editor; Columbia University, United States
-
Joseph T WadeReviewer; Wadsworth Center, New York State Department of Health, United States
-
Alexander MankinReviewer; University of Illinois, Chicago, United States
In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.
Acceptance summary:
This paper uses an innovative twist on ribosome profiling to investigate the importance of Shine-Dalgarno sequences in bacterial translation initiation. Surprisingly, the data show that strong base-pairing between the 16S ribosomal RNA and the mRNA Shine-Dalgarno sequence is neither necessary nor sufficient for translation initiation. This suggests that start-codons are "hard-wired" into the genome, largely independent of Shine-Dalgarno sequence.
Decision letter after peer review:
[Editors’ note: the authors submitted for reconsideration following the decision after peer review. What follows is the decision letter after the first round of review.]
Thank you for submitting your work entitled "Shine-Dalgarno sequences fine-tune translation genome-wide but are not the primary determinants of start-site selection" for consideration by eLife. Your article has been reviewed by three peer reviewers, including Joe Wade as the Reviewing Editor and Reviewer #1, and the evaluation has been overseen by Jim Manley as the Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Shura Mankin (Reviewer #2).
Our decision has been reached after an extensive discussion involving the three reviewers. The reviewers were enthusiastic about parts of the manuscript, in particular the method itself; however, there was some disagreement as to the significance of the work. Much of the discussion focused on the data in Figure 6, which the reviewers considered to present potentially the most important result. We felt that further analysis is needed to fully address the key questions of (i) whether annotated start codons are inherently good at binding ribosomes, independent of the SD, and (ii) whether an SD alone (e.g. next to an ATG in the middle of an ORF) is insufficient to bind a ribosome. Put more simply, we felt that further analysis is required to show that start-codons are "hard-wired" into the genome, largely independent of SD sequence. We are therefore rejecting the paper because the outcome of the new analysis is unclear. Nonetheless, we would be willing to consider a revised version if the analyses suggested below, or something equivalent, provide stronger support for the idea that start-codons are hard-wired, independent of the SD sequence.
The main concerns with Figure 6 are that (i) the control set of ORF-internal ATGs is not the best control because many (most?) of the ATGs don't have a good SD sequence for the modified ribosomes; and (ii) ribosome density at annotated start codons for the modified ribosomes could be due to a subset of start codons that have decent matches to the modified ASD sequence. We suggest that a more appropriate control set of ATGs would be those with a good predicted match to the altered ASD sequence. We also suggest limiting the analysis in Figure 6C to start codons that have a poor match to the modified ASD. Another way to look at this would be to compare which annotated start codons are recognized by the different modified ribosomes; if all three types of ribosome recognize the same subset of start codons, it's safe to conclude that this is occurring independent of the SD. If these (or other) analyses can provide stronger support for the "hard-wired" model, that would likely be sufficient for publication. In addition to the re-analysis of data from Figure 6, it's important to improve the clarity of the paper, which was at times confusing (see the detailed reviews for more information on this). Additionally, reviewer #3 makes some important points about the calculation of hybridization energies, such as considering a full, 9 nt SD sequence with variable spacing. Lastly, the manuscript would benefit from a clearer description of what is already known about features other than SD sequence that contribute to translation initiation (see comments from reviewer 3).
Reviewer #1:
This paper describes an innovative approach to probe the importance of Shine-Dalgarno (S-D) sequences in translation initiation in Escherichia coli. By performing ribosome profiling on modified ribosomes, the authors are able to observe translation by ribosomes with altered anti-S-D sequences. This method reveals that despite no correlation between S-D strength and translation levels, there is a contribution of S-D strength that is apparent when all confounding factors have been controlled for. Interestingly, this effect of S-Ds is lost during other growth conditions, although for cold shock that is largely consistent with previous work, and it is unclear what the mechanism is in stationary phase. While I think the topic is interesting and the primary method is ingenious, I'm not convinced that the authors have learned much about the relative importance of S-Ds in translation initiation. As they acknowledge, previous studies have failed to see a correlation between S-D strength and translation initiation levels, and the importance of secondary structure and of A-rich sequences has been described previously. The fact that predictions of S-D strength correlate with translation initiation levels once factors other than S-D have been accounted for indicates that these predictions are fairly accurate. This is important, since it accounts for the possibility that the lack of correlation between predicted S-D strength and translation initiation is because of our inability to predict S-D strength. However, the impact of this advance is small. I also have concerns about the interpretation of Figures 6 and 7 that impact the overall conclusions.
- The presentation of the cold shock data is confusing. The overall conclusion is that S-D-dependence is lost at almost all genes during cold shock. However, a small subset of genes appears to depend strongly on the S-D. The distinction between the effect on the majority of genes and the effect on a small subset should be explained more clearly. The simplest interpretation of these data is that most start codons are highly structured during cold shock, but those that are do not rely on their S-Ds. This model is largely consistent with previous work.
- I disagree with the interpretation of Figure 6. The data show that for the altered ribosomes, annotated start codons are used far more efficiently than the collection of all other ATG sequences within ORFs. However, there are many more ATGs within ORFs than annotated start codons, and even if translation relies heavily on S-D sequences, you would expect that most ATGs within ORFs would not be selected by alternative ribosomes because only a small subset will have appropriate S-D sequences, and many may be weakly expressed. My interpretation of these data is that alternative ribosomes do use annotated start codons, but there is no way to tell how selectively they do this. A more appropriate comparison would be of (i) annotated start codons to (ii) ATGs within ORFs where the ATG is associated with a sequence that is predicted to function as a good S-D for the alternative ribosome.
- Another concern I have with Figure 6 is that presumably some, and perhaps many of the annotated start codons will have good SD matches for the alternative ribosomes. Figure 2C suggests that the number with good matches will be fairly high. Is the ribosome density at annotated start codons simply due to the subset of start codons that have reasonable SD matches to the altered ASD? Another way to think about this is to ask whether the start codons contributing to the signal in Figure 6C are the same start codons that contribute to the signal in Supplementary Figure 5A-B.
- Figure 7E shows the importance of an A-rich sequence in the context of start codons lacking a good S-D. Similar to Figure 6, these data highlight the contribution of non-SD sequences to translation initiation, but they do not provide any information about the relative importance of the different sequence elements.
Reviewer #2:
Major findings:
The paper of Saito and al. examines the contribution of Shine-Dalgarno sequence (SD) to the translation efficiency in bacteria. Using a clever approach, the authors use ribosome profiling to compare mRNA occupancy by wt ribosomes and ribosomes with the altered anti-SD sequence (ASD). In confirmation of previous findings from the Weissman lab, they find that the general translation efficiency does not correlate with the predicted strength of SD-ASD interactions. However, when all the other factors are masked, they observe a strong dependence of the initiation rate on the strength of SD-ASD pairing. They also noted that a subset of genes expressed in the stressed cells depend heavily on recognition of the SD sequence by the ribosomes. One of the unexpected, but highly important findings is the observation that the ribosomes with the altered ASD can nevertheless correctly and selectively initiate translation at the known start sites underscoring the importance of factors other than SD-ASD interactions in the start codon selection. Importantly, the reported work reveals the prevalence of A-rich motifs in the ribosome binding sites of the genes with weak SD sequences in E. coli and other bacteria. This trend becomes especially prominent in the bacterial species that do not rely on SD-ASD interactions for translation initiation.
Critique:
This is an interesting, intriguing and important study. The results are nice and clean and the implications are important for unraveling the fundamental mechanism of translation initiation in bacteria. Although the paper is generally well written, it was hard at times to follow the authors logic and I strongly encourage the authors to try to clarify the message, which often was hard to extract.
1) Here are several examples:
- Abstract: The statement "We reveal a genome-wide correlation between the SD strength and translational efficiency" is followed by "this global correlation is lost and a subset of genes […] becomes [dependent] on SD motifs for translation". This is hard to digest.
- Figure 4C legend ("the strength of the SD motifs determines whether wild-type or ASD mutants are recruited to messages") is supposed to contrast Figure 4F legend ("the unstructured SD motifs can recruit wild type ribosomes more effectively than they recruit ASD mutants"). However, they sound nearly identical and thus, do not accurately communicate the point the authors apparently are trying to make.
- "genes with strong SD motif are translated better by ribosomes with canonical ASD": better in comparison with the ASD-mutant ribosomes or better in comparison with the genes with weak SD?
2) Aleksashin et al., 2019, have shown that altering ASD in 16S rRNA compromises rRNA maturation. Although the presence of unprocessed sequences at the 5' and 3' end of the ASD-mutant 16S rRNA would not likely change the general conclusions of the paper, hypothetically it could affect the functionality and elongation rate of the mutant ribosomes. I am wondering whether authors have checked how well their mutant 16S rRNAs are processed. Irrespectively, I believe a more detailed discussion of the general functionality of the ribosomes with altered ASD, especially in relation to the elongation rate, would be beneficial.
3) Subsection “Gene-specific roles of SD motifs under stress”. The readers need a better explanation why the authors switched from ΔlogTE to ΔlogRPKM metrics when they move to the experiments in the stressed cells.
4) The influence of the competition between wt and mutant 30S subunits for the translation start sites on the conclusions drawn from ribosome profiling should be discussed.
Reviewer #3:
The authors introduce mutated 16S ribosomal RNAs into E. coli strains, altering their anti-Shine Dalgarno sequences, to investigate how these perturbations affect translation rates across the E. coli genome. To do this, they carry out ribosome profiling experiments to measure genome-wide ribosome densities on aSD-modified strains, including during exponential, stationary, and cold shock growth phases. Overall, they find that changing the last 9 nucleotides of the 16S rRNA has a significant effect on the transcriptomes' translation rates.
Overall, the collected measurements are interesting and potentially useful. However, the analysis suffers from a terribly incomplete knowledge of what controls a mRNA's translation rate. The statistics applied are tailored for a 1-factor problem, when in fact, there are many factors that control translation rate. There are also inconsistencies and errors in the authors' calculations that should be corrected. The authors' conclusions are not well supported by their analysis. The manuscript requires significant work for it to productively add to our knowledge of what controls translation rate in bacteria.
1) The author focuses primarily on the importance of the sequence colloquially known as the Shine-Dalgarno in controlling a mRNA's translation initiation rate. The authors write that "Initiation rates vary depending on how well an mRNA recruits 30S subunits to the start codon, and in bacteria, the working model is that this is accomplished primarily by Shine-Dalgarno (SD) motifs." This is incorrect. The current working model is that a mRNA's translation initiation rate is controlled by at least five important molecular interactions, only one is the hybridization between the last 9 nucleotides of the 16S rRNA and the mRNA. They include:
a) the hybridization between the last 9 nucleotides of the 16S rRNA and the mRNA;b) the unfolding of mRNA structures that overlap with the ribosome footprint;c) the differences in spacing (physical distance) between the 16S rRNA binding site and the start codon;d) the standby site's accessibility, as determined by the length of available single-stranded RNA;e) the start codon and its hybridization to the tRNA.
The free energy needed to unfold inhibitory mRNA structures is also affected by the dynamics of RNA folding (RNA folding kinetics) as well as the rate of ribosome binding.
If the authors better understood how translation rate was controlled, they could more productively use their measurements to push the real state-of-the-art forward. Their current conclusions are already subsumed within the state-of-the-art (i.e., nothing new).
2) Any discussion of "which translation rate interaction is most important" or "which translation rate interaction is responsible for X whereas the interaction Y only fine-tunes Z" is not productive and can easily be contradicted by selecting a real counter example. Overall, it is the binding free energy of the 30S ribosome to the mRNA that determines its translation initiation rate. Each of these interactions contributes free energy to this process and the magnitude of the contributed free energies can be roughly equal across a selection of real mRNA examples. There are unstructured mRNAs where there is little penalty for unfolding inhibitory mRNA structures. There are highly structured mRNAs that have consensus SDs sequences. There are mRNAs that have consensus SD sequences far away from the start codon. All of these mRNAs could have the same translation rate. Which interaction is most important? That's not the right question to ask, because it's meaningless.
3) The manuscript's main topic is the Shine-Dalgarno sequence, but the authors should be made aware that at least the last 9 nucleotides of the 16S rRNA can contact the mRNA and hybridize to it. In E. coli, the anti-Shine Dalgarno sequence is 5'-ACCUCCUUA-3' and the "consensus" Shine-Dalgarno sequence is therefore 5'-TAAGGAGGT-3'. The manuscript text and the authors' calculations should reflect this.
4) The authors are mis-using the ribosome profiling measurements in their analysis. Ribosome profiling measurements do not directly measure translation rates. They measure mRNA-bound ribosome densities. A mRNA's ribosome density will depend on both its translation initiation rate AND its translation elongation rate. Specifically, in steady-state conditions, the ribosome density will be the ratio between these two quantities (initiation rate over elongation rate). In the initial applications of ribosome profiling, researchers assumed that all mRNAs have the same translation elongation rate in order to conclude that ribosome density measurements were proportional to translation initiation rates. This is not true. Coding sequences in mRNAs have very different translation elongation rates, due to differences in synonymous codon usage. Unless each mRNAs' translation elongation rates are predicted or directly measured, ribosome density measurements cannot be used to infer their translation initiation rates. Therefore, when the authors write "In pioneering ribosome profiling studies in bacteria, the paradoxical observation was made that there is little or no correlation between the translational efficiency of a gene and the strength of its SD motif (calculated using thermodynamic algorithms for RNA pairing), as had been anticipated based on the SD model." there is no actual paradox. The ribosome profiling measurements were not used correctly to test how mRNA sequences control translation rate.
5) Getting to the authors' main conclusions, they write that "These data indicate that the ASD mutant ribosomes translate genes with weak SD motifs better than genes with strong SD motifs, exactly the opposite of what wild-type ribosomes are expected to do." This statement is confusing given the real conclusion of the authors, that all other factors being equal a "strong" SD motif does result in higher translation than a "weak" SD motif. It's only because of other confounding factors that the initial analysis did not yield a positive correlation. An incorrect analysis (excluding confounding variables) cannot lead to a correct conclusion.
6) Figure 2C shows a very interesting and productive result, that the difference in translation efficiency between the wild-type "C" ribosomes and the A-ribosomes correlates to some degree with the hybridization free energy between the mRNA and (a portion of) the anti-SD sequence. This is a productive approach towards eliminating key confounding variables because, in principle, the strengths of the four other interactions that control translation initiation rate should not change when the 16S rRNA aSD sequences are changed. However, it's not apparent in the manuscript text, but the authors are using the modified 16S rRNAs to “eliminate the SD-aSD interaction as a contribution to the mRNA's translation rate”. So when they subtract the contribution from the modified A-ribosome's translation rates from the C-ribosome's translation rate, they are observing more directly the contribution from the SD-aSD interaction. The manuscript text should more clearly explain this experimental design. This is a creative and valid way of using ribosome profiling measurements.
7) However, the hybridization free energy calculations could be improved. First, as mentioned previously, the wild-type aSD sequence in E. coli is ACCUCCUUA. Second, the hybridization free energy calculation was only performed on the region from 15 to 6 nucleotides upstream of the start codon, but the aSD sequence can hybridize at other locations. Third, the hybridization between the mRNA and aSD can accommodate 1 or 2-nucleotide bulges or internal loops.
8) The measurements in cold shock are greatly confounded by the higher expression levels of RNA chaperones that are unfolding mRNA structures “at specific mRNAs” where the RNA chaperones recognize binding motifs. The conclusion here should be that RNA chaperones bind specific mRNAs, unfold their inhibitory mRNA structures, and increase their translation rates during cold shock. This is all independent of the Shine-Dalgarno sequence. This process also does not depend on many other uninteresting factors.
9) The use of ORF-wide GINI values is odd because it's generally only the region surrounding the start codon that affects its translation initiation rate, and not the structure of the entire ORF (which this coefficient is quantifying). Also, using the SHAPE reactivity around a start codon as a proxy for RNA structure is a bit misleading as ribosomes actively unfold RNA structures during translation initiation. A highly structured mRNA with a consensus SD sequence will have a high SHAPE reactivity (i.e., low RNA structure) because the ribosomes can rapidly bind to the mRNA and unfold the mRNA structure. SHAPE reactivity is measuring the effect of rapid ribosome binding and not the cause of it. Rapid ribosome binding can also be facilitated by slow RNA refolding kinetics, called "Ribosome Drafting" in the literature.
10) The data in Figure 6 just says that A-ribosomes can initiate translation rate at other start codons because they now have more negative binding free energies to those start codons, compared to the annotated ones. The authors could perform hybridization calculations using the A-ribosome's aSD sequence to investigate whether these "new start codons" have a nearby "SD" sequence that is complementary to the A-ribosome's aSD. That would be interesting.
[Editors’ note: further revisions were suggested prior to acceptance, as described below.]
Thank you for resubmitting your work entitled "Translational initiation in E. coli occurs at the correct sites genome-wide in the absence of mRNA-rRNA base-pairing" for further consideration by eLife. Your revised article has been evaluated by James Manley as the Senior Editor, and three reviewers, including Joe Wade as the Reviewing Editor and Reviewer #1.
The reviewers and editors agree that the revised manuscript is greatly improved, and we are pleased to provisionally accept the manuscript. We ask that you make a few small changes in response to the reviewers' comments. First, reviewer 2 has two minor concerns that are easily addressed. Second, based on reviewer 3's comments, the conclusions regarding the importance of A-rich sequences should be softened a little. The reviewers' comments are listed below:
Reviewer #1:
The authors have done an excellent job improving the manuscript. Removing the data on cold-shock has improved the focus and readability. Moreover, the new analyses in Figures 3 and 4 make a more compelling case that Shine-Dalgarno sequences are neither necessary nor sufficient for start site selection.
Reviewer #2:
The streamlined paper of Saito et al. reads much better than the original version and delivers a clear and impactful message.
I believe it can be published after authors address two remaining issues:
The authors refer to "the number of elongating ribosomes per mRNA as a proxy of initiation rates". This is incorrect: there would be twice as many ribosomes on an mRNA that is twice as long as another one, even if those two would have the same initiation rate. The correct metrics is not the number of ribosomes per mRNA but the ribosome density (their number normalized by mRNA length). This does not affect conclusions of the paper because authors normalize RiboSeq reads by RNASeq reads. Yet, I would try to avoid this confusion.
The authors write: "Interestingly, in comparing internal AUG codons that support initiation in our ribosome profiling data to those that do not, we found that A's are enriched both upstream and downstream of initiation sites (Figure 5A).…. This results from endogenous initiation sites… ". However, Figure 5A does not deal with the internal initiation sites, but with the annotated sites lacking SD.
Reviewer #3:
With the revisions, the authors have greatly improved the manuscript's introductory description of translation and the overall analysis of their dataset, leading to a more laser-focused and well-supported set of conclusions. These results provide an excellent and clarifying view of the sequence determinants and interactions that control translation initiation rate within natural (highly evolved) mRNAs by cleanly separating the role of the SD:aSD interaction from other factors, including the presence/absence of inhibitory mRNA structures.
The Introduction provides a more comprehensive description of the several sequence determinants and factors that control translation initiation rate, which is essential towards understanding the authors' excellent dataset. The analysis clearly explains how their dataset provides comparative measurements with vs. without the SD:aSD interaction and how those measurements quantify its effect on translation rate. Start codon selection is a property of all the factors that control translation initiation rate, and is also likely a property of ribosome-ribosome dynamics along the mRNA.
https://doi.org/10.7554/eLife.55002.sa1Author response
[Editors’ note: the authors resubmitted a revised version of the paper for consideration. What follows is the authors’ response to the first round of review.]
Reviewer #1:
This paper describes an innovative approach to probe the importance of Shine-Dalgarno (S-D) sequences in translation initiation in Escherichia coli. […] I also have concerns about the interpretation of Figures 6 and 7 that impact the overall conclusions.
- The presentation of the cold shock data is confusing. The overall conclusion is that S-D-dependence is lost at almost all genes during cold shock. However, a small subset of genes appears to depend strongly on the S-D. The distinction between the effect on the majority of genes and the effect on a small subset should be explained more clearly. The simplest interpretation of these data is that most start codons are highly structured during cold shock, but those that are do not rely on their S-Ds. This model is largely consistent with previous work.
We have removed the figures dealing with cold shock and other stresses. We agree that they are largely consistent with previous work. The confusion raised by the cold shock story appears to have taken away from the main story.
- I disagree with the interpretation of Figure 6. The data show that for the altered ribosomes, annotated start codons are used far more efficiently than the collection of all other ATG sequences within ORFs. However, there are many more ATGs within ORFs than annotated start codons, and even if translation relies heavily on S-D sequences, you would expect that most ATGs within ORFs would not be selected by alternative ribosomes because only a small subset will have appropriate S-D sequences, and many may be weakly expressed. My interpretation of these data is that alternative ribosomes do use annotated start codons, but there is no way to tell how selectively they do this. A more appropriate comparison would be of (i) annotated start codons to (ii) ATGs within ORFs where the ATG is associated with a sequence that is predicted to function as a good S-D for the alternative ribosome.
We have added analyses to the new Figure 4 and Figure 4—figure supplement 1 showing initiation at internal AUG codons predicted to have high affinity for the mutant ASD sequences. These data show that initiation occurs with all four ribosome types regardless of the SD strength or specificity, but that initiation is most efficient when the SD and ASD are complementary.
- Another concern I have with Figure 6 is that presumably some, and perhaps many of the annotated start codons will have good SD matches for the alternative ribosomes. Figure 2C suggests that the number with good matches will be fairly high. Is the ribosome density at annotated start codons simply due to the subset of start codons that have reasonable SD matches to the altered ASD? Another way to think about this is to ask whether the start codons contributing to the signal in Figure 6C are the same start codons that contribute to the signal in Supplementary Figure 5A-B.
We added analyses to the new Figure 3 showing that initiation occurs with all three mutant ribosomes at annotated start sites that have no affinity for the mutant ASD sequences.
- Figure 7E shows the importance of an A-rich sequence in the context of start codons lacking a good S-D. Similar to Figure 6, these data highlight the contribution of non-SD sequences to translation initiation, but they do not provide any information about the relative importance of the different sequence elements.
We have removed claims about the relative importance of different sequence elements.
Reviewer #2:
Critique:
This is an interesting, intriguing and important study. The results are nice and clean and the implications are important for unraveling the fundamental mechanism of translation initiation in bacteria. Although the paper is generally well written, it was hard at times to follow the authors logic and I strongly encourage the authors to try to clarify the message, which often was hard to extract.
1) Here are several examples:
- Abstract: The statement "We reveal a genome-wide correlation between the SD strength and translational efficiency" is followed by "this global correlation is lost and a subset of genes […] becomes [dependent] on SD motifs for translation". This is hard to digest.
- Figure 4C legend ("the strength of the SD motifs determines whether wild-type or ASD mutants are recruited to messages") is supposed to contrast Figure 4F legend ("the unstructured SD motifs can recruit wild type ribosomes more effectively than they recruit ASD mutants"). However, they sound nearly identical and thus, do not accurately communicate the point the authors apparently are trying to make.
- "genes with strong SD motif are translated better by ribosomes with canonical ASD": better in comparison with the ASD-mutant ribosomes or better in comparison with the genes with weak SD?
We removed the section on stress conditions that was hard to follow and taking away from the main point of the manuscript.
2) Aleksashin et al., 2019, have shown that altering ASD in 16S rRNA compromises rRNA maturation. Although the presence of unprocessed sequences at the 5' and 3' end of the ASD-mutant 16S rRNA would not likely change the general conclusions of the paper, hypothetically it could affect the functionality and elongation rate of the mutant ribosomes. I am wondering whether authors have checked how well their mutant 16S rRNAs are processed. Irrespectively, I believe a more detailed discussion of the general functionality of the ribosomes with altered ASD, especially in relation to the elongation rate, would be beneficial.
RNA-seq analyses of rRNA (prior to nuclease treatment) is now shown in Figure 1—figure supplement 1 and discussed early in the Results section (subsection “Selective profiling of ribosomes with mutant ASD sequences”).
3) Subsection “Gene-specific roles of SD motifs under stress”. The readers need a better explanation why the authors switched from ΔlogTE to ΔlogRPKM metrics when they move to the experiments in the stressed cells.
This section was removed.
4) The influence of the competition between wt and mutant 30S subunits for the translation start sites on the conclusions drawn from ribosome profiling should be discussed.
This possibility was added to the Discussion.
Reviewer #3:
1) The author focuses primarily on the importance of the sequence colloquially known as the Shine-Dalgarno in controlling a mRNA's translation initiation rate. The authors write that "Initiation rates vary depending on how well an mRNA recruits 30S subunits to the start codon, and in bacteria, the working model is that this is accomplished primarily by Shine-Dalgarno (SD) motifs." This is incorrect. […] Their current conclusions are already subsumed within the state-of-the-art (i.e., nothing new).
We added a more detailed description of the factors that affect initiation rates to the Introduction, including the points listed above.
2) Any discussion of "which translation rate interaction is most important" or "which translation rate interaction is responsible for X whereas the interaction Y only fine-tunes Z" is not productive and can easily be contradicted by selecting a real counter example. Overall, it is the binding free energy of the 30S ribosome to the mRNA that determines its translation initiation rate. Each of these interactions contributes free energy to this process and the magnitude of the contributed free energies can be roughly equal across a selection of real mRNA examples. There are unstructured mRNAs where there is little penalty for unfolding inhibitory mRNA structures. There are highly structured mRNAs that have consensus SDs sequences. There are mRNAs that have consensus SD sequences far away from the start codon. All of these mRNAs could have the same translation rate. Which interaction is most important? That's not the right question to ask, because it's meaningless.
We have removed language that focuses on the relative contribution of the individual factors that affect translational initiation. We agree that our analyses do not allow us to determine their relative contributions.
3) The manuscript's main topic is the Shine-Dalgarno sequence, but the authors should be made aware that at least the last 9 nucleotides of the 16S rRNA can contact the mRNA and hybridize to it. In E. coli, the anti-Shine Dalgarno sequence is 5'-ACCUCCUUA-3' and the "consensus" Shine-Dalgarno sequence is therefore 5'-TAAGGAGGT-3'. The manuscript text and the authors' calculations should reflect this.
We revised the Introduction to explicitly state that up to 9 bp can form. Drawing on previous work, we refer to the “consensus” as GGAGG because it is the G’s that are overrepresented upstream of start codons (see the data for E. coli in Figure 5—figure supplement 1).
4) The authors are mis-using the ribosome profiling measurements in their analysis. Ribosome profiling measurements do not directly measure translation rates. They measure mRNA-bound ribosome densities. A mRNA's ribosome density will depend on both its translation initiation rate AND its translation elongation rate. Specifically, in steady-state conditions, the ribosome density will be the ratio between these two quantities (initiation rate over elongation rate). In the initial applications of ribosome profiling, researchers assumed that all mRNAs have the same translation elongation rate in order to conclude that ribosome density measurements were proportional to translation initiation rates. This is not true. Coding sequences in mRNAs have very different translation elongation rates, due to differences in synonymous codon usage. Unless each mRNAs' translation elongation rates are predicted or directly measured, ribosome density measurements cannot be used to infer their translation initiation rates. Therefore, when the authors write "In pioneering ribosome profiling studies in bacteria, the paradoxical observation was made that there is little or no correlation between the translational efficiency of a gene and the strength of its SD motif (calculated using thermodynamic algorithms for RNA pairing), as had been anticipated based on the SD model." there is no actual paradox. The ribosome profiling measurements were not used correctly to test how mRNA sequences control translation rate.
We revised the Introduction to explicitly state that up to 9 bp can form. Drawing on previous work, we refer to the “consensus” as GGAGG because it is the G’s that are overrepresented upstream of start codons (see the data for E. coli in Figure 5—figure supplement 1).
5) Getting to the authors' main conclusions, they write that "These data indicate that the ASD mutant ribosomes translate genes with weak SD motifs better than genes with strong SD motifs, exactly the opposite of what wild-type ribosomes are expected to do." This statement is confusing given the real conclusion of the authors, that all other factors being equal a "strong" SD motif does result in higher translation than a "weak" SD motif. It's only because of other confounding factors that the initial analysis did not yield a positive correlation. An incorrect analysis (excluding confounding variables) cannot lead to a correct conclusion.
The sentence, “These data indicate that the ASD mutant ribosomes translate genes with weak SD motifs better than genes with strong SD motifs” describes the observations in Figure 2B, the result of all the other factors except for SD-ASD pairing. We removed the phrase “exactly the opposite of what wild-type ribosomes are expected to do” that seems to have caused the confusion.
6) Figure 2C shows a very interesting and productive result, that the difference in translation efficiency between the wild-type "C" ribosomes and the A-ribosomes correlates to some degree with the hybridization free energy between the mRNA and (a portion of) the anti-SD sequence. This is a productive approach towards eliminating key confounding variables because, in principle, the strengths of the four other interactions that control translation initiation rate should not change when the 16S rRNA aSD sequences are changed. However, it's not apparent in the manuscript text, but the authors are using the modified 16S rRNAs to “eliminate the SD-aSD interaction as a contribution to the mRNA's translation rate”. So when they subtract the contribution from the modified A-ribosome's translation rates from the C-ribosome's translation rate, they are observing more directly the contribution from the SD-aSD interaction. The manuscript text should more clearly explain this experimental design. This is a creative and valid way of using ribosome profiling measurements.
We revised the language at the end of the Introduction and the beginning of the Results section to better explain our experimental design. The fact that we can isolate the effects of SD-ASD interactions from the other factors that set initiation rates explains why we use statistics for single variables and focus primarily on the SD mechanism of initiation.
7) However, the hybridization free energy calculations could be improved. First, as mentioned previously, the wild-type aSD sequence in E. coli is ACCUCCUUA. Second, the hybridization free energy calculation was only performed on the region from 15 to 6 nucleotides upstream of the start codon, but the aSD sequence can hybridize at other locations. Third, the hybridization between the mRNA and aSD can accommodate 1 or 2-nucleotide bulges or internal loops.
The fact that we see a strong correlation between our calculated SD affinities and differences in ribosome occupancy (WT – mutant) argues that the calculations are basically reliable. We see the highest correlation when affinities are calculated using the 10 nt between -15 and -6 from the AUG and we show the data for various windows with different SD distances in Figure 2—figure supplement 2. The calculations are quite robust to changes in parameters: we see little or no differences in SDRO correlations if we use 9 nt of ASD sequence to calculate affinities instead of 7 nt, or if we allow the ASD to pair anywhere between -20 to 0 upstream of AUG, or if we use the RBS calculator to generate the ΔG values.
8) The measurements in cold shock are greatly confounded by the higher expression levels of RNA chaperones that are unfolding mRNA structures “at specific mRNAs” where the RNA chaperones recognize binding motifs. The conclusion here should be that RNA chaperones bind specific mRNAs, unfold their inhibitory mRNA structures, and increase their translation rates during cold shock. This is all independent of the Shine-Dalgarno sequence. This process also does not depend on many other uninteresting factors.
We removed the section on the role of SD motifs under stress because it generated confusion and ornithological references without strengthening the main point of our manuscript.
9) The use of ORF-wide GINI values is odd because it's generally only the region surrounding the start codon that affects its translation initiation rate, and not the structure of the entire ORF (which this coefficient is quantifying). Also, using the SHAPE reactivity around a start codon as a proxy for RNA structure is a bit misleading as ribosomes actively unfold RNA structures during translation initiation. A highly structured mRNA with a consensus SD sequence will have a high SHAPE reactivity (i.e., low RNA structure) because the ribosomes can rapidly bind to the mRNA and unfold the mRNA structure. SHAPE reactivity is measuring the effect of rapid ribosome binding and not the cause of it. Rapid ribosome binding can also be facilitated by slow RNA refolding kinetics, called "Ribosome Drafting" in the literature.
Carol Gross and colleagues showed that ORF-wide GINI values are highly correlated with translational efficiency genome-wide. This is true whether the DMS probing is done in vivo (where ribosomes could affect structure by unwinding the RNA) or to a lesser extent with purified RNA in vitro. Kevin Weeks also showed that RNA structures are correlated in vivo and in vitro using the SHAPE reagent. These data argue that at least to some extent mRNA structure is driving translation rates. This was clarified in the Results section in the discussion of Figure 5E.
10) The data in Figure 6 just says that A-ribosomes can initiate translation rate at other start codons because they now have more negative binding free energies to those start codons, compared to the annotated ones. The authors could perform hybridization calculations using the A-ribosome's aSD sequence to investigate whether these "new start codons" have a nearby "SD" sequence that is complementary to the A-ribosome's aSD. That would be interesting.
The new Figures 3 and 4 now include analyses of sets of initiation sites with various affinities for the WT or mutant ASD sequences showing more clearly that translation occurs at start codons even without strong SD-ASD pairing.
[Editors’ note: what follows is the authors’ response to the second round of review.]
Reviewer #2:
The streamlined paper of Saito et al. reads much better than the original version and delivers a clear and impactful message.
I believe it can be published after authors address two remaining issues:
The authors refer to "the number of elongating ribosomes per mRNA as a proxy of initiation rates". This is incorrect: there would be twice as many ribosomes on an mRNA that is twice as long as another one, even if those two would have the same initiation rate. The correct metrics is not the number of ribosomes per mRNA but the ribosome density (their number normalized by mRNA length). This does not affect conclusions of the paper because authors normalize RiboSeq reads by RNASeq reads. Yet, I would try to avoid this confusion.
The language was changed to “ribosome density” instead of “the number of ribosomes.”
The authors write: "Interestingly, in comparing internal AUG codons that support initiation in our ribosome profiling data to those that do not, we found that A's are enriched both upstream and downstream of initiation sites (Figure 5A).…. This results from endogenous initiation sites… ". However, Figure 5A does not deal with the internal initiation sites, but with the annotated sites lacking SD.
Thanks for catching this mistake; the Discussion was updated to reflect this.
https://doi.org/10.7554/eLife.55002.sa2Article and author information
Author details
Funding
National Institute of General Medical Sciences (GM110113)
- Allen R Buskirk
Howard Hughes Medical Institute
- Rachel Green
Japan Society for the Promotion of Science
- Kazuki Saito
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
The authors thank Daniel Goldman, Colin Wu, and Boris Zinshteyn for critical reading of the manuscript, as well as David Mohr at the Genetics Resources Core Facility, Johns Hopkins Institute of Genetic Medicine, for sequencing assistance. This study was funded by a JSPS fellowship (KS), NIH grant GM110113 (ARB), and HHMI (RG).
Senior Editor
- James L Manley, Columbia University, United States
Reviewing Editor
- Joseph T Wade, Wadsworth Center, New York State Department of Health, United States
Reviewers
- Joseph T Wade, Wadsworth Center, New York State Department of Health, United States
- Alexander Mankin, University of Illinois, Chicago, United States
Version history
- Received: January 9, 2020
- Accepted: February 14, 2020
- Accepted Manuscript published: February 17, 2020 (version 1)
- Version of Record published: February 26, 2020 (version 2)
Copyright
© 2020, Saito et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 13,691
- Page views
-
- 1,273
- Downloads
-
- 44
- Citations
Article citation count generated by polling the highest count across the following sources: Crossref, Scopus, PubMed Central.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Cell Biology
- Chromosomes and Gene Expression
The cohesin complex plays essential roles in chromosome segregation, 3D genome organisation, and DNA damage repair through its ability to modify DNA topology. In higher eukaryotes, meiotic chromosome function, and therefore fertility, requires cohesin complexes containing meiosis-specific kleisin subunits: REC8 and RAD21L in mammals and REC-8 and COH-3/4 in Caenorhabditis elegans. How these complexes perform the multiple functions of cohesin during meiosis and whether this involves different modes of DNA binding or dynamic association with chromosomes is poorly understood. Combining time-resolved methods of protein removal with live imaging and exploiting the temporospatial organisation of the C. elegans germline, we show that REC-8 complexes provide sister chromatid cohesion (SCC) and DNA repair, while COH-3/4 complexes control higher-order chromosome structure. High-abundance COH-3/4 complexes associate dynamically with individual chromatids in a manner dependent on cohesin loading (SCC-2) and removal (WAPL-1) factors. In contrast, low-abundance REC-8 complexes associate stably with chromosomes, tethering sister chromatids from S-phase until the meiotic divisions. Our results reveal that kleisin identity determines the function of meiotic cohesin by controlling the mode and regulation of cohesin–DNA association, and are consistent with a model in which SCC and DNA looping are performed by variant cohesin complexes that coexist on chromosomes.
-
- Chromosomes and Gene Expression
- Developmental Biology
Though long non-coding RNAs (lncRNAs) represent a substantial fraction of the Pol II transcripts in multicellular animals, only a few have known functions. Here we report that the blocking activity of the Bithorax complex (BX-C) Fub-1 boundary is segmentally regulated by its own lncRNA. The Fub-1 boundary is located between the Ultrabithorax (Ubx) gene and the bxd/pbx regulatory domain, which is responsible for regulating Ubx expression in parasegment PS6/segment A1. Fub-1 consists of two hypersensitive sites, HS1 and HS2. HS1 is an insulator while HS2 functions primarily as an lncRNA promoter. To activate Ubx expression in PS6/A1, enhancers in the bxd/pbx domain must be able to bypass Fub-1 blocking activity. We show that the expression of the Fub-1 lncRNAs in PS6/A1 from the HS2 promoter inactivates Fub-1 insulating activity. Inactivation is due to read-through as the HS2 promoter must be directed toward HS1 to disrupt blocking.