Kinetic modeling predicts a stimulatory role for ribosome collisions at elongation stall sites in bacteria
Abstract
Ribosome stalling on mRNAs can decrease protein expression. To decipher ribosome kinetics at stall sites, we induced ribosome stalling at specific codons by starving the bacterium Escherichia coli for the cognate amino acid. We measured protein synthesis rates from a reporter library of over 100 variants that encoded systematic perturbations of translation initiation rate, the number of stall sites, and the distance between stall sites. Our measurements are quantitatively inconsistent with two widely-used kinetic models for stalled ribosomes: ribosome traffic jams that block initiation, and abortive (premature) termination of stalled ribosomes. Rather, our measurements support a model in which collision with a trailing ribosome causes abortive termination of the stalled ribosome. In our computational analysis, ribosome collisions selectively stimulate abortive termination without fine-tuning of kinetic rate parameters at ribosome stall sites. We propose that ribosome collisions serve as a robust timer for translational quality control pathways to recognize stalled ribosomes.
https://doi.org/10.7554/eLife.23629.001Introduction
Ribosomes move at an average speed of 3–20 codons per second during translation elongation in vivo (Dalbow and Young, 1975; Bonven and Gulløv, 1979; Yan et al., 2016). Since this rate is higher than the typical initiation rate of ribosomes on mRNAs [less than 1 s-1 (Yan et al., 2016; Kennell and Riezman, 1977)], elongation is often assumed to not affect the expression level of most proteins. Nevertheless, the elongation rate of ribosomes can decrease significantly at specific locations on an mRNA due to low abundance of aminoacyl-tRNAs, inhibitory codon pairs or amino acid pairs, nascent peptides interacting strongly with the ribosome exit tunnel, or the presence of RNA-binding proteins (Richter and Coller, 2015). Ribosome profiling — the deep sequencing of ribosome-protected mRNA fragments — has enabled the identification of additional factors that induce slowing or stalling of ribosomes during elongation (Ingolia et al., 2009; Ingolia, 2014). An important question emerging from these studies is the extent to which ribosome stalling affects the expression of the encoded protein, since initiation might still be the slowest step during translation.
Several mechanistic models have been proposed to explain how ribosome stalling during elongation might affect the expression of the encoded protein. In the widely used traffic jam model (MacDonald et al., 1968), the duration of ribosome stalling is sufficiently long to induce a queue of trailing ribosomes extending to the start codon, thus decreasing the translation initiation rate. Evidence supporting this model has been found in the context of EF-P dependent polyproline stalls in E. coli (Hersch et al., 2014; Woolstenhulme et al., 2015), and rare-codon induced pausing in E. coli and yeast (Mitarai et al., 2008; Chu et al., 2014). In an alternate abortive termination model, ribosome stalling causes premature termination without synthesis of the full-length protein. This model is thought to underlie the action of various ribosome rescue factors in E. coli and yeast (Subramaniam et al., 2014; Choe et al., 2016). Finally, ribosome stalling can also affect protein expression indirectly by altering mRNA stability (Presnyak et al., 2015; Radhakrishnan et al., 2016), co-translational protein folding (Chaney and Clark, 2015), or stress-response signaling (Ishimura et al., 2016).
Despite the experimental evidence supporting the above models, predicting the effect of ribosome stalling on protein levels has been challenging because of uncertainty in our knowledge of in vivo kinetic parameters such as the duration of ribosome stalling and the rate of abortive termination. Further, while we have a detailed understanding of the kinetic steps and structural changes that occur during the normal elongation cycle of the ribosome (Wintermeyer et al., 2004; Voorhees and Ramakrishnan, 2013; Blanchard et al., 2004), the ‘off-pathway’ events that occur at stalled ribosomes have been elucidated in only a few specific cases (Neubauer et al., 2012; Muto et al., 2006; Shao et al., 2015). Thus, development of complementary approaches, which can quantitatively constrain the in vivo kinetics of stalled ribosomes without precise knowledge of rate parameters, will be useful for bridging the gap between the growing list of ribosome stall sequences (Ingolia, 2014; Woolstenhulme et al., 2013; Gamble et al., 2016) and their effect on protein expression.
Here, we investigated the effect of ribosome stalling on protein expression using amino acid starvation in E. coli as an experimental model. In this system, we previously found that both ribosome traffic jams and abortive termination occur at a subset of codons cognate to the limiting amino acid (Subramaniam et al., 2013). Motivated by these observations, here we computationally modeled ribosome traffic jams and abortive termination with the goal of predicting their effect on protein expression. Even without precise knowledge of in vivo kinetic parameters, we found that these two processes give qualitatively different trends in protein expression when the initiation rate, the number of stall sites, and the distance between stall sites are systematically varied. Surprisingly, experimental measurements support a model in which traffic jams and abortive termination do not occur independent of one another; rather, collisions by trailing ribosomes stimulate abortive termination of the stalled ribosome. We find that this model is consistent with the absence of long ribosome queues in ribosome profiling measurements, and it naturally provides a mechanistic basis for the selectivity of abortive termination towards stalled ribosomes. While these conclusions are limited to the specific context of amino acid starvation in E. coli, the integrated approach developed in this work should be generally applicable to investigate other ribosome stalls in both bacteria and eukaryotes.
Results
Effect of ribosome stalling on measured protein level, mRNA level, and polysome occupancy
During starvation for single amino acids in E. coli, certain codons that are cognate to the limiting amino acid decrease protein expression, while the same codons have little or no effect during nutrient-rich growth (Subramaniam et al., 2013). For example, synonymously mutating seven CTG leucine codons in the yellow fluorescent protein gene (yfp) to CTA, CTC, or CTT reduced the synthesis rate of YFP 10–100 fold during leucine starvation (Subramaniam et al., 2013). Genome-wide ribosome profiling showed that ribosomes stall at CTA, CTC, and CTT codons during leucine starvation, which leads to a traffic jam of trailing ribosomes and abortive termination of translation (Subramaniam et al., 2014). These observations led us to ask whether ribosome traffic jams and abortive termination can quantitatively account for the decrease in protein synthesis rate (number of full proteins produced per unit time) caused by ribosome stalling during leucine starvation in E. coli.
To measure the effect of ribosome stalling on protein synthesis during leucine starvation, we constructed fluorescent reporter genes which have a stall-inducing CTA codon at one or two different locations along yfp (Figure 1A, blue bars). We induced these reporter variants from very low copy vectors (SC*101 ori, 3–4 copies per cell) either during leucine starvation or during leucine-rich growth. While YFP expression was similar across all yfp variants during leucine-rich growth, a single CTA codon at two different locations reduced YFP expression during leucine starvation by 3–4 fold relative to a control yfp without CTA codons (Figure 1B). Introducing both CTA codons reduced YFP expression by ~6 fold (Figure 1B), and a stretch of 7 CTA codons reduced YFP expression close to background level as observed in our earlier work (Subramaniam et al., 2013). Thus, YFP expression can serve as a quantitative readout of the effect of ribosome stalling on protein synthesis.
We then sought biochemical evidence supporting a role for either ribosome traffic jams or abortive termination in the reduction of YFP expression caused by stall-inducing CTA codons. We reasoned that ribosome traffic jams that reduce protein expression by blocking initiation should increase the number of ribosomes on an mRNA when the stall site is far from the initiation region. However, polysome fractionation of leucine-starved E. coli did not indicate an unambiguous shift of the yfp mRNA to higher polysome fractions when two stall-inducing CTA codons were introduced 475nt from the start codon (Figure 1C, top vs bottom panels). This observation agrees with previous ribosome density measurements that detected traffic jams of only 1–2 ribosomes behind stalled ribosomes (Subramaniam et al., 2014).
We detected truncated YFP fragments consistent with abortive termination at stall-inducing CTA codons during leucine starvation (Figure 1D). Previous studies suggested that abortive termination of stalled ribosomes requires cleavage of mRNA near the stall site as an obligatory step (Keiler, 2015; Hayes and Sauer, 2003; Ivanova et al., 2004). Therefore, we tested whether changes in mRNA levels could account for the 3–4 fold decrease in YFP expression caused by single CTA codons during leucine starvation (Figure 1B). However, we found that yfp mRNA levels, as measured by quantitative RT-PCR spanning the single CTA codons, did not decrease significantly during leucine starvation (Figure 1E). Similarly, introducing two CTA codons resulted in <2 fold decrease in yfp mRNA levels despite ~6 fold decrease in YFP expression (Figure 1E vs 1B). These observations are consistent with earlier measurements using ribosome profiling and Northern blotting that did not find evidence for significant mRNA cleavage or decay upon ribosome stalling at CTA codons during leucine starvation (Subramaniam et al., 2014).
Computational modeling of ribosome kinetics at stall sites
Since the above reporter-based experiments were qualitative and could miss subtle effects, we formulated an alternate approach using computational modeling to quantitatively test the role of ribosome traffic jams and abortive termination at stall sites during amino acid starvation in E. coli. To this end, we defined a minimal set of five kinetic states at ribosome stall sites and the rate constants for transition between these kinetic states (Figure 2A).
In our modeling (Figure 2A), ribosomes stalled during amino acid starvation are represented by the A-site empty state ae. Once the aminoacyl-tRNA is accommodated, A-site empty ribosomes transition to the A-site occupied state ao. Ribosomes transition back from the A-site occupied state ao to the A-site empty state ae upon peptide-bond formation and translocation. Beyond the ae and ao states, we did not consider additional kinetic states in the normal elongation cycle of the ribosome (Wintermeyer et al., 2004; Blanchard et al., 2004), since these states cannot be resolved using our measurements of protein synthesis rates during amino acid starvation. Ribosomes that have dissociated from mRNA, due to either normal termination at stop codons or abortive termination at stall sites, transition to the free state f. Finally, collision between a stalled ribosome with an empty A-site and a trailing ribosome with an occupied A-site transitions the stalled ribosome to the 5’ hit state 5h and the trailing ribosome to the 3’ hit state 3h.
To model ribosome traffic jams, we chose the rate constant for abortive termination of all elongating ribosomes to be zero. Hence if the duration of ribosome stalling is sufficiently long, a queue of trailing ribosomes forms behind the stalled ribosome and ultimately reduces protein synthesis rate by blocking the initiation region. We designate this as the traffic jam (TJ) model (Figure 2B, top).
To model abortive termination, we set the transition rate constant from stalled ribosomes to free ribosomes to be non-zero. Abortive termination occurs selectively at stalled ribosomes, and not at normally elongating ribosomes (Roche and Sauer, 1999). Even though the mechanistic basis for this selectivity is poorly understood (Janssen et al., 2013; Miller and Buskirk, 2014), we can account for the selectivity in our modeling by simply setting the abortive termination rate to be zero at all codons except at the stall site (Subramaniam et al., 2014). We designate this as the simple abortive termination (SAT) model (Figure 2B, middle).
While abortive termination and traffic jams are usually considered as independent molecular processes (Kurland, 1992; Andersson and Kurland, 1990), our definition of kinetic states (Figure 2A) suggests a more general model in which these processes are coupled. Specifically, we considered a model in which the rate of abortive termination is non-zero only when stalled ribosomes have undergone a collision with a trailing ribosome, i.e. when they are in the 5h state. We designate this as the collision-stimulated abortive termination (CSAT) model (Figure 2B, bottom). As shown below, the CSAT model is closer to experimental measurements of protein synthesis rate than the TJ and SAT models, and it also suggests a mechanistic basis for the selectivity of abortive termination.
Experimental variables for distinguishing kinetic models of ribosome stalling
Predicting the effect of ribosome stalling on YFP expression in our three kinetic models (Figure 2B) requires knowledge of the elongation rate and the abortive termination rate of ribosomes at stall-inducing codons during amino acid starvation in E. coli. In principle, these rate constants can be estimated using the ribosome profiling method (Subramaniam et al., 2014; Shah et al., 2013), but sequence-specific and protocol-related biases in ribosome profiling (Woolstenhulme et al., 2015; Mohammad et al., 2016; Lareau et al., 2014) will introduce a large uncertainty in this estimation. Therefore, we sought to identify experimental variables that would enable us to discriminate between the different kinetic models of ribosome stalling without precise knowledge of the underlying rate constants.
First, we examined the effect of varying the initiation rate of an mRNA with a single stall site in our three kinetic models (Figure 3A). We used stochastic simulations to predict the protein synthesis rate from a yfp mRNA under this perturbation (Materials and methods). We chose the elongation and abortive termination rate constants at the stall site so that an mRNA with an initiation rate of 0.3 s-1 — a typical value for E. coli mRNAs (Kennell and Riezman, 1977; Subramaniam et al., 2014) — had the same protein synthesis rate (number of full proteins produced per unit time) in all three models. In the SAT model, varying the initiation rate does not modulate the effect of the stall site on protein synthesis rate (Figure 3A, blue squares). By contrast, in the TJ and CSAT models, the effect of the stall site on protein synthesis rate is reduced at lower initiation rates (Figure 3A, green circles and red diamonds). This reduction is more pronounced in the TJ model because, at low initiation rates, ribosome queues do not block the initiation region in the TJ model, while they still lead to collision-stimulated abortive termination in the CSAT model.
Second, we examined the effect of systematically varying the number of stall sites on an mRNA in our three kinetic models (Figure 3B). We chose the elongation rate and abortive termination rate constants at stall sites so that the effect of a single stall site on protein synthesis rate was identical between the three models (Supplementary file 1). With no further parameter adjustments, we introduced additional identical stall sites, with each stall site separated by at least two ribosome footprints (>60 nt) from other stall sites. In the traffic jam (TJ) model, additional stall sites had very little effect on protein synthesis rate (Figure 3B, green circles). In the simple abortive termination (SAT) model, protein synthesis rate decreased exponentially with the number of stall sites (Figure 3B, blue squares). In the collision-stimulated abortive termination (CSAT) model, the effect of additional stall sites was intermediate between the TJ and SAT models (Figure 3B, red diamonds). The differential effect of multiple stall sites in the three models can be intuitively understood as follows: In the TJ model, extended queues of ribosomes occur only at the first stall site because the average rate at which ribosomes arrive at subsequent stall sites is limited by the rate at which they elongate past the first stall site. In the CSAT model, ribosome collisions occur at a greater rate at the first stall site, but are not completely prevented at subsequent stall sites due to stochastic ribosome elongation past the first stall site. In the SAT model, abortive termination rate at each stall site does not depend on the presence of other stall sites on the mRNA.
Finally, we considered the effect of varying the distance between two identical stall sites in our kinetic models. In the SAT model, varying the distance between two stall sites does not modulate the effect of the stall sites on protein synthesis rate (Figure 3C, blue). In the CSAT model, when the two stall sites are separated by less than a ribosome footprint, then the frequency of collisions at the stall sites increases, thus resulting in the lower protein synthesis rate in this regime (Figure 3C, red). In the TJ model, the length of ribosome queues at the first stall site is modulated by the formation of shorter ribosome queues at the second stall site when it is within a few ribosome footprints. This interaction results in a lower protein synthesis rate when the stall sites are separated by a few ribosome footprints (Figure 3C, green).
Measured protein synthesis rates support a collision-stimulated abortive termination model
We tested the predictions from our kinetic models using yfp reporters with stall-inducing codons during starvation for single amino acids in E. coli. First, we measured the effect of varying the initiation rate on the synthesis rate of YFP either by mutating the ATG start codon to a near-cognate codon, or by mutating the Shine-Dalgarno sequence (Figure 4, inset). We fitted the ribosome elongation rate at stall-inducing codons in the three kinetic models using the measured YFP synthesis rate for the yfp variant with the non-mutated initiation region (variant four in Figure 4), and used this fit to predict the YFP synthesis rate of the other initiation mutants with no remaining free parameters (Materials and methods, Supplementary file 2). The effect of a single CTA codon on YFP synthesis rate decreased as the initiation rate of the yfp variants was reduced (Figure 4, black triangles). Both the TJ and CSAT models predicted the decreasing effect of the CTA codon with lower initiation rate (Figure 4, green circles and red diamonds). By contrast, the predicted YFP synthesis rate from the SAT model was independent of initiation rate (Figure 4, blue squares). This difference between the SAT model, and the TJ and CSAT models was also observed upon introducing CTA, CTC, or CTT codons at other locations in yfp, as well as the stall-inducing codon TCG during serine starvation (Figure 4—figure supplement 1).
Second, we tested the effect of multiple stall sites on YFP synthesis rate (Figure 5). We introduced a single CTA codon at one of five locations among the twenty-two leucine codons in yfp (Figure 5, inset), and we then combined the single mutations to generate ten yfp variants with two CTA codons, two yfp variants with three CTA codons, and one yfp variant with four CTA codons. We then used the measured YFP synthesis rates (Figure 5, black triangles) of the five single CTA variants to fit the ribosome elongation rate at each of the five CTA codon locations in our three kinetic models (Materials and methods, Supplementary file 3). These fits, with no remaining free parameters, were used to predict YFP synthesis rates of the multiple-CTA variants during leucine starvation. We found that the TJ model systematically overestimated the YFP synthesis rate for 12 of 13 multiple-CTA variants (Figure 5, green circles), while the SAT model systematically underestimated the YFP synthesis rate for all 13 multiple-CTA variants during leucine starvation (Figure 5, blue squares). By contrast, the predicted YFP synthesis rates from the CSAT model (Figure 5, red diamonds) were closest to the measured YFP synthesis rates with approximately half the average error of the TJ and SAT models. Similarly, the CSAT model prediction was more accurate when we introduced CTC, CTT, or TCG stall-inducing codons into yfp (Figure 5—figure supplement 1).
Third, we measured the effect of varying the distance between two stall sites on YFP synthesis rate (Figure 6, black triangles). We made pairwise combinations of seven CTA mutations to generate eight variants with a range of distances d between the two CTA codons (Figure 6, inset). As before, we fitted our three models to the measured YFP synthesis rate of the single CTA variants and used these fits to predict the YFP synthesis rate of the double CTA variants (Materials and methods, Supplementary file 4). We found that two CTA codons separated by less than a ribosome footprint (d < 10 codons) resulted in lower protein synthesis rate than two CTA codons separated by several ribosome footprints (d > 50 codons) (Figure 6, black triangles). This observation was in line with the predictions from the TJ and CSAT models (Figure 3C), with the CSAT model providing a better fit than either the TJ or SAT models overall. Similarly, the CSAT model was more accurate when we varied the distance between two CTC codons in yfp (Figure 6—figure supplement 1).
Combining the results from all the yfp mutants (N = 94) and assuming independent and normal distribution of residual errors, we conclude that the TJ model systematically overestimates the measured YFP synthesis rate (p<10–15, one-sided Student’s t-test), while the SAT model systematically underestimates the measured YFP synthesis rate (p<10–8, one-sided Student’s t-test). The CSAT model shows no such bias (p>0.05, two-sided Student’s t-test). Under the same assumption of normal distribution of residual errors and using the Akaike Information Criterion (Burnham and Anderson, 2013), we find an Akaike weight >0.999 in favor of the CSAT model over the TJ and SAT models. Thus we conclude that the CSAT model provides a better fit to the measured YFP synthesis rates from the yfp mutants than either the TJ or the SAT models when the initiation rate, the number of stall sites, and the distance between stall sites are systematically varied during starvation for single amino acids in E. coli.
Selectivity, robustness, and ribosome density in the collision-stimulated abortive termination model
The ability of the CSAT model to account for measured YFP synthesis rates from our reporters led us to examine whether this model is consistent with other expected features of ribosome stalling during amino acid starvation in E. coli. Specifically, we used our simulations to examine how varying the abortive termination rate affects protein synthesis from mRNAs with and without stall sites, as well as the predicted ribosome density near stall sites in the three kinetic models.
First, only a small fraction of ribosomes are expected to prematurely terminate from mRNAs without stall sites (Subramaniam et al., 2014; Zhang et al., 2010; Sin et al., 2016). Consistent with this expectation, predicted protein synthesis rates from reporters without stall sites did not decrease when the abortive termination rate was increased in the CSAT model (Figure 7A, top panel, red). This selectivity towards stalled ribosomes naturally arises in the CSAT model from the requirement for ribosome collisions to cause abortive termination. By contrast, protein synthesis rates from reporters without stall sites decreased with increasing abortive termination rate in a SAT model in which abortive termination was not explicitly specified to be selective for stalled ribosomes (Figure 7A, top panel, blue vs. pink).
Second, the frequency of abortive termination is known to be robust to over-expression of factors that rescue stalled ribosomes (Moore and Sauer, 2005). Consistent with this observation, we found that increasing the abortive termination rate in the CSAT model predicted only a minor effect on protein synthesis rate from an mRNA with a single stall site (Figure 7A, bottom panel, red). By contrast, in both the selective and non-selective SAT models, protein synthesis rate from an mRNA with a single stall site continuously decreased as the abortive termination rate was increased (Figure 7A, bottom panel, blue and pink). The robustness of the CSAT model to varying abortive termination rates arises because the frequency of ribosome collisions limit the actual rate of abortive termination at stall sites.
Finally, previous ribosome profiling measurements have detected a queue of only a few ribosomes at CTA codons during leucine starvation in E. coli (Subramaniam et al., 2014). Consistent with this observation, the length of ribosome queues at the stall site predicted by the CSAT model is limited to a few ribosomes even when the stall site is located ~200 codons from the start codon (Figure 7B, red). A similar queue of few ribosomes is also observed in the SAT model (Figure 7B, blue and pink). By contrast, the TJ model predicts a queue of ~20 ribosomes when stall sites are located ~200 codons from the start codon (Figure 7B, green).
Discussion
In this work, we used a combination of computational modeling and reporter-based measurements of protein synthesis rate to constrain ribosome kinetics at stall sites during single amino acid starvation in E. coli. Our approach allowed us to test two previously proposed models for how ribosome stalling decreases protein expression, namely, ribosome traffic jams that block initiation (TJ model) and simple abortive termination of stalled ribosomes (SAT model). We also considered a novel model in which ribosome collisions stimulate abortive termination of stalled ribosomes (CSAT model). Our integrated approach allowed us to infer the extent to which each of these three kinetic models quantitatively accounted for the measured protein synthesis rate from a library of yfp variants during starvation for single amino acids.
The TJ model has been considered theoretically in several studies (MacDonald et al., 1968; Mitarai et al., 2008; Zhang et al., 1994). While queues of ~7 ribosomes have been detected in vitro (Wolin and Walter, 1988), ribosome profiling studies have revealed a queue of only a few ribosomes at stall sites in vivo (Woolstenhulme et al., 2015; Subramaniam et al., 2014; Guydosh and Green, 2014). These smaller queues can modulate protein expression only if the stall site is within a few ribosome footprints from the start codon (Mitarai et al., 2008; Liljenström and von Heijne, 1987; Tuller et al., 2010). Nevertheless, recent studies on EF-P dependent pauses in bacteria and rare-codon dependent pauses in yeast suggested that the TJ model underlies the decreased protein expression when stall sites are over 100 codons away from the start codon (Hersch et al., 2014; Chu et al., 2014). These conclusions were based on observations that decreasing initiation rate of ribosomes on reporters reduced the effect of stall sites on protein expression (Hersch et al., 2014; Chu et al., 2014). This regulatory effect of initiation rate was also observed in our experiments (Figure 3). However, we find that both the TJ and CSAT models predict this regulatory effect of initiation rate (Figure 3A), while only the CSAT model predicts a queue of few ribosomes (Figure 7B) that is observed experimentally. Thus, collision-stimulated abortive termination is a plausible alternative mechanism to the traffic jam model proposed in previous studies (Hersch et al., 2014; Chu et al., 2014).
Simple kinetic partitioning between normal elongation and abortive termination has been proposed as a possible mechanism for how ribosome rescue factors might act at ribosomes that are stalled within an mRNA (Brandman and Hegde, 2016; Shao and Hegde, 2016). However, our modeling indicates that this non-selective mechanism of abortive termination will result in decreased protein expression from mRNAs that do not have stall sites (Figure 7A, top panel). This observation can be intuitively understood from the fact that even a small probability of abortive termination during each elongation cycle will be exponentially amplified over the course of translating a typical E. coli protein with 300 amino acid residues.
Despite the better fit provided by the CSAT model to our measured YFP synthesis rates, there still remains a residual error in its prediction (Figures 4, 5 and 6). This error might arise from several simplifying assumptions in our definition of the CSAT model, which we made in order to emphasize its qualitative difference with the TJ and SAT models (Figure 2B). First, we assumed the rate of abortive termination to be zero in the absence of ribosome collisions. Relaxing this assumption is likely to provide a better fit to our measurements, but it will introduce an extra free parameter while not providing additional mechanistic insight into the kinetics of abortive termination. Second, we assumed the rate of abortive termination to be zero for the trailing ribosome in the collided state (3h in Figure 2B), since there is no biochemical evidence for such a process. This assumption could be relaxed based on evidence from future biochemical studies of ribosome queues formed at stall sites. The inverse approach used in our work relied on model predictions that did not depend sensitively on underlying kinetic parameters such as the elongation rate and the abortive termination rate at stall sites. Hence, our work cannot be used to infer the exact values of these kinetic parameters in vivo. Finally, we studied the CSAT model solely in the context of ribosome stalls caused by amino acid starvation in E. coli. Hence, the validity of this model at ribosome stalls in exponentially growing bacterial cells remains to be tested.
Ribosome collisions during amino acid starvation could stimulate abortive termination through several mechanisms. Specifically, ribosome collisions could either stimulate spontaneous drop-off of stalled ribosomes, or they could stimulate the activity of quality control pathways such as the tmRNA and the ArfA systems that rescue stalled ribosomes (Keiler, 2015; Shoemaker and Green, 2012). In the latter case, ribosome collisions might allow the quality control pathway to selectively recognize ribosomes that have been stalled for an extended duration over ribosomes that are transiently stalled due to the stochasticity of normal elongation (Miller and Buskirk, 2014). In this sense, the frequency of ribosome collisions can provide a natural timer for achieving selectivity of quality control pathways towards stalled ribosomes (Shao and Hegde, 2016). Further, the robustness of the CSAT model to changes in the abortive termination rate (Figure 7A, bottom panel) can buffer against cell-to-cell variation in the concentration of quality control factors that mediate abortive termination. Finally, ribosome collisions might also have a role in stimulating the activity of eukaryotic translational quality control pathways (C. Simms and H. Zaher, personal communication) such as No-Go mRNA decay (Doma and Parker, 2006), where the kinetic events leading to recognition of stalled ribosomes remain poorly defined (Shoemaker and Green, 2012). This general role for ribosome collisions in translational quality control could have arisen during evolution to minimize the idling of translation-competent ribosomes on mRNAs.
Materials and methods
Bacterial strains and plasmids
Request a detailed protocolAll leucine starvation experiments in this study were performed using an E. coli strain (Subramaniam et al., 2013) that is auxotrophic for leucine and contains the tet repressor gene for inducible control of reporter genes (ecMF1). Serine starvation experiments were performed using a similar strain, but auxotrophic for serine instead of leucine (ecMF403). All fluorescent reporters in this study were cloned into a very low copy expression vector (SC*101 ori, 3–4 copies per cell) used in our previous work (Subramaniam et al., 2013) (pASEC1, Addgene plasmid #53241). The fluorescent reporter genes used in leucine starvation experiments were based off a yellow fluorescent protein sequence (yfp0) present in pASEC1, which encodes a fast-maturing ‘Venus’ variant of YFP. All 22 leucine codons in yfp0 were chosen as CTG. All yfp reporters used for serine starvation experiments were constructed from a yfp variant that had the AGC codon at all eight serine positions. For constructing yfp reporters with single stall sites during leucine starvation, the corresponding CTG codon in yfp0 was mutated to CTA, CTC, or CTT by encoding these mutations in oligos and using Gibson assembly (Gibson et al., 2009). The single stall reporters for serine starvation were similarly constructed by mutating a single AGC codon to TCG codon. A yfp variant with seven leucine codons mutated to CTA was used in all plate reader experiments as a control for the lower limit of detection of YFP fluorescence under Leu starvation (ecMF112). Variants of yfp with multiple CTA, CTC, CTT, or TCG codons were constructed by Gibson assembly of PCR fragments from the corresponding single codon variants of yfp. The start codon and Shine-Dalgarno sequence variants of yfp were generated by encoding these mutations in one of the PCR oligos for yfp. The 3xflag-yfp variants were generated by the addition of a 22 codon sequence at the 5’ end that encoded a 3X-FLAG peptide used in our previous work (Subramaniam et al., 2013). All strains and plasmids used in this study are available upon request (See Supplementary file 7 for list of strains and plasmids).
Growth and fluorescence measurements
Request a detailed protocolOvernight cultures were inoculated in biological triplicates from freshly grown single colonies or patched colonies from glycerol stocks. Overnight cultures were grown in a modified MOPS rich defined medium (Subramaniam et al., 2013; Neidhardt et al., 1974) made with the following recipe: 10X MOPS rich buffer, 10X ACGU nucleobase stock, and 100X 0.132M K2HPO4 were used at 1X final concentration as in the original recipe. In addition, the overnight growth medium contained 0.5% glucose as carbon source and 800 µM of 19 amino acids and 10 mM of serine. pH was adjusted to 7.4 using 1M NaOH and appropriate selective antibiotic (100 µg/ml carbenicillin) was added. 200 ng/ml of anhydro-tetracycline (aTc) was also added in order to induce the PLtetO-1 promoter (Lutz and Bujard, 1997). 1 ml overnight cultures were grown in 2 ml deep 96-well plates (AB0932, Fisher) at 30°C with shaking at 1200 rpm (Titramax 100 shaker) for 12 to 16 hr.
For amino acid starvation time course experiments, overnight cultures were diluted 1:100 into 150 µl of the same MOPS rich-defined medium as the overnight cultures. However, leucine was added at 100 µM and supplemented with its methyl ester analog at 160 µM (AC125130250, Fisher) for leucine starvation experiments. Similarly, serine was added at 5 mM and supplemented with its methyl ester analog at 800 µM (412201, Sigma) for serine starvation experiments. Addition of each methyl ester results in a steady but limiting supply of the amino acid due to slow hydrolysis of the ester, and this enables extended and accurate measurements of protein synthesis rate under the amino acid starvation condition (Subramaniam et al., 2013). Except for the limiting amino acid, the remaining 19 amino acids were present at the overnight culture concentrations during the amino acid starvation experiments.
Diluted overnight cultures were grown in 96-well plates (3595, Costar) at 30°C with shaking at 1200 rpm (Titramax 100 shaker). A 96-well plate reader (Infinite M1000 PRO, Tecan) was used to monitor cell density (absorbance at 600 nm) and YFP synthesis (fluorescence, excitation 504 nm and emission 540 nm). Each plate was read every 15 min and shaken in between readings for a total period of 6–10 hr.
For experiments in Figure 1, overnight cultures were grown without aTc and diluted 1:1000 into the same medium. Then when the OD600 reached 0.5, the cells were spun down at 3000 g for 5 min and then re-suspended in the same medium, but either with or without leucine, and with aTc for reporter induction. Fluorescence, Western blots, and qRT-PCR measurements in Figure 1 were performed from these cultures after shaking at 37°C, 200 rpm for 20 min with leucine or 60 min without leucine.
Polysome profiling
Request a detailed protocolOvernight cultures were diluted 1:200 into 400 ml MOPS rich defined medium and grown at 37°C to an OD600 of 0.2. Cells were harvested by vacuum filtration on a 0.2 µm nitrocellulose membrane (BA83, GE) and subsequently cut in half. One half was added to 200 ml MOPS rich defined medium, the other to 200 ml of same medium but without leucine. After growth at 37°C for either 20 min (Leu-rich cultures) or 1 hr (Leu starvation cultures), cells were harvested by vacuum filtration again. Cells were scraped from the membrane using a plastic spatula before the membrane became dry, and then immediately submerged in liquid nitrogen and stored at –80°C. Frozen cells were then re-suspended in 0.7 ml bacterial lysis buffer (20 mM Tris pH 8.0, 10 mM MgCl2, 100 mM NH4Cl, 2 mM DTT, 0.1% NP‐40, 0.4% Triton X‐100, 100 U/ml DNase I, and 1 mM chloramphenicol) and lysed using glass beads (G1277, Sigma) by vortexing 4 × 30 s at 4°C with 60 s cooling on ice in between. The lysate was clarified by centrifugation at 21,000 g, 4°C for 10 min and supernatant was transferred to a fresh tube.
Lysate RNA concentration was quantified by A260 (Thermo Scientific Nanodrop) and 100–200 µl of lysate containing 0.5 mg RNA was loaded onto a 10–50% sucrose gradient made with 20 mM Tris pH 8.0, 10 mM MgCl2, 100 mM NH4Cl, and 2 mM DTT. Polysomes were separated by centrifugation in an SW41 rotor at 35,000 rpm for 3 hr at 4°C. Gradients were then fractionated into 15 fractions containing 25.6 ng spike-in control firefly luciferase mRNA. RNA from each fraction was column-purified along with in-column DNase I digestion (Quick-RNA Miniprep, Zymo Research, Irvine, CA).
Total RNA extraction
Request a detailed protocolPhenol-chloroform extraction method was used to obtain total RNA. 10 ml of cells were quickly chilled in an ice water bath and harvested by centrifugation at 3000 g for 5 min. Cell pellets were re-suspended in 500 µl of 0.3 M sodium acetate and 10 mM EDTA pH 4.5. Re-suspended cells were mixed with 500 µl of acetate-saturated phenol-chloroform pH 4.5 and 500 µl of acid-washed glass beads (G1277, Sigma). The mixture was shaken in a vortexer for 3 min and then clarified by centrifugation at 21,000 g for 10 min. The samples were maintained at 4°C through this step. The aqueous layer was extracted twice with acetate-saturated phenol-chloroform pH 4.5 and once with chloroform. Total RNA was precipitated with an equal volume of isopropanol, washed with 70% ethanol, and finally re-suspended in 200 µl of RNase-free 10 mM Tris pH 7.0. 200 ng of the total RNA was treated with DNase I (M0303S, NEB) to remove residual DNA contamination (manufacturer’s instructions were followed). The DNA-free RNA was column-purified (Quick-RNA Miniprep, Zymo Research, Irvine, CA).
Reverse transcription and quantitative PCR
Request a detailed protocolReverse transcription (RT) was performed using 10–20 ng of DNA-free RNA and Maxima reverse transcriptase (EP0741, Thermo), used according to manufacturer’s instructions. Random hexamer primers were used for priming the RT reaction. At the end of the RT reaction, the 10 µl RT reaction was diluted 20-fold and 5 µl of this diluted sample was used as template for qPCR in the next step. qPCR was performed using Maxima SYBR Green/ROX qPCR Master Mix (FERK0221, Thermo) and manufacturer’s instructions were followed. qPCR was performed in triplicates for each RT reaction and appropriate negative RT controls were used to confirm the absence of DNA contamination. gapA mRNA was used as internal reference to normalize all other mRNA levels. Primers for qPCR were from our previous work (Subramaniam et al., 2013). ΔCt method was used to obtain relative mRNA levels. Analysis was implemented using Python 2.7 libraries. Code for analysis and plotting of figures starting from raw qPCR data is publicly available at http://github.com/rasilab/ferrin_elife_2017 (Subramaniam, 2017) as Jupyter notebooks (Perez and Granger, 2007).
Western blotting
Request a detailed protocolCells were harvested by centrifugation and protein was precipitated by mixing trichloroacetic acid to a final concentration of 10%. The mixture was incubated on ice for 15 min and the supernatant was removed. Protein pellets were re-suspended in 100 µl 1X Laemmli Buffer (Biorad), boiled at 99°C for 5 min, and then loaded onto each lane of a 4–20% polyacrylamide gel (Biorad) and SDS-PAGE was carried out at 200V for 50 min. Proteins were transferred to a nitrocellulose membrane at 500mA for 60 min using a wet-transfer apparatus (Biorad). The membrane was cut along the 50kD marker and both halves were blocked in Odyssey PBS Blocking Buffer (Li-cor) for 60 min. The lower-MW half was incubated with a 1:6000 dilution of an anti-FLAG antibody (F3165, Sigma), and the higher-MW half in the same dilution of an anti-σ70 antibody (WP004, Neoclone), each in 15 ml of Odyssey PBS Blocking Buffer with shaking at 4°C overnight. After washing 4 × 5 min with TBST, the membrane was incubated with 1:10,000 dilution of a secondary dye-conjugated antibody (925–68072, Li-cor) in 15 ml of Odyssey PBS Blocking Buffer with shaking at room temperature for 60 min. After washing 4 × 5 min with PBS, the membrane was imaged using a laser-based fluorescence imager.
Growth and fluorescence data analysis
Request a detailed protocolOD600 and YFP fluorescence were recorded as time series for each well of a 96-well plate. Background values for OD600 and YFP fluorescence were subtracted based on measurements from a well with just growth medium. Time points corresponding to Leu-rich growth and Leu starvation were identified by manual inspection of OD600 curves. The onset time of starvation was automatically identified as the time point at which YFP/OD600 reached a minimum value. YFP synthesis rate during Leu-rich exponential growth was defined as the average of YFP/OD600 values for the three points around the onset time of starvation. YFP synthesis rate during Leu starvation was defined as the slope of a linear fit to the fluorescence time series in the Leu starvation regime. YFP synthesis rates for individual wells were averaged over biological replicate wells for calculation of mean and standard error. Analysis was implemented using Python 2.7 libraries. Code for analysis and plotting of figures starting from raw plate reader data is publicly available at http://github.com/rasilab/ferrin_elife_2017 (Subramaniam, 2017) as Jupyter notebooks (Perez and Granger, 2007).
Simulation
Request a detailed protocolThe kinetic models in Figure 2 were implemented as stochastic simulations in the C++ object-oriented programming language. Separate classes were defined to represent ribosomes, mRNA transcripts, gene sequences, tRNAs, and codons. Each elongating ribosome was represented as an instance of the Ribosome class. The four distinct states of the elongating ribosome (ae, ao, 5h, 3h in Figure 2A) were tracked using three bool properties of the Ribosome class: AsiteEmpty, hitFrom5Prime, and hitFrom3Prime. The identities of the tRNAs occupying the A-site and P-site of the elongating ribosome were tracked. Only the aggregate number of ribosomes in the free state (f in Figure 2A) was tracked. Instances of the transcript class were used to track the number of proteins produced from each transcript. The gene, tRNA and codon classes were used as data structures and their properties did not change during the course of the simulation.
Since our reporters were expressed from very low copy number plasmids, translation of the reporter mRNAs is not expected to perturb the native translation machinery in the cell. Therefore we assumed that both the translation rate of native mRNAs, as well as the pool of free ribosomes and aminoacyl-tRNAs remain constant across all reporters used in this study. Hence, each simulation considered a minimal set of two mRNA molecules that both encoded YFP. The first mRNA molecule was a control yfp sequence without any CTA, CTC or CTT codon. The second mRNA molecule was the test yfp sequence with the CTA, CTC or CTT codon as specified for individual simulations. The simultaneous translation of the two mRNA molecules was simply to ensure that we used exactly the same set of parameters for our test and control reporters during simulation runs and subsequent analyses.
We simulated four different molecular processes during translation: initiation, elongation, aminoacylation and abortive termination. The rates of all other steps in translation such as termination and ribosome recycling were set to be instantaneous.
The initiation rate of all mRNA sequences was set as 0.3 s-1 [a typical value for E. coli mRNAs (Kennell and Riezman, 1977; Subramaniam et al., 2014)] except when this rate was explicitly varied, either to demonstrate its effect in our kinetic models (Figure 3A) or for experimental fits (Figure 4, Figure 4—figure supplement 1). For the experimental fits in Figure 4 and Figure 4—figure supplement 1, the measured YFP synthesis rate of the initiation region mutants during Leu-rich growth relative to the starting sequence (four in Figure 4) was used to scale the default initiation rate of 0.3 s-1.
Elongation cycle of ribosomes at each codon was divided into two steps:
In the first elongation step, the cognate tRNA is accommodated into the A-site. The rate of tRNA accommodation was chosen to be non-zero only when ribosomes are in the ae state. The tRNA accommodation rate for all codons was calculated as the product of a pseudo first-order rate constant (2 × 107 M-1s-1), the concentration of individual tRNAs, and a weight factor to account for codon-anticodon pairing strength. The concentration of tRNAs and the weight factors were based on measured concentration of E. coli tRNAs (Dong et al., 1996) and known wobble-pairing rules (Subramaniam et al., 2014; Shah et al., 2013). Leucine starvation was simulated using a previous whole cell model of translation (Subramaniam et al., 2014). The steady-state charged fraction of all tRNAs from this whole-cell model during leucine starvation was used for our yfp reporter simulation as the default values. To fit the measured YFP synthesis rate of single stall-site variants (Figures 4, 5 and 6, Figure 4–figure supplement 1, Figure 5–figure supplement 1, Figure 6–figure supplement 1), the tRNA accommodation rate at CTA, CTC and CTT codons was systematically varied in the three kinetic models. These fit values were used for illustrating the predictions from the kinetic models in Figure 3 and Figure 7.
In the second elongation step, peptide bond is formed and ribosomes translocate to the next codon. This rate was set to be 22 s-1 and equal to the maximum measured rate of in vivo elongation (Bremer and Dennis, 1996).
The aminoacylation rate for all tRNAs was calculated as the product of a pseudo first-order rate constant (2 × 1010 M-1s-1) and the concentration of individual tRNAs. Even though we simulated this process explicitly, we did not lower this rate for leucine tRNAs to simulate leucine starvation; Instead, we accounted for leucine starvation by using the steady-state charged fraction of leucine tRNAs from our whole-cell model as mentioned above in our discussion of tRNA elongation rate. This modified procedure enabled us to simulate the translation of just the yfp reporters without considering all the endogenous mRNAs in the cell.
The abortive termination rate was set to a value of 1 s-1 in the SAT and CSAT models and 0 s-1 in the TJ model, except when this rate was explicitly varied (Figure 7A). We chose this rate to be of the same approximate value as in our ribosome profiling studies (Subramaniam et al., 2014). The exact value of this rate is not critical in our SAT and CSAT models since the fitted value of the elongation rate varies accordingly to reproduce the measured protein synthesis rate from our YFP reporters with single stall sites.
The simulations used a stochastic Gillespie algorithm that was implemented in earlier studies (Subramaniam et al., 2014; Shah et al., 2013). Each simulation was run until 10,000 full-length YFP molecules were produced from the control yfp mRNA without stall-inducing codons. The number of full-length YFP molecules produced in the same duration from the second yfp mRNA with stall-inducing codons was used to calculate the YFP synthesis rate (in Figures 3, 4, 5, 6 and 7, Figure 4—figure supplement 1, Figure 5—figure supplement 1, Figure 6—figure supplement 1) after normalizing by 10,000. Time-averaged ribosome density on each mRNA was also tracked during the simulation run after 100 YFP molecules were produced from the first yfp mRNA, and this density was median-normalized for plotting in Figure 7B.
Code for creating simulation input files, running the simulation, and plotting of figures starting from simulation results is publicly available at http://github.com/rasilab/ferrin_elife_2017 (Subramaniam, 2017) as Jupyter notebooks (Perez and Granger, 2007). Parameters common to all simulations are listed in Supplementary file 6. Parameters specific to simulations in individual figures are listed in Supplementary files 1–5.
Data accession
Request a detailed protocolRaw data and programming code for reproducing all figures in this paper is publicly available at: http://github.com/rasilab/ferrin_elife_2017 (Subramaniam, 2017, with a copy archived at https://github.com/elifesciences-publications/ferrin_elife_2017).
References
-
Codon preferences in free-living microorganismsMicrobiological Reviews 54:198–210.
-
tRNA selection and kinetic proofreading in translationNature Structural & Molecular Biology 11:1008–1014.https://doi.org/10.1038/nsmb831
-
Peptide chain elongation rate and ribosomal activity in Saccharomyces cerevisiae as a function of the growth rateMGG Molecular & General Genetics 170:225–230.https://doi.org/10.1007/BF00337800
-
Ribosome-associated protein quality controlNature Structural & Molecular Biology 23:7–15.https://doi.org/10.1038/nsmb.3147
-
Modulation of Chemical Composition and Other Parameters of the Cell by Growth Rate1553–1569, Modulation of Chemical Composition and Other Parameters of the Cell by Growth Rate, Salmonella.
-
BookModel Selection and Multimodel Inference: A Practical Information-Theoretic ApproachNew York, NY: Springer.
-
Roles for synonymous codon usage in protein biogenesisAnnual Review of Biophysics 44:143–166.https://doi.org/10.1146/annurev-biophys-060414-034333
-
Co-variation of tRNA abundance and Codon usage in Escherichia coli at different growth ratesJournal of Molecular Biology 260:649–663.https://doi.org/10.1006/jmbi.1996.0428
-
Translation initiation rate determines the impact of ribosome stalling on bacterial protein synthesisJournal of Biological Chemistry 289:28160–28171.https://doi.org/10.1074/jbc.M114.593277
-
Ribosome profiling: new views of translation, from single codons to genome scaleNature Reviews Genetics 15:205–213.https://doi.org/10.1038/nrg3645
-
Ribosome rescue by tmRNA requires truncated mRNAsJournal of Molecular Biology 338:33–41.https://doi.org/10.1016/j.jmb.2004.02.043
-
Mechanisms of ribosome rescue in bacteriaNature Reviews Microbiology 13:285–297.https://doi.org/10.1038/nrmicro3438
-
Transcription and translation initiation frequencies of the Escherichia coli lac operonJournal of Molecular Biology 114:1–21.https://doi.org/10.1016/0022-2836(77)90279-0
-
Translational accuracy and the fitness of bacteriaAnnual Review of Genetics 26:29–50.https://doi.org/10.1146/annurev.ge.26.120192.000333
-
Translation rate modification by preferential codon usage: intragenic position effectsJournal of Theoretical Biology 124:43–55.https://doi.org/10.1016/S0022-5193(87)80251-5
-
The SmpB C-terminal tail helps tmRNA to recognize and enter stalled ribosomesFrontiers in Microbiology 5:462.https://doi.org/10.3389/fmicb.2014.00462
-
Ribosome collisions and translation efficiency: optimization by Codon usage and mRNA destabilizationJournal of Molecular Biology 382:236–245.https://doi.org/10.1016/j.jmb.2008.06.068
-
Ribosome rescue: tmRNA tagging activity and capacity in Escherichia coliMolecular Microbiology 58:456–466.https://doi.org/10.1111/j.1365-2958.2005.04832.x
-
IPython: a system for Interactive Scientific ComputingComputing in Science & Engineering 9:21–29.https://doi.org/10.1109/MCSE.2007.53
-
SsrA-mediated peptide tagging caused by rare codons and tRNA scarcityThe EMBO Journal 18:4579–4589.https://doi.org/10.1093/emboj/18.16.4579
-
Target selection during protein Quality ControlTrends in Biochemical Sciences 41:124–137.https://doi.org/10.1016/j.tibs.2015.10.007
-
Translation drives mRNA quality controlNature Structural & Molecular Biology 19:594–601.https://doi.org/10.1038/nsmb.2301
-
Quantitative assessment of ribosome drop-off in E. coliNucleic Acids Research 44:2528–2537.https://doi.org/10.1093/nar/gkw137
-
Structural basis of the translational elongation cycleAnnual Review of Biochemistry 82:203–236.https://doi.org/10.1146/annurev-biochem-113009-092313
-
Mechanisms of elongation on the ribosome: dynamics of a macromolecular machineBiochemical Society Transactions 32:733.https://doi.org/10.1042/BST0320733
-
Ribosome pausing and stacking during translation of a eukaryotic mRNAThe EMBO Journal 7:3559–3569.
-
Clustering of low usage codons and ribosome movementJournal of Theoretical Biology 170:339–354.https://doi.org/10.1006/jtbi.1994.1196
-
Global and local depletion of ternary complex limits translational elongationNucleic Acids Research 38:4778–4787.https://doi.org/10.1093/nar/gkq196
Article and author information
Author details
Funding
National Institute of General Medical Sciences (R35 GM119835)
- Michael A Ferrin
- Arvind R Subramaniam
Fred Hutchinson Cancer Research Center
- Michael A Ferrin
- Arvind R Subramaniam
National Institute of General Medical Sciences (R00 GM107113)
- Michael A Ferrin
- Arvind R Subramaniam
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
We thank Robert Bradley, Allen Buskirk, Erick Matsen, Premal Shah, Kevin Wood, Hani Zaher, and Brian Zid for discussions. Funding for this work was provided by NIH grant R35 GM119835, NIH grant R00 GM107113, and startup funds from the Fred Hutchinson Cancer Research Center. The computations in this paper were run on the Gizmo cluster supported by the Scientific Computing group at the Fred Hutchinson Cancer Research Center. ARS dedicates this work to his late father, Perinkulam Ramnathan.
Copyright
© 2017, Ferrin et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 2,488
- views
-
- 450
- downloads
-
- 41
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Computational and Systems Biology
- Microbiology and Infectious Disease
Timely and effective use of antimicrobial drugs can improve patient outcomes, as well as help safeguard against resistance development. Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) is currently routinely used in clinical diagnostics for rapid species identification. Mining additional data from said spectra in the form of antimicrobial resistance (AMR) profiles is, therefore, highly promising. Such AMR profiles could serve as a drop-in solution for drastically improving treatment efficiency, effectiveness, and costs. This study endeavors to develop the first machine learning models capable of predicting AMR profiles for the whole repertoire of species and drugs encountered in clinical microbiology. The resulting models can be interpreted as drug recommender systems for infectious diseases. We find that our dual-branch method delivers considerably higher performance compared to previous approaches. In addition, experiments show that the models can be efficiently fine-tuned to data from other clinical laboratories. MALDI-TOF-based AMR recommender systems can, hence, greatly extend the value of MALDI-TOF MS for clinical diagnostics. All code supporting this study is distributed on PyPI and is packaged at https://github.com/gdewael/maldi-nn.
-
- Computational and Systems Biology
- Genetics and Genomics
Enhancers and promoters are classically considered to be bound by a small set of transcription factors (TFs) in a sequence-specific manner. This assumption has come under increasing skepticism as the datasets of ChIP-seq assays of TFs have expanded. In particular, high-occupancy target (HOT) loci attract hundreds of TFs with often no detectable correlation between ChIP-seq peaks and DNA-binding motif presence. Here, we used a set of 1003 TF ChIP-seq datasets (HepG2, K562, H1) to analyze the patterns of ChIP-seq peak co-occurrence in combination with functional genomics datasets. We identified 43,891 HOT loci forming at the promoter (53%) and enhancer (47%) regions. HOT promoters regulate housekeeping genes, whereas HOT enhancers are involved in tissue-specific process regulation. HOT loci form the foundation of human super-enhancers and evolve under strong negative selection, with some of these loci being located in ultraconserved regions. Sequence-based classification analysis of HOT loci suggested that their formation is driven by the sequence features, and the density of mapped ChIP-seq peaks across TF-bound loci correlates with sequence features and the expression level of flanking genes. Based on the affinities to bind to promoters and enhancers we detected five distinct clusters of TFs that form the core of the HOT loci. We report an abundance of HOT loci in the human genome and a commitment of 51% of all TF ChIP-seq binding events to HOT locus formation thus challenging the classical model of enhancer activity and propose a model of HOT locus formation based on the existence of large transcriptional condensates.