Estimating the protein burden limit of yeast cells by measuring the expression limits of glycolytic proteins
Abstract
The ultimate overexpression of a protein could cause growth defects, which are known as the protein burden. However, the expression limit at which the protein-burden effect is triggered is still unclear. To estimate this limit, we systematically measured the overexpression limits of glycolytic proteins in Saccharomyces cerevisiae. The limits of some glycolytic proteins were up to 15% of the total cellular protein. These limits were independent of the proteins’ catalytic activities, a finding that was supported by an in silico analysis. Some proteins had low expression limits that were explained by their localization and metabolic perturbations. The codon usage should be highly optimized to trigger the protein-burden effect, even under strong transcriptional induction. The S–S-bond-connected aggregation mediated by the cysteine residues of a protein might affect its expression limit. Theoretically, only non-harmful proteins could be expressed up to the protein-burden limit. Therefore, we established a framework to distinguish proteins that are harmful and non-harmful upon overexpression.
https://doi.org/10.7554/eLife.34595.001eLife digest
If a cell makes too much of a given protein, it can sometimes cause problems and impair the cell’s growth. Overproducing some proteins may deplete the cell’s limited resources, meaning it does not have enough to make other more essential proteins. This phenomenon is known as the protein burden effect. Theoretically, only harmless proteins can be overproduced up to a level where growth would be impaired in this way. Conversely, if an overproduced protein causes harm before it becomes a burden on resources, scientists must consider other mechanisms to explain the cell’s problems, namely that the protein itself is harmful.
Knowing the ultimate level of protein production that could cause the protein burden effect – the protein burden limit – would allow scientists to distinguish between harmful and non-harmful proteins. However, to date, this limit had not been defined for any cell.
Eguchi et al. have now tried to estimate the protein burden limit for budding yeast – one of the best-studied experimental organisms. The experiments first focused on enzymes involved in alcoholic fermentation because they were expected to be non-harmful. Some of these enzymes were overproduced to the level were the made up 15% of all the cell’s proteins before they started to cause growth defects. The same results were seen with versions of the enzymes that had been mutated to be less active, leading Eguchi et al. to conclude that this level is the protein burden limit.
In other experiments, harmful enzymes could only be overproduced to levels that were far less than this proposed protein burden limit. These enzymes caused problems for the yeast in several ways, including interfering with biochemical reactions and forming large aggregates in the cell. Lastly, Eguchi et al. looked at the yeast’s genetic code and saw that most of its genes seemed to have evolved to specifically limit the production of proteins to a level that would avoid the unwanted protein burden effect.
Together these findings establish a framework to clearly distinguish between harmful and non-harmful proteins. This framework will be useful to understand the different reasons why the overproduction of certain proteins, which is seen in neurodegenerative diseases and cancer cells, can cause problems for cells.
https://doi.org/10.7554/eLife.34595.002Introduction
Protein overexpression is sometimes harmful to cellular growth (Makanae et al., 2013; Sopko et al., 2006), and a few mechanisms that could result in overexpression-triggered growth defects have been proposed (Moriya, 2015). Resource overload, stoichiometric imbalance, promiscuous interaction, and pathway modulation are triggered upon overexpression of, respectively, (i) a protein that has a high demand of cellular resources (Dong et al., 1995; Kintaka et al., 2016; Stoebel et al., 2008), (ii) a protein that is part of a protein complex (Kaizu et al., 2010; Makanae et al., 2013; Papp et al., 2003), (iii) a protein with a nonspecific interaction domain (Ma et al., 2010; Vavouri et al., 2009), and (iv) a protein that catalyzes a pathway (Prelich, 2012; Youn et al., 2017). The mechanism of protein overexpression-triggered growth defects depends on the protein’s structural and functional characteristics, which are not always fully understood yet. Therefore, it is still difficult to predict whether the overexpression of a particular protein will be harmful to cellular growth and which mechanisms cause the harmful effect.
The ultimate overexpression of a protein could be harmful for cellular growth, because it monopolizes and depletes limited resources that are involved in protein production, such as ribosomes and aminoacyl-tRNAs (Gong et al., 2006; Shachrai et al., 2010; Vind et al., 1993). This phenomenon is known as the protein burden/cost effect (Kafri et al., 2016; Snoep et al., 1995). Proteins that have no harmful effects on cellular functions can be overexpressed up to a level that causes protein-burden–triggered growth defects. Conversely, if a protein cannot be overexpressed up to that level because it adversely affects cellular functioning, then overexpression of that protein will cause growth defects at relatively low expression levels, and we should consider mechanisms causing the defects.
We previously developed a genetic tug-of-war (gTOW) method that can be used to estimate the expression limit of a target protein that triggers growth defects in the yeast Saccharomyces cerevisiae (Makanae et al., 2013; Moriya et al., 2006 , 2012 ). We estimated that the expression limit of a green fluorescent protein (GFP) was about 15% of the total cellular protein in S. cerevisiae (Kintaka et al., 2016). Because GFP is a highly structured cytoplasmic protein unrelated to the cellular functions of yeast and thus harmless, this level could be considered the expression limit for any protein that causes growth defects triggered by the protein-burden effect.
We predicted that the expression limits of some native highly expressed glycolytic proteins would be similar (>15% of the total cellular protein) (Moriya, 2015), suggesting that overexpression of these proteins would be harmless even though they have metabolic functions in yeast. The prediction was performed by the calculation of the proteins’ native expression levels (Kulak et al., 2014) and their gene copy number limits as determined by gTOW analysis, and has not yet been experimentally validated (Moriya, 2015). In this study, therefore, we tried to measure the expression limits of 29 glycolytic proteins to assess whether they are expressed up to levels that cause growth defects triggered by the protein-burden effect. There are five reasons why we chose glycolytic proteins: (1) because they are generally highly expressed and thus considered non-harmful upon high-level expression, they are excellent targets for examining whether they are expressed up to the protein-burden limit; (2) because they have been intensely studied, we have information that can allow us to manipulate their catalytic activities; (3) because the glycolytic pathway is one of the best-known metabolic pathways, we can predict and measure metabolic changes upon overexpression of these proteins; (4) they include a heteromer (Pfks), a mitochondrially localized protein (Adh3), and membrane proteins (Hxts), so we can assess how these characteristics affect expression limits; and (5) they include paralogs whose expressions are differently regulated, so that we can test how their differences affect their expression limits.
We found that the expression limits of most of the 29 proteins were comparable to that of GFP and were independently determined by their catalytic activities, as suggested by a kinetic model of yeast glycolysis, confirming that their overexpression was harmless. Also, some of the proteins had far lower expression limits than those that would create a protein burden (the expression limit of GFP), and their harmful effects were derived from their localization and metabolic perturbations. Owing to their codon optimality, native poorly expressed isozymes were not produced at levels sufficient to cause growth defects, even when they were expressed from the strong TDH3 promoter on the multicopy plasmids. Some glycolytic proteins formed S–S-bond-mediated aggregates when overexpressed, and this aggregation also seemed to restrict their expression limits.
Results
Overexpression of glycolytic proteins from a strong promoter on a multicopy plasmid causes growth defects
Figure 1A shows the experimental system (plasmid) used to express glycolytic proteins to limits that cause growth defects. The target glycolytic proteins analyzed in this study and their characteristics are summarized in Supplementary file 1. We cloned each target gene on the gTOW plasmid (pTOW40836) (Moriya et al., 2012), such that the target protein was expressed under the control of the strong TDH3 promoter. PFK1 and PFK2 were exceptionally expressed from the less-strong PYK1/CDC19 promoter because their expression from the TDH3 promoter was too strong and consequently the growth of the transformants was very poor (data not shown). The plasmids were used to transform the S. cerevisiae strain BY4741 (ura3Δ leu2Δ). Copy numbers of the plasmid within the cell were controlled by changing growth conditions: up to 35 copies per cell in +leucine (–uracil) conditions (low-copy conditions) and up to 150 copies per cell in –leucine conditions (high-copy conditions) due to the biases 2 µm ORI and leu2-89 (Moriya et al., 2012). In this experimental system, the maximum growth rates of the cells with the vector in +leucine conditions are much greater than those in –leucine conditions (see Figure 1B–C), probably because the copy number of leu2-89 is not sufficient tosupport fully the leucine requirement in –leucine conditions. We measured the expression limits of most of the 29 target proteins in low-copy conditions because the expression levels produced under these conditions were already sufficient to cause growth defects.

Overexpression of most glycolytic proteins using a strong promoter and a multicopy plasmid causes growth defects.
(A) The plasmid used in this study. Each glycolytic gene was cloned into a 2-µm-based multicopy plasmid (pTOW40836) and expressed from the TDH3 promoter (TDH3pro) (with the exception of PYK1 which was expressed from the PYK1 promoter [PYK1pro], represented in the figure as (p)). In +leucine conditions, the copy number of the plasmid is relatively low (~30). In –leucine conditions, the copy number goes up to 150 copies per cell due to the bias of leu2-89 (Moriya et al., 2012). Here, we designate these conditions low- and high-copy conditions, respectively. (B and C) Maximum growth rate of yeast cells harboring the plasmid overexpressing each glycolytic protein in the indicated growth conditions. The unit is min−1 × 10−4. (D and E) Copy number of the plasmid overexpressing each glycolytic protein in the indicated growth conditions. The unit is copy number per haploid genome. The error bars shows the standard deviation of at least three independent biological measurements.
-
Figure 1—source data 1
This spreadsheet contains all data and statistical values associated with the figure.
- https://doi.org/10.7554/eLife.34595.004
We first measured the growth rates of cells harboring gTOW plasmids. As shown in Figure 1B, all cells expressing glycolytic proteins, with the exceptions of those expressing HXT1, HXT3, and HXT4, showed significant growth retardation compared to the vector control cells in low-copy conditions (p<0.01, Welch's t-test, Figure 1—source data 1), indicating that expression of most of the glycolytic proteins caused growth defects. This observation was confirmed by the growth measurement in the high-copy conditions shown in Figure 1C, as cells expressing most of the glycolytic proteins did not grow in these conditions. Cells expressing GLK1, FBA1, GPM1, PYK2, PDC6, ADH5, and ADH4 could grow in high-copy conditions, although their growth rate was significantly lower than that of the vector control (p<0.01, Welch's t-test, Figure 1—source data 1).
As previously reported, the copy number of the gTOW plasmid inside the cell inversely reflects the deleterious effect of protein expression from the plasmid due to the gTOW effect: the plasmid copy number is low if expression of the target protein is harmful to cellular growth, and high if expression of the target protein is less harmful (Kintaka et al., 2016; Makanae et al., 2013; Moriya et al., 2006, 2012). Figure 1D and E show the copy numbers of gTOW plasmids in low- and high-copy conditions. In high-copy conditions, the copy numbers of gTOW plasmids expressing only GLK1, FBA1, GPM1, PYK2, PDC6, and ADH4 were determined because yeast containing plasmids expressing the other protein-coding genes failed to grow. The copy numbers of all gTOW plasmids containing target genes were significantly lower than those containing the empty vector (p<0.05, Welch's t-test, Figure 1—source data 1), confirming that they were expressed up to levels that caused growth defects in this experimental system. Because the copy numbers of plasmids expressing most of the glycolytic proteins tested here, other than GLK1, PYK2, and ADH4, were not greater than the copy number of GFP, the expression of most glycolytic proteins in this experimental system seemed no less defective than that of GFP.
We concluded that most glycolytic proteins were expressed close to their upper limits, even in low-copy conditions, and that a copy number increase in high-copy conditions was required to express GLK1, FBA1, GPM1, PYK2, PDC6, and ADH4 to their limits.
Measurement of the expression limits of glycolytic proteins
Next, we measured the expression levels of proteins within the cells, overexpressing them from the gTOW plasmid. Figure 2A shows how protein abundance was estimated. As reported previously, when GFP is expressed up to its expression limit (and probably to the level required to trigger the protein-burden effect), the protein is visible within whole cellular proteins separated by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) (Kintaka et al., 2016). Because most glycolytic proteins were also expressed to similar levels by our experimental system, we measured the expression levels in arbitrary units (AU) as the relative intensities of target protein bands within the total protein separated by SDS-PAGE. The AU is considered to reflect the total number of amino acids within the band, and the relative number of protein molecules can be estimated by dividing AU by the protein length. When two proteins of different sizes give bands of the same number of AUs, the molecule number of the larger protein in the band should be lower than that of the smaller protein. The relationship between the AU and the percentage of total protein that we previously reported (Kintaka et al., 2016) was estimated, as shown in Figure 2—figure supplement 1, as % total protein = 5.5 × AU. Representative images of SDS-PAGE-separated total proteins from cells harboring gTOW plasmids containing the target glycolytic protein genes are shown in Figure 2—figure supplement 2. As shown in Figure 2B, most proteins were expressed at levels high enough to make them visible within the SDS-PAGE–separated whole cellular proteins, and the expression levels of Pgk1, Gmp1, Eno2, and Eno1 were higher than that of GFP. By contrast, the expression levels of Pfk1, Adh3, and Hxts were almost undetectable with this experimental system. The x-fold increase in the expression of each target protein over its native level is shown in Figure 2—figure supplement 3. The expression of some proteins was increased more than 10,000-fold in this experimental system. The expression of Tdh3 and Gpm1 further increased in –leucine conditions (Figure 2C), but the cells in this condition had stunted growth (Figure 1C).

Expression limits of glycolytic proteins.
(A) Measurement of the expression level of an overexpressed glycolytic protein. Whole cellular proteins were stained by a fluorescent dye and separated by SDS-PAGE. The overexpression of the target protein (TBS), estimated from the intensity of it's band on the gel an its molecular weight, was compared with the corresponding expression of the vector control (TBV), after normalization using control bands (CBS and CBV) to calculate protein expression (AU). Measurement of Gpm1 expression level is shown as an example. (B and C) The expression level of glycolytic proteins overexpressed using the experimental system shown in Figure 1 in the indicated conditions. The TDH3 promoter was used for the expression of all genes except where (p) indicates the use of the PYK1 promoter. (D) Relationship between copy number and protein level in low-copy conditions. The copy number data are the same as in Figure 1D, and the protein level data are the same as in Figure 2B. The error bars shows the standard deviation of at least three independent biological measurements.
-
Figure 2—source data 1
This spreadsheet contains all data and statistical values associated with the figure.
- https://doi.org/10.7554/eLife.34595.009
There could be two reasons that explain why the expression level of a protein is low: (i) its strong overexpression is harmful to cellular growth and (ii) its expression is repressed. We can distinguish these two possibilities by comparing the copy numbers and protein abundance as shown in Figure 2D because the copy number of the plasmid inversely reflects the deflective effect of protein expression as described above. Overexpression of Pfk1, Adh3, and Hxts seemed harmful because their copy numbers were lower than the those of the other proteins (red circles in Figure 2D). By contrast, the expression of Glk1, Pyk2, and Pdc6 seemed to be repressed because their copy numbers were higher than those of the other proteins (blue circles in Figure 2D). The relationship between protein expression levels and copy numbers (shown in Figure 2D) also suggested that expression levels are not solely determined by the promoter, because there was no significant correlation between the expression levels and plasmid copy numbers (Pearson r=0.28, p=0.12).
Mutations in catalytic centers do not affect the expression limits of most glycolytic proteins
Next, we tried to reveal the factors causing harmful effects that restrict expression limits, and the mechanisms that repress protein expression. Overproduction of glycolytic proteins might cause metabolic perturbations by accelerating the reactions that these proteins catalyze. To test whether growth inhibition caused by metabolic perturbations limits the expression levels of glycolytic proteins, we analyzed the expression limits of mutant proteins with reduced enzymatic activities by introducing mutations into the catalytic centers (here we call the mutant a ‘CC mutant’). The mutations introduced into the glycolytic proteins are summarized in Supplementary file 1. Figure 3A shows the expression levels of wild-type and mutant proteins in low-copy conditions. The expression levels of all proteins except Pfk1, Fba1, Tdh3, and Eno1 were not significantly changed by introducing mutations. The expression levels of mutant Pfk1 and Tdh3 were significantly higher than those of wild-type proteins, and the expression levels of mutant Fba1 and Pgk1were significantly lower than those of wild-type proteins (p<0.05, Welch's t-test, Figure 3—source data 1). For Pfk1, Tdh3, and Pfk2 (which catalyzes the same reaction with Pfk1), we further analyzed the expression levels in high-copy conditions (Figure 3B). The expression level of mutant Pfk2 significantly increased when compared with that of wild-type Pfk2 (p=0.046, Welch's t-test). The expression levels of both wild-type and mutant Pfk1 were almost undetectable in these conditions, probably because their high-level expression was too toxic to the yeast cells. Because the expression level of wild-type Tdh3 was greater than that of mutant Tdh3, the enzymatic activity of Tdh3 probably did not restrict its protein expression limit. We concluded that the expression limits of most of the glycolytic proteins studied here are not restricted by metabolic perturbations triggered by their overproduction, whereas the expression limits of Pfk1 and Pfk2 are exceptionally restricted by metabolic perturbations. The expression levels of mutant Pfk1 and Pfk2, however, remained markedly lower than those of other glycolytic proteins, suggesting that other factors also influence their expression limits.

Effects of mutations on the expression limits of glycolytic proteins.
(A and B) Expression levels of wild-type and CC mutant glycolytic proteins in the indicated conditions. Each CC mutant has a mutation in the position shown in Supplementary file 1. (C) SDS-PAGE gel images of whole cellular proteins overexpressing Adh3 and ΔMTS-Adh3 in low-copy conditions. Red dots indicate the expected sizes of the target proteins. (D) Protein expression levels of Adh3 and ΔMTS-Adh3 in low-copy conditions. The error bars indicates the standard deviation of the mean. *p<0.05; **p<0.01 in Welch's t-test.
-
Figure 3—source data 1
This spreadsheet contains all data and statistical values associated with the figure.
- https://doi.org/10.7554/eLife.34595.012
Mitochondrial localization restricts the expression limit of Adh3
Next, we focused on Adh3, whose expression level was lower than those of the other glycolytic proteins, probably because high-level expression of Adh3 is harmful (Figure 2D). This harmful effect, however, is not triggered by metabolic perturbations, because the expression level of the mutant Adh3 with reduced enzymatic activity was almost the same as that of wild-type Adh3 (Figure 3A). Among the glycolytic proteins tested in this study, Adh3 alone is a mitochondrial protein (Young and Pilgrim, 1985). To test whether the mitochondrial localization of Adh3 restricts its protein expression limit, we constructed a mutant without the mitochondrial targeting sequence (ΔMTS-Adh3, Figure 3—figure supplement 1) and compared its expression level to that of wild-type Adh3. As shown in Figure 3C and D, the expression level of ΔMTS-Adh3 was about three times higher than that of wild-type Adh3. We concluded that the mitochondrial localization of Adh3 restricts its expression limit, probably because the high-level expression of this mitochondrial protein causes growth defects due to overloading of mitochondrial transport resources (Kintaka et al., 2016).
Metabolic perturbations triggered upon overexpression of glycolytic proteins
The results suggested that the overexpression of most glycolytic proteins do not cause serious metabolic perturbations. To test whether this speculation is theoretically supported, we used a kinetic model of the yeast glycolytic pathway (Smallbone et al., 2013); a schematic diagram of which is shown in Figure 4—figure supplement 1. Figure 4A–D shows the x-fold change of glycolytic metabolites in simulations in which each glycolytic protein is overexpressed up to 128-fold compared with the wild-type simulation. Overproduction of 14 of 20 glycolytic proteins did not cause more than a two-fold metabolic change (gray lines in Figure 4A), indicating that overexpression of most glycolytic proteins does not cause serious metabolic perturbations. By contrast, the overproduction of Hxk1 and Hxk2 affected glycolytic metabolism throughout, and the overproduction of Pdc1 and Cdc19 affected metabolism locally (Figure 4B–C). Because the experimental results using CC mutants suggested that their overexpression did not trigger metabolic perturbations leading to the growth defects, unknown mechanisms to explain the discrepancy might exist. Overproduction of Pfk1 or Pfk2 did not cause a metabolic change, because, in the model, these individual enzymes did not catalyze the Pfk reaction whereas the Pfk1–Pfk2 complex did. Simultaneous overproduction of both Pfk1 and Pfk2 caused severe metabolic changes (Figure 4D), whose pattern was quite similar to the changes caused by Hxk1 and Hxk2 overexpression (Figure 4—source data 1) (except G6P and F6P levels did not change as these metabolites are upstream of the Pfk reaction). Although metabolic changes upon overexpression of Pfks and Hxks showed a similar pattern, overexpression of Pfks but not Hxks caused growth defects (Figures 1 and 2), and catalytic mutations of only Pfks increased the expression limit of this protein (Figure 3). Hence, the metabolic changes observed in the simulation do not by themselves explain the growth defects triggered by the overexpression of Pfks.

Metabolic perturbations triggered by overexpression of glycolytic proteins in silico.
(A–D) Metabolic change triggered by overexpression of the indicated glycolytic protein in a kinetic model of glycolytic metabolism (Smallbone et al., 2013). Log10 fold-change in each metabolite level in a simulation with 128-fold overexpression of each glycolytic protein compared to that in the wild-type is shown. In (D), Pfk1 and Pfk2 are simultaneously overexpressed.
-
Figure 4—source data 1
This spreadsheet contains all data and statistical values associated with the figure.
- https://doi.org/10.7554/eLife.34595.015
To further characterize physiological conditions that are triggered by the overexpression of Pfks, we next analyzed metabolic changes in yeast cells overexpressing wild-type and CC mutant Pfk2 over the vector control by measuring 35 metabolites (Figure 5—source data 1), because the CC mutants showed increased expression limits (Figure 3B). Figure 5A shows changes in the levels of nine glycolytic metabolites. Overexpression of both wild-type and CC mutant Pfk2 triggered significant reductions in some metabolites (p<0.05, Welch’s t-test, Figure 5—source data 1). Moreover, the patterns of metabolic changes were inconsistent with those predicted by the model (Figure 5—figure supplement 1). These metabolic reductions were thus not triggered by the catalytic activity of Pfk2. We noticed, however, that the level of F16bP in the cells overexpressing wild-type Pfk2 was >3-fold higher than that in the CC mutant Pfk2 (Figure 5A, p<0.05, Welch’s t-test, Figure 5—source data 1). F16bP is the product of Pfk catalysis and the simulation predicted an increase in the F16bP level upon overexpression of Pfks (Figure 4D), suggesting that the catalytic activity of Pfk2 triggers this metabolic difference.

Metabolic changes triggered by overexpression of glycolytic proteins in vivo.
(A–C) The bar graph shows the log2-fold change in each metabolite in the cells overexpressing wild-type and mutant Pfk2, Pfk1, and Tdh3 over the vector controls. The red circle shows the log2 fold-difference in each metabolite between the wild-type and the CC mutant measurements. The metabolites were measured in exponentially growing cells cultured in low-copy conditions. The error bars indicates the standard deviations of the mean for three (Pfk2) and two (Pfk1, Tdh3) biological replicates. *p<0.05 in Welch's t-test.
-
Figure 5—source data 1
This spreadsheet contains all data and statistical values associated with the figure.
- https://doi.org/10.7554/eLife.34595.018
We next measured metabolic changes in 29 metabolites in cells overexpressing wild-type Pfk1 and Tdh3 and their CC mutants because these CC mutants also showed increased expression limits (Figure 3A). As shown in Figure 5B and C, levels of glycolytic metabolites in the cells overexpressing wild-type Pfk1 and Tdh3 were not changed more than three-fold over the vector control. We did not observe any reproducible increase in F16bP level in the cells overexpressing wild-type Pfk1 over levels in its CC mutant. Moreover, overall metabolic changes were higher in the cells overexpressing CC mutant than in those expressing wild-type Pfk1 (Figure 5B). We did not observe any reproducible difference in the metabolic changes between the cells overexpressing wild-type Tdh3 and its CC mutant (Figure 5C). We thus concluded that overexpression of Pfk1 and Tdh3 did not trigger significant metabolic changes through their catalytic activities, at least in the detected glycolytic metabolites.
Codon optimality explains the lower expression of non-harmful glycolytic proteins
We next focused on Glk1, Pyk2, and Pdc6, as their expression levels were lower than those of other glycolytic proteins in low-copy conditions, while they did not seem to be harmful (Figure 2D). Moreover, the expression levels of Glk1 and Pyk2 were significantly elevated in high-copy conditions (Figure 6A). These results raised the possibility that expressed protein levels per single gene copy are lower than those for other genes either because protein synthesis rates are low or because protein degradation rates are high. Codon optimality strongly contributes totranslational elongation rate and mRNA stability (Presnyak et al., 2015). Therefore, we analyzed the tRNA adaptation index of a gene (tAIg) (Tuller et al., 2010) for the the glycolytic genes studied here (Figure 6B and Figure 6—figure supplement 1) and noticed that GLK1, PYK2, and PDC6 had a much lower tAIg than the other glycolytic genes. To test whether the codon optimality of GLK1 affects the protein expression level, we constructed codon-optimized GLK1 (CoGLK1) and measured its protein expression level (Figure 6A). Glk1 expressed from CoGLK1 was present at levels 3.6 and 4.7 times higher than that expressed from native GLK1 in low- and high-copy conditions, respectively. We concluded that Glk1 expression was low due to its low codon optimality.

Codon usage affects the expression level, but not the synthesis timing, of Glk1.
(A) Expression levels of Glk1, Pyk2, Pdc6, and codon-optimized GLK1 (CoGlk1) in the indicated conditions. (B) Relationship between the tAIg and the expression level of each glycolytic protein in low-copy conditions. Protein level data are the same as those shown in Figure 2B. (C) Growth curves and GFP fluorescence of cells expressing codon-optimized GFPs. oG-GFP (tAIg = 0.40): a GFP gene whose codons were optimized for the GLK1 codon usage. oT-GFP (tAIg = 0.64): a GFP gene whose codons were optimized for the TDH3 codon usage. (D) Lag time between the timings with the maximum GFP fluorescence and the maximum growth rate. Timings of the maximum GFP fluorescence and the maximum growth rate are the time points with maximum second derivatives of GFP fluorescence and growth curves. The error bars indicate the standard deviations of the means. **p<0.01; ***p<0.001 in Welch's t-test.
-
Figure 6—source data 1
This spreadsheet contains all data and statistical values associated with the figure.
- https://doi.org/10.7554/eLife.34595.021
Glk1 expression increases after a diauxic shift—a growth-phase shift triggered by the carbon source alteration from glucose to ethanol (Zampar et al., 2013). We speculated that GLK1 might have a codon usage that is optimized for the tRNA pool after a diauxic shift and its translational rate might be higher after the shift. To investigate this possibility, we monitored the expression levels of GFPs with different codon usages under different growth conditions. We constructed two GFP genes whose codons were differently optimized: (i) oG-GFP, whose codons were selected at random with probabilities obtained from the codon usage table of GLK1, and (ii) oT-GFP, whose codons were substituted by the synonymous codon used most frequently in TDH3. We added the ornithine decarboxylase degron (Jungbluth et al., 2010) to the C-terminus of these GFP genes to allow accurate monitoring of the timings of their syntheses. Figure 6C shows the GFP fluorescence and the growth of cells expressing the GFP genes. The GFP fluorescence of both genes peaked during their exponential growth phases. Next, we measured the lag time between the inflection points of the GFP fluorescence curve and the growth curve (where the diauxic shift is supposed to happen), as shown in Figure 6D. Because the lag times were not significantly different (p=0.44), we concluded that the codon usage of GLK1 was not optimized to maximize their translation after the diauxic shift.
Overexpression-triggered protein aggregation through S–S bonds restricts the expression limits of Tpi1
When we measured the expression levels of Eno2 and Pgk1 proteins, we unexpectedly observed high-molecular-weight bands whose sizes (~125 and 100 kDa) were different from the sizes of the monomers or dimers of Eno2 and Pgk1 (45 and 90 kDa, respectively) (Figure 7A). The band formation was independent of the catalytic activities of Eno2 because the bands were also observed in the experiment with Eno2 CC mutant (Figure 7A). The band in the Eno2 experiment seemed to be S–S-bond-connected protein aggregates because it disappeared after treatment with the reducing agent dithiothreitol (DTT) (Figure 7B). We confirmed that cysteines were responsible for creating these bands because they disappeared when cysteine residues were removed from Pgk1 and Eno2 (Figure 7—figure supplement 1). To identify the protein species in the bands, we analyzed them by liquid chromatography-tandem mass spectrometry (LC-MS/MS). As shown in Figure 7C and D, we mainly detected glycolytic proteins, translational elongation factors, and translation initiation factors, in addition to each overexpressed protein. Most of the detected proteins were also detected in the CC mutant experiment (Figure 7C). This aggregation did not seem to affect the expression limits of Eno3 and Pgk1 because the expression limits of wild-type proteins and cysteine-less mutants (Eno2-C248S and Pgk1-C98S) were indistinguishable (Figure 7—figure supplement 2).

Overexpressed Eno2 and Pgk1 form protein aggregates.
(A) SDS-PAGE-separated total cellular proteins from cells overexpressing the indicated proteins. (B) SDS-PAGE-separated total cellular proteins from cells overexpressing Eno2 after treatment with (+) or without (–) the reducing agent DTT. (C and D) Enriched proteins in the high molecular bands from cells overexpressing the indicated proteins. Proteins enriched in the high molecular bands > 1.5-fold over the vector control are shown. Proteins were analyzed in low-copy conditions. The red point indicates the expected molecular weight of overexpressed proteins. The asterisk indicates the high-molecular-weight band specifically observed upon overexpression of each glycolytic protein. Gel images were contrasted so that high-molecular-weight bands were visible. CC mut.: catalytic center mutant.
-
Figure 7—source data 1
This spreadsheet contains all data and statistical values associated with the figure.
- https://doi.org/10.7554/eLife.34595.025
Next, we focused on Tpi1 because it was detected in both Pgk1 and Eno2 aggregates (Figure 7C–D) and because its expression limit (1.7 U) was lower than that of the highest-limit proteins such as Pgk1 and Gpm1 (>2.0 U) (Figure 2B). As shown in Figure 8A, Tpi1 constituted many aggregation bands upon its overexpression. The majority of these bands disappeared when cysteine residues were removed from Tpi1 (C41S, C126S), or after DTT treatment. These results suggested that non-specific S-S-bond-connected aggregation occurred upon overexpression of Tpi1. To test whether the aggregation restricts the Tpi1 expression limit, we measured the expression limits of cysteine-less Tpi1. As shown in Figure 8B, the expression levels of cysteine-less Tpi1 significantly increased above those of wild-type Tpi1. Because mutant Tpi1 levels were higher than wild-type Tpi1 levels, even in +DTT conditions, the removal of cysteine residues would not only prevent the formation of aggregates but would also increase the expression limit of Tpi1.

Overexpressed Tpi1 forms protein aggregates.
(A) SDS-PAGE-separated total proteins from cells overexpressing indicated proteins, and their Western blot imaging using anti-Tpi1 antibodies. The red points indicate the expected molecular weight of overexpressed proteins. (B) Effect of cysteine substitutions on the expression level of Tpi1. *p<0.05; **p<0.01; ***p<0.001 in Welch's t-test. C41S: substitution of cysteine 41 to serine; C126S: substitution of cysteine 126 to serine. Proteins were analyzed in low-copy conditions.
-
Figure 8—source data 1
This spreadsheet contains all data and statistical values associated with the figure.
- https://doi.org/10.7554/eLife.34595.027
Discussion
According to the protein-burden concept (Dong et al., 1995; Kafri et al., 2016; Shah et al., 2013; Snoep et al., 1995; Stoebel et al., 2008), the ultimate overexpression of any protein could cause growth defects by overloading basic protein production resources. But only non-harmful proteins can be overexpressed up to the ultimate level, or the protein-burden limit, because the expression limit of harmful proteins should be restricted by their harmful effects. Knowing the protein-burden limit itself is thus essential when seeking to determine whether the overexpression of a protein is harmful to cellular functions. We previously estimated the protein-burden limit of S. cerevisiae cells by measuring the expression level of GFP that causes growth defects. This was 15% of the total cellular protein (Kintaka et al., 2016).
In this study, we first tried to measure the expression limits of yeast glycolytic proteins in order to confirm whether the protein-burden limit measured using GFP applies to endogenous proteins. Most of the glycolytic proteins studied here caused growth defects when they were expressed from a strong promoter on a multicopy plasmid (Figure 1). The expression levels of some glycolytic proteins in these conditions were, indeed, comparative or even higher than that of GFP (Figure 2). Also, their expression levels did not increase due to mutations in their catalytic centers (Figure 3A). These results strongly suggest that the protein-burden effect largely determines the expression limit, and that the limit is around 15% of the total cellular protein. Among the glycolytic proteins studied here, Pgk1, Gpm1, and Eno2 had the highest expression limits. Although Pgk1 (44.7 kDa) and Eno2 (46.9 kDa) are 1.5-fold larger than Gpm1 (27.6 kDa), their expression limits were similar to those of Gpm1 (Figure 2B). These results suggest that a protein's size does not affect its expression limit, at least for proteins in this molecular weight range. These data also suggest that the expression limits of proteins are not determined by the molar concentrations of those proteins but by the cost of the protein production.
Some other glycolytic proteins, such as Pfk1, Pfk2, Adh3, and Hxts, showed expression limits far below the protein burden limit of 15% (Figure 2), suggesting that overexpression of these proteins is harmful. Of the 18 glycolytic proteins studied, Pfk1 and Pfk2 were the only ones whose expression limits were significantly increased by mutations in their catalytic centers (Figure 3A–B), suggesting that their metabolic functions restrict their expression limits. We think, however, that the metabolic perturbations that trigged the overexpression only partially affect the expression limits because the expression limits of the mutant proteins were still far below those of other glycolytic proteins (Figure 3A–B). Pfk1 and Pfk2 form a hetero-octameric complex, and their stoichiometric imbalance leads to the formation of filamentous Pfk1 structures in the cytosol (Schwock et al., 2004). This stoichiometry-imbalance-triggered protein aggregate might cause growth defects upon overexpression of Pfk1 (and Pfk2), although we could not confirm this hypothesis because simultaneous overexpression of Pfk1 and Pfk2 did not increase the expression limits of these proteins (our unpublished observation).
The CC mutants of Fba1 and Pgk1 showed lower expression limits than their wild-type proteins (Figure 3A). We currently do not have any substantial and consistent explanation of why these CC mutants have lower expression limits. We can assume some general mechanisms: CC mutant proteins sequester the wild-type enzymes into inactive complexes; CC mutant proteins sequester the substrate molecules for the wild-type enzymes; or mutation in the catalytic center destabilizes the structure of the enzyme. For example, Fba1 is an essential homodimeric enzyme (UniProtKB: P14540). Overexpression of CC mutant Fba1 molecules might sequester active wild-type Fba1 molecules into inactive complexes. The limit of CC mutant Tdh3 was higher than that of the wild-type in low-copy conditions whereas it was lower in high-copy conditions (Figure 3A–B). This strange behavior might be related to its moonlighting function. The catalytic activity of Tdh3 did not seem to explain the difference in the expression limits of wild-type and CC mutant Tdh3 (Figure 5C). Beside its metabolic function, Tdh3 directly binds to Sir2 protein to promote transcriptional silencing, and a mutation in the catalytic center (C150G) reduces the silencing (Ringel et al., 2013). It is thus possible that the CC mutant Tdh3 (C150S) causes silencing in a dose-dependent manner by competing with wild-type Tdh3 for binding with Sir2.
We speculated that the localization of Adh3 to the mitochondria and of Hxts to the plasma membrane restricted their expression limits because localized proteins overload more-limited localization resources (Kintaka et al., 2016). This hypothesis was confirmed because the removal of the mitochondrial signal from Adh3 increased its expression limit (Figure 3C,D). We also speculated that the expression limits of membrane proteins such as Hxts should be restricted by their localization, although there is no experimental evidence to support this hypothesis yet.
The fact that the expression limits of most glycolytic proteins were not affected by mutations in their catalytic centers (Figure 3A) suggests that their overexpression does not cause metabolic perturbations. This finding was theoretically confirmed by simulations using a kinetic model of glycolytic metabolism (Figure 4). The reason why their overexpression does not cause metabolic perturbations is probably that they are bidirectional enzymes: the metabolic flux should be determined only by the availability of substrates when the concentrations of these enzymes are more than a certain level. To support this idea, the overexpression of 14 bidirectional enzymes showed minor metabolic changes, whereas the overexpression of 6 unidirectional enzymes (including Hxks, Pfks, Cdc19, and Pdc1) showed strong metabolic changes in the simulation (Figure 4). The expression limits of Hxks in the cells, however, were close to the protein burden limit (Figure 2B) and were not affected by mutations in the catalytic center (Figure 3A). These results suggest an additional mechanism that is not implemented into the model that allows cells to avoid the effects of big metabolic changes upon overexpression of Hxks: a mechanism that prevents these metabolic perturbations from occurring, or a mechanism that prevents these metabolic perturbations from causing growth defects.
Through the metabolic analysis, we realized that we currently do not have any systematic way to identify metabolic changes that are directly triggered by the overexpression of an enzyme, because metabolism is interconnected and the overexpression of a protein could cause non-specific perturbations that ultimately affect metabolism. Moreover, we know very little about how much change in which metabolite triggers a growth defect. Comparison of the metabolic changes in cells overexpressing wild-type and CC-mutant enzymes could be one solution for this. In fact, we observed a three-fold difference between cells expressing wild-type and CC mutant Pfk2 (Figure 5A). Nevertheless, once again, we cannot conclude from our current knowledge that this difference causes the difference in the expression limits of these two forms of Pfk2. By using a mathematical model, we tried to predict the potential metabolic changes that would be triggered by overexpression of an enzyme without considering unknown effects other than the enzyme's metabolic activity. In the simulations, overexpression of Pfks and Hxks triggered divergent and almost catastrophic metabolic changes (~1000-fold increase in some metabolites, Figure 4B,D), suggesting that their overexpression would cause growth defects due to these strong metabolic perturbations. We thus expected to obtain similar metabolic changes upon overexpression of Pfks, whose CC mutants had higher expression limits. We did not, however, observe such great changes (Figure 5A–B and Figure 5—figure supplement 1). To answer these issues precisely, we need a much deeper understanding of the connections between metabolite levels and cellular growth.
The translational rate of some glycolytic proteins, including Glk1, seemed low because of their lower codon optimality (Figure 6). Actually, the codon optimality of Glk1 (tAIg = 0.38) is close to the average for all the yeast genes (tAIg = 0.37), and the codon optimality of other glycolytic proteins studied here is exceptionally high (Figure 6—figure supplement 1). These observations suggest that the codon optimality of most yeast genes is not high enough to allow expression of their proteins up to the protein-burden limit, even if they are expressed from a strong promoter on a multicopy plasmid.
Overexpression of Eno2, Pgk1, and Tpi1 triggered S–S-bond-connected aggregation (Figures 7 and 8), and the aggregates that are formed contain other glycolytic proteins and translational factors (Figure 7C–D). We think that this aggregation is triggered by spontaneous non-specific S–S bond formation among proteins existing in high concentrations. Interestingly, we also detected the same proteins within the gel of the corresponding molecular weight in the vector control, although the amounts estimated by LC-MS/MS were lower and cannot be identified as visible protein bands (Figure 7—source data 1). Therefore, we speculated that the S–S-bond-mediated protein aggregation occurs even in normal physiological conditions, but it is accelerated by an increase in the concentration of cytoplasmic proteins upon overexpression of glycolytic proteins. This aggregation might affect the expression limits of cysteine-containing glycolytic proteins, because changing the cysteine residues of Tpi1 into serine residues increases the protein's expression limit (Figure 8B). As the amount of protein corresponding to the Tpi1 monomer was not changed by DTT treatment, the expression level of Tpi1 should not be reduced simply by aggregation but by the harmful effect of spontaneous S–S bond formation. This hypothesis is supported by the fact that the most highly expressed glycolytic protein Gpm1, which has a molecular weight similar to that of Tpi1, does not have a cysteine residue. The deleterious effect of this aggregation, however, seems protein-specific because the expression limits of Pgk1 and Eno1 were among highest measured (Figure 2A), and removal of their cysteine did not increase their expression limits (Figure 7—figure supplement 1).
As described above, we revealed mechanisms that restrict the expression limits of some glycolytic proteins. We do not think, however, that these mechanisms are the sole factors restricting the expression limits of these proteins. The expression limits of ΔMTS-Adh3 (0.45 AU, Figure 3D) and CoGlkl (1.07 AU, Figure 6A) are still lower than those of other high limit proteins such as Pgk1 and Gpm1 (2.26 AU and 2.63 AU, respectively, Figure 2B). It is thus likely that multiple mechanisms restrict the expression limits of these proteins.
Protein misfolding or misinteraction is considered to cause toxicity upon high-level expression of a protein with low translational robustness, low folding stability, or a high propensity for misinteraction (Drummond and Wilke, 2009; Zhang and Yang, 2015). In general, highly expressed proteins such as glycolytic proteins are thus evolved to avoid these characteristics (Zhang and Yang, 2015), and that should be a requirement for a protein to be expressed up to the protein-burden limit. Cdc19, one of the glycolytic proteins studied here, aggregates in a stress-induced and reversible manner through a region of low compositional complexity (Saad et al., 2017). This aggregation capacity of Cdc19 might explain why its expression limit (0.42 AU) is lower than the protein burden limit (>2.0 AU) (Figure 2B). Our finding in Figure 8 suggested that the high-level expression of a cysteine-containing protein could also cause a misinteraction-triggered toxic effect; hence unimportant cysteines should be avoided in highly expressed proteins. Concentration-dependent liquid phase separation is also considered to cause toxicity upon overexpression of structurally disordered and nucleic-acid-binding proteins (Bolognesi et al., 2016). We do not think that this mechanism caused growth defects upon overexpression of the glycolytic proteins studied here because they are less structurally disordered (Moriya, 2015) and not nucleic-acid-binding proteins.
We summarize our analysis in Supplementary file 1. In conclusion, we established the ultimate expression level that causes cellular growth defects due to the protein-burden effect as around 15% of the total cellular protein. The next interesting theme is to identify characteristics of proteins that can be overexpressed up to the protein-burden limit because such proteins are considered non-harmful to cellular functions. Those characteristics should conversely imply the properties of proteins that are harmful when they are overexpressed.
Materials and methods
Strains, growth conditions, and yeast transformation
Request a detailed protocolBY4741 (MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0) (Brachmann et al., 1998) was used as the host strain for the experiments. Yeast culture and transformation were performed as previously described (Amberg et al., 2005). A synthetic complete (SC) medium without uracil (Ura) or leucine (Leu), as indicated, was used for yeast culture.
Plasmids used in the study
Request a detailed protocolThe plasmids used in the study are listed in the Key Resources Table (Supplementary file 2). The plasmids were constructed by the homologous recombination activity of yeast cells (Oldenburg et al., 1997), and their sequences were verified by DNA sequencing.
Measurement of the plasmid copy number
Request a detailed protocolThe plasmid copy number was measured by real-time polymerase chain reaction, as previously described (Moriya et al., 2006), using a LightCycler480 system (Roche). The LEU2 (LEU2-2F and LEU2-2R) and LEU3 primer sets (LEU3-3F and LEU3-3R) were used to amplify DNA fragments of the pTOW40836 plasmid and genomic DNAs, respectively. Mean values, standard deviations (SD), and p-values of Welch's t-test were calculated from biological triplicates.
Protein analysis
Request a detailed protocolThe total protein was extracted from log-phase cells with an NuPAGE LDS sample buffer (ThermoFisher) after 0.2N NaOH treatment (Kushnirov, 2000). For each analysis, the total protein extracted from two optical density (OD) units of cells with OD600 was used. For total protein visualization, the extracted total protein was labeled with Ezlabel FluoroNeo (ATTO), as described in the manufacturer’s protocol, and separated by 4–12% SDS-PAGE. Proteins were detected and measured using the LAS-4000 image analyzer (GE Healthcare) in SYBR–green fluorescence detection mode and Image Quant TL software (GE Healthcare). The expression of each target protein (AU) was calculated, as shown in Figure 2. Average values, SD, and p-values of Welch's t-test were calculated from biological triplicates. For detection of Tpi1, the SDS-PAGE-separated proteins were transferred to a PVDF membrane (ThermoFisher). Tpi1 was detected using an anti-Tpi1 antibody (RRID:AB_11130951), a peroxidase-conjugated secondary antibody (Nichirei Biosciences), and a chemiluminescent reagent (ThermoFisher). The chemiluminescent image was acquired with an LAS-4000 image analyzer in chemiluminescence detection mode.
Measuring growth rate and GFP fluorescence
Request a detailed protocolCellular growth and GFP fluorescence were measured by monitoring OD595 and Ex485 nm/Em 535 nm, respectively, every 30 min using an Infinite F200 microplate reader (Tecan). The maximum growth rate (MGR) was calculated as described previously (Moriya et al., 2006). Average values, SD, and p-values of Welch's t-test were calculated from biological triplicates. We define growth defect based on a significant reduction in the maximum growth rate of the cells overexpressing a target protein compared with that of cells overexpressing the control vector (p<0.01, Welch’s t-test).
In silico analysis of overexpression of glycolytic proteins
Request a detailed protocolWe used a kinetic model of the yeast glycolytic pathway developed previously (Smallbone et al., 2013). To predict metabolic changes upon overexpression of glycolytic proteins, we changed the initial concentration of each target protein 128-fold over the original concentration, and calculated the concentration of each metabolite at the steady state. We did not analyze the metabolism for the overproduction of Pyk2, Adh2, Adh3, Adh4, and Adh5, because they were not included or because their turnover ratios were set to 0 in the model. We also did not analyze Hxts overexpression, because its concentration was not changeable in the model.
Metabolite analysis
Request a detailed protocolYeast cells were aerobically cultivated at 30°C for 24–48 hr in an SC–Ura medium. The cells were inoculated into 200 mL of the medium at an OD600 of 0.5 and then aerobically cultured at 30°C for 3 hr. 1.0 mL of culture containing cells with an of OD600 of 50 was mixed with 1.4 mL of methanol solution pre-cooled at –80°C. The sample was centrifuged at 5,000 g at –20°C for 5 min. After the removal of the supernatant, 1.0 mL of 75% ethanol pre-heated at 95°C was added to the sample, which was then incubated for 3 min at 95°C. 10 µL of 17 µM D-camphor sulfonic acid was added to the sample as an internal standard for liquid chromatography triple–stage quadrupole-mass spectrometry (LC-QqQ-MS) analysis. After placing on ice for 5 min, the sample was centrifuged at 5,000 g at 4°C for 5 min to remove cell debris. 950 µL of the supernatant was transferred to a new tube and centrifuged at 15,000 rpm at 4°C for 5 min. 300 µL of the supernatant collected as cell extract was dried under vacuum, and then stored at –80°C until the mass spectrometry analysis. All metabolites were measured using LC-QqQ-MS. LC-QqQ-MS analysis was performed according to the method given by Kato et al. (2012). We calculated the normalized internal standard peak areas for each metabolite. Samples from three independent cultures were analyzed for the cells overexpressing Pfk2, Pfk2 CC mutant, and the vector control. Samples from two independent cultures were analyzed for the cells overexpressing Pfk1, Pfk1 CC mutant, Tdh3, Tdh3 CC mutant, and the vector control.
Identification of aggregated protein species
Request a detailed protocolThe total protein extracts in the overexpression of Eno2, Eno2 CC mutant, and Pgk1 were separated by SDS-PAGE and stained by Coomassie staining solution (ThermoFisher). Proteins of interest were excised from the gels and digested using trypsin. The tryptic peptides were analyzed by LC-MS/MS consisting of an LTQ-Orbitrap mass spectrometer (ThermoFisher) and a DiNa nano LC (KYA Technologies) system according to the method described previously (Kito et al., 2016). The peptide mixture was separated with reverse-phase chromatography. Mobile phase A contained 0.1% formic acid, and mobile phase B contained 0.1% formic acid/80% acetonitrile. Peptides were eluted at a flow rate of 200 nL/minute using a 55 min gradient as follows: from 0% to 32% solvent B over 45 min, from 32% to 40% solvent B over 5 min, and from 40% to 80% solvent B over 5 min. The acquired MS/MS spectra were subjected to a database search against the protein sequences of S. cerevisiae. The aggregating protein species in Figure 7 are those for which the number of peptide hits in the database search was five or more and was 1.5-fold more than that of the vector control.
Data availability
All data generated or analysed during this study are included in the manuscript and supporting files.
References
-
BookMethods in Yeast Genetics: A Cold Spring Harbor Laboratory Course ManualCold Spring Harbor Laboratory Press.
-
The evolutionary consequences of erroneous protein synthesisNature Reviews Genetics 10:715–724.https://doi.org/10.1038/nrg2662
-
Overexpression of tnaC of Escherichia coli inhibits growth by depleting tRNA2Pro availabilityJournal of Bacteriology 188:1892–1898.https://doi.org/10.1128/JB.188.5.1892-1898.2006
-
Widely targeted metabolic profiling analysis of yeast central metabolitesJournal of Bioscience and Bioengineering 113:665–673.https://doi.org/10.1016/j.jbiosc.2011.12.013
-
The systems biology graphical notationNature Biotechnology 27:735–741.https://doi.org/10.1038/nbt.1558
-
Robustness analysis of cellular systems using the genetic tug-of-war methodMolecular BioSystems 8:2513–2522.https://doi.org/10.1039/c2mb25100k
-
Quantitative nature of overexpression experimentsMolecular Biology of the Cell 26:3932–3939.https://doi.org/10.1091/mbc.e15-07-0512
-
Recombination-mediated PCR-directed plasmid construction in vivo in yeastNucleic Acids Research 25:451–452.https://doi.org/10.1093/nar/25.2.451
-
Determinants of the rate of protein sequence evolutionNature Reviews Genetics 16:409–420.https://doi.org/10.1038/nrg3950
Decision letter
-
Detlef WeigelSenior Editor; Max Planck Institute for Developmental Biology, Germany
-
Nir Ben-TalReviewing Editor; Tel Aviv University, Israel
-
Claus O WilkeReviewer; The University of Texas at Austin, United States
In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.
Thank you for submitting your article "Estimating the Protein Burden Limit of Yeast Cells by Measuring Expression Limits of Glycolytic Proteins" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Detlef Weigel as the Senior Editor. The following individual involved in review of your submission has agreed to reveal his identity: Claus O Wilke (Reviewer #3).
The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.
Summary:
Moriya et al., attempt to determine the protein expression limits of glycolytic proteins. They measure the growth rate, protein expression and copy number of glycolytic proteins using both high copy and low copy plasmids. They go on to estimate the contributions of various factors that might determine the upper limit of protein expression such as metabolic activity, codon usage, membrane localization and disulfide bond formation. The authors conclude that while metabolic activity has no role in determining the expression limit, disulfide bond formation, membrane localization and sub-optimal codon usage limit the extent of protein expression. The authors assert the ability of their pipeline to differentiate between proteins that can be overexpressed from those that will be harmful upon overexpression.
Essential revisions:
1) One of the main findings of the paper is that protein expression limits are not determined by the metabolic activity. The authors base this primarily on the lack of a difference between expression levels of proteins and their catalytic mutants (proteins with mutations in their catalytic sites). The authors need to verify that metabolism is indeed altered to unequivocally comment on the link between metabolic activity and protein expression levels. Instances that highlight the need to perform metabolic measurements in the manuscript are:
A) The authors measured metabolic activity for only one protein-mutant pair (Pfk2, Figure 4), which did not show any difference in protein expression levels (Figure 3A). The authors need to measure the metabolic levels for pairs of proteins that show a significant difference in their expression levels like Pfk1 or Tdh3 to be able to comment on the link between metabolism and protein levels.
B) In addition, there are several contradictions in the effect of these catalytic mutants on protein expression levels. Figure 3A shows that in two cases the mutant has higher expression (Pfk1 and Tdh3) but in other two cases, the wild-type has higher expression (Fba1, Eno1). How do the authors explain such differences? In addition, Tdh3 mutant has lower expression level compared to wild-type when expressed from a high copy plasmid. How do the authors explain this flip?
C) Discussion, fifth paragraph: The authors claim that one of the reasons why they don't see any association between metabolic activity and expression is that the majority of these enzymes are bidirectional. This is not true for all the enzymes as some glycolytic enzymes are unidirectional. In addition, the authors need to show a control example where unidirectional enzyme has a higher protein expression in order to make any claim between enzymatic directionality and protein expression.
2) The authors show that three mechanisms namely codon optimization, membrane localization and disulfide bond formation, determine the limit of expressions of several proteins. However, the differences observed by the authors are significant but really small. While it is likely that multiple mechanisms would contribute to determining the upper limit of protein expression, the authors need to be cautious about claiming them as the sole factors limiting expression levels (subsection “Mitochondrial localization restricting the expression limit of Adh3” and subsection “Lower expression of nonharmful glycolytic proteins explained by their codon optimality”, first paragraph). In addition, the authors claim that disulfide bonds limit the expression of Eno2 and Pgk1 by triggering aggregation. They show that addition of DTT removes the bond formation in Eno2, and changing cysteine to serine in a third protein (Tpi1) reduces its expression levels. Both these pieces of evidence are incomplete independently. Instead of performing the estimations in two different proteins, the authors need to alter cysteine to serine in Eno2 and Pgk1 and then show that the bands disappear in addition to increase in expression. Alternately, the authors need to show that Tpi1 also forms bonds which disappear upon treatment with DTT. The fact that expression of Tpi1 is independent of DTT contradicts that role of disulfide bonds in limiting expression.
3) Overall the authors play a bit fast and loose with their statistics. First, p values should be accurately reported. Don't write "p<0.05", write "p=0.032". Second, whenever p values are stated it should also be stated what test was used (see e.g. subsection “Mutations in catalytic centers not affecting expression limits of most glycolytic proteins”). Third, correlations should be reported with p values (e.g. subsection “Metabolic perturbations triggered upon overexpression of glycolytic proteins”, last paragraph).
4) While the authors describe an interesting system to estimate the limits of protein expression within a cell, there are several discrepancies between vector copy number and measured expression levels, which raises the concern the results can be reflective of the technical experimental setting instead of true limitations. In addition, the authors use GFP, an exogenous protein with a high expression, as control. Endogenous proteins, preferably unidirectional and bidirectional enzymes, that show high and low expression levels, will make for better controls to the glycolytic enzymes. The following are specific examples of such discrepancies:
A) In Figure 1B and C, why is the maximum growth rate between high and low copy number vector control so different?
B) Subsection “Measurement of expression limits of glycolytic proteins”, second paragraph: The two explanations mentioned by the authors are not mutually exclusive. The authors argue that proteins with low expression and low copy number are harmful for the cell and the ones with low expression and high copy number are repressed due to their high copy number. This is a circular argument and doesn't explain why the copy numbers are high in the first place. Finally, is the Pearson correlation of 0.3 significant? The authors have dismissed it but they need to show that it is statistically not significant.
C) The authors claim that 15 percent of the total cellular protein is the limit for overexpression of protein. However, the authors do not observe any correlation between molecular weight and protein expression levels. How do the authors explain this lack of correlation?
5) The results depend on how exactly one defines "growth defect" and how accurately one measures it. The paper does not discuss this issue. "growth defect" needs to be defined precisely, and the authors also need to argue that they can measure it with sufficient accuracy. One way by which one could get the result that there are no growth defects even at high levels of overexpression is by using a very insensitive assay.
6) A simple summary table presenting the expression limit and proposed mechanism of toxicity (or not) and the evidence for this for all 29 proteins would be very helpful. This could replace some of the information in Table 1 which could be moved to the supplement.
7) In places it is not entirely clear why the authors are only performing mechanistic experiments on a specific subset of the proteins. Again a summary table might help to better communicate what has been tested for which proteins and why.
8) 'Repression' implies an active mechanism to lower protein concentration whereas it is just that these proteins are not using optimised codons that increase translation like the other enzymes. It is better to avoid this word and simply talk about 'lower expression'.
9) The metabolic profiling is rather inconclusive. Is the conclusion simply that changes in the quantified metabolites cannot be causing the growth defect? There also doesn't seem to be much of a connection between the results of the computational simulations and the metabolic profiling, so it's not at all clear how useful the simulations are.
10) There is an obvious other source of potential growth defects that have been discussed widely in the literature but that aren't mentioned at all: Toxic effects due to protein misfolding or misinteractions. For example, Geiler-Samerotte et al. measured the effect of overexpressed, misfolded GFP on yeast growth and found an effect in proportion to the amount of misfolded protein (https://doi.org/10.1073/pnas.1017570108). Also concentration-dependent liquid demixing e.g. Bolognesi et al., 2016. Similar topics have been discussed in the literature for a long time, see e.g. this review: https://www.nature.com/articles/nrg2662. No additional work required, but discussion is needed.
https://doi.org/10.7554/eLife.34595.032Author response
Essential revisions:
1) One of the main findings of the paper is that protein expression limits are not determined by the metabolic activity. The authors base this primarily on the lack of a difference between expression levels of proteins and their catalytic mutants (proteins with mutations in their catalytic sites). The authors need to verify that metabolism is indeed altered to unequivocally comment on the link between metabolic activity and protein expression levels. Instances that highlight the need to perform metabolic measurements in the manuscript are:
A) The authors measured metabolic activity for only one protein-mutant pair (Pfk2, Figure 4), which did not show any difference in protein expression levels (Figure 3A). The authors need to measure the metabolic levels for pairs of proteins that show a significant difference in their expression levels like Pfk1 or Tdh3 to be able to comment on the link between metabolism and protein levels.
We agree the reviewer’s comment that we need a positive example of which metabolic perturbation mainly determines its expression limit. Overexpression of that protein should cause strong growth defects, and thus expression limit is low, and its CC mutation dramatically increases its expression limit up to the level of 15% of total protein (the protein burden limit). We expected to obtain this type of protein during our analysis of glycolytic proteins because such enzyme had never been identified as far as we know. However, we could not obtain it. We need to further survey whether our finding, “catalytic activity does not determine the expression limit of a metabolic enzyme”, can generally be applicable for other metabolic enzymes.
From the comparison of expression limits between wild-type proteins and catalytic center (CC) mutant proteins (Figure 3A and 3B), we currently believe that metabolic perturbations triggered by the overexpression of Pfk1 and Pfk2 (but not Tdh3) partially restrict their expression limits. We agree that difference in the expression limits between the wild-type and CC mutant is not a direct way to show whether the metabolic perturbation is a significant determinant of the expression limit, and measurement of metabolic levels would strengthen our arguments.
We first carefully re-examined our metabolite measurements of the cells overexpressing wild-type and CC mutant of Pfk2.The levels of some glycolytic metabolites were significantly lower than the those of the vector control (new Figure 5A, p < 0.05, Welch’s t-test). We further found that the level of F16bP in the cells overexpressing wild-type Pfk2 was >3-fold higher than CC mutant Pfk2 (p < 0.05, Welch’s t-test). Because F16bP is the product of Pfk catalysis and the simulation predicted the dramatic increase in the F16bP level upon overexpression of Pfks, the catalytic activity of Pfk2 might trigger this metabolic difference. However, as discussed later, it is not easy to conclude whether this minor F16bP change causes growth defects.
We then measured metabolic changes in the cells overexpressing Pfk1 and Tdh3 (and their CC mutants) in low-copy conditions where CC mutants showed increased expression limits. As shown in the new Figure 5B, C, levels of glycolytic metabolites in the cells overexpressing wild-type Pfk1 and Tdh3 were not changed more than threefold over the vector control. We did not observe reproducible higher F16bP level in the cells overexpressing wild-type Pfk1 than CC mutant Pfk1. Moreover, the metabolic changes were higher in the cells overexpressing CC mutant than wild-type Pfk1 (Figure 5B). We did not observe any reproducible difference between the metabolic changes between the cells overexpressing wild-type and those overexpressing CC mutant Tdh3 (Figure 5C). We thus concluded that overexpression of Pfk1 and Tdh3 did not trigger significant metabolic changes through their catalytic activities at least in the detected glycolytic metabolites.
We added these results in new Figure 5 and Figure 5—figure supplement 1 and thoroughly rewrote relevant descriptions in Results as follows:
“To further characterize physiological conditions triggered by the overexpression of Pfks, we next analyzed metabolic changes in yeast cells overexpressing wild-type and CC mutant Pfk2 over the vector control by measuring 35 metabolites (Figure 5–source data 1), because the CC mutants showed increased expression limits (Figure 3B). […] We thus concluded that overexpression of Pfk1 and Tdh3 did not trigger significant metabolic changes through their catalytic activities at least in the detected glycolytic metabolites.”
B) In addition, there are several contradictions in the effect of these catalytic mutants on protein expression levels. Figure 3A shows that in two cases the mutant has higher expression (Pfk1 and Tdh3) but in other two cases, the wild-type has higher expression (Fba1, Eno1). How do the authors explain such differences? In addition, Tdh3 mutant has lower expression level compared to wild-type when expressed from a high copy plasmid. How do the authors explain this flip?
Higher expression limits in CC mutants could be explained as they cause less perturbation in the metabolism, which is the reason why we constitute the mutants as described.
During the revision process, we found a mistake in the construction of the CC mutant of PGK1 (we had introduced a synonymous mutation (115A>C) which did not change the target amino acid). We thus re-constituted a correct CC mutant of PGK1 (R39A, 115A>G, 116G>C). We are very sorry for this mistake (we checked and confirmed that we constructed all others as we had intended). We measured the expression limit of the mutant and noticed it also showed significantly lower expression limit than the wild-type (Figure 3A). We also performed statistical analysis again by more strictly (Welch’s t-test instead of Student’s t-test), and noticed the difference between the wild-type and the CC mutant of Eno1 was not significant. The CC mutants of Fba1 and Pgk1 thus showed significant lower expression limits than the wild-types.
We currently do not have any substantial and consistent explanation why these CC mutants have lower expression limits. We can assume some general mechanisms; CC mutant proteins sequester the wild-type enzymes into inactive complexes; CC mutant proteins sequester substrate molecules for the wild-type enzymes; mutation in the catalytic center destabilize the structure of the enzyme. For example, Fba1 is an essential homodimeric enzyme (UniProtKB: P14540). Overexpressed of Fba1 CC mutant molecules might sequester active wild-type Fba1 molecule into inactive complexes.
The strange behavior of Tdh3 CC mutant might be related to its moonlighting function. As described above, the difference in the expression limits between wild-type and CC mutant Tdh3 seemed not to be explained its catalytic activity. Beside its metabolic function, Tdh3 directly bind to Sir2 protein to promote transcriptional silencing (Ringel AE., et al., 2013). A mutation in the catalytic center (C150G) reduces the silencing. It is thus possible that our CC mutant of Tdh3 (C150S) affect the silencing in a dose-dependent manner by competing with the wild-type Tdh3 for binding with Sir2.
We added these discussion in Discussion section as follows:
“The CC mutants of Fba1 and Pgk1 showed lower expression limits than their wild-types (Figure 3A). […] It is thus possible that the CC mutant Tdh3 (C150S) affect the silencing in a dose-dependent manner by competing with the wild-type Tdh3 for binding with Sir2.”
C) Discussion, fifth paragraph: The authors claim that one of the reasons why they don't see any association between metabolic activity and expression is that the majority of these enzymes are bidirectional. This is not true for all the enzymes as some glycolytic enzymes are unidirectional. In addition, the authors need to show a control example where unidirectional enzyme has a higher protein expression in order to make any claim between enzymatic directionality and protein expression.
Our claim is opposite. We are sorry for our confusing descriptions. We tried to claim that overexpression of a bidirectional enzyme does not strongly affect the metabolite levels because the substrate/product levels by themselves should determine the flux. Conversely, overexpression of a unidirectional enzyme should strongly affect the metabolite levels because the enzymatic activity should determine the flux. We thus think that unidirectional enzymes (Hxks, Pfks, and Cdc19) should instead have lower expression limits. The idea is based on the results of the simulation shown in Figure 4. In the simulation, overexpression of 14 bidirectional enzymes showed minor metabolic changes (Figure 4A), while all six unidirectional enzymes showed strong metabolic changes (Figures 4B–D). At least in theoretical level, above hypothesis that overexpression of a bidirectional enzyme does not strongly affect the metabolite levels seemed right. This hypothesis, however, was not supported by the experiment because both wild-type and CC mutant of Hxts had high limits and CC mutant Cdc19 did not increase its expression limits (Figure 3A). We need further analysis about this issue, but it is beyond the scope of this study.
We thus added descriptions about this issue in Discussion:
“To support this idea, overexpression of 14 bidirectional enzymes showed minor metabolic changes, while overexpression of 6 unidirectional enzymes (Hxks, Pfks, Cdc19, and Pdc1) showed strong metabolic changes in the simulation (Figure 4). […] These results suggest an additional mechanism that is not implemented into the model to avoid big metabolic changes upon overexpression of Hxks; a mechanism that prevents these metabolic perturbations from occurring, or a mechanism that prevents these metabolic perturbations from causing growth defects.”
2) The authors show that three mechanisms namely codon optimization, membrane localization and disulfide bond formation, determine the limit of expressions of several proteins. However, the differences observed by the authors are significant but really small. While it is likely that multiple mechanisms would contribute to determining the upper limit of protein expression, the authors need to be cautious about claiming them as the sole factors limiting expression levels (subsection “Mitochondrial localization restricting the expression limit of Adh3” and subsection “Lower expression of nonharmful glycolytic proteins explained by their codon optimality”, first paragraph).
We agree that multiple mechanisms would contribute to determining expression limit of a protein, and in this study, we tried to reveal them one by one using glycolytic proteins as model proteins. We are sorry that our descriptions gave an impression that we are claiming our findings as the sole factors limiting expression levels.
We thus added some arguments about this in Discussion section as follows:
“As described above, we revealed mechanisms restricting the expression limits of some glycolytic proteins. We, however, do not think that these mechanisms are the sole factors restricting expression limits of these proteins. The expression limits of ΔMTS-Adh3 (0.45 AU, Figure 3D) and CoGlkl (1.07 AU, Figure 6A) are still lower than the other high limit proteins such as Pgk1 and Gpm1 (2.26 AU and 2.63 AU, Figure 2B). It is thus likely that multiple mechanisms would restrict the expression limits of these proteins.”
In addition, the authors claim that disulfide bonds limit the expression of Eno2 and Pgk1 by triggering aggregation. They show that addition of DTT removes the bond formation in Eno2, and changing cysteine to serine in a third protein (Tpi1) reduces its expression levels. Both these pieces of evidence are incomplete independently. Instead of performing the estimations in two different proteins, the authors need to alter cysteine to serine in Eno2 and Pgk1 and then show that the bands disappear in addition to increase in expression. Alternately, the authors need to show that Tpi1 also forms bonds which disappear upon treatment with DTT. The fact that expression of Tpi1 is independent of DTT contradicts that role of disulfide bonds in limiting expression.
As described above, we found a mistake in the construction of the CC mutant of Pgk1. We thus withdrew the identification of proteins within the aggregate observed when Pgk1 CC mutant was overexpressed from Figure 6D (now Figure 7D) as it was not the mutant protein. We are sorry for our mistake, but our finding is not primarily affected by this.
We constructed mutants of Eno2 (C248S) and Pgk1 (C98S) whose cysteine was changed into serine and analyzed their effects. We confirmed that the aggregation bands were disappeared as shown in Figure 7—figure supplement 1. We, however, did not observe any increase in the expression limits of Eno2 and Pgk1 even when their cysteines were removed. We thus concluded that S-S bond-connected proteins aggregation does not restrict the expression limits of Eno2 and Pgk1. We are not surprised by this result because only small part of overexpressed Eno2 and Pgk1 constitute the aggregation bands (the bands are only observed with a long-exposure), and expression limits of wild-type Eno2 and Pgk1 are the highest levels among other glycolytic proteins (Figure 2B); hence their expression limits seems not restricted by other mechanisms than the protein burden effect. We thus concluded that deleterious effect of this aggregation might be protein-specific.
Detection of the S-S bond connected protein aggregation was just a hint to think the reason why the expression limits of Tpi1 is far lower than Gpm1 who does not contain any cysteine (Figure 2B). Tpi1 was detected in both Eno2 and Pgk1 overexpression-triggered aggregates although Tpi1 itself was not overexpressed in that situation, suggesting that Tpi1 is a naturally-aggregative protein. As the reviewer has suggested, we tried to detect aggregation band upon overexpression of Tpi1, and if the band is disappeared when cysteines are substituted.
We did not identify any aggregation band upon overexpression of Tpi1 by the total protein staining (new Figure 8A). We want to emphasize that our critical finding in our identification of the proteins in the aggregates is that the aggregation bands contain many different proteins in addition to the overexpressed proteins themselves. Hence, the aggregation bands visible in the gel are constructed by chance, and other invisible aggregation bands with different sizes could exist in the gel. If Tpi1 constitutes aggregation with many different proteins, it is no wonder even if no visible aggregation band is observed. We thus performed Western blotting using Tpi1-specific antibodies to detect invisible aggregation bands upon total protein staining, and confirmed the existence of many aggregation bands that were disappeared by the removal of cysteines from Tpi1 or the DTT treatment (Figure 8A).
We added these results in Figure 7—figure supplement 1 and 2, and Figure 8 and added relevant descriptions in Results and Discussion as follows:
In Results:
“We confirmed that cysteines were responsible for creating these bands because they were disappeared when cysteine residues were removed from Pgk1 and Eno2 (Figure 7—figure supplementary 1).”
“This aggregation seemed not to affect their expression limits because expression limits of wild-types and cysteine-less mutants (Eno2-C248S and Pgk1-C98S) were indistinguishable (Figure 7—figure supplementary 2). […] To test whether the aggregation restricts the Tpi1 expression limit, we measured expression limits of cysteine-less Tpi1. As shown in Figure 8B, the expression levels of cysteine-less Tpi1 significantly increased.”
In Discussion:
“Overexpression of Eno2, Pgk1, and Tpi1 triggered S–S-bond-connected aggregation (Figure 7 and 8).”
“The deleterious effect of this aggregation, however, seems protein-specific because expression limits of Pgk1 and Eno1 were among highest (Figure 2A), and removal of their cysteine did not increase expression limits of them (Figure 7—figure supplement 1).”
3) Overall the authors play a bit fast and loose with their statistics. First, p values should be accurately reported. Don't write "p<0.05", write "p=0.032". Second, whenever p values are stated it should also be stated what test was used (see e.g. subsection “Mutations in catalytic centers not affecting expression limits of most glycolytic proteins”). Third, correlations should be reported with p values (e.g. subsection “Metabolic perturbations triggered upon overexpression of glycolytic proteins”, last paragraph).
We added statistics as the reviewer suggested. We initially used Student’s t-test to calculate p-value for most of our measurements, but re-calculated p-values using Welch’s t-test for more strict significance evaluations. All p-values are listed in each figure source data.
4) While the authors describe an interesting system to estimate the limits of protein expression within a cell, there are several discrepancies between vector copy number and measured expression levels, which raises the concern the results can be reflective of the technical experimental setting instead of true limitations.
We answered to this comment in the following specific examples.
In addition, the authors use GFP, an exogenous protein with a high expression, as control. Endogenous proteins, preferably unidirectional and bidirectional enzymes, that show high and low expression levels, will make for better controls to the glycolytic enzymes.
As we mentioned in Introduction, we previously found that GFP can be overexpressed up to the protein burden level which is 15% of the total cellular protein (Kintaka et al., 2016). As far as we recognize, there was no report that any endogenous or exogenous protein is expressed up to this level in S. cerevisiae. In other words, we did not know any endogenous protein whose expression limit is this high, and it was impossible to use an endogenous protein as a control in this study. Therefore, we can instead consider this study as the first identification of endogenous unidirectional and bidirectional enzymes that show high and low expression levels that can be used for future research.
The following are specific examples of such discrepancies:
A) In Figure 1B and C, why is the maximum growth rate between high and low copy number vector control so different?
We are sorry that our experimental system used here is a bit unusual and difficult to understand. The detailed explanation of this method is omitted because it is previously published (Moriya et al., 2006 and 2012). In this experiment, we used pTOW40836 who carries URA3 and leu2-89 (LEU2 with a truncated promoter) as the selection markers. Level of the Ura3 protein expressed from the plasmid is sufficient to fully recover the growth of the host strain (ura3 and leu2 deletion) in –uracil (+leucine) conditions even if the plasmid copy number is low because URA3 has a full strength promoter. In contrast, levels of the Leu2 protein is insufficient to recover the growth in –leucine conditions if the plasmid copy number is low because leu2-89 has a large deletion in the promoter. This becomes a bias to increase the plasmid copy number up to 150 copies per cell in –leucine conditions (to be more precise, a bias to select cells with higher plasmid copies). However, even in this condition, the growth rate is not fully recovered compared to +leucine conditions probably because the level of Leu2 protein is not sufficient enough.
We added a description below in Results:
“In this experimental system, maximum growth rates of the cells with the vector in +leucine conditions is much higher than those in –leucine conditions (see Figures 1B–C), probably because the copy number of leu2-89 is not sufficient to fully support the leucine requirement in –leucine conditions.”
B) Subsection “Measurement of expression limits of glycolytic proteins”, second paragraph: The two explanations mentioned by the authors are not mutually exclusive. The authors argue that proteins with low expression and low copy number are harmful for the cell and the ones with low expression and high copy number are repressed due to their high copy number. This is a circular argument and doesn't explain why the copy numbers are high in the first place.
We agree those two explanations are not mutually exclusive. We thus remove ‘independent’ from the sentence. For the latter part, we are sorry for our confusing descriptions; we used ‘because’ not for the mechanistic explanation but the evidence. To avoid confusion, we added exact citations to the data points on the graph in the sentences as follows:
“There could be two reasons that the expression level of a protein is low: (i) its strong overexpression is harmful to cellular growth and (ii) its expression is repressed. […] In contrast, the expression of Glk1, Pyk2, and Pdc6 seemed to be repressed because their copy numbers were higher than the others’ (blue circles in Figure 2D).”
Finally, is the Pearson correlation of 0.3 significant? The authors have dismissed it but they need to show that it is statistically not significant.
We added p value (0.11) that was not significant.
C) The authors claim that 15 percent of the total cellular protein is the limit for overexpression of protein. However, the authors do not observe any correlation between molecular weight and protein expression levels. How do the authors explain this lack of correlation?
In this study, we calculate the unit for the expression level of a protein (AU) from the total intensity of fluorescently-stained protein band, which reflects the total amino acid number within the band (not the molecule number of the protein). Therefore, if two proteins give the same expression AUs, the molecule number in the band of the larger protein is lower as the reviewer claims. For example, Gpm1 and Pgk1 had their expression levels 2.63 AU and 2.27 AU. When those units are normalized with their lengths (248 a.a. and 415 a.a.), their expression levels are 0.010 AU/a.a and 0.005 AU/a.a., so the larger protein expresses lower in the molecule number.
We can think the reason why we did not see any correlation between the expression levels (AU) and lengths is that the cost for protein production but not the number of the protein molecule determines the expression limit. Fifteen percent of the total cellular protein thus means ‘15% of total amino acid incorporated into cellular protein’, but not ‘15% molecules of total protein molecules’. We also want to emphasize that a protein can be expressed up to 15% of total protein only if overexpression of the protein does not cause additional harmful effect than the protein burden. So the expression limit of a harmful protein should be determined independently by the size.
We added these explanations in Results and Discussion.
In Results:
“The AU is considered to reflect the total amino acid number within the band, and the relative number of the protein molecule can be estimated by dividing AU by the protein length. When two proteins with different sizes give the same AUs, the molecule number of the larger protein in the band should be lower.”
In Discussion:
“Among the glycolytic proteins studied here, Pgk1, Gpm1, and Eno2 gave highest expression limits. […] It also suggests that their expression limits are not determined by the molar concentration of the proteins but by the cost of the protein production.”
5) The results depend on how exactly one defines "growth defect" and how accurately one measures it. The paper does not discuss this issue. "growth defect" needs to be defined precisely, and the authors also need to argue that they can measure it with sufficient accuracy. One way by which one could get the result that there are no growth defects even at high levels of overexpression is by using a very insensitive assay.
We define growth defect based on the significance in the difference of maximum growth rates (MGRs) between the cells overexpressing the target protein and the cells with the control vector. We also use the copy numbers of plasmids expressing target proteins as indicators for growth defects upon overexpression of target proteins. It is not easy to argue how accurate our measurements are, because we did not compare our measurements with other more sensitive measurements like competitive growth assay. However, these measurements are sufficiently reproducible to argue growth defects with statistical significance. We observed significant reductions of MGRs of the cells overexpressing most of the glycolytic proteins (p < 0.01, Welch’s t-test, Figure 1B). Plasmid copy numbers seemed more sensitive indicators for growth defects because all of them overexpressing glycolytic proteins but Pyk2 were less than half of that of the vector control with larger statistical significance (p < 0.001, Welch’s t-test, Figure 1D). We thus think our experimental system is sufficiently sensitive to discuss growth defects triggered by the overexpression of glycolytic proteins.
We added the explanation for our definition of the growth defect in Materials and methods as follows:
“The maximum growth rate (MGR) was calculated as described previously (Moriya et al., 2006). Average values, SD, and p-values of Welch's t-test were calculated from biological triplicates. We define growth defect based on the significance in the reduction of maximum growth rate of the cells overexpressing a target protein compared with that with the control vector (p < 0.01, Welch’s t-test).”
6) A simple summary table presenting the expression limit and proposed mechanism of toxicity (or not) and the evidence for this for all 29 proteins would be very helpful. This could replace some of the information in Table 1 which could be moved to the supplement.
We are thankful for the reviewer’s suggestion. We made a summary table (Supplementary file 1).
7) In places it is not entirely clear why the authors are only performing mechanistic experiments on a specific subset of the proteins. Again a summary table might help to better communicate what has been tested for which proteins and why.
We made a summary table (Supplementary file 1).
8) 'Repression' implies an active mechanism to lower protein concentration whereas it is just that these proteins are not using optimised codons that increase translation like the other enzymes. It is better to avoid this word and simply talk about 'lower expression'.
In the earlier part of Results, we used ‘repress’ to imply an active mechanism to lower protein concentration because we did not know the mechanism to lower the expression of Glk1, Pyk2, and Pdc6. The mechanism was later turned out to be low codon-optimality. Although we think that lowering codon optimality (deoptimizing) could be an evolutionally active mechanism to lower protein concentration, we agree this word is confusing. We thus changed the description in Discussion as follows:
“The translational rate of some glycolytic proteins, including Glk1, seemed low due to their lower codon optimality (Figure 6).”
9) The metabolic profiling is rather inconclusive. Is the conclusion simply that changes in the quantified metabolites cannot be causing the growth defect? There also doesn't seem to be much of a connection between the results of the computational simulations and the metabolic profiling, so it's not at all clear how useful the simulations are.
We agree that the metabolic profiling is inconclusive. A big issue we have recognized through the metabolic analysis was we hardly know how much change of which metabolite triggers growth defect. For example, does a threefold change in the F16bP level cause growth defect? As far as we know, this issue has never been assessed. Moreover, we also have recognized that we currently do not have any systematic way to identify metabolic changes directly triggered by the overexpression of an enzyme because metabolism is interconnected and overexpression of a protein could cause non-specific perturbations that ultimately affect metabolism. To answer this issue, we have to wait until a large-scale survey to connect metabolic changes and growth defects, preferably with systematic perturbations such as deletion and overexpression is performed. As described above, we observed a threefold difference between wild-type and CC mutant Pfk2 (Figure 5A). However, we cannot conclude this causes the difference in their expression limits with our current knowledge. We thus do not want to simply conclude that changes in the quantified metabolites cannot be causing the growth defects.
Above reasons are why we used a mathematical model as a reference. Using a mathematical model, we can predict potential metabolic changes triggered by overexpression of an enzyme without considering other unknown effects than its metabolic activity. As in the case of Pfks and Hxks overexpression in the simulation (Figures 4B, D), we can predict the pattern of metabolic changes. We again cannot define how much changes in these metabolites could trigger growth defects in the simulation (and in vivo). However, if the metabolic changes are divergent and almost catastrophic as observed in the Pfks and Hxks simulations (~1,000 fold increase in some metabolites, Figures 4B, D), we could claim metabolic changes triggered by the overexpression of these enzymes should cause growth defects. Hence, we expected to obtain similar metabolic changes upon overexpression of Pfks whose catalytic activities seemed to trigger growth defects. However, we did not observe such a big changes, and the pattern was not consistent (Figures 5, Figure 5—figure supplement 1). We also agree that the usefulness of the simulations is unclear with these results, but we at least could show one approach to attack above uncertain biological issue.
We agree that this is a critical argument need to be included in Discussion and described as follows:
“Through the metabolic analysis, we realized that we currently do not have any systematic way to identify metabolic changes directly triggered by the overexpression of an enzyme, because metabolism is interconnected and overexpression of a protein could cause non-specific perturbations that ultimately affect metabolism. […] To precisely answer these issues, we need a much deeper understanding of the connection between metabolite levels and the cellular growth.”
10) There is an obvious other source of potential growth defects that have been discussed widely in the literature but that aren't mentioned at all: Toxic effects due to protein misfolding or misinteractions. For example, Geiler-Samerotte et al. measured the effect of overexpressed, misfolded GFP on yeast growth and found an effect in proportion to the amount of misfolded protein (https://doi.org/10.1073/pnas.1017570108). Also concentration-dependent liquid demixing e.g. Bolognesi et al., 2016. Similar topics have been discussed in the literature for a long time, see e.g. this review: https://www.nature.com/articles/nrg2662. No additional work required, but discussion is needed.
We thank the reviewer for noticing us a critical argument. We had intended to include these mechanisms as promiscuous interaction in Introduction, but we recognized that it was not sufficient and needed to discuss them more.
We thus added the following descriptions in Discussion:
“Protein misfolding or misinteraction is considered to cause toxicity upon high-level expression of a protein with low translational robustness, low folding stability, or a high propensity for misinteraction (Drummond and Wilke, 2009; Zhang and Yang, 2015). […] We do not think this mechanism caused growth defects upon overexpression of glycolytic proteins studied here because they are less structurally-disordered (Moriya, 2015) and not nucleic-acid-binding proteins.”
https://doi.org/10.7554/eLife.34595.033Article and author information
Author details
Funding
New Energy and Industrial Technology Development Organization (Development of Production Techniques for Highly Functional Biomaterials Using Smart Cell, P16009)
- Hisao Moriya
Japan Society for the Promotion of Science (KAKENHI 17H03618)
- Hisao Moriya
Japan Society for the Promotion of Science (KAKENHI 15KK0258)
- Hisao Moriya
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
We thank the members of the Moriya laboratories for advice and helpful discussions, Mr. Katsuhiro Yamamoto and Ms. Yoshimi Hori for experimental support, and Dr. Kei Takahashi for providing experimental materials.
Senior Editor
- Detlef Weigel, Max Planck Institute for Developmental Biology, Germany
Reviewing Editor
- Nir Ben-Tal, Tel Aviv University, Israel
Reviewer
- Claus O Wilke, The University of Texas at Austin, United States
Publication history
- Received: December 22, 2017
- Accepted: July 1, 2018
- Version of Record published: August 10, 2018 (version 1)
Copyright
© 2018, Eguchi et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 5,892
- Page views
-
- 690
- Downloads
-
- 33
- Citations
Article citation count generated by polling the highest count across the following sources: Crossref, Scopus, PubMed Central.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Computational and Systems Biology
- Neuroscience
Brains are not engineered solutions to a well-defined problem but arose through selective pressure acting on random variation. It is therefore unclear how well a model chosen by an experimenter can relate neural activity to experimental conditions. Here we developed 'Model identification of neural encoding (MINE)'. MINE is an accessible framework using convolutional neural networks (CNN) to discover and characterize a model that relates aspects of tasks to neural activity. Although flexible, CNNs are difficult to interpret. We use Taylor decomposition approaches to understand the discovered model and how it maps task features to activity. We apply MINE to a published cortical dataset as well as experiments designed to probe thermoregulatory circuits in zebrafish. MINE allowed us to characterize neurons according to their receptive field and computational complexity, features which anatomically segregate in the brain. We also identified a new class of neurons that integrate thermosensory and behavioral information which eluded us previously when using traditional clustering and regression-based approaches.
-
- Computational and Systems Biology
- Epidemiology and Global Health
Background: While biological age in adults is often understood as representing general health and resilience, the conceptual interpretation of accelerated biological age in children and its relationship to development remains unclear. We aimed to clarify the relationship of accelerated biological age, assessed through two established biological age indicators, telomere length and DNA methylation age, and two novel candidate biological age indicators , to child developmental outcomes, including growth and adiposity, cognition, behaviour, lung function and onset of puberty, among European school-age children participating in the HELIX exposome cohort.
Methods: The study population included up to 1,173 children, aged between 5 and 12 years, from study centres in the UK, France, Spain, Norway, Lithuania, and Greece. Telomere length was measured through qPCR, blood DNA methylation and gene expression was measured using microarray, and proteins and metabolites were measured by a range of targeted assays. DNA methylation age was assessed using Horvath's skin and blood clock, while novel blood transcriptome and 'immunometabolic' (based on plasma protein and urinary and serum metabolite data) clocks were derived and tested in a subset of children assessed six months after the main follow-up visit. Associations between biological age indicators with child developmental measures as well as health risk factors were estimated using linear regression, adjusted for chronological age, sex, ethnicity and study centre. The clock derived markers were expressed as Δ age (i.e., predicted minus chronological age).
Results: Transcriptome and immunometabolic clocks predicted chronological age well in the test set (r= 0.93 and r= 0.84 respectively). Generally, weak correlations were observed, after adjustment for chronological age, between the biological age indicators. Among associations with health risk factors, higher birthweight was associated with greater immunometabolic Δ age, smoke exposure with greater DNA methylation Δ age and high family affluence with longer telomere length. Among associations with child developmental measures, all biological age markers were associated with greater BMI and fat mass, and all markers except telomere length were associated with greater height, at least at nominal significance (p<0.05). Immunometabolic Δ age was associated with better working memory (p = 4e -3) and reduced inattentiveness (p= 4e -4), while DNA methylation Δ age was associated with greater inattentiveness (p=0.03) and poorer externalizing behaviours (p= 0.01). Shorter telomere length was also associated with poorer externalizing behaviours (p=0.03).
Conclusions: In children, as in adults, biological ageing appears to be a multi-faceted process and adiposity is an important correlate of accelerated biological ageing. Patterns of associations suggested that accelerated immunometabolic age may be beneficial for some aspects of child development while accelerated DNA methylation age and telomere attrition may reflect early detrimental aspects of biological ageing, apparent even in children.
Funding: UK Research and Innovation (MR/S03532X/1); European Commission (grant agreement numbers: 308333; 874583).