Resource allocation accounts for the large variability of rateyield phenotypes across bacterial strains
Abstract
Different strains of a microorganism growing in the same environment display a wide variety of growth rates and growth yields. We developed a coarsegrained model to test the hypothesis that different resource allocation strategies, corresponding to different compositions of the proteome, can account for the observed rateyield variability. The model predictions were verified by means of a database of hundreds of published rateyield and uptakesecretion phenotypes of Escherichia coli strains grown in standard laboratory conditions. We found a very good quantitative agreement between the range of predicted and observed growth rates, growth yields, and glucose uptake and acetate secretion rates. These results support the hypothesis that resource allocation is a major explanatory factor of the observed variability of growth rates and growth yields across different bacterial strains. An interesting prediction of our model, supported by the experimental data, is that high growth rates are not necessarily accompanied by low growth yields. The resource allocation strategies enabling highrate, highyield growth of E. coli lead to a higher saturation of enzymes and ribosomes, and thus to a more efficient utilization of proteomic resources. Our model thus contributes to a fundamental understanding of the quantitative relationship between rate and yield in E. coli and other microorganisms. It may also be useful for the rapid screening of strains in metabolic engineering and synthetic biology.
Editor's evaluation
This study develops a rigorous resource allocation model for E. coli growing under steadystate conditions. Validated by comparison with a compiled data set, the model highlights the complex nature of the relationship between metabolites, growth rate, and yield which is significantly more complex than the onetooneone relationship that has generally been assumed. The work will be of interest not only to investigators interested in basic questions of bacterial physiology but also to those working on applied problems in biotechnology.
https://doi.org/10.7554/eLife.79815.sa0Introduction
Microbial growth consists of the conversion of nutrients from the environment into biomass. This flux of material is coupled with a flux of energy from the substrate to small energy cofactors (ATP, NADH, NADPH, etc.) driving biomass synthesis forward and releasing energy in the process (Schaechter et al., 2006). The growth of microorganisms has been profitably analyzed from the perspective of resource allocation, that is, the assignment of limiting cellular resources to the different biochemical processes underlying growth (Scott et al., 2010; Scott et al., 2014; Molenaar et al., 2009; Giordano et al., 2016; Weiße et al., 2015; Reimers et al., 2017; Bosdriesz et al., 2015; Towbin et al., 2017; Maitra and Dill, 2015; Dourado and Lercher, 2020; MetzlRaz et al., 2017). It is often considered that proteins, the main component of biomass, are also the bottleneck resource for growth. Proteins function as enzymes in carbon and energy metabolism and they constitute the molecular machines responsible for the synthesis of macromolecules, in particular proteins themselves. The composition of the proteome in a given growth condition can therefore be interpreted as the resource allocation strategy adopted by the cells to exploit available nutrients.
Two macroscopic criteria for characterizing microbial growth are growth rate and growth yield. The former refers to the rate of conversion of substrate into biomass, and the latter to the efficiency of the process, that is, the fraction of substrate taken up by the cells that is converted into biomass. Several empirical relations between proteome composition on the one hand, and growth rate and growth yield on the other, have been established. A linear relation between growth rate and the ribosomal protein fraction of the proteome holds over a large range of growth rates and for a variety of microbial species (Scott et al., 2010; Neidhardt and Magasanik, 1960; Forchhammer and Lindahl, 1971; Bremer and Dennis, 1996). Variants of this socalled growth law have been found for cases of reduced translation capacities (Scott et al., 2010) or different temperatures (Herendeen et al., 1979; Mairet et al., 2021). While the ribosomal protein fraction increases with the growth rate, the proteome fraction allocated to energy metabolism decreases (Basan et al., 2015a; Schmidt et al., 2016). Moreover, within this decreasing fraction, Escherichia coli and other microorganisms move resources from respiration to fermentation pathways (Basan et al., 2015a). Simple mathematical models have been proposed to account for the above relations in terms of the requirements of selfreplication of the proteome and the relative protein costs and ATP yields of respiration and fermentation (Scott et al., 2010; Molenaar et al., 2009; Giordano et al., 2016; Weiße et al., 2015; Bosdriesz et al., 2015; Dourado and Lercher, 2020; Mairet et al., 2021; Basan et al., 2015a; Mori et al., 2019).
Most of these relations have been studied in experiments in which the same strain exhibits a range of growth rates in different environments, with different carbon sources. Even for a fixed environment, however, different strains of the same species may grow at very different rates and yields. For example, in a comparative study of seven E. coli strains, growth rates ranging from 0.61 to 0.97 hr^{1}, and (carbon) growth yields between 0.52 and 0.66, were observed during aerobic growth on glucose (Monk et al., 2016). Since the genes encoding enzymes in central carbon and energy metabolism are largely shared across the strains (Monk et al., 2016), the yield differences are not due to different metabolic capacities but rather to different regulatory strategies, that is, different usages of the metabolic pathways of the cell. As another example, evolution experiments with E. coli have given rise to evolved strains that grow more than 40% faster, sometimes with higher growth yields, than the ancestor strain in the same environment (LaCroix et al., 2015). Analysis of the underlying mutations reveals that the higher rates and yields of the evolved strains are not due to new metabolic capacities, but rather to modified regulatory strategies (LaCroix et al., 2015; Utrilla et al., 2016).
Can the large variability of rateyield phenotypes observed across different strains of the same species be explained by different resource allocation strategies, that is, different compositions of the proteome? In order to answer this question, we developed a coarsegrained resource allocation model that couples the fluxes of carbon and energy underlying microbial growth. The model was calibrated by means of existing data in the literature, without any parameter fitting, and its predictions were compared with a database of several hundreds of pairs of rates and yields of E. coli strains reported in the literature. The database includes wildtype strains as well as mutant strains obtained through directed mutagenesis or adaptive laboratory evolution (ALE).
We found that, in different growth conditions, the predicted variability of rateyield phenotypes corresponds very well with the observed range of phenotypes. This also holds for the variability of substrate uptake and acetate secretion rates. Whereas in the literature, a high rate is often associated with a low yield, due to a shift of resources from respiration to fermentation, many of the E. coli strains in our database grow at a high rate and a high yield. The model predicts that strains with a highrate, highyield phenotype require resource allocation strategies that increase metabolite concentrations in order to allow for the more efficient utilization of proteomic resources, in particular enzymes in metabolism and ribosomes in protein synthesis. This prediction is confirmed by experimental data for a highrate, highyield strain. A resource allocation strategy matching the observed strategy could only be found, however, when taking into account enzyme activities in addition to enzyme concentrations.
These results are interesting for both fundamental research and biotechnological applications. They show that the application of coarsegrained models can be used to predict multivariate phenotypes, without making any assumptions on optimality criteria, and reveal unexpected relations confirmed by the experimental data. The model is capable of predicting quantitative bounds on growth rates and yields within a specific environment, which can be exploited for rapidly screening performance limits of strains developed in synthetic biology and metabolic engineering.
Results
Coarsegrained model with coupled carbon and energy fluxes
Coarsegrained resource allocation models describe microbial growth by means of a limited number of macroreactions converting nutrients from the environment into proteins and other macromolecules. Several such models have been proposed, usually focusing on either carbon or energy fluxes (Scott et al., 2010; Molenaar et al., 2009; Giordano et al., 2016; Weiße et al., 2015; Maitra and Dill, 2015; Bosdriesz et al., 2015; Towbin et al., 2017; Mairet et al., 2021). Few models have taken into account both, that is, the use of substrate as a carbon source for macromolecules and as a source of free energy to fuel the synthesis of macromolecules. This coupling of carbon and energy fluxes is essential, however, for understanding the relation between growth rate and growth yield. Among the notable exceptions, we cite the model of Basan et al., 2015a (see also Mori et al., 2019), which couples carbon and energy fluxes while abstracting from the reaction kinetics, and the model of Zavřel et al., 2019, which does provide such a kinetic view but ignores macromolecules other than proteins and focuses on photosynthetic growth (see Appendix 1 for a discussion of existing coarsegrained resource allocation models).
Figure 1 presents a coarsegrained kinetic model that takes inspiration from and generalizes this previous work. While the model is generic, it has been instantiated for aerobic growth of E. coli in minimal medium with glucose or glycerol as the limiting carbon source. The model variables are intensive quantities corresponding to cellular concentrations of proteins ($p$) and other macromolecules (DNA, RNA, and lipids forming cell membranes) ($u$), as well as central carbon metabolites ($c$) and ATP (${a}^{*}$). The central carbon metabolites notably comprise the 13 precursor metabolites from which the building blocks for macromolecules (amino acids, nucleotides, etc.) are produced (Schaechter et al., 2006). All concentrations have units Cmmol gDW^{1}, except for ATP [mmol gDW^{1}]. Five macroreactions are responsible for carbohydrate uptake and metabolism, ATP production by aerobic respiration and fermentation, and the synthesis of proteins and other macromolecules. The rates of the reactions, denoted by ${v}_{mc}$, ${v}_{mer}$, ${v}_{mef}$, v_{r}, and ${v}_{mu}$ [Cmmol gDW^{1} hr^{1}], respectively, are defined by kinetic expressions involving protein, precursor metabolite, and ATP concentrations. Details of the rate equations and the derivation of the model from basic assumptions on microbial growth can be found in Appendix 1. Appendix 1—table 1 summarizes the definition of variables, reaction rates, and parameters.
The carbon entering the cell is included in the different biomass components or released in the form of CO_{2} and acetate. CO_{2} is produced by respiration and macromolecular synthesis, while acetate overflow is due to aerobic fermentation (Basan et al., 2015a; Gottschalk, 1986). The carbon balance also includes the turnover of macromolecules, which is responsible for a large part of cellular maintenance costs (van Bodegom, 2007 and Appendix 1).
The energy balance is expressed in terms of the production and consumption of ATP. While energy metabolism also involves other energy cofactors (NADP, NADPH, etc.), the latter can be converted into ATP during aerobic growth (Basan et al., 2015a; Gottschalk, 1986). We call the ATP fraction ${a}^{*}/({a}^{*}+a)$, where ${a}^{*}$ and $a$ denote the ATP and ADP concentrations, respectively, the energy charge of the cell, by analogy with the concept of adenylate energy charge (Atkinson, 1968). The ATP yields of respiration and fermentation (${n}_{mer}$ and ${n}_{mef}$) as well as the ATP costs of the synthesis of proteins and other macromolecules (n_{r} and ${n}_{mu}$) are determined by the stoichiometry of the underlying metabolic pathways and the biomass composition (Basan et al., 2015a; Kaleta et al., 2013 and Appendix 2). When total ATP production and consumption in growing microbial cells are computed from ${n}_{mer}{v}_{mer}+{n}_{mef}{v}_{mef}$ and ${n}_{r}{v}_{r}+{n}_{mu}{v}_{mu}$, respectively, the former usually largely exceeds the latter (Feist et al., 2007; Russell and Cook, 1995). This socalled uncoupling phenomenon is explicitly accounted for by an energy dissipation term v_{d} in the energy balance (Appendix 1).
Like in other resource allocation models, the proteome is subdivided into categories (Scott et al., 2010; Basan et al., 2015a). We distinguish ribosomes and other translationaffiliated proteins, enzymes in central carbon metabolism, enzymes in respiration and fermentation metabolism, and a residual category of other proteins, with concentrations $r$, m_{c}, ${m}_{er}$, ${m}_{ef}$, and m_{u}, respectively. The latter category includes proteins involved in the synthesis of RNA and DNA as well as in a variety of housekeeping functions. Each category of protein catalyzes a different macroreaction in Figure 1: ribosomes are responsible for protein synthesis, enzymes for carbon and energy metabolism, and residual proteins for the synthesis of macromolecules other than proteins. Note that the proteins in the residual category may thus catalyze a macroreaction, contrary to what is assumed in other models in the literature (Appendix 1).
The protein synthesis capacity of the cell, given by the total protein synthesis rate v_{r}, is distributed over the protein categories using five fractional resource allocation parameters that sum to 1: ${\chi}_{u}$, ${\chi}_{r}$, ${\chi}_{c}$, ${\chi}_{er}$, and ${\chi}_{ef}$. Fixing the resource allocation parameters determines the model dynamics and therefore the growth phenotype (Dourado and Lercher, 2020; Zavřel et al., 2019; de Groot et al., 2020). During balanced growth, when the system is at steady state, the resource allocation parameters equal the corresponding protein fractions, for example, ${\chi}_{r}^{*}={r}^{*}/{p}^{*}$, where the asterisk (${}^{*}$) denotes the steadystate value (Appendix 1 and Erickson et al., 2017).
Contrary to most models of microbial growth, the biomass includes other cellular components (DNA, RNA, metabolites, etc.) in addition to proteins (Appendix 1). The growth rate μ [hr^{1}] directly follows from the biomass definition, under the assumption that the total biomass concentration $1/\beta $ is constant (Appendix 1 and de Jong et al., 2017). The growth rate captures the specific accumulation of biomass corrected for degradation:
where ${\rho}_{mef}$ and ${\rho}_{ru}1$ denote the fractional loss of carbon by fermentation and macromolecular synthesis, respectively. More precisely, ${\rho}_{mef}$ and ${\rho}_{ru}$, both greater than 1, express that CO_{2} is a byproduct of the synthesis of acetate and of proteins and other macromolecules, respectively, adding to the total flux of carbon through these macroreactions (Basan et al., 2015a; Gottschalk, 1986). In the growth rate definition of Equation 1, the total macromolecular synthesis rate ${v}_{r}+{v}_{mu}$ is multiplied with ${\rho}_{ru}1$, because only the associated CO_{2} flux is lost to biomass production (Appendix 1).
The growth yield is defined as the ratio of the net biomass synthesis rate ($\mu /\beta $) and the substrate uptake rate ${v}_{mc}$:
Yields are dimensionless and vary between 0 and 1. They express the fraction of carbon taken up by the cells that is included in the biomass, a definition often used in ecology and biotechnology (Morin et al., 2016; Roller and Schmidt, 2015). The definitions of Equations 1 and 2 provide a rigorous statement of the carbon balance and thus enable the comparison of different resource allocation strategies.
The model in Figure 1 was calibrated using data from the literature for batch or continuous growth of E. coli in minimal medium with glucose or glycerol. In brief, for the E. coli reference strain BW25113, we collected for each growth medium the growth rate and metabolite uptake and secretion rates (Peebo et al., 2015; Haverkorn van Rijsewijk et al., 2011; Gerosa et al., 2015), as well as protein and metabolite concentrations (Schmidt et al., 2016; Gerosa et al., 2015). Using additional assumptions based on literature data (Bennett et al., 2009; Dourado et al., 2021), we fixed a unique set of parameters for each condition (batch vs. continuous growth, glucose vs. glycerol), without parameter fitting (Appendix 2). The resulting set of quantitative models provides a concise but comprehensive representation of the growth of E. coli in different environments.
Predicted rateyield phenotypes for E. coli
The reference strain used for calibrating the model has, for each of the conditions considered, a specific resource allocation strategy defined by the values of the resource allocation parameters: $({\chi}_{u},{\chi}_{r},{\chi}_{c},{\chi}_{er},{\chi}_{ef})$. We ask the question how the growth rate and growth yield change, during balanced growth, when the resource allocation strategy is different from the one adopted by the reference strain. In other words, we consider the range of possible rateyield phenotypes for strains with the same metabolic capacities as the reference strain, but different regulation of the allocation of protein resources to the macroreactions of Figure 1. The same parameter values for the kinetic constants are used as for the reference strain. This allows us to focus on differences in growth rate and growth yield that can be unambiguously attributed to differences in resource allocation.
In order to predict the variability of rateyield phenotypes, we uniformly sampled the space of possible resource allocation strategies. Except for the parameter ${\chi}_{u}$, expressing the fraction of resources attributed to housekeeping and other proteins, the parameters defining a resource allocation strategy were allowed to vary over the entire range from 0 to 1, subject to the constraint that they sum to 1 (Figure 1). The allowed range of values for ${\chi}_{u}$ was limited to the observed variation in the reference strain over a large variety of growth conditions (different limiting carbon sources, different stresses, etc.) (Schmidt et al., 2016 and Figure 2—figure supplement 1). For every resource allocation strategy, we numerically simulated the system until a steady state was reached, corresponding to balanced growth of the culture (Materials and methods). From the steadystate values of the fluxes and concentrations, the growth rate and growth yield can then be computed by means of Equations 1 and 2 (Figure 2—figure supplement 3).
Figure 2 shows the cloud of predicted rateyield phenotypes for batch growth on glucose. A first observation is that the possible combinations of rate and yield are bounded. The growth rate does not exceed 1.1 hr^{1}, and for all but the lowest growth rates, the growth yield is larger than 0.3. The existence of an upper bound on the growth rate can be intuitively understood from Equation 1. The maximum growth rate is limited by the substrate uptake rate, which provides the carbon included in the biomass. In turn, the uptake rate is bounded by the concentration of enzymes responsible for substrate uptake and metabolism, a concentration that is ultimately limited by the total biomass concentration. The existence of a lower bound on the biomass yield is a direct consequence of the autocatalytic nature of microbial growth: the different growthsupporting functions are sustained by enzymes and ribosomes, which need to be continually produced to counter the effect of growth dilution and degradation.
A second observation is that, for low growth rates, the maximum growth yield increases with the rate, whereas it decreases for high growth rates, above 0.4 hr^{1}. The initial maximum yield increase can be attributed to the proportionally lower burden of the maintenance costs (Pirt, 1965). In particular, considering that a higher growth rate comes with a higher substrate uptake rate (Equation 1), the term $\gamma /{v}_{mc}$ appearing in the definition of the yield when substituting the growth rate expression (Equation 2) rapidly diminishes in importance when the growth rate increases (Figure 4—figure supplement 1A). The decrease of the maximum yield at higher growth rates reflects a tradeoff that has been much investigated in microbial physiology and ecology (Lipson, 2015; Beardmore et al., 2011) and to which we return below.
Every point within the cloud of rateyield phenotypes corresponds to a specific underlying resource allocation strategy. The mapping from resource allocation strategies to rateyield phenotypes is far from straightforward due to the feedback loops in the model, which entail strong mutual dependencies between carbon and energy metabolism, protein synthesis, and growth. Useful insights into the nature of this mapping can be gained by visualizing the physiological consequences of a strategy in the form of a pictogram showing (i) the biomass composition, (ii) the flux map, and (iii) the energy charge. The pictogram summarizes how the incoming carbon flux is distributed over the biosynthesis, respiration, and fermentation fluxes, and how the concentrations of proteins, metabolites, and energy cofactors sustain these fluxes (Figure 2).
Due to model calibration, the fluxes, concentrations, and energy charge for the point corresponding to the growth of the reference strain, labeled BW in Figure 2, agree with the experimental data. At steady state, the resource allocation parameters coincide with the protein fractions (Erickson et al., 2017 and Appendix 1), so that the relative sizes of the protein concentrations in the pictogram correspond to the resource allocation strategy adopted by the cells. As can be seen, the reference strain highly invests in ribosomal and other translationoriented proteins, which take up almost 50% of the proteome. The pictogram also shows that the reference strain generates ATP by a combination of respiration and fermentation: both ${v}_{mer}$ and ${v}_{mef}$ are nonzero, and so are the corresponding enzyme concentrations ${m}_{er}$ and ${m}_{ef}$. Although proteins dominate the biomass, a nonnegligible proportion of the latter consists of other macromolecules (25%) and central metabolites (1%) (Appendix 2).
How does the reference point compare with other notable points in the cloud of predicted rateyield phenotypes, in particular the points at which the growth rate and growth yield are maximal, denoted by $\mu}_{max$ and $Y}_{max$? While the physiology of ${\mu}_{\text{max}}$ is not radically different from that for the reference strain, it does have a number of distinctive features. The higher growth rate comes with a higher glucose uptake rate and a higher protein synthesis rate. The total protein concentration is lower though, due to increased growth dilution at the higher growth rate. Investment in energy metabolism has shifted from fermentation to respiration, in order to allow for more efficient ATP production at a lower enzyme concentration. The energy charge is slightly lower than in the reference strain. This is compensated for by a higher metabolite concentration, however, which leads to a higher saturation of ribosomes and allows protein synthesis to increase even at a lower ribosome concentration. In other words, bearing in mind the kinetic expression for protein synthesis from Appendix 1,
where k_{r} is a catalytic constant corresponding to the maximum protein synthesis rate and ${K}_{r},{K}_{ar}$ halfsaturation constants, v_{r} can increase at ${\mu}_{\text{max}}$ despite the decrease of $r$ and ${a}^{*}$, thanks to the increase of $c$.
The rateyield phenotype corresponding to Y_{max} has a predicted physiology that is strikingly different from the reference strain. The high yield is obtained by a strong reduction of protein synthesis and therefore lower concentrations of enzymes and ribosomes (Figure 2). Protein synthesis is the principal ATPconsuming process in microbial growth, so its reduction diminishes the need for ATP synthesis and decreases the associated loss of carbon (Figure 1). The net effect is a decrease of the growth rate, but an increase of the growth yield (Equations 1 and 2).
The strong reduction of the concentration of proteins and other macromolecules at Y_{max} implies, by the assumption of constant biomass density (Appendix 1), that the metabolite concentration increases. This may correspond to the formation of glycogen, a glucose storage compound, which occurs when excess glucose cannot be used for macromolecular synthesis due to other limiting factors. Glycogen concentrations in wildtype E. coli cells are low, but there exist mutants which accumulate high amounts of glycogen, on the order of 25–30% of biomass (Morin et al., 2016). The biomass percentage of carbohydrates and lipids in other microorganisms, such as microalgae, reaches even higher levels (Finkel et al., 2016; Reitan et al., 2021).
The upper boundary of the cloud of predicted rateyield phenotypes in Figure 2, between Y_{max} and ${\mu}_{\text{max}}$, is a Pareto frontier. It corresponds to a tradeoff between growth rate and growth yield, which cannot be simultaneously increased in this region. How can this tradeoff be explained? By making appropriate assumptions, the model can be simplified along the Pareto frontier, which allows the decrease in growth yield with the increase in growth rate to be traced back to changes in the resource allocation strategy (Appendix 1 and Figure 2—figure supplement 4). In summary, the analysis shows that an increase in growth rate requires protein synthesis to be increased, which comes with a higher loss of carbon, and therefore a lower (maximum) yield. The increase in protein synthesis leads to a higher protein concentration, reflected in a resource allocation strategy shifting resources to the synthesis of enzymes in energy metabolism and ribosomes, and a correspondingly lower concentration of central carbon metabolites. That is, on the physiological level, the tradeoff between growth rate and growth yield corresponds to a tradeoff between protein and metabolite concentrations.
Some caution should be exercised in the biological interpretation of the points ${\mu}_{\text{max}}$ and $Y}_{max$, as they are located on the upper boundary of the cloud of predicted rateyield phenotypes. They represent extreme phenotypes that may be counterselected in the environment in which E. coli evolves or that may violate basic biophysical constraints not included in the model. Nevertheless, the bounds do put a quantitative limit on the variability of rateyield phenotypes that can be confronted with the available experimental data.
Comparison of predicted and observed rateyield phenotypes for E. coli
We predicted the variability of rateyield phenotypes of E. coli during batch growth in minimal medium with glucose or glycerol, and during continuous growth at different dilution rates in minimal medium with glucose. The resource allocation strategies were varied in each condition with respect to the strategy observed for the BW25113 strain used for model calibration (Figure 3A). In order to compare the predicted variability of rateyield phenotypes with experimental data, we compiled a database of measured rates and yields reported in the literature (Supplementary files 1 and 2), and plotted the measurements in the phenotype spaces (Figure 3B–D). The database includes the reference wildtype strain, other E. coli wildtype strains, strains with mutants in regulatory genes, and strains obtained from ALE experiments. Apart from the rate and yield of the reference strain (Haverkorn van Rijsewijk et al., 2011), none of the data points plotted in Figure 3 were used for calibration.
The variability of the measured rates and yields during batch growth on glucose corresponds very well with the predicted variability: all data points fall inside the predicted cloud of phenotypes and much of the cloud is covered by the data points (Figure 3B). Interestingly, the highest growth rates on glucose attained in ALE experiments, just above 1 hr^{1} (LaCroix et al., 2015; Monk et al., 2017), approach the highest predicted growth rates (1.1 hr^{1}). The range of high growth rates is enriched in data points, which may reflect the bias that E. coli wildtype and mutant strains grow relatively fast on glucose and glycerol, and that in most ALE experiments the selection pressure is tilted toward growth rate.
The BW25113 strain has a low growth yield on glucose (equal to 0.50, Haverkorn van Rijsewijk et al., 2011). Many mutants of this strain with deletions of regulatory genes somewhat increase the yield (Haverkorn van Rijsewijk et al., 2011), but still fall well below the maximally predicted yield. The growth yield of some other wildtype strains is significantly higher, for example the W strain achieves a yield of 0.66 at a growth rate of 0.97 hr^{1} (Monk et al., 2016). The highest growth yield is achieved by an evolved strain (0.81, Schuetz et al., 2012), agreeing quite well with the maximum predicted growth yield for that growth rate. The latter strain does not secrete any acetate while growing on glucose (Schuetz et al., 2012), which contributes to the higher yield.
Similar observations can be made for growth of E. coli on glycerol, although in this case less experimental data points are available (Figure 3D). The model predicts that the highest growth rate on glycerol is similar to the highest growth rate on glucose, which is confirmed by experimental data (Andersen and von Meyenburg, 1980). In addition to batch growth, we also considered continuous growth in a chemostat. This required a recalibration of the model, since the environment is not the same as for batch growth (Appendix 2). Figure 3C shows the predicted rateyield phenotype space for dilution rates around 0.2, 0.35, and 0.5 hr^{1}, as well as the observed rates and yields. Again, there is good correspondence between the predicted and observed variability of growth yield. Most chemostat experiments reported in the literature have been carried out with the BW25113 and MG1655 wildtype strains. This absence of mutants and evolved strains may lead to an underestimation of the range of observed growth yields.
In the above comparisons of the model with the data, we made the assumption that the strains considered have the same metabolic capacities as the reference strain. This assumption was satisfied by restricting the database to wildtype strains with essentially the same central carbon and energy metabolism (Monk et al., 2016), mutant strains with deletions of genes encoding regulators instead of enzymes (Haverkorn van Rijsewijk et al., 2011), and shortterm ALE mutants which have not had the time to develop new metabolic capacities (Monk et al., 2017). We also made the assumption that the parameter values are the same for all strains, so that differences in resource allocation strategies are the only explanatory variable. It is remarkable that, despite these strong assumptions, the model predicts very well the observed variability of rateyield phenotypes in E. coli.
Predicted and observed uptakesecretion phenotypes for E. coli
Growth rate and growth yield are defined in terms of carbon and energy fluxes through the population (Equations 1 and 2). Like rate and yield, some of these fluxes, in particular uptake and secretion rates, have been found to vary substantially across E. coli strains growing in minimal medium with glucose (Monk et al., 2016; LaCroix et al., 2015). Can our model also reproduce the observed variability of uptakesecretion phenotypes? We projected the model predictions in the space of uptakesecretion phenotypes, and crossed the latter with rateyield phenotypes. Moreover, we compared the predicted variability with measurements from studies in which not only growth rate and growth yield, but also uptake and secretion rates were measured (Supplementary file 1).
Figure 4A and B relates the predicted range of glucose uptake rates to the growth rates and growth yields, respectively. The model predicts an overall positive correlation between growth rate and glucose uptake rate, which is an obvious consequence of the fact that glucose provides the carbon included in the biomass. The glucose uptake rate does not unambiguously determine the growth rate though. Depending on the resource allocation strategy, the bacteria can grow at different yields for a given glucose uptake rate (Equation 2 and Figure 4—figure supplement 1B). Note that the tradeoff between growth rate and maximum growth yield previously observed in Figure 3 reappears here in the form of a tradeoff between glucose uptake rate and maximum growth yield, for uptake rates above 20 Cmmol gDW^{1} hr^{1}.
The predicted variability of glucose uptake rates vs growth rates and growth yields corresponds to the observed variability. Almost all data points fall within the predicted cloud of phenotypes and the data points cover much of the cloud. The strains resulting from ALE experiments cluster along the predicted upper bound of not only rate but also yield, suggesting that part of the increase in growth rate of ALE strains is obtained through the more efficient utilization of glucose.
Another observable flux is the acetate secretion rate, which is an indicator of the functioning of energy metabolism. In aerobic conditions, E. coli has two different modes of ATP production: respiration and fermentation. Glucose and glycerol are taken up by the cells and degraded in the glycolysis pathway, eventually producing acetylCoA. Whereas acetylCoA enters the tricarboxylic acid (TCA) cycle in the case of respiration, it is secreted in the form of acetate during fermentation. In both cases, NADP and other reduced compounds are produced along the way and their recycling is coupled with the generation of a proton gradient across the membrane, enabling the production of ATP. Respiration is the more efficient of the two ATP production modes: in E. coli, respiration yields 26 ATP molecules per molecule of glucose and fermentation only 12 (Basan et al., 2015a).
Figure 4C and D shows the predicted relation between acetate secretion rates and growth rates and growth yields. The plots reveal a clear tradeoff between maximum growth yield and acetate secretion rate, due to the fact that fermentation is less efficient than respiration in producing ATP. The model predicts no apparent relation between growth rate and acetate secretion. In particular, high growth rates can be attained with a continuum of ATP production modes: from pure respiration to combinations of respiration and fermentation. Similar conclusions can be drawn when plotting the acetate secretion rate relative to the glucose uptake rate (${v}_{mef}/{v}_{mc}$), that is, when considering the fraction of carbon taken up that is secreted as acetate (Figure 4—figure supplement 1C–D). Maximum yield requires respiration without fermentation, whereas minimum yield is attained for maximum fermentation, where more than 50% of the carbon entering the cell is lost due to acetate overflow.
The measured combinations of acetate secretion rate vs growth rate or growth yield entirely fall within the bounds predicted by the model (Figure 4C–D). The data notably show that as the growth yield increases, fermentation phenotypes give way to respiration phenotypes. The measurements further confirm that it is possible for E. coli to grow fast without acetate secretion. In particular, some of the fastest growing E. coli wildtype strains have no acetate overflow, like the W strain (Monk et al., 2016), and some of the evolved strains grow very fast but with little acetate overflow as compared to their ancestors (Schuetz et al., 2012). The observed relative acetate secretion rates also fall almost entirely within the predicted bounds (Figure 4—figure supplement 1C–D).
Another view on the uptakesecretion data is obtained when plotting, for each resource allocation strategy, the predicted glucose uptake rate against the predicted acetate secretion rate (Figure 4E). Not surprisingly, the maximum acetate secretion rate increases with the glucose uptake rate, since acetate is a byproduct of glucose metabolism. The plot also emphasizes, however, that the increase of acetate secretion with glucose uptake is not a necessary constraint of the underlying growth physiology: E. coli is predicted to be able to grow without acetate overflow over almost the entire range of glucose uptake rates, from 0 to 65 Cmmol gDW^{1} hr^{1}.
Again, the observed variability of uptakesecretion phenotypes falls well within the predicted bounds, although a few outliers occur. In particular, the Crooks strain has a phenotype that is significantly deviating from the predicted combinations of acetate secretion and glucose uptake rates (Monk et al., 2017). This suggests that resource allocation alone cannot fully explain the observed phenotype and other regulatory effects need to be taken into account in this case. High acetate secretion rates, above 20 Cmmol gDW^{1} hr^{1}, are mostly absent from the database of observed uptakesecretion phenotypes. This is another manifestation of the overrepresentation of strains with a high growth rate on glucose (Figure 3B): the secretion of a large fraction of the glucose taken up in the form of acetate does not make it possible to attain high growth rates (Equation 1).
Given the higher ATP yield of respiration, it is not surprising that the highest growth yields are attained when respiration is preferred to fermentation. What might not have been expected, however, is that some strains achieve a growth rate on glucose close to the predicted maximum without resorting to fermentation. It is well known that when growing an E. coli strain in minimal medium with glucose at increasingly higher growth rates, the contribution of fermentation to ATP production increases at the expense of respiration, as witnessed by the increase of acetate secretion (Basan et al., 2015a; Nanchen et al., 2006; Peebo et al., 2015; Valgepea et al., 2010 and Figure 4—figure supplement 2). This shift of resources from respiration to fermentation has been explained in terms of constraints on available protein resources, trading costly but efficient respiration enzymes against cheap but inefficient fermentation enzymes. The existence of strains capable of attaining the highest growth rates without fermentation suggests that this proteome constraint can be bypassed and raises the question which resource allocation strategies allow the bacteria to do so.
Strategies enabling fast and efficient growth of E. coli
The analysis of the model predictions in Figure 2, notably the point ${\mu}_{\text{max}}$, provided some indications of the strategies enabling highrate, highyield growth of E. coli. Unfortunately, no data for ${\mu}_{\text{max}}$ are available. However, the NCM3722 strain (Brown and Jun, 2015) attains a growth rate approaching the maximally observed rate for E. coli in minimal medium with glucose (0.97 hr^{1}), and has a significantly higher growth yield than the BW25113 reference strain (0.6) (Schmidt et al., 2016; Cheng et al., 2019). The glucose uptake and acetate secretion rates of NCM have been measured in the growth conditions considered here (Basan et al., 2015a; Cheng et al., 2019) and proteomics data are available from the same experiment as used for calibration of the model (Schmidt et al., 2016, Figure 5A). How does the observed resource allocation strategy for NCM compare with the strategies that, according to the model, predict the rateyield and uptakesecretion phenotypes of NCM? And how do these strategies enable fast and efficient growth of this strain?
Whereas every resource allocation strategy gives rise to a unique rateyield phenotype, the inverse is not true: several strategies can in principle predict an observed combination of growth rate, growth yield, glucose uptake rate, and acetate secretion rate (Materials and methods and Figure 2—figure supplement 2). The boxplots in Figure 5C show the resource allocation strategies that, according to the model, give rise to a growth physiology consistent with that observed for NCM. That is, every individual strategy predicts a growth rate, growth yield, glucose uptake rate, and acetate secretion rate within 5% of the observed value. The same figure also shows the observed resource allocation strategy for NCM, consisting of the values of ${\chi}_{u}$, ${\chi}_{r}$, ${\chi}_{c}$, and ${\chi}_{e}={\chi}_{er}+{\chi}_{ef}$ during balanced growth on glucose, derived from the proteomics data (Materials and methods).
Whereas the strategies reproducing the rateyield and uptakesecretion phenotypes of NCM partially overlap with the measured strategy, the predicted ${\chi}_{c}$ values are significantly higher than those observed. In other words, the model requires a higher protein fraction for enzymes in central carbon metabolism (${m}_{c}/p$) than observed in the proteomics data. The underlying problem is that in our model the carbon uptake and metabolization rate is directly proportional to the enzyme concentration (Appendix 1):
where $S\gg {K}_{mc}$ during balanced growth in batch and e_{m} [hr^{1}] is an apparent catalytic constant (Appendix 1). Therefore, the high value glucose uptake rate necessary for the high growth rate of NCM requires a high enzyme concentration, and therefore a high protein fraction ${m}_{c}/p$. This is contradicted by the measured protein fraction for NCM, which is slightly lower than the one observed for BW (0.07 as compared to 0.09 for BW), for a glucose uptake rate that is much higher (66.0 Cmmol gDW^{1} hr^{1} as compared to 49.6 Cmmol gDW^{1} hr^{1} for BW). Note that a less pronounced, but opposite divergence of model and data is seen in the case of the protein fractions of ribosomal proteins and enzymes in energy metabolism (Figure 5C). That is, the predicted overinvestment in central metabolism comes with a corresponding underinvestment in protein synthesis and energy metabolism.
The discrepancies between predicted and observed resource allocation strategies suggest that bacteria exploit additional regulatory factors to achieve highrate, highyield growth. This conclusion agrees with the view that the regulation of fluxes in central metabolism involves not only enzyme concentrations, but also regulation of enzyme activity (Davidi and Milo, 2017; Donati et al., 2018). While little is known about the mechanisms allowing NCM to grow much faster than BW, genomic changes and their physiological impact have been identified for ALE strains (LaCroix et al., 2015; Utrilla et al., 2016; Cheng et al., 2014). In an ALE mutant evolved in glycerol, the change in growth rate was attributed to a change in activity of the GlpK enzyme (Cheng et al., 2014), leading to higher glycerol uptake rates. In the model, the latter mutation would translate to an increase in the catalytic constant ${k}_{mc}$ (Appendix 1).
In order to verify the hypothesis that an additional layer of regulation, acting upon enzyme activity, plays a role in highrate, highyield growth, we modified the analysis of the model. Instead of varying only resource allocation parameters $({\chi}_{u},{\chi}_{r},{\chi}_{c},{\chi}_{er},{\chi}_{ef})$, we also allowed the catalytic constants $({k}_{mc},{k}_{mer},{k}_{mef})$, representing the (apparent) enzyme turnover rates in central carbon and energy metabolism (Appendix 1), to increase or decrease by at most a factor of 2. The results of the simulations are shown in Figure 5C. They reveal that there now exist resource allocation strategies capable of reproducing the observed NCM growth phenotypes within a 5% margin. Most notably, these strategies require an increased value of ${k}_{mc}$ (Figure 5—figure supplement 1). That is, the model predicts that glycolytic enzymes are more active in NCM as compared to BW during growth on glucose. This allows resources to be shifted from glycolytic enzymes to other growthsupporting functions. Whereas no experimental data exist to specifically test the above prediction, it is known that the activity of pyruvate kinase, regulated by fructose1,6bisphosphate (Valentini et al., 2000), increases with a higher glycolytic flux and therefore higher growth rate (Kochanowski et al., 2013; Kremling et al., 2007).
Our model thus allows the accurate reconstruction of resource allocation strategies underlying highrate, highyield growth of the E. coli NCM strain on glucose, when the repertoire of available strategies is enlarged from resource allocation to the regulation of enzyme activity. In addition to the rateyield and uptakesecretion phenotypes, the strategies also reproduce the total protein and metabolite concentrations (Figure 5D and Basan et al., 2015b; Park et al., 2016). Importantly for the question how the strategies enable highrate, highyield growth, NCM is seen to maintain a higher metabolite concentration than BW (Figure 5B). As a consequence, the estimated ratio of central metabolites and halfsaturation constants rises from 1.2 for BW to 3.0 for NCM (Appendix 2). The resulting increased saturation of enzymes and ribosomes sustains higher metabolic fluxes, without an additional investment in proteins (Figure 5B). This observation, together with the higher activity of enzymes in central carbon metabolism, suggests that the more efficient utilization of proteomic resources is key to highrate, highyield growth of E. coli. This strategy is reminiscent of the proposed existence of a tradeoff between enzyme and metabolite concentrations in central carbon metabolism in other recent studies (Dourado et al., 2021; Fendt et al., 2010; O’Brien et al., 2016).
Discussion
Analysis of the resource allocation strategies adopted by microbial cells can explain a number of phenomenological relations between growth rate, growth yield, and macromolecular composition (Scott et al., 2010; Scott et al., 2014; Molenaar et al., 2009; Giordano et al., 2016; Weiße et al., 2015; Reimers et al., 2017; Bosdriesz et al., 2015; Towbin et al., 2017; Maitra and Dill, 2015; Dourado and Lercher, 2020; MetzlRaz et al., 2017). We have generalized this perspective to account for a striking observation: the large variability of rateyield phenotypes across different strains of a bacterial species grown in the same environment. We constructed a coarsegrained resource allocation model (Figure 1), which was calibrated using literature data on batch and continuous growth of the E. coli BW25113 strain in minimal medium with glucose or glycerol. In each of the conditions, we considered the rateyield phenotypes predicted by the model when allowing resource allocation to vary over the entire range of possible strategies, while keeping the kinetic parameters constant.
This approach is based on a number of strong assumptions. The coarsegrained nature of the model reduces microbial metabolism and protein synthesis to a few macroreactions, instead of accounting for the hundreds of enzymecatalyzed reactions involved in these processes (Cheng et al., 2019; Adadi et al., 2012; Mori et al., 2016; Reimers et al., 2017; Wortel et al., 2018). Resource allocation is reduced to constraints on protein synthesis capacity, whereas other constraints such as limited solvent capacity and membrane space may also play a role (Adadi et al., 2012; Beg et al., 2007; Zhuang et al., 2011; Szenk et al., 2017). All possible combinations of resource allocation parameters were considered, limited only by the constraint that they must sum to 1. Observed variations in protein abundance are less drastic (Schmidt et al., 2016; Hui et al., 2015), and coupled through shared regulatory mechanisms (Scott et al., 2014; Chubukov et al., 2014). The kinetic parameters in the model have apparent values absorbing unknown regulatory effects, specific to each growth condition. This contrasts with strainspecific kinetic models with an explicit representation of the underlying regulatory mechanisms (Weiße et al., 2015; Erickson et al., 2017; Millard et al., 2017), and does not allow our model as such to be used for transitions between growth conditions.
Despite these limitations, we observed a very good quantitative correspondence between the predicted and observed variability of rateyield phenotypes of different E. coli strains grown in the same environment (Figure 3). This correspondence also holds when the comparison with the experimental data is extended to glucose uptake and acetate secretion rates associated with the measured growth rates and growth yields (Figure 4). The results suggest that differences in resource allocation are a major explanatory factor for the observed rateyield variability. We verified the robustness of this conclusion by testing alternative ways to calibrate the model (Appendix 1 and Appendix 2). In particular, we used data for another commonly used laboratory strain, MG1655, to determine the kinetic parameters, and we interpreted the proteomics data differently by introducing an additional category of growthrateindependent proteins that do not carry a flux (Scott et al., 2010; Hui et al., 2015). In both cases, the predicted rateyield variability largely overlaps with that obtained for the reference model (Figure 3—figure supplement 1).
Many studies of microbial growth have provided evidence for a tradeoff between growth rate and growth yield (see Lipson, 2015; Beardmore et al., 2011, for reviews). One particularly telling manifestation of this tradeoff is the relative increase of acetate overflow, and thus decrease of the growth yield, when an E. coli strain is grown on glucose at increasingly higher growth rates, by setting the dilution rate in a chemostat or by genetically modifying the glucose uptake rate (Figure 4—figure supplement 2). This shift of resources from respiration to fermentation has been explained in terms of a tradeoff between energy efficiency and protein cost (Molenaar et al., 2009; Basan et al., 2015a; Pfeiffer et al., 2001). In the experimental condition considered here, batch growth on glucose of different E. coli strains with the same metabolic capacities, we found no straightforward relation between growth rate and growth yield. Neither the model nor the data show a correlation between growth rate and acetate overflow (Figure 4C and Figure 4—figure supplement 1), as was also previously observed by Cheng et al., 2019, for a selection of ALE mutant strains. In particular, the data show that some of the fastest growing strains secrete little or no acetate and therefore have a high growth yield.
These findings raise the question which resource allocation strategies allow E. coli to grow on glucose both rapidly and efficiently. Our model predicts that a highrate, highyield phenotype, as exemplified by ${\mu}_{\text{max}}$ in Figure 2, can be obtained by increasing the concentration of central carbon metabolites in comparison with the concentration observed for the BW25113 strain used for calibration. While no data are available for the ${\mu}_{\text{max}}$ phenotype, a higher concentration of central carbon metabolites is indeed observed for the wellcharacterized NCM3722 strain, which also exhibits highrate, highyield growth (Figure 5B). The increased concentration of metabolites leads to a higher saturation of enzymes and ribosomes, and allows an increase of biosynthetic fluxes without a higher investment in proteins. When comparing the resource allocation strategies that predict the NCM phenotype with experimental data (Figure 5), we found some discrepancies that cannot be solely attributed to the uncertainty in the proteomics data. We therefore allowed the apparent catalytic constants of the macroreactions to vary as well, contrary to the initial model assumption, in order to account for genetic differences between strains or for regulatory mechanisms responding to physiological changes. This finetuning of the adaptation repertoire made it possible to quantitatively reproduce the highrate, highyield phenotype of NCM by means of resource allocation strategies consistent with the proteomics data (Figure 5). In comparison with the BW reference strain, a higher value of the catalytic constant corresponding to glucose uptake and metabolism was required, that is, a higher activity of glycolytic enzymes (Figure 5—figure supplement 1). Both higher enzyme saturation and higher enzyme activity point at a more efficient utilization of proteomic resources as a requirement for high rate, highyield growth.
A strategy consisting of the more efficient utilization of enzymes and ribosomes cannot be predicted by most existing models. For example, with constant metabolite concentrations and some additional simplifying assumptions, our model reduces to the wellknown model of Basan et al., 2015a, which predicts that high growth rates can only be attained at the expense of low growth yields (Appendix 1). In other words, in the absence of the possibility of a tradeoff between proteins and metabolites, our simplified model also predicts that an increase in growth rate requires a shift from energyefficient but costly respiration to energyinefficient but cheap fermentation. The model presented in this work is thus general enough to accommodate different strategies to increase the growth rate, some of which lead to a decrease in growth yield whereas others may afford an increase in growth yield by exploiting available degrees of freedom in the space of resource allocation strategies.
The main finding of this study is that the observed variability of growth rates and growth yields across different strains of a bacterial species can, to a large extent, be accounted for by a coarsegrained resource allocation model. The capability to predict the range of rates and yields achievable by a microbial species, and the possibility to relate these to underlying resource allocation strategies, is of great interest for a fundamental understanding of microbial growth. In addition, by extending the model with a macroreaction for the production of a protein or a metabolite of interest (Yegorov et al., 2019), this provides rapidly exploitable guidelines for metabolic engineering and synthetic biology, by pointing at performance limits of specific strains and suggesting improvements. While instantiated for growth of E. coli, the model equations are sufficiently generic to apply to other microorganisms. The calibration of such model variants can benefit from the same hierarchical procedure as developed here, exploiting largely available proteomics and metabolomics datasets.
Materials and methods
Simulation studies
Request a detailed protocolThe resource allocation models were derived from a limited number of assumptions on the processes underlying microbial growth, as explained in Appendix 1. The parameters in the models were determined from literature data, as described in Appendix 2. In order to produce the plots with rate, yield, uptake, and secretion phenotypes (Figures 2—4), we uniformly sampled combinations of resource allocation parameters ${\chi}_{r}$, ${\chi}_{c}$, ${\chi}_{er}$, and ${\chi}_{ef}$ such that their sum equals 1${\chi}_{u}$, where ${\chi}_{u}$ was sampled from a reduced interval determined from the data (Figure 2—figure supplement 1). Starting from initial conditions, the system was simulated for each combination of resource allocation parameters until a steady state was reached, and rate and yield were computed from the fluxes and concentrations at steady state (Figure 2—figure supplement 3).
When sampling the space of initial conditions for a given resource allocation strategy, the system was found to always reach the same steady state. Whereas every strategy thus gives rise to a unique rateyield phenotype, the inverse is not true: different strategies can account for a given growth rate and growth yield. An intuitive explanation can be obtained from inspection of Equations 1 and 2. A given rateyield phenotype fixes the substrate uptake rate ${v}_{mc}$ and the sum ${v}_{mer}+{\rho}_{mef}\phantom{\rule{thinmathspace}{0ex}}{v}_{mef}+(({\rho}_{ru}1)\phantom{\rule{thinmathspace}{0ex}}({v}_{r}+{v}_{mu}))$, representing the loss of carbon due to CO_{2} outflow and acetate secretion. Different resource allocation strategies, and hence different protein and metabolite concentrations, can lead to fluxes that add up to the latter sum, and thus enable the cells to grow at the specified rate and yield (Figure 2—figure supplement 3). The same argument generalizes to combined rateyield and uptakesecretion phenotypes.
All simulations were carried out by means of Matlab R2020b. The models and the simulation code used for generating all figures in the paper are available at https://gitlab.inria.fr/baldazzi/coliallocation.
Computation of rates and yields from published experimental data
Request a detailed protocolThe rateyield database was compiled from the experimental literature (Supplementary files 1 and 2). Growth rates have unit hr^{1} and growth yields were converted to the dimensionless quantity $\mathrm{C}\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\text{substrate}$ $\mathrm{C}\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\text{biomass}}^{1$ by means of appropriate conversion constants. Most publications report yields with unit gDW $\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\text{substrate}}^{1$, that is, as the ratio of the growth rate with unit hr^{1} and the substrate uptake rate with unit $\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\text{substrate}$ gDW^{1} hr^{1}. If yields are not explicitly reported, then they were computed in this way from the reported growth rate and substrate uptake rate. In order to convert $\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\text{substrate}$ to $\mathrm{C}\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\text{substrate}$, we multiplied the former with the number of carbon atoms in the substrate molecule (six for glucose, three for glycerol). In order to convert gDW to $\mathrm{C}\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\text{biomass}$, we used the consensus value for the biomass density $1/\beta $, 40.65 $\mathrm{C}\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\text{biomass}$ gDW^{1} (Appendix 2). Some substrate uptake rates, in particular for the NMC3722 strain, were expressed in units $\mathrm{m}\mathrm{M}}_{\text{substrate}$ OD^{1} hr^{1}. We used strainspecific and when possible laboratoryspecific conversion constants from optical density (OD) to gDW L^{1}, notably the value 0.49 gDW L^{1} OD^{1} for NMC3722 (Basan et al., 2015a). Acetate secretion rates reported in $\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\text{acetate}$ gDW^{1} hr^{1} or $\mathrm{m}\mathrm{M}}_{\text{acetate}$ OD^{1} hr^{1} were converted to unit Cmmol gDW^{1} hr^{1} using the same procedure.
Computation of resource allocation strategies from proteomics data
Request a detailed protocolThe observed resource allocation strategies for the BW25113, MG1655, and NCM3722 strains were computed by means of the proteomics data in Table S11 of Schmidt et al., 2016. We computed the mass fraction for each protein category distinguished in the model by associating the latter with specific COG groups ($r/p\to $ amino acid transport and metabolism and translation; ${m}_{c}/p\to $ carbohydrate transport and metabolism; $({m}_{er}+{m}_{ef})/p\to $ energy production and conversion; ${m}_{u}/p\to $ all other COG groups). The mass fraction of enzymes in energy metabolism was further subdivided into fractions attributed to respiration and fermentation, ${m}_{er}/p$ and ${m}_{ef}/p$, in the same way as for model calibration, by distinguishing enzymes specific to fermentation, enzymes specific to respiration, and enzymes shared between respiration and fermentation (Basan et al., 2015a, and Supplementary file 4). The resource allocation strategy during balanced growth $({\chi}_{u},{\chi}_{r},{\chi}_{c},{\chi}_{er},{\chi}_{ef})$ was equated with the corresponding mass fractions.
Appendix 1
Model equations
Modeling assumptions
The coarsegrained resource allocation model of coupled carbon and energy fluxes generalizes and elaborates upon previous models of microbial growth (Scott et al., 2010; Giordano et al., 2016; Basan et al., 2015a; Zavřel et al., 2019). It is based on a partitioning of the cellular proteome into five major categories:
Ribosomes and translationaffiliated proteins, including enzymes in amino acid metabolism, that are necessary for protein synthesis.
Enzymes in central carbon metabolism that are responsible for carbohydrate uptake and metabolism, leading to central carbon metabolites that fuel biosynthesis and ATP production pathways.
Enzymes in energy metabolism that are responsible for transferring (free) energy from carbohydrate substrates to small energy cofactors like ATP, NADH, and NADPH. This category is further subdivided into enzymes for aerobic respiration and fermentation, respectively.
Other proteins that do not fall within one of the abovementioned categories. This category includes, for example, proteins involved in the synthesis of RNA and DNA, cellcycle proteins, and a variety of housekeeping functions.
The partitioning is different from that found in some other coarsegrained models of microbial growth, as discussed in the section Model variant with an additional growthrateindependent protein category below.
In addition to the above proteins, we distinguish two intracellular metabolite categories:
Central carbon metabolites, that is, catabolic products of the carbohydrate substrate (glucose, glycerol, etc.) taken up from the medium. Central carbon metabolites include intermediates of the glycolysis pathway, the TCA cycle, and the pentose phosphate pathway, notably the 13 precursor metabolites from which the building blocks for macromolecules (amino acids, nucleotides, etc.) are produced (Schaechter et al., 2006). Central carbon metabolites can be stored in the form of glycogen or other storage compounds.
Energy cofactors driving the synthesis of proteins and other macromolecules, occurring in both their higherenergy form (ATP, NADH, NADPH, etc.) and lowerenergy form (ADP, NAD+, NADP+, etc.). Here, we restrict ourselves to the principal energy cofactors ATP and ADP, exploiting the fact that in aerobic conditions NADH and NADPH can be converted to ATP (Basan et al., 2015a; Gottschalk, 1986).
In addition to proteins and metabolites, we have:
Other macromolecules, notably including RNA, DNA, and lipids forming the cell membrane.
The cellular biomass consists of the sum of the above categories, that is, it includes proteins, metabolites, and other macromolecules, contrary to most other models which equate biomass with proteins. For reasons of simplicity, energy cofactors are not included as a separate category in the biomass. This is motivated by the fact that the total biomass fraction of ATP, ADP, NADH, NAD+, etc. is negligible (<1%, Appendix 2). As a consequence, the model does not explicitly account for their synthesis from central carbon intermediates, but only represents their role in the flow of energy through the different macroreactions.
The following macroreactions interconverting the above biomass categories are distinguished in the model:
Carbon uptake and central carbon metabolism, responsible for the uptake of the carbohydrate substrate from the medium and its conversion into metabolic precursors for amino acid biosynthesis and energy metabolism.
Energy metabolism for the regeneration of energy cofactors (conversion of ADP into ATP) through the respiration or fermentation of central carbon intermediates. In the former case, carbon leaves the cell in the form of CO_{2}, whereas both acetate and CO_{2} are produced in the second case.
Protein synthesis involving the biosynthesis and polymerization of amino acids, a process driven by ATP and releasing CO_{2}.
Synthesis of other macromolecules, like RNA and DNA, which consumes precursors from central metabolism and ATP, and releases CO_{2}.
The total protein synthesis rate is divided over the different protein categories enumerated above, according to fractional resource allocation parameters. Together, these parameters define the resource allocation strategy of the cell and determine the growth rate and growth yield in a given environmental condition.
The model includes two macroreactions producing ATP (respiration and fermentation) and two macroreactions consuming ATP (synthesis of proteins and other macromolecules). The ATP produced and consumed in central carbon metabolism is accounted for in the ATP balance of the other macroreactions. For example, the net ATP consumption attributed to protein synthesis does not only include the ATP costs of amino acid polymerization, but also ATP consumption and production required for amino acid synthesis (Kaleta et al., 2013). The same holds for the production of ATP by energy metabolism (Basan et al., 2015a).
Much of the carbon taken up and the ATP produced by microbial cells does not directly contribute to growth but is used for maintenance. Maintenance is a broad concept that includes, among other things, the turnover of macromolecules, osmoregulation, motility, and energy spilling (van Bodegom, 2007). The first type of maintenance costs distinguished in the model are the resources needed to compensate for the degradation of biomass, in particular macromolecules. As a consequence of biomass degradation, cells require a minimal substrate uptake rate above which net growth of the population starts. In Appendix 2, we show that biomass degradation in our model is structurally equivalent to the socalled maintenance coefficient in the Pirt model (Pirt, 1965). The second form of maintenance considered is energy dissipation. This refers to the sizable fraction of ATP that is not consumed for macromolecular synthesis but invested in other cellular processes that are not explicitly modeled, such as motility and the regulation of osmotic pressure, or that is apparently spilled (Russell and Cook, 1995).
Derivation of model equations
A schematic representation of microbial growth is shown in Appendix 1—figure 1, illustrating the modeling assumptions discussed above. Here, we derive a mathematical model from these assumptions following a number of basic steps outlined previously (de Jong et al., 2017). We first define extensive variables for quantities and rates, then normalize these with respect to the mass of the growing microbial population, assuming that the biomass density is constant (Basan et al., 2015a). This will lead to intensive variables denoting concentrations and specific reaction rates, as well as matching expressions of growth rate and growth yield in terms of these rates.
Carbohydrates in the medium are taken up and metabolized by the cellular population at a rate ${V}_{mc}$, a macroreaction that is controlled by enzymes with a total quantity equal to ${M}_{c}$. The resulting central carbon metabolites having a quantity $C$ are used to produce ATP and synthesize proteins and other macromolecules. More specifically, two alternative ATPproducing pathways are considered: respiration at a rate ${V}_{mer}$, catalyzed by enzymes with a quantity ${M}_{er}$, and fermentation at a rate ${V}_{mef}$, catalyzed by enzymes with a quantity ${M}_{ef}$. Synthesis of proteins and other macromolecules occurs at rates ${V}_{r}$ and ${V}_{mu}$, respectively, and are catalyzed by ribosomes and other proteins with quantities $R$ and ${M}_{u}$, respectively. The protein and metabolite quantities are expressed in units mmol of carbon (Cmmol) and the rates in units Cmmol hr^{1}.
ADP and ATP, at total quantities $A$ and ${A}^{*}$ [mmol], respectively, are permanently recycled through the ATP production and the biosynthesis pathways. CO_{2} is released by the cell through respiration, but also as a byproduct of the biosynthetic reactions and fermentation. The latter CO_{2} outflux is accounted for in the carbon balance through the (dimensionless) correction factors ${\rho}_{ru}$ and ${\rho}_{mef}$, respectively. The correction factors express that CO_{2} is a byproduct of the synthesis of proteins and other macromolecules (${\rho}_{ru}$) and acetate (${\rho}_{mef}$). The loss of CO_{2} adds to the total flux of carbon through these macroreactions, which makes ${\rho}_{ru}>1$ and ${\rho}_{mef}>1$. All biomass components are subjected to degradation at a rate $\gamma $ [hr^{1}].
The time evolution of the total quantity of each biomass component in the growing population can now be written as follows:
where ${\chi}_{u},{\chi}_{r},{\chi}_{c},{\chi}_{er},{\chi}_{ef}$ are dimensionless resource allocation parameters, such that
The time evolution of the total quantity of protein $P={M}_{u}+R+{M}_{c}+{M}_{er}+{M}_{ef}$ is obtained by summing the differential equations for the different protein categories:
We define the total cellular biomass $B$ [gDW] as
where $1/\beta $ is the biomass carbon content [Cmmol gDW^{1}]. Recall that ATP and ADP are not included in the biomass.
Assuming that the volume of the growing microbial population is proportional to the biomass (Basan et al., 2015a), we transform the above quantities into concentrations by dividing by the total biomass $B:{m}_{u}={M}_{u}/B,\phantom{\rule{thinmathspace}{0ex}}{m}_{c}={M}_{c}/B,\phantom{\rule{thinmathspace}{0ex}}{m}_{er}={M}_{er}/B,\phantom{\rule{thinmathspace}{0ex}}{m}_{ef}={M}_{ef}/B,\phantom{\rule{thinmathspace}{0ex}}r=R/B,\phantom{\rule{thinmathspace}{0ex}}c=C/B,\phantom{\rule{thinmathspace}{0ex}}u=U/B$. Accordingly, the concentration variables have units Cmmol gDW^{1} and the total biomass concentration is given by $1/\beta $.
The dynamics of the concentration variables is described by the following system of differential equations:
The (specific) growth rate μ [hr^{1}] is defined as the relative biomass increase of the cell,
so that the last term in the preceding equations describes dilution by growth. Furthermore, defining ${v}_{mc}={V}_{mc}/B$, ${v}_{me}={V}_{me}/B$, ${v}_{r}={V}_{r}/B$, and ${v}_{mu}={V}_{mu}/B$ as the reaction rates per unit of biomass (volume) [Cmmol hr^{1} gDW^{1}], we obtain
In addition to the flow of carbon through the system, two equations describe energy transfer due to the production and consumption of ATP. We define, analogously to the other concentration variables, ${a}^{*}={A}^{*}/B$ and $a=A/B$, with units mmol gDW^{1}. The energy and mass flows are coupled via the following balance equations
where ${n}_{mer}$ and ${n}_{mef}$ represent the ATP yield of the two ATP production pathways (with $n}_{mer}>{n}_{mef$, i.e. respiration has a higher yield than fermentation), and ${n}_{mu}$ and n_{r} the ATP costs of biomass and protein synthesis, respectively. The reaction rate v_{d} accounts for energy dissipation, that is, the fact that around half of the ATP produced is not utilized for macromolecular synthesis but dissipated in other cellular processes (Russell and Cook, 1995; Feist et al., 2007).
Since $d{a}^{*}/dt=da/dt$, the total concentration of the energy cofactors (pool of $a$ and ${a}^{*}$) is equal to some constant a_{0} [mmol gDW^{1}],
in agreement with experiments in which usually little variation in the concentration of energy cofactors is observed (Petersen and Møller, 2000; Schneider and Gourse, 2004). Given the dependency between ${a}^{*}$ and $a$, we omit the differential equation of the latter.
The model variables and rates are summarized in Appendix 1—table 1.
Using the definition of total biomass (Equation 14), we can express the growth rate μ as a function of the reaction rates as follows:
Note that the total macromolecular synthesis rate is multiplied by ${\rho}_{ru}1$ rather than ${\rho}_{ru}$, expressing that only the additional CO_{2} outflux is lost to biomass synthesis.
The nondimensional growth yield is defined as the ratio between the net biomass synthesis rate ($\mu /\beta $) and the carbon uptake rate ${v}_{mc}$, which leads to the following expression:
We use MichaelisMenten kinetics to define the rates of the macroreactions:
where $S$ denotes the concentration of the substrate in the medium [Cmmol L^{1}], ${K}_{mc}$, ${K}_{r}$, ${K}_{ar}$, ${K}_{mu}$, ${K}_{amu}$, ${K}_{mer},{K}_{amer},{K}_{mef},{K}_{amef}$ halfsaturation constants [Cmmol gDW^{1}] and [mmol gDW^{1}], and ${k}_{mc}$, k_{r}, ${k}_{mu}$, ${k}_{mer}$, ${k}_{mef}$ maximum catalytic rate constants [hr^{1}]. As can be seen, rates are proportional to enzyme concentrations, but depend nonlinearly on metabolite concentrations. During balanced growth in batch, the external substrate concentration $S$ is much higher than the halfsaturation constant ${K}_{mc}$ ($S\gg {K}_{mc}$), so that Equation 35 can be approximated by ${v}_{mc}({m}_{c})={m}_{c}{e}_{s}$, where ${e}_{s}={k}_{mc}$ [hr^{1}]. During continuous growth, the external substrate concentration $S$ is approximately constant, with the parameter e_{s} now defined as
The energy dissipation rate is defined by firstorder massaction kinetics:
where k_{d} [hr^{1}] is a catalytic rate constant.
The resource allocation model of microbial growth thus becomes
with
Since it holds by Equation 14 that
we can omit the differential equations for one of the variables in the righthand side. Given that $u$ is not playing a role in any of the kinetic rates, we usually eliminate Equation 42.
Note that in the above model, like in other resource allocation models (Erickson et al., 2017), resource allocation parameters and proteome fractions coincide at steady state. For example, from the steadystate equation for ribosomes, ${\chi}_{r}{v}_{r}=(\mu +\gamma )r$, and the steadystate equation for total proteins, ${v}_{r}=(\mu +\gamma )p$, it follows that ${\chi}_{r}=r/p$.
Model variant with an additional growthrateindependent protein category
The model described above includes a residual category of proteins, consisting of proteins other than ribosomes and translationaffiliated proteins ($R$), enzymes in central carbon metabolism (${M}_{c}$), or enzymes in energy metabolism (${M}_{er}$ and ${M}_{ef}$). This category ${M}_{u}$ carries a flux, because it includes the machinery for the synthesis of macromolecules other than proteins, in particular RNA and DNA. Moreover, we allow the fraction of the proteome occupied by this category to vary with the particular resource allocation strategy adopted, and therefore with the growth rate.
The fact that the proteome fraction of ${M}_{u}$ may change with the growth rate and that it carries a flux distinguishes it from a residual category of housekeeping proteins that is found in other models of microbial growth (Scott et al., 2010; Mori et al., 2016). The latter protein category (usually indicated by $Q$) is not accessible to growthratedependent proteome adjustments and carries no flux. Its size can be determined in different ways, most rigorously as the sum of the offsets of the linear relation between growth rate and proteome fraction of the individual protein categories (Hui et al., 2015).
We developed a variant of the model used in this study that includes such a growthrateindependent category $Q$. First of all, for each of the other protein categories, we distinguished a growthrateindependent and dependent part, indicated by the superscripts 0 and μ, respectively. For example, for ribosomes and translationaffiliated proteins, we have $R={R}^{0}+{R}^{\mu}$. Second, we defined $Q$ as consisting of the growthrateindependent parts of the other protein categories:
Following these notations, the total cellular biomass $B$ [gDW] is now defined as
where in what follows we drop the superscripts for the growthratedependent parts of the protein categories. Notice that, like in the reference model, ATP and ADP are not included in the biomass.
Following the same steps as for the reference model, a system of ordinary differential equations can be derived. The only differences with Equations 41–49 are that an additional equation for the category $Q$ is added:
Moreover, the sum of biomass components is given by
and the sum of resource allocation parameters is extended with ${\chi}_{q}$:
Note that, while the model has a very similar structure as the reference model of Equations 41–49, the interpretation of the protein concentrations m_{c}, $r$, ${m}_{er}$, ${m}_{ef}$, and m_{u} has changed: instead of denoting the total enzyme and ribosome concentrations, they now refer to the growthratedependent part of these concentrations.
Comparison with other coarsegrained resource allocation models
The model of Figure 1 differs in several assumptions from previously proposed resource allocation models of microbial growth. We summarize these differences below, focusing the comparison on coarsegrained models. That is, we do not consider finegrained models on the genome scale used in constraintbased analysis (Cheng et al., 2019; Adadi et al., 2012; Mori et al., 2016; Reimers et al., 2017; Wortel et al., 2018).
A first class of models takes into account either the carbon or energy balance, but not both (Molenaar et al., 2009; Scott et al., 2010; Scott et al., 2014; Maitra and Dill, 2015; Giordano et al., 2016; Weiße et al., 2015; Bosdriesz et al., 2015; Erickson et al., 2017; Towbin et al., 2017; Dourado and Lercher, 2020; Mairet et al., 2021). Typical examples are the classical model of Scott et al., 2010, which describes mass flow from substrate to different categories of proteins, and the model of Maitra and Dill, 2015, which provides a balance of ATP produced from the substrate and ATP consumed for protein synthesis. These models have successfully reproduced the ribosomal growth law, that is, the linear relation between growth rate and the ribosomal protein fraction, and other empirical regularities. However, apart from the presence of an occasional dissipation term, all substrate is used for biomass synthesis. Therefore, the growth yield as defined by Equation 2 does not vary with resource allocation. For our purpose, we need to be able to take into account that the use of substrate for ATP production is accompanied by the outflow of CO_{2} and the secretion of acetate, thus lowering the growth yield.
A second class of models takes into account the coupling of the carbon and energy balances, but describes the latter as fluxes of carbon and energy without specifying the underlying reaction kinetics (Basan et al., 2015a; Mori et al., 2019). For example, in the model of Basan et al., 2015a, fluxes in energy metabolism are modeled as the product of the proteome fraction of enzymes in respiration or fermentation multiplied by a corresponding efficiency coefficient. The energy coefficients express the ATP yield per unit of protein in the respiration and fermentation pathways, respectively. The coefficients are constant and therefore cannot express differences in the utilization of enzymes depending on the concentrations of central carbon metabolites and energy cofactors. These concentrations may change with the resource allocation strategy and lead to a higher saturation of enzymes, which we hypothesized as an explanation for highrate, highyield growth of E. coli. In addition, this category of models equates biomass with proteins, like the other models cited above. This does not allow the total protein concentration to vary and a tradeoff between protein and metabolite concentrations to occur. In the next section, we precisely define the additional modeling assumptions that allow our model to be reduced to the model of Basan et al., 2015a.
A third class of models does provide a kinetic description of all fluxes in the model and does include metabolites in the biomass definition, although ignoring other macromolecules (Zavřel et al., 2019; Faizi et al., 2018). The model of Zavřel et al., 2019, is closest to our model, but since it describes growth of cyanobacteria, it does not include alternative ATP production pathways and therefore does not account for differences in growth yield depending on the investment of cellular resources in respiration or fermentation. Moreover, the analysis of this model is focused on accounting for the experimentally observed growth rate of cyanobacteria under different light intensities. This has motivated the choice to look for resource allocation strategies optimizing the growth rate for each light intensity rather than scanning the space of possible resource allocation strategies in order to predict the variability of rateyield phenotypes.
The model presented in this work could be further extended by taking into account additional features of some of the models cited above. For example, instead of treating resource allocation strategies as an input to the model (Figure 2—figure supplement 2), they could be defined as a function of the bacterial physiology, for example, translation activity (Scott et al., 2014; Maitra and Dill, 2015; Giordano et al., 2016; Weiße et al., 2015; Bosdriesz et al., 2015; Erickson et al., 2017; Towbin et al., 2017). This would allow, among other things, to account for the adaptation of resource allocation during dynamic transitions between states of balanced growth. As another example, our model could be extended to allow the uptake of alternative carbon sources (Erickson et al., 2017; Towbin et al., 2017), which would allow the modeling of diauxic growth behavior. The short summary in this section describes the main differences between the model of Figure 1 and some major previous work, but cannot do complete justice to the rich diversity of results in the literature. We refer to articlelength reviews on coarsegrained resource allocation models and microbial growth for more extensive information (Scott et al., 2014; Kafri et al., 2016; de Jong et al., 2017; Bruggeman et al., 2020).
Simplified coarsegrained resource allocation models
In this section, we discuss two simplifications of the model that (i) allow its predictions to be analyzed along the Pareto frontier of growth rate and growth yield in Figure 2, in order to explore the relation between resource allocation and growth, and (ii) allow the predictions to be compared with previous work, in particular the model of Basan et al., 2015a.
Model simplification and analysis along the Pareto frontier
We analyze the model of Equations 41–50, with the reaction rates given by Equations 33–40, at steady state, after making a number of simplifying assumptions that are valid along the Pareto frontier of growth rate and growth yield shown in Figure 2. Using this simplified model, the decrease of the maximum yield with increasing growth rate can be traced back to qualitative changes in resource allocation parameters.
First, we exploit the fact that the contribution to the carbon balance of CO_{2} loss during macromolecular synthesis is negligible along the Pareto frontier, that is, ${v}_{mc}\gg ({\rho}_{ru}1)({v}_{r}+{v}_{mu})$ (Figure 2—figure supplement 4C). Second, we exploit the fact that the degradation of macromolecules is negligible at high growth rates, that is, $\gamma \ll \mu $ (Figure 4—figure supplement 1A). Third, for the maximum yields along the Pareto frontier, the contribution of fermentation to energy production is negligible (${v}_{mef}\approx 0$, Figure 2—figure supplement 4). This leads to a simplified definition of growth rate (Equation 33):
Fourth, over most of the rateyield phenotype space, and a fortiori along the Pareto frontier, the rate of synthesis of other macromolecules is strongly coupled to the rate of protein synthesis (Figure 2—figure supplement 4C). In other words, ${v}_{mu}\approx {\alpha}_{1}{v}_{r}$, where ${\alpha}_{1}<1$ is a positive constant. Fifth, in a similar way, the rate of ATP spilling is strongly coupled to the rate of ATP production, that is, ${v}_{d}\approx {\alpha}_{2}{v}_{mer}$, with ${\alpha}_{2}<1$ a positive constant. This leads to the following simplified energy balance (Equation 30):
Moreover, the assumptions lead to the following simplification of the biomass composition (Equation 50):
and the resource allocation constraint:
Sixth, we note that ${\chi}_{u}$, and therefore ${m}_{u}/p$, are approximately constant at their minimal possible value (Figure 2—figure supplement 4D and Figure 2—figure supplement 1). Finally, seventh, while the concentrations of central metabolites and ADP vary along the Pareto frontier, we observe that the Michaelian term in the rate equations in which $c$ and $a$ occur are approximately constant, contrary to the term for ${a}^{*}$ (Figure 2—figure supplement 4B). This leads to simplified expressions for the rate equations of ATP production and consumption (Equation 36 and Equation 38):
where ${k}_{r}^{\prime},{k}_{mer}^{\prime}$ are lumped constants incorporating the effect of the metabolite and energy cofactor concentrations on the reaction rates.
With the above simplifications, it becomes possible to explicitly relate the observed increase in growth rate ($\mu \uparrow $) and decrease in growth yield ($Y\downarrow $) to underlying changes in the resource allocation parameters, due to the constraints on carbon and energy fluxes, biomass composition, and resource allocation. We first note that, by Equation 1, a decrease in growth yield ($Y\downarrow $) must be accompanied by a decrease of the ratio of the growth rate and the substrate uptake rate: $(\mu /{v}_{mc})\downarrow $. Because $\mu \uparrow $ this necessarily implies ${v}_{mc}\uparrow $, that is, the substrate rate must increase along the Pareto frontier. Furthermore, by substituting the simplified growth rate expression of Equation 56 into the yield definition, we obtain the expression
where $Y\downarrow $ implies $({v}_{mer}/{v}_{mc})\uparrow $, that is, the fraction of substrate dedicated to ATP production increases for higher growth rates along the Pareto frontier. Because ${v}_{mc}\uparrow $, it must also hold that ${v}_{mer}\uparrow $. With the simplified energy balance of Equation 57, ${v}_{mer}\uparrow $ implies ${v}_{r}\uparrow $. Moreover, from the proportionality of ${v}_{mer}$ and v_{d}, it follows that ${v}_{d}\uparrow $ too. In summary, the flux of carbon underlying microbial growth increases with higher growth rate along the Pareto frontier, as verified in Figure 2—figure supplement 4C.
Does the increase in protein synthesis rate lead to a higher (total) protein concentration? The answer is less straightforward than might be thought, because under conditions of balanced growth the protein synthesis rate equals growth dilution of proteins, that is, ${v}_{r}=\mu p$. Both v_{r} and μ increase, so the direction of increase of $p$ is not obvious from this equation. However, note that with the simplified energy balance of Equation 57, the growth yield equation of Equation 62 can be rewritten as
which with $Y\downarrow $ implies $({v}_{r}/{v}_{mc})\uparrow $. Now, because $({v}_{mc}/\mu )\uparrow $, and $({v}_{r}/{v}_{mc})({v}_{mc}/\mu )={v}_{r}/\mu $, it follows that $({v}_{r}/\mu )\uparrow $, and therefore $p\uparrow $. That is, in order to facilitate the higher flux of carbon through the bacteria, a higher protein concentration is needed. By the constant total biomass concentration (Equation 58), this directly implies that the concentration of central carbon metabolites must decrease ($c\downarrow $). In other words, the tradeoff between rate and yield along the Pareto frontier is accompanied by a tradeoff between proteins and metabolites. Because the concentration of central carbon metabolites remains largely saturating (Figure 2—figure supplement 4B), however, the decrease of the concentration does not much affect its driving force in the reactions of energy metabolism and macromolecular synthesis.
How do the concentrations of the individual protein classes change when the growth rate increases along the Pareto frontier? With the definition of the substrate uptake rate, ${v}_{mc}={m}_{c}{e}_{s}$, we immediately find that ${v}_{mc}\uparrow $ implies ${m}_{c}\uparrow $. From the simplified rate equation for energy metabolism (Equation 61) it also follows that ${m}_{er}\uparrow $. Determining the direction of change of the ribosome concentration is less straightforward. Note that the simplified rate equation can be rewritten as follows:
Because ${v}_{r}\sim {v}_{mer}\sim {v}_{d}\sim {a}^{*}$, the ratio ${v}_{r}/{a}^{*}$ remains constant for increasing μ along the Pareto frontier. Because ${v}_{d}\uparrow $, we have ${a}^{*}\uparrow $, so that it follows that $r\uparrow $. In conclusion, not only the total protein concentration, but the concentrations of all enzymes and the ribosomes increase (Figure 2—figure supplement 4A).
The fact that the steadystate concentration of a protein category increases does not imply that the corresponding resource allocation parameter also increases. Since the total protein concentration $p$ increases, even for constant resource allocation, the concentration of the protein category increases. This is the case for the category of other proteins: ${\chi}_{u}$ is constant, so that with ${m}_{u}={\chi}_{u}p$, it follows that ${m}_{u}\uparrow $. Dividing Equations 60 and 61 by $p$, we obtain the following expressions:
From the energy balance, we find that ${v}_{mer}/p$ must change in the same direction as ${v}_{r}/p=\mu $, that is, increase along the Pareto frontier. As a consequence, ${\chi}_{er}\uparrow $. Since both $\mu \uparrow $ and ${a}^{*}\uparrow $, the second equation does not unambiguously fix the direction of change of ${\chi}_{r}$, which depends on the ratio of μ and ${a}^{*}/({a}^{*}+{K}_{ar})$. In particular, if this ratio remains constant, then ${\chi}_{r}$ also remains constant, whereas if the ratio increases, then ${\chi}_{r}\uparrow $. Figure 2—figure supplement 4B shows that these two cases both occur along the Pareto frontier. ${\chi}_{r}$ remains constant for a large range of growth rates: the ribosome concentration nevertheless increases due to the higher total protein concentration. This is not sufficient for the highest growth rates, however, where ${\chi}_{r}$ needs to increase as well to sustain the higher flux of carbon through the bacteria. In both cases, however, the resource allocation constraint of Equation 59 forces ${\chi}_{c}$ to decrease (Figure 2—figure supplement 4D). That is, whereas the concentration of m_{c} increases, the fraction of resources devoted to the uptake and metabolism of the carbon source decreases, so as to free resources for energy metabolism and protein synthesis at the higher growth rate. The higher investment in protein synthesis, and the corresponding higher energy demand and CO_{2} loss through respiration, explain the lower growth yield.
The above analysis thus explicitly relates the observed change in rate and yield along the Pareto surface with the changes in fluxes, concentrations, and resource allocation parameters shown in Figure 2—figure supplement 4. We emphasize that some of the assumptions underlying the model simplifications are specific for the Pareto frontier, such as the restriction of energy metabolism to respiration. As a consequence, accounting for a change in rate and yield in terms of changes in resource allocation may be different in other regions of the rateyield phenotype space.
Reduction to resource allocation model of Basan et al.
We simplify the model of Equations 41–50, with the reaction rates given by Equations 33–40, to the model of Basan et al., 2015a, by making a number of additional assumptions.
First, assume that the concentrations of central carbon metabolites, energy cofactors, and other macromolecules are constant and that their contribution to the biomass balance can be ignored. This leads to the revised rate equations
where the constants ${k}_{r}^{\prime},{k}_{mer}^{\prime},{k}_{mef}^{\prime},\mathrm{\dots}$ lump the effects of the catalytic efficiency of the enzymes and the concentrations of central carbon metabolites and energy cofactors. Moreover, the assumption reduces the biomass to total protein mass:
and consequently,
where $1/{\beta}^{\prime}$ is the total protein concentration. Note that, with this simplification, the total protein concentration is constant, independently from the resource allocation strategy adopted by the cell.
A second assumption is that energy dissipation and the degradation of macromolecules can be neglected, which means that $\gamma ={k}_{d}=0$. The absence of protein degradation, together with the revised biomass definition, leads to the proportionality of growth rate and protein synthesis rate:
The absence of energy dissipation, in combination with the omission of other macromolecules, leads to a revised energy balance:
which with Equation 70 gives
and hence
Third, assume that in the mass balance for the central carbon metabolites the contributions from growth dilution and spontaneous degradation can be neglected in comparison with the utilization of these metabolites for protein synthesis. Then Equation 41 reduces to
which with the energy balance of Equation 71 can be rewritten as
bearing in mind that both ${\rho}_{mef}$ and ${n}_{mef}/{n}_{mer}$ assume values in the range 1–2. That is, the substrate uptake rate is approximately proportional to the protein synthesis rate.
Now, using the protein mass balance of Equation 69, we can express the total concentration of energy proteins as follows:
which with Equation 75 and the rate equations for the glucose uptake and protein synthesis rates give
where ${m}_{e}^{max}=1/{\beta}^{\prime}{m}_{u}$, making the further assumption that ${m}_{u}={\beta}^{\prime}/{\chi}_{u}$ is constant, and $\alpha =(1/{e}_{s})({\rho}_{ru}{n}_{r}/{n}_{mer}+{e}_{s}/{k}_{r}^{\prime})$. Equation 78 expresses that the concentration (or equivalently for constant $1/{\beta}^{\prime}$, the fraction) of proteins involved in energy metabolism linearly decreases with the growth rate. Basan et al., 2015a, posit the same linear relationship, based on proteomics data for the NCM3722 strain (Hui et al., 2015).
When combining Equations 72 and 78, we can solve for the two unknowns ${m}_{ef}$ and ${m}_{er}$ as a function of μ:
The model is only valid in the range of growth rates where both concentrations are positive. By means of the simplified expressions for the respiration and fermentation fluxes (Equation 67), we can compute the total ATP production rate ${n}_{mer}{k}_{mer}^{\prime}{m}_{er}+{n}_{mef}{k}_{mef}^{\prime}{m}_{ef}$ using the above expressions. The ATP production rates of Basan et al., 2015a, are rescaled by using protein fractions instead of protein concentrations, which gives rise to ${J}_{E,f}\equiv {n}_{mef}{k}_{mef}^{\prime}{\beta}^{\prime}{m}_{ef}$ and ${J}_{E,r}\equiv {n}_{mer}{k}_{mer}^{\prime}{\beta}^{\prime}{m}_{er}$. Developing the expressions for ${J}_{E,f}$ and ${J}_{E,r}$ by means of Equations 79 and 80 yields equations that are equivalent to Eqs S12 and S13, respectively, of Basan et al., 2015a, after appropriately renaming the parameters (${\u03f5}_{f}={n}_{mef}{k}_{mef}^{\prime}$, ${\u03f5}_{r}={n}_{mer}{k}_{mer}^{\prime}$, $\sigma ={n}_{r}$, $b=\alpha $, and ${\varphi}_{E,max}={\beta}^{\prime}{m}_{e}^{max}$).
The model of Basan et al., 2015a, predicts a tradeoff between respiration and fermentation when the growth rate increases, because the protein cost of fermentation is lower than the protein cost of respiration, that is, $n}_{mef}\phantom{\rule{thinmathspace}{0ex}}{k}_{mef}^{\mathrm{\prime}}>{n}_{mer}\phantom{\rule{thinmathspace}{0ex}}{k}_{mer}^{\mathrm{\prime}$. This relation, which is preserved for the parameter values for growth on glucose in our model (Appendix 2—table 2), implies that when the growth rate increases, the concentration of fermentation enzymes increases at the expense of the concentration of respiration enzymes. Due to the lower protein cost of fermentation, however, the total ATP production rate increases.
As explained in the Discussion section, our model makes less stringent assumptions, which notably allows metabolite and total protein concentrations to vary with different resource allocation strategies. As a consequence, there are ways to increase the total ATP production rate without shifting resources from energyefficient but costly respiration (high ${n}_{mer}$ but low ${n}_{mer}{k}_{mer}^{\prime}$) to energyinefficient but cheap fermentation (low ${n}_{mef}$ but high ${n}_{mef}{k}_{mef}^{\prime}$). In particular, in our model, growth rate and growth yield can be simultaneously increased, by trading off proteins against metabolites, thus enabling a more efficient use of proteomic resources.
Appendix 2
Model calibration
Reference datasets and model calibration strategy
Model calibration was performed using published reference datasets with measurements of growth rates and fluxes (Haverkorn van Rijsewijk et al., 2011; Gerosa et al., 2015; Peebo et al., 2015), protein concentrations (Schmidt et al., 2016), and metabolite concentrations (Gerosa et al., 2015; Bennett et al., 2009; Park et al., 2016). The datasets concern the E. coli BW25113 strain: either batch growth in minimal medium with glucose or glycerol, or continuous growth in minimal medium with glucose. We also used auxiliary data for other strains at comparable growth rates, when necessary. Moreover, we adopted a topdown model calibration procedure, in order to enforce consistency across different data types.
Step 1
We used the total biomass density and measured biomass proportions of proteins and metabolites to derive total protein and metabolite concentrations.
Step 2
We used proteomics and metabolomics data to derive the concentrations of the different protein and metabolite categories distinguished in the model.
Step 3
We used published data to reconstruct the biomass degradation rate for growth on glucose and glycerol.
Step 4
We used the measured substrate uptake and acetate secretion rates, the growth rate, and the derived protein and metabolite concentrations to reconstruct the other metabolic fluxes from the carbon mass balance.
Step 5
We derived the kinetic parameters from literature data and from the fluxes and the concentrations obtained in the previous steps.
The above procedure does not require computational parameter fitting, since all parameters are unambiguously fixed by the data, literature information, and suitable hypotheses motivated by experimental results. We explain the procedure in detail for batch growth of the reference strain, and then summarize the results for continuous growth and for an alternative strain. In what follows, observed fluxes, growth rates, and concentrations, as well as kinetic parameters derived from this information, are denoted by a hat $\widehat{.}$ symbol.
Reconstruction of concentrations, rates, and fluxes for batch growth
Total biomass concentration $1/\beta $
The total concentration of biomass in the cell, in units Cmmol gDW^{1}, is referred to in our model as $1/\beta $. Using the definition of yield (Equation 2 in the main text), we have $1/\beta =Y{v}_{mc}/\mu $. With the values reported by Morin et al., 2016, for the MG1655 strain, we estimate
This value is close to the theoretical value obtained from the fact that the carbon mass fraction of biomass is approximately 0.5 (Folsom and Carlson, 2015):
where CgDW refers to Cgram dry weight and the molecular weight of C equals 12.01 g mol^{1}. Another way to determine the total biomass concentration is to use the estimated elementary biomass composition of E. coli. von Stockar and Liu, 1999, report CH_{1.77}O_{0.49}N_{0.24}, which with the molecular weights of H, O, and N yields an estimate of 40.03 Cmmol gDW^{1}, again close to the value proposed above.
Metabolite concentrations $c$, $a$, ${a}^{*}$, and a_{0}
A recent quantification of 43 abundant metabolites in the E. coli BW25113 strain growing in minimal medium with glucose or glycerol learns that these metabolites sum up to a concentration of 0.89 Cmmol gDW^{1} and 0.69 Cmmol gDW^{1}, respectively (Gerosa et al., 2015). When comparing the metabolites quantified by Gerosa et al. with those measured in a broader screen carried out by Park et al., 2016, we conclude that 56% of the metabolite mass is covered by the study of Gerosa et al. As a consequence, we estimate the total metabolite concentrations in growth on glucose and glycerol to be 1.6 Cmmol gDW^{1} and 1.2 Cmmol gDW^{1}, respectively. With the biomass density value of $1/\widehat{\beta}$, these concentrations correspond to 3.9% and 3.0% of the total biomass. The estimates correspond well to the older estimate that metabolites constitute 3.5% of the total biomass, obtained for the E. coli B/r strain growing at a rate of around 1 hr^{1} (Neidhardt, 1996), and a more recent estimate of 2.9% (Feist et al., 2007).
Analysis of the data of Gerosa et al., 2015, shows that central carbon metabolites account for 22% of the total free metabolite concentration during growth in minimal medium with glucose. We therefore estimate the concentration of the pool of central metabolites in this condition as
For growth on glycerol, the fraction of central metabolites is 17%, so that
As explained in Appendix 1, we consider pools of charged and discharged energy cofactors expressed as ATP equivalents. Following the arguments of Basan et al., 2015b, 1 NADH or 1 NADPH molecule can be converted into 2 ATP molecules. With these conversion factors, we obtain from the ATP/ADP, NADH/NAD+, NADPH/NADP+ concentrations reported by Gerosa et al., 2015, the following estimates of the concentrations of energy cofactors during growth on glucose:
The values for growth on glycerol are
Accordingly, ${\widehat{a}}_{0}=0.020\mathrm{mmol}{\mathrm{gDW}}^{1}$ for growth on glucose, and ${\widehat{a}}_{0}=0.015\mathrm{mmol}{\mathrm{gDW}}^{1}$ for growth on glycerol. Recall that ATP and ADP are not included in the mass balance (Appendix 1).
Protein concentrations m_{u}, $r$, m_{c}, and ${m}_{er}+{m}_{ef}$
Estimates of the total protein concentration of E. coli reported in the literature vary significantly (Milo, 2013). For example, older values for the B/r strain indicate a mass fraction of 0.55 (Neidhardt, 1996), for cells growing with a doubling time of 40 min ($\mu =1.04$ hr^{1}). In their quantification of the NCM3722 strain, Basan et al., 2015b, report a value of 0.67 for the protein fraction of dry biomass of cells growing in batch in minimal medium with glucose at a rate of 0.99 hr^{1}. For growth on other carbon sources at rates of 0.42–0.43, this fraction increases to 0.73–0.76. Valgepea et al., 2013, find that for glucoselimited growth in a bioreactor at a rate of 0.4 hr^{1}, the MG1655 strain, another K12 descendant, has a protein dry biomass fraction equal to 0.53. Milo, 2013, cites an old reference value of 0.24 g mL^{1}, which with an estimated total (dry) biomass concentration of 0.33 g mL^{1} yields a protein mass fraction of 0.73, in agreement with the values of Basan et al.
We based our estimates on the data from Basan et al., 2015b, who report protein dry mass fractions for batch growth in different media at different growth rates. From within the range of reported values, we chose the dry mass fractions for growth rates corresponding to the observed growth rates of the BW25113 strain in minimal medium with glucose or glycerol (Appendix 2—figure 1). This resulted in protein dry mass fractions of 0.72 (glucose) and 0.73 (glycerol). Like the carbon mass fraction of biomass, the carbon mass fraction of protein is approximately 0.5 (Supplementary table 3 in Feist et al., 2007). As a consequence, the above protein dry mass fractions also denote the protein fractions of the total biomass concentration expressed in units Cmmol gDW^{1}.
In our model, the process of protein synthesis includes the synthesis of amino acids from central metabolites (Appendix 1). For reasons of consistency, we therefore add the concentrations of free amino acids to the total protein concentration. Given that amino acids account for around 50% of metabolites (Bennett et al., 2009), and the total metabolite concentrations were estimated to take up 3.9% and 3.0% of the total biomass during growth on glucose and glycerol, respectively, the total protein concentrations amount to a fraction of 0.74 of the total biomass density, for both glucose and glycerol.
The proteomics data of Schmidt et al., 2016, provide information on the mass fractions of each of the protein categories distinguished in the model. This information, together with the total protein concentration established above, allows us to compute the concentrations m_{u}, $r$, m_{c}, and ${m}_{er}+{m}_{ef}$ (in units Cmmol gDW^{1}). The use of mass fractions, instead of the absolute values also reported by Schmidt et al., has the advantage of ensuring the consistency of the protein concentrations with the uptake, secretion, and growth rates reconstructed below. In the case of growth in minimal medium with glucose, we thus estimate that
while for minimal medium with glycerol we obtain
The above mass fractions correspond to the following resource allocation parameters for growth on glucose:
and growth on glycerol:
We will discuss in a later section how to distribute the total concentration ${\widehat{m}}_{er}+{\widehat{m}}_{ef}$ over the respiration and fermentation protein classes (and thus determine the resource allocation parameters ${\chi}_{er}$ and ${\chi}_{ef}$).
Concentration of other macromolecules $u$
The biomass definition in the model enforces the concentration $u$ of other macromolecules (RNA, DNA, lipids in the cell membrane) to equal the difference between the total biomass concentration and the sum of the total protein and metabolite concentrations. For growth on glucose, we thus find that
whereas for growth on glycerol, we also obtain
The estimated values, and all other concentration values derived above, are summarized in Appendix 2—table 1.
Degradation rate $\gamma $
The model includes a degradation constant $\gamma $ that accounts for one of the main causes of socalled maintenance costs of the cell, the turnover of macromolecules and other biomass components. We show that the biomass degradation constant can be determined by means of the wellknown Pirt model for maintenance, defined by
where ${v}_{mc}$ [Cmmol gDW^{1} hr^{1}] is the substrate uptake rate, ${Y}^{max}$ [gDW Cmmol^{1}] the maximum biomass yield without maintenance, and k_{m} [Cmmol gDW^{1} hr^{1}] the socalled maintenance coefficient (Pirt, 1965).
By substituting expressions for ${Y}^{max}$ and μ from our model (Appendix 1) into Equation 99, we obtain
or
Data for growth of the E. coli MG1655 strain in minimal medium with glucose, by Esquerré et al., 2014, indicate a maintenance coefficient of ${k}_{m}=0.35$ $\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\text{glc}$ gDW^{1} hr^{1} and a maximal yield ${Y}_{max}=76.2$ gDW $\mathrm{m}\mathrm{o}\mathrm{l}}_{\text{glc}}^{1$, practically identical to the values reported for the same strain in the same medium by Nanchen et al., 2006 (${k}_{m}=0.37$ $\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\text{glc}$ gDW^{1} hr^{1}, ${Y}_{max}=76$ gDW $\mathrm{m}\mathrm{o}\mathrm{l}}_{\text{glc}}^{1$). Using the values from Esquerré et al., 2014, we find $\widehat{\gamma}=0.027$ hr^{1}. By the same reasoning as above, the maintenance rate for growth in minimal medium with glycerol can be obtained. Classical experiments indicate that the rate is 1.2 times the rate for glucose (Farmer and Jones, 1976), so $\widehat{\gamma}=0.032$ hr^{1}.
Substrate uptake flux ${v}_{mc}$, fermentation flux ${v}_{mef}$, and biosynthesis fluxes ${v}_{mu}$, v_{r}
The datasets used from Haverkorn van Rijsewijk et al., 2011, and Gerosa et al., 2015, consist of measured fluxes and the growth rate of the E. coli BW25113 strain, during exponential growth in minimal medium with glucose and glycerol, respectively. In particular, the glucose or glycerol uptake rate ${v}_{mc}$ [$\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\text{glc/gly}$ gDW^{1} hr^{1}], the acetate secretion rate ${v}_{mef}$ [$\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\text{ace}$ gDW^{1} hr^{1}], and the growth rate μ [hr^{1}] were measured. The values for glucose are ${\widehat{v}}_{mc}=8.26$ $\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\text{glc}$ gDW^{1} hr^{1}, ${\widehat{v}}_{mef}=4.89$ $\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\text{ace}$ gDW^{1} hr^{1}, and $\widehat{\mu}=0.61$ hr^{1}. These values are very close to those reported by Morin et al., 2016, for the MG1655 strain. In the case of growth on glycerol, we have ${\widehat{v}}_{mc}=11.3$ $\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\text{gly}$ gDW^{1} hr^{1} and $\widehat{\mu}=0.49$ hr^{1}, while the acetate secretion rate was found to be small: ${\widehat{v}}_{mef}=0.60$ $\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\text{ace}$ gDW^{1} hr^{1}. (Gerosa et al., 2015, actually report a glycerol uptake rate of 10.14 $\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\text{glc}$ gDW^{1} hr^{1}, but explain that uptake rates were computed by dividing the measured growth rates by the measured biomass yields [see Extended Experimental Procedures]. In the case of glycerol, the growth rate and the biomass yield were found to be 0.49 hr^{1} and 0.47 gDW g^{1}, respectively (Data S1), which with a molecular weight of 92.09 g mol^{1} gives a value of $0.49/(0.47\cdot 92.09\cdot 0.001)=11.3$ mmol gDW^{1} hr^{1} for the glycerol uptake rate).
In agreement with the biomass concentration units, we express mass fluxes in terms of the amount of carbon flowing through the system [Cmmol gDW^{1} hr^{1}]. Bearing in mind that the carbon content of glucose is 6 C and that of acetate 2 C, we obtain the following rates:
Similarly, for growth on glycerol we have
where we have used the fact that the carbon content of glycerol is 3 C.
The measured fluxes, together with the growth and degradation rates and the total biomass concentration, fix the biosynthesis fluxes in the model. This can be shown by rewriting the equations in the model in the following way:
Values for ${v}_{mu}$ and v_{r} can be directly computed from the values for the concentrations and rates in the righthand sides of Equations 106 and 107 that were derived above. This yields for growth on glucose:
and for growth on glycerol:
Respiration flux ${v}_{mer}$ and CO_{2} correction factors ${\rho}_{ru}$ and ${\rho}_{mef}$
In the flux datasets mentioned above, CO_{2} released by the cells was not directly measured. The CO_{2} flux can be derived from the carbon mass balance, bearing in mind that almost all of the carbon not integrated into biomass leaves the cells as CO_{2} or acetate (Gerosa et al., 2015; Gottschalk, 1986). The carbon mass balance is given by the definition of the growth rate, which provides an expression for the total CO_{2} outflux ${v}_{C{O}_{2}}$. We have
where ${\rho}_{ru}1>0$ is the correction factor accounting for the release of CO_{2} during the synthesis of amino acids, proteins, and other biomass components and ${\rho}_{mef}1>0$ the correction factor accounting for the CO_{2} released during the conversion of glucose to acetate (Appendix 1). That is, the total CO_{2} flux is composed of the CO_{2} released during respiration (${v}_{mer}$), fermentation ($({\rho}_{mef}1){v}_{mef}$), and the CO_{2} released during macromolecular synthesis ($({\rho}_{ru}1)({v}_{r}+{v}_{mu}$)). Basan et al., 2015a, argue that the latter CO_{2} outflux is proportional to the growth rate over a wide range of conditions, with a proportionality constant $\eta $:
The value of $\eta $ is estimated at 7.2 Cmmol gDW^{1} (Basan et al., 2015a), so that for a growth rate of .61 hr^{1} in the case of minimal medium with glucose, the CO_{2} outflux associated to biosynthesis equals 4.4 Cmmol gDW^{1} hr^{1}. Moreover, with the values for v_{r} and ${v}_{mu}$ derived above, we find
That is, 17% of the carbon flux toward macromolecular synthesis is lost as CO_{2}. The total CO_{2} outflux can be directly computed from Equation 112, giving
For each acetate molecule, one CO_{2} is produced (Basan et al., 2015a), so that ${\widehat{\rho}}_{mef}=1.5$. The respirationassociated CO_{2} outflux can now be reconstructed as
In the case of growth on glycerol, we find ${\widehat{v}}_{C{O}_{2}}=11.5$ Cmmol gDW^{1} hr^{1} and ${\widehat{v}}_{mer}=7.3$ Cmmol gDW^{1} hr^{1}, while the value for ${\rho}_{ru}$ is the same as for glucose (1.17). The reconstructed flux measurements are summarized in Appendix 2—table 1, whereas the flux correction factors for CO_{2} release are included in Appendix 2—table 2.
Estimation of parameter values for batch growth
The model contains 20 kinetic parameters. Estimation of all of these values from the data in Appendix 2—table 1 would lead to identifiability problems. However, as shown below, making appropriate assumptions based on experimental observations allows all parameters to be unambiguously fixed.
Parameters in energy balance equation ${n}_{me}$, ${n}_{mer}$, ${n}_{mef}$, n_{r}, ${n}_{mu}$, k_{a}
We remind that the energy cofactor rate equation at steady state, or energy balance, is given by
where ${v}_{d}={k}_{a}{a}^{*}$.
The ATP yield coefficients ${n}_{mer}$ and ${n}_{mef}$ describe how many energy cofactor molecules (ATP) can be regenerated from a molecule of substrate (glucose or glycerol), in units $\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\mathrm{A}\mathrm{T}\mathrm{P}$ Cmmol. Basan et al., 2015b, describe a procedure for deriving the yield coefficients ${n}_{mer}$ and ${n}_{mef}$ from the reaction stoichiometry of the metabolic pathways used during growth on glucose. Aerobic respiration generates 4 ATP, 8 NADH, 2 NADPH, and 2 FADH_{2} from one molecule of glucose, equivalent to 26 ATP, whereas aerobic fermentation (acetate overflow) leads to 4 ATP and 4 NADH, equivalent to 12 ATP. As a consequence,
bearing in mind that glucose contains 6 C atoms. Restricting central metabolism to the glycolysis and TCA pathways, like Basan et al., 2015b, and focusing on the main flux of glycerol catabolism through the lower part of the glycolysis pathway, the ATP yield of glycerol respiration can be determined as 2 ATP, 4 NADH, 1 NADPH, and 2 FADH_{2}, equivalent to 14 ATP. Similarly, for aerobic fermentation we find 2 ATP, 2 NADH, and 1 FADH_{2}, equivalent to 7 ATP. This yields
given that glycerol contains 3 C atoms.
The coefficient n_{r} describes the ATP costs of protein synthesis. Kaleta et al., 2013, compute the amount of ATP needed for the elongation of a protein by one amino acid, including the net ATP costs of the synthesis of the amino acids from central metabolites and mRNA synthesis. They find that the ATP costs of the synthesis of many amino acids are negative (i.e. their synthesis yields ATP), while the ATP costs of mRNA synthesis are negligible in comparison with the translation costs. For glucose, the median total ATP costs are 3.7 ATP/amino acid. This equals 3.7/4.8=0.77 $\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\mathrm{A}\mathrm{T}\mathrm{P}$ $\mathrm{C}\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\mathrm{a}\mathrm{a}}^{1$, where the mean C content of amino acids, weighted for the amino acid composition of biomass, is estimated at 4.8 (data from Feist et al., 2007). That is,
These theoretical costs are close to the value of 0.94 $\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\mathrm{A}\mathrm{T}\mathrm{P}$ $\mathrm{C}\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\mathrm{a}\mathrm{a}}^{1$ obtained from the review of Russell and Cook, who base their estimate on calculations by Stouthamer (Russell and Cook, 1995). (The value of 0.94 $\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\mathrm{A}\mathrm{T}\mathrm{P}$ $\mathrm{C}\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\mathrm{a}\mathrm{a}}^{1$ is obtained by converting the value given in Table 1 of Russell and Cook, 1995, bearing in mind that the calculations were done for a protein fraction of biomass equal to 0.52 and using a carbon mass fraction of protein equal to 0.5; Feist et al., 2007.) For glycerol, where the synthesis of many amino acids is energetically favorable (Kaleta et al., 2013), the median total ATP costs are much lower: 0.44 ATP/amino acid. This amounts to 0.44/4.8=0.09 $\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\mathrm{A}\mathrm{T}\mathrm{P}$ $\mathrm{C}\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\mathrm{a}\mathrm{a}}^{1$, and hence
The coefficient ${n}_{mu}$ describes the ATP costs of the synthesis of other macromolecules (RNA, DNA, etc.). From the review of Russell and Cook, 1995, under the assumption that the average carbon mass fraction of other macromolecules is also equal to 0.5, we find that these ATP costs equal 0.65 $\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\mathrm{A}\mathrm{T}\mathrm{P}$ $\mathrm{C}\mathrm{m}\mathrm{m}\mathrm{o}\mathrm{l}}_{\mathrm{m}\mathrm{a}\mathrm{c}\mathrm{r}\mathrm{o}\mathrm{m}\mathrm{o}\mathrm{l}\mathrm{e}\mathrm{c}\mathrm{u}\mathrm{l}\mathrm{e}}^{1$, so that
This value applies to growth on glucose, but in the absence of information specific to growth on glycerol, we use the same value for the latter condition.
It has been well established that the estimated ATP production exceeds the estimated ATP consumption for macromolecular synthesis by a factor of 2–3 in the case of growth on minimal medium with glucose (Feist et al., 2007; Russell and Cook, 1995). This suggests a dissipation of energy which is also observed in our case: the ratio of ${\widehat{n}}_{mer}{\widehat{v}}_{mer}+{\widehat{n}}_{mef}{\widehat{v}}_{mef}$ and ${\widehat{n}}_{r}{\widehat{v}}_{r}+{\widehat{n}}_{mu}{\widehat{v}}_{mu}$ equals 2.1 in the case of glucose, and increases to 7.5 in the case of glycerol. The difference is due to the costs of osmoregulation, motility, and other maintenance processes (van Bodegom, 2007), but also to energy spilling, a factor that remains little understood (Russell and Cook, 1995). As explained in Appendix 1, we model all of the above forms of energy dissipation by a firstorder reaction with constant k_{a} whose value can be computed by closing the energy balance (Equation 117):
In the case of batch growth on glucose, we thus find an approximate value
and for glycerol,
Parameter in rate equation for central carbon metabolism e_{s}
As explained in Appendix 1, the macroreaction for central carbon metabolism simplifies to the following simple rate equation:
With the value for m_{c} derived in the previous section (Appendix 2—table 1), we obtain the following estimates for glucose:
and for glycerol:
Parameters in the rate equations for the synthesis of proteins and other biomass components ${K}_{r}$, ${K}_{mu}$, ${K}_{ar}$, ${K}_{amu}$, k_{r}, and ${k}_{mu}$
The rate equations for the macroreactions corresponding to protein synthesis and the synthesis of other macromolecules are restated as a reminder:
The above reactions consume central metabolites ($c$) and charged energy cofactors (ATP) (${a}^{*}$).
Very little information is available on the in vivo values of halfsaturation constants occurring in the kinetic expressions of the macroreactions. However, previous metabolomics assays have yielded general observations on enzyme saturation (the ratio of reaction substrates and halfsaturation constants) that will be exploited here (Bennett et al., 2009). These will be refined by combining available measurements with a recent compilation of ${K}_{m}$ values for E. coli (Dourado et al., 2021; Park et al., 2016).
First, in the case of central carbon metabolism, ‘substrate concentrations are close to ${K}_{m}$ for many reactions’ (Bennett et al., 2009). We have computed, for metabolites in central carbon metabolism of E. coli quantified by Gerosa et al., 2015, the ratio of metabolite concentrations and values of the halfsaturation constants of the reactions in which the metabolites participate (Dourado et al., 2021). Taking the geometric mean of the ratios, we found an average value of substrate saturation of 1.2 for glucose and 0.72 for glycerol (Supplementary file 3). Assuming that this value is approximately valid for all reactions consuming central carbon metabolites in our model, we estimate for glucose
and for glycerol
Note that we deal with apparent halfsaturation constants that account for possible metabolic regulation.
Second, ATP and NAD+ were found to saturate their enzymes with ‘cofactor concentration typically exceeding their ${K}_{m}$ value by more than 10fold’ (Bennett et al., 2009). This motivates the following approximate values for the halfsaturation constants occurring in the energy terms of the biosynthesis rate equations:
with different values for growth on glucose and glycerol (0.0009 vs 0.0005 mmol gDW^{1}).
Together with the values for the fluxes and enzyme concentrations, we can now derive values for the unknown catalytic constants k_{r} and ${k}_{mu}$ from Equations 131 and 132. In the case of growth on glucose, we have
whereas for growth on glycerol we find
Note that the estimates for k_{r} are comparable to values used for the maximum translation capacity in previous work (5.9 hr^{1} in Scott et al., 2010; 3.6 hr^{1} in Giordano et al., 2016).
Parameters in the rate equations for energy metabolism ${K}_{mer}$, ${K}_{mef}$, ${K}_{amer}$, ${K}_{amef}$, ${k}_{mer}$, and ${k}_{mef}$
We repeat the rate equations for energy metabolism, for the two macroreactions (respiration and fermentation):
The arguments given in the previous section for fixing the values of the halfsaturation constants also apply in this case, so that we obtain
for growth on glucose, and
for growth on glycerol.
In the previous section, we were only able to reconstruct the total concentration of enzymes involved in energy metabolism (Appendix 2—table 1), but not the fractions involved in aerobic respiration or fermentation. Let ${\widehat{m}}_{e}={\widehat{m}}_{er}+{\widehat{m}}_{ef}$. In order to derive the concentrations ${m}_{er}$ and ${m}_{ef}$, we follow approximately the same procedure as Basan et al., 2015b, but for the proteomics data of Schmidt et al., 2016. We divide the proteins labeled as taking part in energy metabolism into enzymes only playing a role in respiration (pyruvate decarboxylation, TCA cycle), enzymes only playing a role in fermentation (acetate pathway), and other enzymes, notably those constituting the electron transport chain and ATP synthases using the proton gradient for ATP production. The latter category is involved in both (aerobic) respiration and fermentation, and we divide the protein mass according to the ratio of the respiration and fermentation fluxes. For growth on glucose, we find fractions 0.45, 0.01, and 0.54 for the three protein categories, whereas for glycerol we find 0.37, 0.01, and 0.62, respectively (Supplementary file 4). This gives rise to the following estimates for glucose,
and for glycerol
Together with the values for the fluxes and metabolite concentrations, we can now estimate values for the unknown apparent catalytic constants ${k}_{mer}$ and ${k}_{mef}$ from Equations 138 and 139. In the case of growth on glucose, we have
and for growth on glycerol,
All parameter values derived in this and the previous sections are summarized in Appendix 2—table 2.
Data and parameter estimates for continuous growth
The model calibration procedure for the other conditions considered, continuous growth in a chemostat, in minimal medium with glucose at dilution rates of 0.2 hr^{1}, 0.35 hr^{1}, and 0.5 hr^{1}, is the same as for batch growth. Not all source data used above are available for continuous growth. In their absence, we use the corresponding data for batch growth as a proxy. In particular, total protein and metabolite concentrations were obtained from Gerosa et al., 2015, and Basan et al., 2015b, by selecting the (interpolated) values for batch growth at rates corresponding to the dilution rates (Appendix 2—figure 1). In addition, for the case of growth at a dilution rate of 0.2 hr^{1}, where no significant acetate overflow is detected, we set the acetate secretion rate to 5% of the acetate secretion rate during continuous growth at 0.35 hr^{1}, that is, a value below the detection limit. This allows the same model with respiration and fermentation to be used over all conditions.
The data used for calibration is shown in Appendix 2—table 3 and the values for the parameters obtained after calibration are listed in Appendix 2—table 4.
Data and parameter estimates for MG1655 and NCM3722 strains
In order to test the robustness of our results with respect to the calibration procedure, we calibrated the model for a different E. coli strain, MG1655, in the same way as for the reference strain. To this aim, we used published measurements on batch growth of MG1655 in minimal medium with glucose, including metabolite concentrations (McCloskey et al., 2018), proteomics data (Schmidt et al., 2016), and metabolic fluxes (Monk et al., 2017).
The total biomass concentration is the same as for the reference strain (Equation 81). The total metabolite concentration is obtained by McCloskey et al., 2018, who reported a value of 3.7 Cmmol gDW^{1}, equivalent to 9.1% of the total cellular biomass. The fraction of central metabolites is estimated to be 14% of the total metabolic concentration. The total protein concentration is obtained from Basan et al., 2015b, who report a protein fraction of 0.71 for the MG1655 strain, to which we add the fraction of free amino acids, estimated as 50% of the total metabolite concentration (Bennett et al., 2009). This gives a total protein biomass fraction of 0.76.
Proteins are then distributed over our protein categories, following the mass fraction values reported by Schmidt et al., 2016, for the MG1655 strain. Accordingly, we estimate
Uptake and secretion rates were taken from Monk et al., 2017. Comparison of metabolite concentration measurements of McCloskey et al., 2018, with ${K}_{m}$ values collected by Dourado et al., 2021, shows that reactions in central carbon metabolism are more saturated in MG1655 than in the reference strain (2.2 vs 1.2), in agreement with its higher growth rate (Supplementary file 3). Accordingly, the halfsaturation constant of reactions consuming central metabolites are estimated as
The data used for calibration are summarized in Appendix 2—table 5 and the values for the parameters obtained after calibration are listed in Appendix 2—table 6.
We also collect in Appendix 2—table 5 the data for batch growth of the NCM3722 strain in minimal medium with glucose, used in the Results section of the main paper. The data concern the growth rate and growth yield (Cheng et al., 2019), the glucose uptake, and acetate secretion rates reported by Cheng et al., 2019, from experiments carried out by Basan et al., 2015a, the total protein concentration (Basan et al., 2015a), and the total metabolite concentration (Park et al., 2016).
Calibration of model variant with an additional growthrateindependent protein category
In Appendix 1, we introduced a model variant with an additional growthrateindependent protein category, referred to as $Q$ (Scott et al., 2010). Estimation of the parameters for this model variant requires the estimation, for every protein category, of the offset of the linear relation between growth rate and proteome fraction (Hui et al., 2015). In order to obtain results comparable to those for the reference model, we have used proteomics data for the BW25113 strain (Schmidt et al., 2016). We considered 22 different growth conditions, excluding stationary phase (no balanced growth) and LB medium (addition of amino acids).
For the $R$ category, the proteome fraction increases with the growth rate and the offset can be computed as ${\chi}_{r}^{0}=0.23$ (Appendix 2—figure 2). Unfortunately, in the case of ${M}_{c}$, ${M}_{e}$, and ${M}_{u}$, the data show a decreasing or constant pattern with growth rate, which makes it impossible to determine the offset fraction for these protein categories (Appendix 2—figure 2, panels B–D). We therefore followed a different approach to estimate the growthrateindependent protein fraction. Assuming a total fraction of growthrateindependent proteins ${\chi}_{q}=0.52$, as reported for the MG1655 strain by Mori et al., 2016, we split the fraction ${\chi}_{q}{\chi}_{r}^{0}=0.29$ over the ${M}_{c}$, ${M}_{u}$, and ${M}_{e}$ categories proportionally to their size:
Notice that the above partitioning is equivalent to assuming that all enzyme categories have the same proportion of growthrateindependent proteins.
The growthratedependent fractions of the protein categories are then simply obtained from the difference between the total proteome fractions (Schmidt et al., 2016) and the growthrateindependent fractions:
Further calibration of the model is then identical to the calibration of the reference model, using published data for batch growth of BW25113 in glucose minimal medium (Appendix 2—table 1). In particular, from the total biomass concentration (40.65 Cmmol gDW^{1}) and the protein mass fraction (0.74), we can estimate the following growthratedependent protein concentrations:
Parameter values derived for this model are summarized in Appendix 2—table 6.
Data availability
The current manuscript is a computational study, so no data have been generated for this manuscript. Models and simulation code are available at https://gitlab.inria.fr/baldazzi/coliallocation (copy archived at Baldazzi and de Jong, 2023). Literature data used for model calibration and validation are included in the manuscript as Supplementary files 14.
References

Prediction of microbial growth rate versus Biomass yield by a metabolic network with kinetic parametersPLOS Computational Biology 8:e1002575.https://doi.org/10.1371/journal.pcbi.1002575

Are growth rates of Escherichia coli in batch cultures limited by respirationJournal of Bacteriology 144:114–123.https://doi.org/10.1128/jb.144.1.114123.1980

The energy charge of the adenylate pool as a regulatory parameterInteraction with Feedback Modifiers. Biochemistry 7:4030–4034.https://doi.org/10.1021/bi00851a033

SoftwareColiallocation, version swh:1:rev:b1be76f8c748bb26462977b00b13caf86e653f83Software Heritage.

Inflating bacterial cells by increased protein synthesisMolecular Systems Biology 11:836.https://doi.org/10.15252/msb.20156178

Absolute metabolite concentrations and implied enzyme active site occupancy in Escherichia coliNature Chemical Biology 5:593–599.https://doi.org/10.1038/nchembio.186

BookModulation of chemical composition and other parameters of the cell by growth rateIn: Neidhardt FC, Curtiss R, Ingraham JL, editors. Escherichia coli and Salmonella: Cellular and Molecular Biology (2nd ed). Washington, DC: ASM Press. pp. 1553–1569.

Complete genome sequence of Escherichia coli Ncm3722Genome Announcements 3:e0087915.https://doi.org/10.1128/genomeA.0087915

Searching for principles of microbial physiologyFEMS Microbiology Reviews 44:821–844.https://doi.org/10.1093/femsre/fuaa034

Laboratory evolution reveals a twodimensional rateyield Tradeoff in microbial metabolismPLOS Computational Biology 15:e1007066.https://doi.org/10.1371/journal.pcbi.1007066

Coordination of microbial metabolismNature Reviews. Microbiology 12:327–340.https://doi.org/10.1038/nrmicro3238

Lessons on enzyme Kinetics from quantitative ProteomicsCurrent Opinion in Biotechnology 46:81–89.https://doi.org/10.1016/j.copbio.2017.02.007

Elementary growth modes provide a molecular description of cellular selffabricationPLOS Computational Biology 16:e1007559.https://doi.org/10.1371/journal.pcbi.1007559

Mathematical Modelling of Microbes: metabolism, gene expression and growthJournal of the Royal Society, Interface 14:136.https://doi.org/10.1098/rsif.2017.0502

Crosstalk between transcription and metabolism: how much enzyme is enough for a cellWiley Interdisciplinary Reviews. Systems Biology and Medicine 10:1396.https://doi.org/10.1002/wsbm.1396

An Analytical theory of balanced cellular growthNature Communications 11:1226.https://doi.org/10.1038/s4146702014751w

The Energetics of Escherichia coli during aerobic growth in continuous cultureEuropean Journal of Biochemistry 67:115–122.https://doi.org/10.1111/j.14321033.1976.tb10639.x

Growth rate of polypeptide chains as a function of the cell growth rate in a mutant of Escherichia coli 15Journal of Molecular Biology 55:563–568.https://doi.org/10.1016/00222836(71)903378

Levels of major proteins of Escherichia coli during growth at different temperaturesJournal of Bacteriology 139:185–194.https://doi.org/10.1128/jb.139.1.185194.1979

Flux analysis and control of the central metabolic pathways in Escherichia coliFEMS Microbiology Reviews 19:85–116.https://doi.org/10.1111/j.15746976.1996.tb00255.x

Metabolic costs of amino acid and protein production in Escherichia coliBiotechnology Journal 8:1105–1114.https://doi.org/10.1002/biot.201200267

Use of adaptive laboratory evolution to discover key mutations enabling rapid growth of Escherichia coli K12 Mg1655 on glucose minimal mediumApplied and Environmental Microbiology 81:17–30.https://doi.org/10.1128/AEM.0224614

Optimal Proteome allocation and the temperature dependence of microbial growth lawsNPJ Systems Biology and Applications 7:14.https://doi.org/10.1038/s4154002100172y

Rapidrip Quantifies the intracellular Metabolome of 7 industrial strains of EMetabolic Engineering 47:383–392.https://doi.org/10.1016/j.ymben.2018.04.009

Shifts in growth strategies reflect Tradeoffs in cellular economicsMolecular Systems Biology 5:323.https://doi.org/10.1038/msb.2009.82

Iml1515, a Knowledgebase that computes Escherichia coli traitsNature Biotechnology 35:904–908.https://doi.org/10.1038/nbt.3956

Constrained allocation flux balance analysisPLOS Computational Biology 12:e1004913.https://doi.org/10.1371/journal.pcbi.1004913

A yieldcost Tradeoff governs Escherichia coli’s decision between fermentation and respiration in carbonlimited growthNPJ Systems Biology and Applications 5:16.https://doi.org/10.1038/s4154001900934

Nonlinear dependency of intracellular fluxes on growth rate in Miniaturized continuous cultures of Escherichia coliApplied and Environmental Microbiology 72:1164–1172.https://doi.org/10.1128/AEM.72.2.11641172.2006

Studies on the role of ribonucleic acid in the growth of bacteriaBiochimica et Biophysica Acta 42:99–116.https://doi.org/10.1016/00063002(60)907575

BookChemical composition of Escherichia coliIn: Umbarger HE, Neidhardt FC, editors. Escherichia coli and Salmonella: Cellular and Molecular Biology. Washington, DC: ASM Press. pp. 1–6.

Metabolite concentrations, fluxes and free energies imply efficient enzyme usageNature Chemical Biology 12:482–489.https://doi.org/10.1038/nchembio.2077

Proteome reallocation in Escherichia coli with increasing specific growth rateMolecular BioSystems 11:1184–1193.https://doi.org/10.1039/c4mb00721b

Invariance of the nucleoside Triphosphate pools of Escherichia coli with growth rateThe Journal of Biological Chemistry 275:3931–3935.https://doi.org/10.1074/jbc.275.6.3931

The maintenance energy of bacteria in growing culturesProceedings of the Royal Society of London. Series B, Biological Sciences 163:224–231.https://doi.org/10.1098/rspb.1965.0069

The physiology and ecological implications of efficient growthThe ISME Journal 9:1481–1487.https://doi.org/10.1038/ismej.2014.235

Energetics of bacterial growth: balance of anabolic and Catabolic reactionsMicrobiological Reviews 59:48–62.https://doi.org/10.1128/mr.59.1.4862.1995

The quantitative and conditiondependent Escherichia coli ProteomeNature Biotechnology 34:104–110.https://doi.org/10.1038/nbt.3418

Relationship between growth rate and ATP concentration in Escherichia coli: a Bioassay for available cellular ATPThe Journal of Biological Chemistry 279:8262–8268.https://doi.org/10.1074/jbc.M311996200

Emergence of robust growth laws from optimal regulation of Ribosome synthesisMolecular Systems Biology 10:747.https://doi.org/10.15252/msb.20145379

Optimality and suboptimality in a bacterial growth lawNature Communications 8:14123.https://doi.org/10.1038/ncomms14123

The allosteric regulation of pyruvate kinaseThe Journal of Biological Chemistry 275:18145–18152.https://doi.org/10.1074/jbc.M001870200

Microbial maintenance: a critical review on its QuantificationMicrobial Ecology 53:513–523.https://doi.org/10.1007/s0024800690495

Does microbial life always feed on negative entropy? thermodynamic analysis of microbial growthBiochimica et Biophysica Acta  Bioenergetics 1412:191–211.https://doi.org/10.1016/S00052728(99)000651

Metabolic enzyme cost explains variable tradeoffs between microbial growth rate and yieldPLOS Computational Biology 14:e1006010.https://doi.org/10.1371/journal.pcbi.1006010

Optimal control of bacterial growth for the maximization of metabolite productionJournal of Mathematical Biology 78:985–1032.https://doi.org/10.1007/s0028501812996

Economics of membrane occupancy and RespirofermentationMolecular Systems Biology 7:500.https://doi.org/10.1038/msb.2011.34
Decision letter

Petra Anne LevinReviewing Editor; Washington University in St. Louis, United States

Michael B EisenSenior Editor; University of California, Berkeley, United States
Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.
Decision letter after peer review:
Thank you for submitting your article "Resource allocation accounts for the large variability of rateyield phenotypes across bacterial strains" for consideration by eLife. Our sincere apologies for the delay in returning the decision.
Your article has been reviewed by 2 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Naama Barkai as the Senior Editor. The reviewers have opted to remain anonymous.
The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.
Essential revisions:
While other reviewer concerns as detailed below should be considered, the authors need to address different assumptions for ϕq in order for the study to be complete.
Reviewer #1 (Recommendations for the authors):
Proposed points of improvement.
Rationalize bounds. In Figure 2 I would like to better understand what are the reasons for the bounds. What gives the "P" shape? What are the trends with different allocation tradeoffs? etc. Possibly some analytical insight is possible here [maybe with a simplified version of the model], leading to more transparent theoretical insight.
Rationalize (and show) trends. Figure 3 seems particularly uninformative. It would be much more instructive to see trends of uptake and secretion rates vs the other variables as 2D plots, compared with the model predictions (particularly for panels AB, panels CD show a complex trend, but this is precisely what I would like to get more insight on). It is not clear how the prediction of Figure 3C is produced by the model since the parameters are not fixed as the allocation changes.
Figure 4. I got lost with this figure, which was particularly uninformative to me (and graphically lacks proper labeling). Looking at the plots, I only see different degrees of agreement between the model and data. Reading the connected Results paragraph, there are a lot of qualitative considerations "under the hood" that seem very interesting but are not accessible/transparent. This could be my own limitation, and it's possible that this paragraph is accessible to a different audience (e.g. more expert than me on metabolic models). However, my impression was that this paragraph/figure could be made more accessible, although I did not gain enough access to give specific recommendations, other than giving the reader some insight on the model predicted trends that we are discussing here.
Where are the optima? In Figure 2 one can explore what the model gives if one tries to optimize (1) growth rate at fixed yield (2) yield at fixed growth rate or (3) looks at Pareto optima of both. I agree that optimization of one or both quantities may not be the goal, but still, it is important to understand where optimization would bring theoretically, and how the data points cluster with respect to these theoretical optima.
Comparison with other frameworks.
A more detailed comparison with other "reference" frameworks would be useful here.
I would propose: Erickson 2017, Basan 2015, Maitra 2015 [but other choices are possible]
[see below]
The definition of yield should be explained much more clearly in the main text, both in the model and in the data. Model: Explain why Equation 2 represents the fraction of carbon going into biomass. Data: explain how the quantity is measured and how the measured quantity relates to the model.
I am confused by some sort of implicit identification that the authors make between allocation (e.g. the fraction of ribosome making ribosomes) and partitioning (e.g. the fraction of proteins or total mass that is ribosomes). In particular, for ribosomes, I am not sure that their equations (e.g. Eq (6) in SI regarding ribosomes) are equivalent to the framework of Erickson 2017 (which I use as a reference). At steady state (the condition that is relevant for this study), this might be irrelevant, since allocation and partitioning coincide (Scott 2010), but then for clarity, it might be better to present the framework as steadystate relations (as in Scott 2010) and not by ODE.
Related to this point, or this may be the same point, I think the notation is confusing for the parameters v_x, m_x. These are extensive quantities and I am not clear how they are set. For example, v_r ~ R, which is also a necessary condition to get exponential growth (see above). I found this mentioned only on line 642 of the appendix.
Another related point, I did not understand if the model makes more or less implicit fluxbalance assumptions (or more in general whether at some point it assumes relationships between fluxes). It should not but at some point, I had this impression. In general, it would be interesting to have some insight into the relationships between the different fluxes (in particular those in consecutive chains) for different values of the resource allocation vector.
Around line 240, the authors discuss that the trend in Figure 3AB is a consequence (through Equation 2) of (population average) density homeostasis (in this case across different strains growing in the same conditions, which is perhaps not the usual way this parameter is considered). Do we then need to think that the model prediction is trivial in this case [as pointed out above, seeing this section of the data and the model prediction would be very instructive here]?
Figure S1 could be presented with Figure 3 (although, see above, probably more is needed). Here one sees the points that do not agree with the model and the authors can comment on those. In particular, those outliers laying near the xaxis of Figure S1B seem potentially interesting to explain/rationalize.
Technical point: How do the predictions depend on the data point used for calibration of the model?
Other points raised after discussion with our group
It seems that the interpretation of the C sector might be different from the canonical one. c → ** Central carbon metabolites **, that is, catabolic products of the carbon source substrate (glucose, glycerol, …) taken up from the medium. What about catabolic enzymes?. Also, enzymes in amino acid metabolism, that are necessary for protein synthesis seem to end up in the R sector (?).
Not clear what the ρ are in dc/dt, and why they must be > 1.
The main result statements of the study are either quite generic or cannot be understood from the main figures. This can probably be improved by reengineering both figures and statements (from abstract):
 very good quantitative agreement between the predicted and observed variability in rates and yields, acetate flow does not correlate with the growth rate.
 resource allocation is a major explanatory factor of the observed variety of growth rates and growth yields across different bacterial strains.
 differences in enzyme activity need to be taken into account to explain variations in protein abundance.
Cmmol seems like a very unintuitive and nonstandard unit. Has this been used before? Can a better solution be proposed? Does this hide something related to protein length in the different sectors?
[Editors' note: further revisions were suggested prior to acceptance, as described below.]
Thank you for resubmitting your work entitled "Resource allocation accounts for the large variability of rateyield phenotypes across bacterial strains" for further consideration by eLife. Your revised article has been evaluated by Michael Eisen (Senior Editor) and a Reviewing Editor.
The manuscript has been improved but there are some remaining issues that need to be addressed. In particular:
1. Please address the reviewer's request that some predictions based on the model be added to the text so the reader can better understand how the model works. (i.e. make it less of a black box).
2. To clarify the model and results and how they differ from those of Basan, 2015 please revise the text to address the differences between the results of this study and those of the Basan study in detail. (The reviewer included a list of questions that should ideally be answered in any such comparison in their review below.)
Reviewer #1 (Recommendations for the authors):
The authors made considerable revisions and provided a detailed and clear reply to all the points raised. I maintain my opinion on the fact that the work is timely and the theoretical framework is very interesting.
Having said this, I also have to say that the results remain somewhat nontransparent, as the authors were not able to derive a mathematical or qualitative rationale for the main results or analyze the model in terms of simpler onedimensional relationships, and the comparisons with data remain nonstringent. However, they have provided additional figures and analyses that do contribute towards clarity, as well as clarifying many of the model assumptions and definitions.
I think the manuscript should appear on eLife, in view of the contribution towards rationalizing a more complex relationship between growth and yield than the simple tradeoff assumed by most. If this could be the central point of this study (I find it interesting and I think it might have some impact), then I have some remarks to clarify the message.
First, the message could emerge more clearly in the abstract.
Second, it might be possible to characterize (without comparing to data, making some simple assumptions on the parameters) the variation of mu_max, y_max (and maybe also some "central values") across conditions, to study and visualize their relationship with resource allocation parameters. Perhaps these predictions are not verified or directly comparable to data (and I am not asking to perform any comparison), but they might help the reader understand how the model works (as I said I think the weakest point of this study remains the "black box" feeling about all the main results).
Third, a more stringent comparison with the data/model of Basan et al. 2015 seems important to clarify the results. What brings those authors to conclude towards a tradeoff between protein cost and energy efficiency? Would the model in this study describe the Basan et al. data and how? Would this comparison lead to different conclusions? Are there crucial differences in the modeling choices of the two studies? Are there (according to the authors' model) regimes with/without strong tradeoffs and how can they be characterized? These seem like questions worth addressing.
https://doi.org/10.7554/eLife.79815.sa1Author response
Essential revisions:
While other reviewer concerns as detailed below should be considered, the authors need to address different assumptions for ϕq in order for the study to be complete.
We have developed model variants based on different assumptions for φ_{q}. Depending on the case considered, the resource allocation parameter for housekeeping proteins is called χ_{q} or χ_{u} in the revised manuscript. We have analyzed the models for fixed values of χ_{q} or χ_{u}, or for values varying within bounds given by the proteomics data. We show, as explained in detail below, that the predictions of the variability of rateyield phenotypes are robust with respect to the different assumptions. In addition, we have addressed all other points raised by the reviewers.
Reviewer #1 (Recommendations for the authors):
Proposed points of improvement.
Rationalize bounds. In Figure 2 I would like to better understand what are the reasons for the bounds. What gives the "P" shape? What are the trends with different allocation tradeoffs? etc. Possibly some analytical insight is possible here [maybe with a simplified version of the model], leading to more transparent theoretical insight.
In the revised version of the paper, we have better explained the mapping from resource allocation strategies to rateyield phenotypes, which defines the bounds in Figure 2, using additional plots. We notably argue that insights into the physiological consequences of a strategy can be gained by means of a pictogram showing (i) the biomass composition over (different categories of) proteins, other macromolecules, and metabolites, (ii) the flux map, and (iii) the energy charge. The pictogram is used, for example, to compare growth of the reference strain used for calibration with growth at maximum rate or maximum yield. This provides a qualitatively understanding of the origin of the tradeoff between growth rate and maximum growth yield, which is one of the striking feature of the shape of the cloud of predicted rateyield phenotypes.
The mapping from resource allocation strategies to rateyield phenotypes defined by the model is complex, due to the multiple feedback loops between metabolism, protein synthesis, and growth. Finding simplified models that facilitate mathematical analysis and, at the same time, preserve the main features of the mapping is a nontrivial challenge and raises new questions. We therefore think that a full mathematical analysis of the model is beyond the scope of this study.
We have added a new Results section, Predicted rateyield phenotypes for Escherichia coli, to better explain how the rateyield phenotypes follow from the resource allocation strategies and the macro reactions included in the model. The arguments in this section are supported by the new Figure 2.
Rationalize (and show) trends. Figure 3 seems particularly uninformative. It would be much more instructive to see trends of uptake and secretion rates vs the other variables as 2D plots, compared with the model predictions (particularly for panels AB, panels CD show a complex trend, but this is precisely what I would like to get more insight on). It is not clear how the prediction of Figure 3C is produced by the model since the parameters are not fixed as the allocation changes.
Following the suggestion of the reviewer, we have plotted the glucose uptake and acetate secretion rates against the growth rate and the growth yield in separate 2D plots. The predicted bounds of rateyield and uptakesecretion phenotypes capture the observed variability very well. This new representation allows a clearer statement of a number of conclusions from the analysis, such as the correlation between glucose uptake rate and growth rate, the absence of correlation between growth rate and acetate secretion rate, and the inverse correlation between growth yield and acetate secretion rate.
We have replaced Figure 3 in the original manuscript by a new figure with the plots suggested by the reviewer (Figure 4 in the revised manuscript). We have accordingly revised the discussion in the section Predicted and observed uptakesecretion phenotypes for Escherichia coli. The new Figure 4 makes Supplementary Figures 1 and 2 in the previous version of the manuscript redundant, so these have been removed.
Figure 4. I got lost with this figure, which was particularly uninformative to me (and graphically lacks proper labeling). Looking at the plots, I only see different degrees of agreement between the model and data. Reading the connected Results paragraph, there are a lot of qualitative considerations "under the hood" that seem very interesting but are not accessible/transparent. This could be my own limitation, and it's possible that this paragraph is accessible to a different audience (e.g. more expert than me on metabolic models). However, my impression was that this paragraph/figure could be made more accessible, although I did not gain enough access to give specific recommendations, other than giving the reader some insight on the model predicted trends that we are discussing here.
The main question we want to answer in this section is how E. coli can grow both fast and efficiently on glucose. We agree with the reviewer that in the previous version of the manuscript our answer to this question was not clearly formulated and that Figure 4 can be improved. In the revised manuscript, we have streamlined the argument by explaining (i) that we focus on the wellcharacterized NCM3722 strain as a prototype for highrate, highyield growth, (ii) that we need to revise the model assumption of fixed catalytic constants for glycolytic enzymes to quantitatively account for the NCM growth phenotype by means of the observed resource allocation strategy, and (iii) that after doing this, we can attribute highrate, highyield growth to the more efficient utilization of proteomic resources. Figure 4 (Figure 5 in the revised manuscript) has been completely revised to better bring out the comparison of the predicted and observed resource allocation strategies and growth phenotypes of NCM3722.
The section Predicted and observed strategies enabling fast and efficient growth of Escherichia coli has been rewritten and Figure 4 (Figure 5 in the revised manuscript) has been changed accordingly.
Where are the optima? In Figure 2 one can explore what the model gives if one tries to optimize (1) growth rate at fixed yield (2) yield at fixed growth rate or (3) looks at Pareto optima of both. I agree that optimization of one or both quantities may not be the goal, but still, it is important to understand where optimization would bring theoretically, and how the data points cluster with respect to these theoretical optima.
Figure 2 indeed shows that the global optima of the growth rate and the growth yield are located on both ends of a Pareto front. In the revised version of the manuscript, we discuss the resource allocation strategies underlying the points of the predicted maximum growth rate and maximum growth yield, and the corresponding growth physiology. We also intuitively explain, as mentioned in response to the first comment of this reviewer, which resource allocation tradeoff underlies the tradeoff between growth rate and growth yield along the Pareto front. Whereas no experimental data points are located in the vicinity of the maximum yield, some strains grow at a rate approaching the predicted maximum.
In the revised manuscript, we discuss the points of maximum rate and yield in the new Results section, Predicted rateyield phenotypes for Escherichia coli, supported by the new Figure 2. In the next section, we explain that no experimental data are available for comparison with the point of maximum yield, but that some strains have a highrate, highyield phenotype not far from the point of maximum growth rate. The analysis of highrate, highyield is pursued in the section Predicted and observed strategies enabling fast and efficient growth of Escherichia coli.
Comparison with other frameworks.
A more detailed comparison with other "reference" frameworks would be useful here.
I would propose: Erickson 2017, Basan 2015, Maitra 2015 [but other choices are possible]
[see below]
We have added a new section to Appendix 1 to compare our model with other existing “reference” models. This section essentially develops the short discussion in the beginning of the section Coarsegrained model with coupled carbon and energy fluxes.
The definition of yield should be explained much more clearly in the main text, both in the model and in the data. Model: Explain why Equation 2 represents the fraction of carbon going into biomass. Data: explain how the quantity is measured and how the measured quantity relates to the model.
The definition of yield was motivated in Appendix 1 in the discussion leading up to Equation 34 of the revised manuscript. We have repeated this explanation in the main text, as suggested by the reviewer. We have also written a new Methods subsection on the measurement of growth yields and the conversion of measured values to the dimensionless unit adopted in this work. We refer to this new subsection in the Results section Predicted and observed rateyield phenotypes for Escherichia coli. The previous version of the manuscript had a subsection in Appendix 1 called Consistency with empirical calculations of rate and yield. This subsection has become redundant after the above modifications and has been removed from the revised manuscript.
I am confused by some sort of implicit identification that the authors make between allocation (e.g. the fraction of ribosome making ribosomes) and partitioning (e.g. the fraction of proteins or total mass that is ribosomes). In particular, for ribosomes, I am not sure that their equations (e.g. Eq (6) in SI regarding ribosomes) are equivalent to the framework of Erickson 2017 (which I use as a reference). At steady state (the condition that is relevant for this study), this might be irrelevant, since allocation and partitioning coincide (Scott 2010), but then for clarity, it might be better to present the framework as steadystate relations (as in Scott 2010) and not by ODE.
Our Equation 6 in Appendix 1 of the original manuscript, dR/dt = φ_{r} V_{r} − γ R, corresponds to Equation 3A of Erickson et al. [3], dM_{Rb}/dt = χ_{Rb}(t)J_{R}, with two differences. First, Erickson et al. do not take into account biomass degradation. Second, more important for the question of the reviewer, Erickson et al. denote the ribosomal resource allocation parameter by χ_{Rb} (and allow it to be timevarying), whereas we call the ribosomal resource allocation parameter φ_{r} (and set it to a constant value defined by resource allocation at steady state). Erickson et al. also use symbols φ, but these denote proteome fractions rather than resource allocation strategies. In particular, φ_{Rb}(t) is defined as the ribosomal proteome fraction M_{Rb}(t)/M_{P}(t), which at steady state (^{∗}) equals the resource allocation parameter: ([3], p. 16 of SI). In our framework, the resource allocation parameter φ_{r} and proteome fraction r/p also coincide at steady state. From the steady state equation for ribosomes, φ_{r} v_{r} = (µ + γ)r, and the steadystate equation for total proteins, v_{r} = (µ + γ)p, it follows that φ_{r} = r/p.
In conclusion, our ribosomal resource allocation parameter φ_{r} has the same symbol as the ribosomal proteome fraction φ_{Rb} of Erickson et al. In order to remove this source of confusion between resource allocation and proteome partitioning, we relabeled our resource allocation strategies to χ instead of φ, following Erickson et al. Even though we study the system at steady state, we prefer to keep the ODE representation of the model. First, the ODE representation is used for finding the steady state. Second, the lefthand side of the ODEs indicates the variable for which the mass or energy balance holds, and thus establishes an explicit correspondence with the graphical representation of the model in Figure 1.
We changed the symbols φ to χ throughout the text, and made a more explicit distinction between the notions of allocation and partitioning when necessary. The comparison of our model with the model of Erickson et al. is carried out in a new subsection of Appendix 1 (Comparison with other coarsegrained resource allocation models).
Related to this point, or this may be the same point, I think the notation is confusing for the parameters v_x, m_x. These are extensive quantities and I am not clear how they are set. For example, v_r ~ R, which is also a necessary condition to get exponential growth (see above). I found this mentioned only on line 642 of the appendix.
v_{x} and m_{x} are intensive and not extensive quantities; they have units Cmmol gDW^{−1} h^{−1} and Cmmol gDW^{−1}, respectively (Appendix 1, after Equations 14 and 22). Their corresponding extensive quantities are denoted as V_{x} and M_{x}, with units Cmmol h^{−1} and Cmmol, respectively (Appendix 1, at the point indicated by the reviewer). Accordingly, it does not hold that v_{r} ∼ R, but rather V_{r} ∼ R and v_{r} ∼ r. The model in Figure 1 does not include extensive quantities, only intensive quantities. The extensive quantities are used to construct the model in a principled way from basic assumptions in Appendix 1. The shift from extensive to intensive quantities introduces the growth dilution term in the model. It also allows the definition of the growth rate and growth yield in terms of reaction rates.
As an aside, and as a followup of the previous point, note that the model of Erickson et al. consists of extensive quantities, contrary to our model. The rate equations used in our model, expressing the dependency of a reaction rate on ATP and metabolite concentrations with the help of a halfsaturation constant, require the use of intensive variables.
We have explicitly stated in the section Coarsegrained model with coupled carbon and energy fluxes that the model consists of intensive variables.
Another related point, I did not understand if the model makes more or less implicit fluxbalance assumptions (or more in general whether at some point it assumes relationships between fluxes). It should not but at some point, I had this impression. In general, it would be interesting to have some insight into the relationships between the different fluxes (in particular those in consecutive chains) for different values of the resource allocation vector.
At steady state, the different fluxes in the model including growth dilution must be balanced, in the sense that the righthand side of the equations in Figure 1 must equal 0. We do not make any implicit or explicit assumptions on relations between fluxes though: every flux is defined by a separate rate equation, given by Equations 3540 in Appendix I.
In order to get a better insight into the relations between the fluxes, we have developed a visual representation of the fluxes at steady state for a given resource allocation strategy (Figure 2 in the revised manuscript, see also the first comment of the reviewer). With the help of this pictogram, we explain how the incoming carbon flux is distributed over the other fluxes (protein synthesis, respiration, fermentation, …).
We added a new Results section, Predicted rateyield phenotypes for Escherichia coli, to better explain the working of the model, including the flux balance at steady state.
Around line 240, the authors discuss that the trend in Figure 3AB is a consequence (through Equation 2) of (population average) density homeostasis (in this case across different strains growing in the same conditions, which is perhaps not the usual way this parameter is considered). Do we then need to think that the model prediction is trivial in this case [as pointed out above, seeing this section of the data and the model prediction would be very instructive here]?
The expression Y = µ/(β v_{mc}) (Equation 2) relates the growth yield to the growth rate and the glucose uptake rate, as explained in Appendix 1 in the discussion leading up to Equation 34. The point we wanted to make is that this relation between the three quantities holds by definition, and that we also expect it to hold for experimental measurements of Y, µ, and v_{mc}. As a consequence, for similar values of 1/β in the model and in experiments, a given pair of growth rate and growth yield returns a similar value of v_{mc}.
In the new discussion of the comparison of the predicted and observed variability of uptake secretion phenotypes, structured around the new Figure 4, the relation of Equation 2 is used somewhat differently. We explain that, given a glucose uptake rate, the bacteria can grow at different growth rates depending on the growth yield, which is a consequence of the resource allocation strategy adopted by the cell (see also a comment of Reviewer 2). The determination of growth rate and growth yield by resource allocation is not trivial though.
This point has been reformulated in the revised version of the manuscript (Predicted and observed uptakesecretion phenotypes for Escherichia coli).
Figure S1 could be presented with Figure 3 (although, see above, probably more is needed). Here one sees the points that do not agree with the model and the authors can comment on those. In particular, those outliers laying near the xaxis of Figure S1B seem potentially interesting to explain/rationalize.
In the revised manuscript, following a suggestion of this reviewer, we have structured the discussion of the measured and predicted uptake and secretion rates in a different way. In particular, we compare the predicted and observed variability of uptake and secretion rates as related to the growth rate and growth yield by a series of 2D plots (Figure 4 in the revised manuscript). Some outliers occur in these new projections of the model predictions and are discussed in the text.
Supplementary Figure 1 has been removed from the revised manuscript. We discuss the outliers in the section Predicted and observed uptakesecretion phenotypes for Escherichia coli.
Technical point: How do the predictions depend on the data point used for calibration of the model?
We also calibrated the model for another commonlyused E. coli laboratory strain, MG1655 instead of BW25113. The clouds of predicted rateyield phenotypes for the BW and MG strains are quantitatively very similar. This shows the robustness of the rateyield predictions for calibration with an alternative dataset.
We discuss the results of the calibration of the model for another E. coli strain in the Discussion section and in Figure 3—figure supplement 1. The details of the calibration are included in a new subsection of Appendix 2, Data and parameter estimates for an alternative E. coli strain.
Other points raised after discussion with our group
It seems that the interpretation of the C sector might be different from the canonical one. c → ** Central carbon metabolites **, that is, catabolic products of the carbon source substrate (glucose, glycerol, …) taken up from the medium. What about catabolic enzymes? Also, enzymes in amino acid metabolism, that are necessary for protein synthesis seem to end up in the R sector (?).
Indeed, central carbon metabolites (variable c in the model) are defined as consisting of the catabolic products of the carbon source (glucose, glycerol, …) taken up from the medium. They include intermediates of the glycolysis pathway, the tricarboxylic acid cycle, and the pentose phosphate pathway, notably the thirteen precursor metabolites from which the building blocks for macromolecules (amino acids, nucleotides, …) are produced ([11], Ch. 5). The catabolic enzymes are included in the protein category “Enzymes in central carbon metabolism”, which take up substrates from the environment and break them down to central carbon metabolites (variable m_{c} in the model).
In other models, the precursor pool is often defined as consisting of amino acids, e.g., the models of Erickson et al. [3] and Giordano et al. [6]. Here we needed a different definition, because our model includes macro reactions for the production of other macromolecules (RNA, DNA, …) and the secretion of acetate. The synthesis of other macromolecules and the secretion of acetate can also be traced back to the precursor metabolites mentioned above, which has motivated our definition of c as the pool of central carbon metabolites. As an aside, this conceptualization corresponds to the core model of E. coli metabolism in flux balance analysis, where the biomass function is defined in terms of a dozen precursor metabolites [4].
Following the above logic, the enzymes in amino acid metabolism are included with the ribosomes in the category “Ribosomes and translationaffiliated proteins, including enzymes in amino acid metabolism” (variable r in the model, Appendix 1). Enzymes in amino acid metabolism convert central metabolites into amino acids, which ribosomes assemble into proteins. Another reason for including the enzymes in amino acid metabolism with ribosomal proteins is that their proteome fractions have the same linear dependence on the growth rate, contrary to the enzymes in central carbon metabolism (Figure S11 in [12]). In previous stages of this work, we developed model versions including an amino acid pool as a separate variable, but given that this complicated the model without substantially changing its predictions, we have preferred the more parsimonious model presented here.
We briefly summarize the arguments above in the section Coarse grained model with coupled carbon and energy fluxes and in Appendix 1.
Not clear what the ρ are in dc/dt, and why they must be > 1.
The ρ factors account for the (additional) loss of carbon during the synthesis of macromolecules and the secretion of acetate. For example, when converting central metabolites into proteins during growth on glucose, for every 1 Cmmol of protein produced, 0.17 Cmmol of CO_{2} is generated ([1] and Appendix 2). Therefore, in order to preserve the carbon balance, when writing the rate of consumption of central carbon metabolites during protein synthesis, we need to multiply v_{r} with the factor 1.17. This factor, accounting for the loss of CO_{2}, is expressed as the constant ρ_{ru} in the model. Because it expresses an additional consumption of carbon, beyond that included in the proteins, ρ_{ru} > 1 by definition. A similar explanation can be given for the loss of CO_{2} associated with the synthesis of acetate, giving rise to the term ρ_{mef} v_{mef}.
We have explained the origin of the ρ factors in more detail in the main text (in the section Coarsegrained model with coupled carbon and energy fluxes) and in Appendix 1 (in the section Derivation of model equations).
The main result statements of the study are either quite generic or cannot be understood from the main figures. This can probably be improved by reengineering both figures and statements (from abstract):
– very good quantitative agreement between the predicted and observed variability in rates and yields, acetate flow does not correlate with the growth rate.
– resource allocation is a major explanatory factor of the observed variety of growth rates and growth yields across different bacterial strains.
– differences in enzyme activity need to be taken into account to explain variations in protein abundance.
We have verified that the statements of the main results in the abstract are explicitly matched by statements in the text, in the context of the discussion of Figures 35 and in the Discussion section.
Cmmol seems like a very unintuitive and nonstandard unit. Has this been used before? Can a better solution be proposed? Does this hide something related to protein length in the different sectors?
Cmmol is a unit often used in biotechnology (e.g., [8]) and in ecology (e.g., [10]). It is notably used for expressing yields: Cmmol_{biomass}/Cmmol_{glucose} indicates the fraction of carbon taken up by the cells that is included in the biomass [8]. The advantage of adopting this unit in our coarsegrained model is that it enables a rigorous statement of the carbon balance, by making explicit the carbon contents of the cellular components and the fluxes. This is critical for estimating the variations in growth rate and growth yield when changing the resource allocation strategy.
We have provided an explicit motivation for the use of Cmmol units in the section Coarsegrained model with coupled carbon and energy fluxes.
[Editors' note: further revisions were suggested prior to acceptance, as described below.]
The manuscript has been improved but there are some remaining issues that need to be addressed. In particular:
1. Please address the reviewer's request that some predictions based on the model be added to the text so the reader can better understand how the model works. (i.e. make it less of a black box).
2. To clarify the model and results and how they differ from those of Basan, 2015 please revise the text to address the differences between the results of this study and those of the Basan study in detail. (The reviewer included a list of questions that should ideally be answered in any such comparison in their review below.)
In order to address the first point, we have added a new paragraph in the section Predicted rateyield phenotypes for Escherichia coli. The text is based on the analysis of a simplified version of the model described in a new subsection of Appendix 1, Simplified coarsegrained resource allocation models. This simplified model allows to trace back the decrease in growth yield and increase in growth rate occurring along the Pareto frontier in Figure 2 to underlying changes in the resource allocation strategy. This opens up the “black box” of the model, as Reviewer 1 suggests, for a specifically striking prediction of our model. This extension comes with an additional supplementary figure, Figure 2—figure supplement 4.
The second point is addressed in a new paragraph in the Discussion section, which answers the questions posed by the reviewer. This paragraph is based on the new subsection Simplified coarsegrained resource allocation models in Appendix 1, where it is shown that the model of this manuscript can be reduced to the model of Basan et al. [1] when a number of additional assumptions are made. In particular, the possibility of a tradeoff between investment in proteins and metabolites, which our model admits contrary to the model of Basan et al., is shown to be critical for the differences in predictions of the models.
Reviewer #1 (Recommendations for the authors):
The authors made considerable revisions and provided a detailed and clear reply to all the points raised. I maintain my opinion on the fact that the work is timely and the theoretical framework is very interesting.
Having said this, I also have to say that the results remain somewhat nontransparent, as the authors were not able to derive a mathematical or qualitative rationale for the main results or analyze the model in terms of simpler onedimensional relationships, and the comparisons with data remain nonstringent. However, they have provided additional figures and analyses that do contribute towards clarity, as well as clarifying many of the model assumptions and definitions.
I think the manuscript should appear on eLife, in view of the contribution towards rationalizing a more complex relationship between growth and yield than the simple tradeoff assumed by most. If this could be the central point of this study (I find it interesting and I think it might have some impact), then I have some remarks to clarify the message.
First, the message could emerge more clearly in the abstract.
We have reformulated the second part of the abstract, and added a phrase at the end of the Introduction, to better bring out this message.
Second, it might be possible to characterize (without comparing to data, making some simple assumptions on the parameters) the variation of mu_max, y_max (and maybe also some "central values") across conditions, to study and visualize their relationship with resource allocation parameters. Perhaps these predictions are not verified or directly comparable to data (and I am not asking to perform any comparison), but they might help the reader understand how the model works (as I said I think the weakest point of this study remains the "black box" feeling about all the main results).
We have found a way to mathematically analyze the model along the predicted Pareto frontier running from Y_{max} to µ_{max} in the rateyield phenotype space. This has required making a number of simplifications that are justified in this specific region. The analysis of the simplified model allows the decrease in growth yield with the increase in growth rate along the Pareto frontier to be traced back to qualitative changes in the underlying resource allocation strategy, by taking into account the constraints on carbon and energy flows, biomass composition, and resource allocation. In particular, this analysis supports the observation made in the main text that the rateyield tradeoff corresponds to a tradeoff between investment in proteins vs metabolites on the physiological level. The analysis thus opens the “black box” for this particularly striking prediction of a tradeoff between growth rate and (maximum) growth yield.
We have added a new paragraph in the section Predicted rateyield phenotypes for Escherichia coli. The text is based on the analysis of a simplified version of the model described in a new subsection in Appendix 1, Simplified coarsegrained resource allocation models, which is accompanied by an additional supplementary figure (Figure 2—figure supplement 4).
Third, a more stringent comparison with the data/model of Basan et al. 2015 seems important to clarify the results. What brings those authors to conclude towards a tradeoff between protein cost and energy efficiency? Would the model in this study describe the Basan et al. data and how? Would this comparison lead to different conclusions? Are there crucial differences in the modeling choices of the two studies? Are there (according to the authors' model) regimes with/without strong tradeoffs and how can they be characterized? These seem like questions worth addressing.
We have shown how the model of Basan et al. [1] can be derived from our model when making a number of simplifying assumptions. That is, under these additional assumptions, our model and the model of Basan et al. make the same predictions. The major simplifying assumption is that the concentrations of central carbon metabolites, energy cofactors, and other macromolecules are constant and that their contribution to the mass balance can be ignored. As a consequence, a tradeoff between investment in proteins and metabolites is no longer possible. This notably rules out the strategy of more efficiently utilizing available proteomic resources, which underlies highrate, highyield growth predicted by our model and observed in the data.
The comparison with the model of Basan et al. is addressed in a new paragraph in the Discussion section, which explicitly answers the questions posed by the reviewer. This paragraph is based on a new subsection in Appendix 1, Simplified coarsegrained resource allocation models, which shows under which conditions the model of this manuscript reduces to the model of Basan et al.
References
[1] M. Basan, S. Hui, H. Okano, Z. Zhang, Y. Shen, J.R. Williamson, and T. Hwa. Overflow metabolism in Escherichia coli results from efficient proteome allocation. Nature, 528(7580):99– 104, 2015.
https://doi.org/10.7554/eLife.79815.sa2Article and author information
Author details
Funding
French National Research Agency (Maximic project (ANR17CE400024))
 Delphine Ropers
 JeanLuc Gouzé
 Hidde de Jong
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
This work was supported by the ANR project Maximic (ANR17CE400024). The authors would like to thank Francis Mairet and Antrea Pavlou for comments on a previous version of the manuscript, and Achille Fraisse for help with the simulation studies.
Senior Editor
 Michael B Eisen, University of California, Berkeley, United States
Reviewing Editor
 Petra Anne Levin, Washington University in St. Louis, United States
Version history
 Preprint posted: April 27, 2022 (view preprint)
 Received: April 27, 2022
 Accepted: May 30, 2023
 Accepted Manuscript published: May 31, 2023 (version 1)
 Version of Record published: June 15, 2023 (version 2)
Copyright
© 2023, Baldazzi et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 612
 Page views

 105
 Downloads

 0
 Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading

 Computational and Systems Biology
The microbial community composition in the human gut has a profound effect on human health. This observation has lead to extensive use of microbiome therapies, including overthecounter 'probiotic' treatments intended to alter the composition of the microbiome. Despite so much promise and commercial interest, the factors that contribute to the success or failure of microbiometargeted treatments remain unclear. We investigate the biotic interactions that lead to successful engraftment of a novel bacterial strain introduced to the microbiome as in probiotic treatments. We use pairwise genomescale metabolic modeling with a generalized resource allocation constraint to build a network of interactions between taxa that appear in an experimental engraftment study. We create induced subgraphs using the taxa present in individual samples and assess the likelihood of invader engraftment based on network structure. To do so, we use a generalized LotkaVolterra model, which we show has strong ability to predict if a particular invader or probiotic will successfully engraft into an individual's microbiome. Furthermore, we show that the mechanistic nature of the model is useful for revealing which microbemicrobe interactions potentially drive engraftment.

 Computational and Systems Biology
Angiogenesis is a morphogenic process resulting in the formation of new blood vessels from preexisting ones, usually in hypoxic microenvironments. The initial steps of angiogenesis depend on robust differentiation of oligopotent endothelial cells into the Tip and Stalk phenotypic cell fates, controlled by NOTCHdependent cell–cell communication. The dynamics of spatial patterning of this cell fate specification are only partially understood. Here, by combining a controlled experimental angiogenesis model with mathematical and computational analyses, we find that the regular spatial Tip–Stalk cell patterning can undergo an order–disorder transition at a relatively high input level of a proangiogenic factor VEGF. The resulting differentiation is robust but temporally unstable for most cells, with only a subset of presumptive Tip cells leading sprout extensions. We further find that sprouts form in a manner maximizing their mutual distance, consistent with a Turinglike model that may depend on local enrichment and depletion of fibronectin. Together, our data suggest that NOTCH signaling mediates a robust way of cell differentiation enabling but not instructing subsequent steps in angiogenic morphogenesis, which may require additional cues and selforganization mechanisms. This analysis can assist in further understanding of cell plasticity underlying angiogenesis and other complex morphogenic processes.