Evolution of cell size control is canalized towards adders or sizers by cell cycle structure and selective pressures
Abstract
Cell size is controlled to be within a specific range to support physiological function. To control their size, cells use diverse mechanisms ranging from ‘sizers’, in which differences in cell size are compensated for in a single cell division cycle, to ‘adders’, in which a constant amount of cell growth occurs in each cell cycle. This diversity raises the question why a particular cell would implement one rather than another mechanism? To address this question, we performed a series of simulations evolving cell size control networks. The size control mechanism that evolved was influenced by both cell cycle structure and specific selection pressures. Moreover, evolved networks recapitulated known size control properties of naturally occurring networks. If the mechanism is based on a G1 size control and an S/G2/M timer, as found for budding yeast and some human cells, adders likely evolve. But, if the G1 phase is significantly longer than the S/G2/M phase, as is often the case in mammalian cells in vivo, sizers become more likely. Sizers also evolve when the cell cycle structure is inverted so that G1 is a timer, while S/G2/M performs size control, as is the case for the fission yeast S. pombe. For some size control networks, cell size consistently decreases in each cycle until a burst of cell cycle inhibitor drives an extended G1 phase much like the cell division cycle of the green algae Chlamydomonas. That these size control networks evolved such self-organized criticality shows how the evolution of complex systems can drive the emergence of critical processes.
Editor's evaluation
This paper develops evolutionary simulations to identify the type of molecular networks that can give rise to size control. The authors propose an evolutionary framework to find which factors select for particular mechanisms in cell size control. They show that the evolution of a specific cell size control mechanism is dependent on the cell cycle structure.
https://doi.org/10.7554/eLife.79919.sa0Introduction
Cell size is fundamental to cell physiology and function because it sets the scale of subcellular compartments, cellular biosynthetic capacity, metabolism, mechanical properties, surface-to-volume ratios, and molecular transport (Chan and Marshall, 2010; Ginzberg et al., 2015; Neurohr et al., 2019; Zatulovskiy and Skotheim, 2020). While different types of cells vary enormously in size to perform their functions, cells within a particular type are generally uniform in size indicating that cell growth may be accurately coupled to division and differentiation processes. On a phenomenological level, there are many commonalities in how cells regulate their size even though the molecules controlling cell division vary across the tree of life with the most striking differences separating eukaryotes and bacteria. This extreme molecular diversity in the regulatory proteins raises the question as to what are the common features of the control systems that evolved to implement size control.
Most generally, cell size control can be viewed as a return map where the division size is a function of the cell size at birth. The examination of proliferating cells in laboratory conditions has revealed a variety of size control phenomena that can be characterized quantitatively by plotting the size of a cell at birth against the amount of mass added before it divides (Amir, 2014; Facchetti et al., 2017; Jun and Taheri-Araghi, 2015). A ‘sizer’ has a slope of –1 so that all variation in cell mass at birth is compensated for in one cell cycle, whereas an ‘adder’ has a slope of 0 so that each cell adds the same amount of mass during the cell cycle regardless of initial size. In the case of an adder, control is weaker so that multiple cell cycles are required for a particularly large or small cell to return to the average cell size. Importantly, the slope relating size at birth with the amount of growth in the cell cycle is a metric that quantifies the amount of size control occurring in a particular condition.
Studies of cell size control have revealed a diverse set of phenomena. Fission yeast and mouse epidermal stem cells exhibit ‘sizers’, and most bacteria, archaea, and cultured human cell lines exhibit behavior closer to adders (Cadart et al., 2018; Eun et al., 2018; Jun et al., 2018; Sveiczer et al., 1996; Westfall and Levin, 2017; Willis and Huang, 2017; Xie and Skotheim, 2020). Thus, while diverse size control behaviors have been observed, adders have been observed more often than sizers. This raises the question of why adders are more frequently observed if sizers, by definition, are more effective at controlling cell size (Barber et al., 2017; Willis et al., 2020).
To address the question of why adders are the most often observed form of cell size control, we used evolutionary algorithms (Holland, 1992) to identify commonalities between networks evolved to control cell size. Evolutionary algorithms are a class of machine learning techniques aiming at mimicking evolutionary processes (Crombach, 2021; François, 2014; François and Hakim, 2004; Xiong et al., 2019). Because of the nature of evolution, results of evolutionary computations are often more efficient and more creative than expected (Lehman et al., 2020). Furthermore, solutions found by evolutionary algorithms are constrained by their evolutionary paths followed and present similar characteristics to biologically evolved systems (Schaerli et al., 2018). Cell size is regulated through the cell cycle control network that governs transitions from one phase of the cell cycle to the next. The division cycle can be broken up into distinct phases that are characterized by different molecular activities (Morgan, 2007). While it is typically considered that there are 4 phases of the cell cycle (G1, S, G2, and M), we here consider a two phase model based on a G1 phase and a composite S/G2/M phase. This is because size control in general has been associated with either the G1/S transition or mitosis at the end of the cell cycle.
We start with a simple model of the cell cycle with a timer for an S/G2/M phase (Figure 1) that we evolve to optimize homeostatic cell size control (Figure 1B). We discovered that different control mechanisms could perform cell size control based on protein quantity or concentration. Simulations in which size control takes place in G1 phase converge toward an adder mechanism for the entire cell cycle and identified an active quantity sensing mechanism similar to dilution-based mechanisms previously identified experimentally (Chen et al., 2020; Qu et al., 2019; Schmoller et al., 2015). The relative durations of G1 and S/G2/M were important in determining size control properties. A relatively shorter S/G2/M phase favors sizer mechanisms, while longer S/G2/M phases favor adders. Moreover, inverting the model so that cell size controls S/G2/M and G1 is a timer, like in fission yeast, results in more sizer-like control. Thus, we anticipate adders arise when cell size is controlled at a point intermediate in the cell cycle, like the G1/S transition, while sizers will appear when cell size regulates a point later in the cell cycle, as is the case when G1 is proportionally longer, or control takes place at the transition to mitosis. We finally identify a self-organized mechanism based on fluctuation sensing where size control occurs on average over multiple cycles. While there is no one-to-one correspondence between a specific size control mechanism and a given evolutionary pressure, our work identifies clear evolutionary principles that shed light on the diverse cell size control phenomena previously observed experimentally.
Results
Initial cell-cycle model
In general, there are two classes of mechanisms that cells use to control their size that can be separated in terms of whether cell division or cell growth per se is regulated by cell size. Note that in this work, we will use mass, size, and volume interchangeably. In the first class, it is crucial that the growth rate per unit mass of a cell depends on cell size so that cells that are significantly larger than the optimum cell size grow slower (Cadart et al., 2019; Ginzberg et al., 2018; Miettinen and Björklund, 2016; Nordholt et al., 2020; Tzur et al., 2009). Such slower growing cells are then outcompeted by cells closer to the optimum size even when divisions occur purely by chance (Conlon and Raff, 2003). While size-dependent growth mechanisms exist and do support size homeostasis, such mechanisms rely on inefficient growth in all the cells away from the optimum size (Ginzberg et al., 2018; Miettinen and Björklund, 2016; Nordholt et al., 2020). To avoid such inefficient growth, many types of cells use active size control mechanisms to accelerate progression through the cell cycle in larger cells (Zatulovskiy and Skotheim, 2020). In our simulations, we keep cell growth rates constant over a physiological range of cell sizes. This allows us to focus on the common features of the molecular networks in which increasing cell size drives changes in molecular activities to trigger cell division. We assume that cell volume grows at a rate , so that growth is exponential when is a constant. Volume is divided by 2 at each division after which we follow one of the two daughter cells. The growth rate sets the time scale for the system dynamics as it defines the doubling time . Any interdivision time shorter than will see the cell volume shrink at the next generation while any interdivision time larger than will see the cell volume grow. We also use the chemistry square bracket convention such that any protein X’s concentration is denoted by . Correspondingly, its quantity is denoted by only and is defined as .
We initialize our network evolution simulations with a very simplified model of the cell-cycle (Figure 1A). We model two independent phases of a symmetrically dividing cell, G1 and S/G2/M, separated by a commitment point at the end of G1 and division at the end of S/G2/M (Figure 1A, Chandler-Brown et al., 2017). We encode this cell cycle state information via a binary switch variable we call ‘S/G2/M Switch’ that is 0 in G1 and 1 in S/G2/M. In all simulations, we follow an inhibitor model (Heldt et al., 2018; Schmoller et al., 2015; Zatulovskiy et al., 2020) and assume that the probability of passing the G1/S transition is controlled by the quantity of a transcription regulator . One way the quantity rather than the concentration of a molecule could be sensed is through its titration against a fixed cellular quantity such as the genome, which is part of a general class of titration-based cell size sensing mechanisms (Amodeo et al., 2015; Heldt et al., 2018; Si et al., 2019; Wang et al., 2009). A lower quantity of this inhibitor means a higher probability of a G1/S transition at the current time step of the simulation (Appendix 1—figure 2). Like all other proteins, the quantity is produced with a rate proportional to volume, degraded at a constant rate, diluted by cell growth, and equally partitioned between mother and daughter cells at division (see Materials and Methods). We found that due to the volume scaling assumption, ’s concentration alone was largely independent of volume and could not trigger a size-dependent G1/S transition, which is why we opted for the quantity of instead (Appendix 1—figure 3). Upon passing the G1/S transition, we assume cells are committed to division and there is a fixed time delay before they divide thus modeling S/G2/M as a timer (Chandler-Brown et al., 2017; Doncic et al., 2015). We initially fixed the timer duration to be roughly equal to 50% of the doubling time with some uniform noise such that G1 and S/G2/M durations would be the same at equilibrium. Regulation of the quantity during the cell cycle thus controls the precise timing of the G1/S transition, but it is not always perfect since the transition is probabilistic. This, along with the noise in S/G2/M timer duration, creates natural cell to cell variability in volume that needs to be compensated for by the evolved mechanism. We note that we initialized most of our simulations with one added interaction in which production of is activated by the S/G2/M Switch variable to reset its concentration to a higher level before the next generation. We initially ran simulations without this specific interaction but found that it systematically appears in the initial stages of evolution simulations. We therefore included it in the initial network to speed up our simulations. We refer the reader to the Appendix 1 for more details. The models used in this study are publicly available (Proulx-Giraldeau and François, 2022).
Typical dynamics of this simple cell-cycle model are represented in Figure 1. These dynamics are similar to models of cell cycles based on relaxation oscillators (Cross, 2003; Tsai et al., 2008). The left and right slow branches correspond to G1 and S/G2/M, respectively, and the fast horizontal branches represent G1/S and division. An intermediate fictitious nullcline is shown as a line that connects the average concentration at G1/S and at division. Starting with cell birth, the system goes down the left-G1 branch because of degradation, then jumps to the right-S/G2/M branch below the threshold for the G1/S transition, stays there while moving up due to production by the S/G2/M Switch, until it jumps back to the left branch at the end of the timer phase. We note that there is no explicit volume control in this initial model since the only control comes from the quantity of which does not initially depend in any way on the volume. This initial quantity sensing oscillator does not perform size control and instead results in unstable growth where size deviations are amplified at each generation instead of being corrected (Appendix 1—figure 3) as had been previously described for a size scaling inhibitor dilution model (Barber et al., 2017; Willis et al., 2020). Thus, the network needs to evolve some other interactions and/or parameters to go beyond a simple G1 inhibitor driven by production in S/G2/M to create a viable cell lineage.
Evolution of quantity-based size control mechanisms
To examine how networks controlling cell size could evolve, we ran evolutionary simulations that optimize both the number of divisions and the coefficient of variation of the size distribution at birth (see Materials and Methods for algorithm details). Figure 2A illustrates the behavior and results of a typical evolutionary run, with axes defined by both fitnesses used. Simulations successfully evolving size control mechanisms typically follow the same pattern. Networks initially cluster in two regions: region [ii] where cells have low but grow too small and die after a few divisions, and region [i] where cells grow too big and reach our cut-off for fitness 2 (y-axis). Notice that our Pareto evolutionary algorithm maximizes network diversity, so that those two clusters are at first maintained during evolution (rank 1 Pareto networks Warmflash et al., 2012). As the number of epochs increases, networks in cluster [ii] have more and more divisions, but still grow too big, so that those cells are therefore penalized (see details in Appendix 1). At some point in evolution (around epoch 700 for this particular simulation), some weak control mechanism suddenly evolves, preventing cells from becoming too big without imposing a tight control on the average volume (see also Figure 5 for an explicit example of how this is done). Thus, fitness 2 collapses and the number of divisions is optimized simultaneously. Cells later optimize the control to give a lower . The optimal networks, at the right most end of this line, both maximize and minimize .
Evolution simulations are in part reproducible and most often lead to similar network topologies. The evolution trajectory leading to Model A1 is a typical example (Figure 2B, see also variations of this network in Model A2 in Figure 2H and models A3-6 in Appendix 1—figures 8–11). The minimal network common to all those models is very simple. One gene, , is added to the seed network and is both repressed and titrated by forming the network motif known as a Mixed Feedback Loop (François and Hakim, 2005). Size control can then be understood intuitively as follows. represses , which is thus only produced in the narrow window of the cycle when is low, i.e., when the cell is close to the G1/S transition and in early S/G2/M. But, since the quantity is fixed at G1/S by design, the concentration is inversely proportional to the volume of the cell at the G1/S transition (VG1/S) as shown in our simulations (Figure 2C). Because of this, the dependent synthesis rate of and therefore its subsequent concentration are (linear) functions of the volume of VG1/S (Figure 2C), allowing for the cell to keep a memory of its volume at G1/S via the variable (this holds even once is constantly degraded for the remainder of the cycle). This has two effects. First, during S/G2/M, synthesis rate is proportional to volume by hypothesis (and thus to VG1/S), and is titrated by , also proportional to VG1/S. Both effects even out so that cells are born with a fixed quantity of inhibitor that is independent of volume (Figure 2D). Second, after division, production of is 0 by hypothesis, but still is titrated in G1 by the remaining (still proportional to VG1/S). Because quantity at the beginning of G1 is size independent, this ensures that daughter cells reach the quantity threshold of G1/S earlier if they were born larger, thus ensuring size control. To confirm this, we examine the change in quantity of as a function of time spent in G1 and of VG1/S, and we see the slope of these two quantities is volume-dependent (Figure 2E). We notice that this evolved size mechanism likely is the simplest possible allowed by our formalism: on the one hand, it entirely captures the volume dependency in a single variable , and on the other hand, it ensures proper scaling of both at birth and at G1/S for size control with the help of a single titration. Notice that such simple control also explains the sudden evolutionary “jump” of Pareto front around epoch 700 on Figure 2A, which corresponds to when the Mixed Feedback Loop motif first appears.
These evolved size control networks, while relying on quantity sensing, are conceptually similar to the budding yeast network relying on concentration sensing of the cell cycle inhibitor Whi5 since there is a constant quantity of present right after division (just like Whi5). In budding yeast, the Whi5 protein is passively diluted in G1 to increase the stochastic rate of progression through the G1/S transition. The time spent in G1 depends on the initial concentration of Whi5 at birth, which scales as , to promote a sizer mechanism. Here, the concentration of at birth scales as , but so does the threshold concentration of regulating the G1/S transition. This is precisely why an active titration mechanism is required to obtain G1 size control in our setup. Such homeostatic control ensures cell size returns to its steady state distribution following an artificial perturbation as soon as a volume deviation is detected at G1/S (Figure 2F). When we plot the amount of volume added for different phases of the cell cycle as a function of the initial volume at the beginning of these phases, we find an approximate adder over the whole cycle that results from weak sizer in G1 followed by a timer in S/G2/M as has been found in budding yeast (Chandler-Brown et al., 2017; Di Talia et al., 2007; Soifer et al., 2016, Figure 2G).
While we chose one simple model to illustrate the control mechanism common to our set of evolved networks (Model A1 shown in Figure 2B), other evolved networks were more elaborate but illustrated a similar principle. For example, Model A2 contains extra interactions for the volume sensing gene , where is repressed by the S/G2/M Switch (meaning its production is completely shut down in S/G2/M leading to sawtooth-like dynamics). Furthermore, represses the synthesis of , adding another layer of repression to promote size control beyond the previously described titration by (Figure 2H). If we perturb cell size to examine the dynamics of the return to steady state and look at the added volume during the cell cycle, we again see overall a weak adder behavior similar to that found in Model A1 (Figure 2I–J). We give additional examples of similarly evolved networks in Appendix 1—figures 8–11 where we can see the sensing and the feedback mechanism being implemented in slightly different ways. Yet, despite these mechanistic differences in feedback regulation the resulting function of the evolved networks were similar as indicated by their (Appendix 1—table 1).
Quantifying size control
To study the mechanisms implicated in cell size control, we modify the control at G1/S and introduce the control volume . This control variable is independent from the biochemical network and is maintained fixed allowing us to disconnect the actual cell volume from the biochemical network and by forcing the G1/S transition to be triggered once is low enough. We then numerically integrate the differential equations of the model and measure the period of the simulated cell-cycle for this control volume. Use of the control volume allows us to break the size feedback system and distinguish its input, , from its output, the induced cycle period T (Angeli et al., 2004). We compute for Models A1 and A2 and compare their responses with the analytical curves of the archetypical timer, adder, and sizer (Figure 3A and C). Those curves intersect at the point where the induced period is exactly equal to the population doubling time (), which defines the equilibrium volume achieved by our cell size control network corresponding to . Examination of the size control in different cell cycle phases indicates the contributions of G1 and S/G2/M to the overall system behavior (Figure 3B and D). We note that we later examine statistics of ensembles of evolved models but that Models A1 and A2 are both typical examples of evolved feedback-based models.
Quantifying precisely how the cell cycle period depends on the control volume at G1/S allows us to see that the for Model A1 overlaps with the theoretical adder curve over a broad range of volumes. In contrast, Model A2 behaves as a sizer for volumes smaller than the equilibrium volume. However, at higher volumes it behaves more like an adder/timer similar to Model A1. Model A2’s equilibrium is thus ‘tuned’ by evolution to be in a regime corresponding to the minimum of the curve extrapolated from the as shown in Figure 3D, precisely when the system transits from a sizer at small volumes to an adder-like behavior at higher volumes. Similar behavior was observed experimentally in budding yeast (Chandler-Brown et al., 2017; Delarue et al., 2017). Thus, both Models A1 and A2 approximate an adder near the equilibrium size, but their behavior differs further from equilibrium where for smaller volumes, Model A1 is still an adder while Model A2 is a sizer.
Modulating cell cycle structural constraints selects for adders or sizers
So far, our evolutionary algorithm selects networks that implement adders rather than sizers near the equilibrium size. This is surprising because sizers are in principle better than adders at controlling cell size and reducing the CV of the size distribution at birth, which is one of our fitness functions. That adders are more frequently observed in nature than sizers (Cadart et al., 2018; Eun et al., 2018; Jun et al., 2018; Westfall and Levin, 2017; Willis and Huang, 2017; Zatulovskiy and Skotheim, 2020), is consistent with our evolution simulations, but adds to the mystery as to why this takes place.
To gain insight into the underlying reason for the prevalence of adders, we considered what might be exceptional in the cases where sizers occur. The best studied, and highly accurate sizer, is found in the fission yeast S. pombe (Fantes, 1977; Sveiczer et al., 1996). In contrast to budding yeast and human cells, where the size control takes place largely in G1 phase, fission yeast exerts size control later in the cell cycle at the G2/M transition. That size control took place later in the cell cycle in fission yeast suggested that this structural feature of its cell cycle might be responsible for its stronger size control. To test this, we inverted our seed network so that the G1 phase was a timer, while cell size control could evolve in S/G2/M (see Materials and Methods). We compare these results to the evolutionary simulations starting with the ‘control’ seed network with G1 size control and an S/G2/M timer (Figure 2). We note that S. pombe growth rates have been reported to deviate from exponential (Wood and Nurse, 2015). While slower than exponential growth would aid cell size control, our analysis here is restricted to exponential growth.
To determine how the seed network structure influences the subsequent evolution of cell size control, we performed 120 independent evolutionary simulations for the two network structures initialized with the Model A1 topology and parameters (Figure 4). Sixty simulations were performed using Pareto optimization and another 60 simulations were performed using individual fitness optimization based on the number of cell divisions . For each simulation’s most fit model after 500 epochs, we calculate the CV of the volume distribution at birth, , and the slope of the linear fit of the volume added in the entire cell cycle as a function of the cell size at birth, Slope (we remind the reader that a slope of –1 corresponds to a sizer, 0 to an adder, and 1 to a timer). We chose to use 500 epochs in our simulations because in our previous experience this was sufficient for networks to evolve to be near the optimum, but not so much that they were forced to extensively explore the effects of neutral mutations near the optimum. Model A1’s initial and Slope before evolution are indicated by the dashed black line. In the control experiment where G1 is a sizer and S/G2/M is a timer, most evolutionary simulations with two fitness functions (Pareto) yield models close to the adder regime with a low . However, when only the number of divisions () is used as a fitness, the evolutionary simulations are closer to the sizer regime, albeit with a slightly higher (Figure 4A, Welch’s t-test p<10–4). When the cell cycle structure is inverted so that G1 is a timer and S/G2/M is a sizer like it is in the fission yeast S. pombe, we found that more sizer-like networks evolve than in the control experiment (Figure 4B, Welch’s t-test on agglomerated data p=0.03). This shows that having a network structure like the fission yeast S. pombe promotes sizers, while having the size control portion of the cell cycle earlier, as in the budding yeast S. cerevisiae, promotes adders. Thus, performing simulations using cell cycle network structures of these two yeasts results in the evolution of size control mechanisms that reflect those that are naturally occurring.
The general notion that having cell size control in G1 results in adder-like mechanisms, while control later in the cell cycle results in more sizer-like mechanisms fits most observations of human cell lines grown in culture, budding yeast, and fission yeast. However, a recent study examining mouse epidermal stem cells growing and dividing in the skin found both a strong sizer and that this sizer was largely due to the size-dependent regulation of G1 (Mesa et al., 2018; Xie and Skotheim, 2020). This raised the question as to how a network performing size control in G1 could result in a sizer for the entire cell cycle. One important difference between mammalian cells grown in culture and the mouse epidermal stem cells growing in an animal (in vivo) is the change in the relative durations of the G1 and S/G2/M phases of the cell cycle. While the S/G2/M phase of the cell cycle is similar in duration in cultured cells and the epidermal stem cells in vivo at ~12 hr, the G1 phase extends ~fivefold from ~10 hr in culture to ~50 hr in vivo (Cadart et al., 2018; Xie and Skotheim, 2020). This suggests the hypothesis that the overall size control behavior can be dominated by the relatively longer cell cycle phase, as is likely the case for the G1 phase of epidermal stem cells. To test this hypothesis, we performed evolutionary simulations with size control in G1 and a timer in S/G2/M but where we changed the duration of the S/G2/M timer phases of the cell cycle to be significantly shorter or longer than the G1 phase at equilibrium. When a timer in S/G2/M is relatively shorter compared to G1, we generally see more sizer-like behavior can evolve (Figure 4C, Welch’s t-test on agglomerated data p=4 x 10–3), while when it is relatively longer, we see more adder-like or even timer-like behavior (Figure 4D, Welch’s t-test on agglomerated data p<10–4).
We next considered the effect of changing the amount of noise in the timer phase of the cell cycle. To do this, we examined the evolution of networks performing size control in G1 and where the S/G2/M phase with an increasing amount of noise. Increasing the noise in the timer progressively reduced the amount of size control done by the network (Appendix 1—figure 5). This is likely because the fixed duration of S/G2/M allows the system to accurately reset protein concentrations for the subsequent cell cycle to promote accurate G1 control (Willis et al., 2020). We also examined the effects of adding noise to the cellular growth rate and to volume partitioning at division and found similar results (Appendix 1—figures 6–7).
Taken together, our simulations show how the structural features of the cell cycle are important for determining what type of size control ultimately evolves. G1 control is more conducive to the evolution of adders, while S/G2/M control is more conducive to sizers. Moreover, size control can be dominated by the cell cycle phase of longest duration and is modulated by the specific selection criteria.
A two-step evolutionary pathway for cell size control
From our series of evolution simulations, we found a somewhat paradoxical inverse correlation between the added volume slope quantifying the degree of size control (sizer vs. adder) and the (Figure 4B–C). This was surprising because it means that there is a broader distribution of volumes in the sizer regime where control should be more effective in theory. To better understand this inverted correlation between Slope and , we revisited our evolutionary simulations to examine the evolutionary pathways through which the networks progressed through the simulated epochs.
A typical evolutionary pathway for a Pareto simulation of the S. pombe-like network structure presented in Figure 4B is shown in detail in Figure 5. Here, Model A1 topology is conserved throughout evolution although individual parameter values change. In the early stages of the evolution (epoch 650), we typically see dynamics where small cell size triggers an overshoot to a large cell size, which is then reduced through a series of rapid divisions (Figure 5A). Because of this small-size-triggered overshoot, the system behaves more like a sizer when the average behavior is analyzed (Figure 5B). However, the high degree of variability in the cell cycles also leads to a broad distribution of volumes (Figure 5C). The variability in volume is attenuated in later epochs where there are fewer and smaller volume overshoots triggered by small cell size (Figure 5D–I). Since small cell size no longer triggers a dramatic amount of cell growth, the slope of is increased and the system converges towards a weak adder in which the distribution of volume at birth is more Gaussian and the is lower (Figure 5H–I). Taken together, these analyses suggest a two-step evolutionary pathway, consistent with the evolutionary dynamics first seen in Figure 2A. First, a strong but imprecise sizer mechanism evolves where, because of noise in the system, small variations in volume lead to a dramatic overcorrection and overshoot of the target volume. The variability in volume produced by this overshoot is then reduced by attenuating the strength of the size control response. Indeed, the overall weaker size control allows the system to respond more mildly to size deviations, thus yielding a lower overall which we select for. Thus, selecting for a smaller , i.e. better size control, can end up selecting for adders rather than sizers. This paradoxical result is consistent with the fact that when we select only for the number of cell divisions (), one sees that more sizer-like behavior can evolve (Figure 4A–C). The typical behavior before optimization of is a strong sizer as illustrated in Figure 5A.
Fluctuation sensing and the evolution of self-organized criticality
One of the main features of smaller cells is that they have fewer proteins and mRNA. If some aspects of protein synthesis and degradation are subject to Poisson fluctuations, we expect such fluctuations to produce larger concentration fluctuations in smaller cells. For example, let us assume that the balance of synthesis and degradation of a generic protein results into a Poisson distribution with parameter , where is the synthesis rate in number of proteins per unit of time for a reference volume of 1, is the volume, and is the degradation rate. The average concentration of this protein in an exponentially growing cell will be , which is independent of the cell volume as expected from the production rate scaling. However, following the Bienaymé formula, the variance in the concentration is (Figure 6A), which decreases with volume. This result makes intuitive sense because bigger cells have to produce more proteins to keep concentrations constant, so that the fluctuations in the relative number of proteins (and thus concentration) are smaller (see Jia et al., 2021 for a complete analytical study of how in general variance scales differently from mean when volume varies). Thus, if the cell could sense the size of concentration fluctuations in some way, it would be able to harness the cell size-dependence of such stochastic fluctuations to regulate cell division and control cell size.
To test if we can evolve networks that control cell size through sensing Poissonian fluctuations, we first initialized our simulation with a network similar to that shown in Figure 1C, but with an added self-activating gene that can activate the production of the inhibitor. We then ran evolutionary simulations using the cell cycle structure of a sizer controlling G1 and timer in S/G2/M, but where the G1/S transition is regulated by the concentration instead of its quantity. We also used Pareto fitness optimization of and of . Importantly, we use the stochastic version of our equations with the molecular noise modeled using a Langevin noise term with a variance inversely proportional to the volume as explained above. We then extracted the most fit network and optimized it further using Pareto optimization of and of the slope to push the models towards the sizer regime. The most fit network of those combined evolutionary simulations is presented in Figure 6B and demonstrates that we can indeed evolve fluctuation-based cell size control (Model B).
The mechanism for size control that evolved based on size-dependent fluctuation sensing is remarkably similar to what we observed for models without size-dependent fluctuations (Figure 5A). For large volumes, the cycle has a constant period which corresponds to approximately 85% of the doubling time . This ensures that in the high-volume regime, the system shrinks over time. When the volume is small however, fluctuations allow the concentration to cross the threshold of the highly non-linear transcriptional activation of by . This results in a massive increase of that needs to be degraded to progress further into the cell cycle. Thus, the low volume regime occasionally leads to a considerable increase in G1 length and a correspondingly very large cell at division. These very large cells then reliably and deterministically re-enter multiple, rapid cell cycles with short G1 until the cell is small again and the concentration fluctuations again become large enough to trigger the activation of by . This mechanism thus appears very similar to the early sizer mechanism observed in other quantity sensing simulations shown in Figure 5. However, here the mechanism is based only on size-dependent fluctuations in protein concentration and the overall behavior is closer to an adder (Figure 6E).
The system dynamics that evolved to perform fluctuation-based cell size control produce volume distributions that are long-tailed due to the stochastic occurrence of occasional exceptionally long G1 phases. Interestingly, the probability distribution of cell cycle durations follows a power law (Figure 6F), which is due to the very broad distributions of G1 duration at lower cell volumes. A more controlled analysis specifying the initial conditions showed that cell cycles get increasingly long, and their distributions widen with decreasing control volume (Figure 6G). Such non-Gaussianity is the hallmark of critical behavior, suggesting that the evolution of fluctuation-based cell size control is based on self-organized criticality (SOC). SOC is defined as a system where an order parameter feeds back on a control parameter (Sornette et al., 1995; Vidiella et al., 2021). The canonical example of SOC is the sandpile to which grains of sand are added on top. As the sand accumulates, the slope steepens, and the angle of the pile (control parameter) increases. Eventually, this triggers avalanches (order parameter) that feedback to dramatically reduce the angle of the pile. This ensures that the system dynamically tunes itself at the critical value of the angle of the pile where avalanches can occur.
We conclude that our evolved size control network exhibits SOC based on several observations. Starting from a high volume, multiple divisions at a rate faster than it takes to double the biomass reduce cell volume just like the addition of grains of sand gradually increases the slope of the pile. Then, for small enough volumes, bursts of drive an extended G1 that greatly increases cell size, which, like the sandpile avalanches, resets the system’s control parameter (volume of the cell or angle of the sandpile). Interestingly, evolution tuned the system to be near a bifurcation (Figure 6I). If we consider the deterministic regime, in which the fluctuations in are small, the cycle is unperturbed and oscillates with a period roughly equal to 85% of the doubling time (Figure 6J). In contrast, if we consider the noisy regime, in which the fluctuations in are large, the cycle disappears, and the system stays locked in a prolonged G1 state with a high value of which is akin to a bifurcation destroying the cycle (Figure 6K). This bifurcation takes place because the position of the G1 attractor for becomes larger with increasing and eventually overcomes the concentration required to induce the stochastic G1/S transition (Figure 6H). Then, the system remains stuck in a state where G1/S cannot be triggered, and cells effectively exit the cycle. As growth occurs, noise dies down and so does the position of the G1 attractor, eventually becoming smaller than the concentration required to induce the G1/S transition which allows cells to re-enter the cell cycle. Thus, the system is critical from a dynamical systems standpoint and also fits the general observation that SOC systems tune themselves to be right at the point where the order parameter is non zero, but infinitesimal (Sornette et al., 1995). In our case, the bifurcation corresponds exactly to the point where can sufficiently activate the production of to prevent the G1/S transition.
Discussion
The last decade saw an explosion of time lapse microscopy studies measuring how cells control their size. These studies revealed diverse phenomena that are characterized by the correlation between cell size at birth and cell size at division. Size control ranged from sizers, where the size at division is uncorrelated from the size at birth, to adders, which add a constant volume in each cell division cycle, to timers, whose cell cycle duration is size-independent (Cadart et al., 2018; Eun et al., 2018; Jun et al., 2018; Willis and Huang, 2017; Wood and Nurse, 2015; Zatulovskiy and Skotheim, 2020). The presence of these diverse phenomena raises the question as to why the underlying control networks evolve one rather than another type of cell size control?
To explore the evolution of cell size control networks subject to distinct selection pressures, we used computational evolution simulations. We initially examined the evolution of a seed cell cycle model consisting of G1 and S/G2/M phases of similar duration, where the G1 phase was free to evolve size-dependence, but the S/G2/M phase was constrained as a timer. Our simulations reliably evolved a control mechanism based on a Mixed Feedback Loop (François and Hakim, 2005, Figure 2). This network is centered on a cell cycle regulator () that inhibits the G1/S transition in proportion to its quantity. is titrated away into an inactive complex by an increasing amount of another protein that is synthesized in proportion to cell size. This results in a size-dependent decrease in the effective cell cycle inhibitor (free ). Thus, our evolved network implements an effective dilution of a cell cycle inhibitor that is conceptually similar to the well-described inhibitor dilution models of budding yeast, human cells, and Arabidopsis plants (D’Ario et al., 2021; Schmoller et al., 2015; Xie and Skotheim, 2020; Zatulovskiy et al., 2020). We note that we did not allow the synthesis of our proteins, such as , to be size-independent as has been found for budding yeast (Chen et al., 2020; Schmoller et al., 2015; Swaffer et al., 2021) as this could result in a one-step implementation of cell size control through the pure dilution of a cell cycle inhibitor. It is therefore interesting that given the constraint that all proteins be made in proportion to cell size, the network still evolved an effective ‘dilution’ of the active form of the cell cycle inhibitor molecule .
Our evolution simulations gave insight into factors that bias evolution towards sizer or adder type control mechanisms (Figure 4). First, it is worth noting that our evolution simulations were not deterministic. There was no one-to-one correspondence between a given evolutionary pressure and any one specific cell size control mechanism. Rather, our claims represent an average behavior observed over the course of many simulations. Size control, as measured by the CV at a particular point in the cell cycle, has contribution both from the slope of the correlation between cell size and the amount of cell growth, and from the amount of noise characterizing the differences between cells that are initially the same size (Di Talia et al., 2007). It is therefore possible that a low noise adder can produce a lower CV than a higher noise sizer. This is reflected in the evolutionary paths of some of our simulations, which traverse from a noisy sizer to a less noisy adder (Figure 5). However, we anticipate even noisy sizers will be better than adders at controlling cell size in response to large deviations away from the steady state distribution. This is because sizers will always return the cell size to be within the steady state distribution within a cell cycle. We note that these generic results of how sizers and adders can govern cell size homeostasis can be derived from more traditional analytical methods (Barber et al., 2017; Willis et al., 2020). However, our evolution simulations are particularly useful because the molecular networks that evolved give non-trivial insights into how the observed size homeostasis dynamics can be regulated (e.g. via a Mixed Feedback Loop or using a system close to criticality). They are also suggestive of evolutionary pathways: despite different evolutionary modalities and control types, a natural step in many of our simulated evolutions is a system with strong sizers at very small volume only (Figures 5 and 6). This is practically reflected in a strongly negative slope on the very left side of the plots, and a positive slope at higher volume corresponding to timers (Figures 5B and 6E). Similar non-monotonicity of has been identified in models of various realism and complexity (Chandler-Brown et al., 2017; Delarue et al., 2017) and we provide here an evolutionary explanation for such an effect. We thus predict that this will be observed in systems where CV at birth does not need to be tightly controlled.
In the selection of a size controlling G1 network followed by a timer in S/G2/M, we observed a prevalence of adders that is consistent with the prevalence of adders reported in the literature. While fewer in number, sizers have also been observed. That the most accurate sizers have been observed in the fission yeast S. pombe (Fantes, 1977; Sveiczer et al., 1996; Wood and Nurse, 2015), and that this organism performs cell size control at G2/M rather than at G1/S led us to explore the effect of cell cycle structure on the evolution of cell size control. We found that controlling cell size later in the cycle in S/G2/M biases evolution away from adders and towards sizers. In retrospect, this result can be rationalized since any size deviations incurred earlier during the timer period can be compensated for by the end of the cycle with the sizer. However, when the order is inverted, any size deviations escaping a G1 control mechanism would only be amplified by exponential volume growth during the S/G2/M timer period. A second recent case exhibiting sizer control was found in mouse epidermal stem cells, which exhibit a greatly elongated G1 phase and a relatively short S/G2/M phase (Mesa et al., 2018; Xie and Skotheim, 2020). We found that if we increased the relative duration of G1 in our simulations by shortening the S/G2/M timer, we also see a bias towards sizer control. In essence, by extending G1 to a larger and larger fraction of the cell cycle the control system is gradually approaching a size control taking place at the end of the cell cycle, that is, an S/G2/M size control. Taken together, these simulations suggest the principle that having size-dependent transitions later in the cell cycle selects for sizers, while having such transitions earlier selects for adders.
In addition to identifying cell cycle structural features that canalize evolution towards sizers and adders as described above, we also observed an intriguing mechanism relying on molecular fluctuations. In this case, small cells would trigger an abnormally long G1 that would result in very large cells that decrease in size through a series of rapid cell divisions. This type of size control is reminiscent of that found in the green algae Chlamydomonas where a series of rapid, size-reducing cell divisions cease when cells go below a target size (Heldt et al., 2020). In our case, small size results in larger concentration fluctuations due to Poisson noise in the number of molecules. These concentration fluctuations, when large enough, are then able to trigger a burst of G1/S inhibitor that leads to an extended G1 phase and massive cell size growth before another series of rapid cell divisions is initiated (Figure 6). Interestingly, the system thus performs statistical size control over many generations. Intriguingly, this size control mechanism exhibits hallmarks of self-organized criticality (SOC). Just like adding grains of sand to a pile eventually triggers avalanches, the consistent decrease of cell size in the rapid division cycles eventually triggers a greatly extended G1 phase. To our knowledge, this is the first example where self-organized criticality is obtained in artificially evolved models of gene networks performing a well-defined function and is consistent with the idea that evolution of complex systems can favor the emergence of critical processes.
Materials and methods
Mathematical formalism
Request a detailed protocolTo model gene networks, we follow a standard ODE based formalism, where we simulate dynamics of the concentrations of proteins. We use Hill functions for transcriptional interactions, and standard mass action kinetics for protein-protein interactions. We also assume that all proteins are degraded at a constant rate. For most of the simulations presented in the paper, we use deterministic ODEs for simplicity. Importantly, cell-to-cell variability arises from the precise timing of cell cycle progression events. This allows for a natural way to generate noise on cell volume that should then be compensated for by the evolved network. In the last part of the paper, we explicitly include Langevin noise for biochemical reactions that are modeled using a classical tau-leaping formalism (Gillespie, 2007). Thus, each biochemical reaction takes place with a rate that corresponds to the deterministic rate, to which we add one white Gaussian noise with a variance equal to that rate. For example, given a deterministic biochemical rate and a time interval of size , we consider a tau-leaping change of where is a random gaussian variable of mean 0 and variance .
Volume influences protein dynamics in three ways. First, protein production rates are generally proportional to cell volume so that proteins reach and maintain a constant concentration that is independent of the cell volume (Chen et al., 2020; Elliott and McLaughlin, 1978; Newman et al., 2006; Swaffer et al., 2021). We note that we are not allowing the cell to employ proteins such as Whi5 in budding yeast whose production is independent of cell size so that its concentration is a direct readout of cell size (Schmoller et al., 2015; Swaffer et al., 2021). We chose to do this because we want to explore how cell size control can be done by a network with multiple feedbacks rather than just the concentration of a single protein with a special dedicated synthesis mechanism. Thus, the only deterministic influence of volume on concentration dynamics is on the dilution rate, which is proportional to the cell growth rate (see details in Appendix 1). At cell division, we also assume that proteins are equally partitioned between the daughter cells, that is, the concentration is the same before and after division. Note that we scale all our variables so that a concentration of one arbitrary unit corresponds roughly to 1000 proteins in a 100fL cell (Milo et al., 2010). Additionally, we scale the time variable so that 1 arbitrary time unit corresponds roughly to 30 min (Di Talia et al., 2007).
In this study, we chose a hierarchical way of introducing noise in the system, starting with the biggest contributing factor and incrementally adding additional sources of noise in subsequent analyses. All simulations presented include noise (stochastic control of G1/S transition and timing of S/G2/M, see below) in the cell cycle phases, whose CV has been found to be as high as 50% (Di Talia et al., 2007). Then, we introduced protein production noise via Langevin noise because the CV of regulatory protein concentrations is typically 20–30% (Newman et al., 2006). Importantly, the cell volume also contributes to stochastic effects, which are larger in smaller cells with fewer molecules. Thus, for stochastic simulations, we include a multiplicative contribution to the added Gaussian noise term (see more complete description in the Appendix 1).
We also checked that our results are largely invariant when adding other sources of noise (see Appendix 1—figures 5–7). In these simulations, we also included noise in cell growth rate (CV ~15%; e.g. Di Talia et al., 2007), and in mass partitioning at cytokinesis (CV ~10%; e.g. Zatulovskiy et al., 2020).
Evolutionary procedure
Request a detailed protocolTo evolve networks regulating cell size, we use the -evo software (Henry et al., 2018) with a modified numerical integrator accounting for volume dynamics and volume dependencies as described above (Figure 1B). -evo simulates the Darwinian evolution of a population of gene networks. A network is encoded with the help of a bipartite graph connecting biochemical species (typically proteins) and interactions between them (we use a custom-made Python library). Networks are converted into an ensemble of stochastic differential equations using a Python to C interpreter. This code is then compiled and integrated on the fly to compute the behavior of the networks.
Each selection step in the algorithm is referred to as an epoch rather than the more commonly used term generation because we use the term generation to refer to cell divisions in the simulations. At each epoch of the algorithm, each gene network is simulated, and its fitness computed. Based on the fitness function(s) (see below), half of the networks are selected and duplicated, while the other half is discarded to maintain a constant population size. The duplicated networks are then randomly mutated. From the most to least probable, mutations consist in random changes of parameters of the network, random removal of interactions, and random additions of interactions or new proteins. Absolute mutation rates are adjusted as a function of the number of evolutionary epochs so that all networks in a population are mutated on average once per epoch. This implements a numerical equivalent of the biological Drake’s rule that mutation rates adjust with genome size (Lynch, 2007). Practically, this prevents the known phenomenon of code-bloating in evolutionary simulations (Foster, 2001) and also means that the total number of epochs is a good proxy of the number of mutations (in random directions) needed to evolve the best networks. All of this is easily made with our customized Python library encoding networks. For more details on technical aspects and implementations of the -evo software, we refer the reader to Henry et al., 2018.
Realistic evolutionary processes select for multiple phenotypes in parallel. While trade-offs between those phenotypes are non-trivial, it has been observed that phenotypes typically define an evolutionarily Pareto front (Shoval et al., 2012; Warmflash et al., 2012). We thus perform network selection using a Pareto mode (Warmflash et al., 2012), in which two distinct fitness functions are computed. During the selection step, networks are first Pareto ranked. For example, consider two networks A and B and two fitness functions f1 and f2. f1A refers to the fitness of network A calculated with function f1. Assuming fitness functions are to be maximized, we say network A Pareto-dominates network B if both f1A > f1B and f2A ≥ f2B. Rank 1 networks are networks which are not dominated by any other networks, Rank 2 networks are networks dominated only by Rank 1 networks, and Rank 3 networks are only dominated by Rank 1 and Rank 2 networks and so on. The algorithm then selects half of the population of Rank 1 networks using a fitness sharing algorithm to maximize population diversity (see details in Warmflash et al., 2012). One advantage of Pareto selection is the increased flexibility of the evolutionary process. Multiple fitness functions can provide different optimization paths in parameter space, which prevents the selection process from getting stuck in a local optimum of a single fitness function. We also perform a few simulations with only one fitness function, in which case networks are simply ranked based on their fitness.
We impose two evolutionary selection pressures in the form of two fitness functions. The first fitness function is simply the number of cell divisions during a long period, which we call . This is consistent with the classical definition of fitness as optimizing the number of offspring and is to be maximized by the algorithm. The second fitness function is the coefficient of variation of the volume distribution at birth for those generations, which we call and is to be minimized by the algorithm. This penalizes broad distributions of volume at birth, which are detrimental to cell size homeostasis, which is what we aim to examine here. We further imposed fitness penalties to prevent way too small or too big cells, see Appendix 1. There, we also study alternative fitness functions, such as least-square residual function to minimize volume variation about a target size, and the fitted slope of the amount of volume added at each cycle to be minimized to drive models toward being a sizer.
Varying cell cycle structure
Request a detailed protocolWe also ran evolutionary simulations with different cell cycle structures. For evolutionary simulations where the G1/S transition was controlled by the concentration of , we simply change the probability to pass the G1/S transition to depend on concentration instead of its quantity . For evolutionary simulations with a cell cycle structure similar to that found in the fission yeast S. pombe, we invert the cell cycle network structure. In this case, quantity controls division and the Switch is turned on for a fixed amount of time in G1. In terms of the relaxation oscillator, this means that the left branch is now S/G2/M and the right branch is G1.
Appendix 1
In section Mathematical implementation, we first describe how to modify deterministic and stochastic ODEs to account for a exponentially growing cell volume. In section Size control, we describe the three size control archetypes, namely the timer, the adder, and the sizer, as well as our implementation of the initial seed cell cycle model. Then, in section Evolutionary algorithm, we give details on the -evo evolutionary algorithm. In section Analysis of sources of noise, we present evolutionary simulations investigating the effects of noise on the evolved networks. Finally, in sections Model descriptions and Additional models, we provide parameter values and equations for the models presented in the main text and for additional models produced by our evolutionary simulations.
Mathematical implementation
Deterministic
All biochemical concentrations can be described as a quantity of molecules of a biochemical species divided by the volume that contains it. The fundamental equation describing all concentrations is thus:
Here, is the quantity of molecules of an arbitrary biochemical species as a function of time and is the volume of the cell containing said species as a function of time. We will use the bracket notation for concentrations. In this project, we will model all biochemical species directly at the concentration level and assume proteins are uniformly distributed in an exponentially growing cell volume. The absolute growth rate of the cell is chosen to be constant over a large viable range of cell volume and is otherwise 0. Consequently, volume grows over time with a fixed absolute growth rate over a viable volume range following the equation:
Deterministic rate equations describing the dynamics of the biochemical species at the concentration level have to be adjusted to take into consideration a time-varying volume. From Equation 1 and the derivative chain rule, we get:
which we can combine with Equation 2 to give:
The first term in the Equation 3, , corresponds to the usual biochemical reaction rates that occur when the volume of the cell is fixed. The second term, , is a dilution term that we can interpret as an effective degradation of the concentration due to the exponential growth of the cell volume over time.
To further simplify the expression for , we make an assumption about the production rates of all biochemical species in our models. Let us consider the rate equation describing the dynamics of the quantity of an arbitrary protein with generic production rate and degradation rate contained in a fixed cell volume .
Here, is given by:
We assume that all the proteins in our models are constitutively expressed by the cell. In other words, the production rates are linear functions of the cell volume . This ensures that the protein production rates scale with the volume such that concentrations stay constant over time which is a general feature of most proteins in S. cerevisiae (Chen et al., 2020; Newman et al., 2006; Swaffer et al., 2021). This yields:
Taken altogether with Equation 3, the deterministic dynamics of a constitutively expressed arbitrary protein concentration contained in an exponentially growing cell volume are given by:
Stochastic
To simulate molecular noise, we follow a classical tau-leaping formalism (Gillespie, 2007). Specifically, we choose the Euler-Maruyama implementation to generate approximate solutions to stochastic differential equations (Kloeden and Platen, 1992).
As we have done before in the deterministic case, let us first consider the quantity of an arbitrary protein in order to extract the equation for the concentration . Let’s assume is changing via a single biochemical reaction rate over the time interval given . We will show later how this approach can be generalized to include multiple reaction rates. We begin by partitioning the time interval in equal segments of length such that with , .
The Euler-Murayama approximate solution to the stochastic differential equation at the discrete time points tn is then recursively given by the following equation for where the single biochemical reaction is assumed to happen with a Poisson rate (corresponding to the deterministic rate), which adds one white Gaussian noise to the differential equations with a variance equal to that rate:
Here, , is the drift term, is the diffusion term and is a random Gaussian variable of mean 0 and variance 1. In other words, is a random Gaussian variable of mean and variance . Since this describes the quantity of proteins, we also have to consider the change in volume over time to recover the equation for the concentration of proteins, . Thus, let’s consider the volume of the cell at the discrete time points tn, which is given recursively by the equation:
Here, we define and recover the term from Equation 2. We assume that the volume time evolution is noiseless for simplicity. To recover, the differential equation describing the protein concentration, we evaluate the expression . Thus, combining Equations 5 and 6, we get:
where we identify as the rate equation describing the concentration of the consitutively expressed protein when the volume is fixed as described in the deterministic case. Then, we compute the derivative as:
Finally, we recover the approximate full differential equation for the protein concentration by expanding the prefactor in the last equation to the order in assuming it to be small to give:
So far, we have assumed that there is only a single biochemical reaction rate . We can however easily generalize our approach to include additional reaction rates by summing up the contribution of each rate to the total differential equation. Given independent reaction rates and the properties of random Gaussian variables, we can easily generalize:
Importantly, the are Gaussian vectors accounting for the noise correlations associated with single reactions. For instance, imagine one protein turns into another protein , then the corresponding Gaussian vector for this interaction takes the form where is vector of length corresponding with the number of variables in the system whose -th component is equal to 1 with 0s elsewhere. This indicates that the molecular fluctuation due to this reaction should have opposite signs for and as expected.
The term is a dilution term that corresponds to an effective degradation of protein concentration as seen in the deterministic case. Interestingly, we highlight the dependency in the noise term. We can understand this dependency intuitively by considering a protein production process with a Poisson parameter . In this scenario, the mean and the variance of the protein quantity distribution is given by the parameter . Going back to concentration space, there are an infinite number of combinations of protein quantity and volume that can give the same concentration . Thus, we need to specify both the protein number and the volume to correctly model the molecular noise contributing to fluctuations in concentrations.
Size control
Initial seed network
To guide the evolutionary process, we begin with an initial seed network. We base our first seed network on the phenomenology of the budding yeast S. cerevisiae’s cell cycle, where cell size primarily regulates the timing of the START transition in late G1 (Schmoller et al., 2015). This regulation allows small daughter cells to delay the G1/S transition allowing them to catch-up in size by extending the G1 phase. S/G2/M duration on the other hand is largely independent of cell size. We note that while budding yeast divide asymmetrically, our simulated cells divide symmetrically. In our simple initial seed network, the cell cycle consists of two phases, G1 and S/G2/M, which respectively denote the pre-G1/S and post-G1/S phases of the cell cycle. The transition between these two phases is controlled by the level of a transcription regulator we call . Like the Whi5 protein in S. cerevisiae, is an inhibitor of the G1/S transition such that the lower its level, the higher the chances of cell cycle progression. Since protein production rates were assumed to be dependent on volume (as described in section Mathematical implementation), we found that ’s concentration alone was largely independent of volume and could not trigger a size-dependent G1/S transition as Whi5 does in budding yeast. Thus, we chose the quantity of defined as as the control variable for this transition. We chose to model the probability of the G1/S transition occurring at the next time point of the simulation with a sigmoid-shaped curve given by Equation 9 that can be visualized in Appendix 1—figure 1. We maintain and fixed throughout this project. We chose these values because they give a similar amount of noise in the G1/S transition as observed experimentally (Di Talia et al., 2007; Chandler-Brown et al., 2017).
In our seed model, we encode cell cycle state using a binary variable S/G2/M Switch, which is 0 in G1 and turns to 1 in S/G2/M once the G1/S transition takes place. Following S. cerevisiae’s cell cycle structure where S/G2/M duration is independent of cell size, we fix S/G2/M duration to be of the doubling time with uniform noise unless stated otherwise. This way, cells can tune the length of their cell cycle by adjusting G1 length while being constrained by the incompressible length of the timer in S/G2/M, .
Following S/G2/M, cells divide such that with the division fraction. The exponent here is referencing the -th generation in the cell lineage. We choose for all simulations performed in this study unless explicitly mentioned otherwise. We assume perfect partitioning of all proteins between the two daughter cells such that the proteins’ concentrations remain the same before and after division. After division, we follow one of the two daughter cells during their own subsequent cycle. If we simulate the cell lineage for a long time, ergodicity guarantees that all volume states will be visited given stable growth and we can extract population statistics from the lineage data itself. Here, cell growth is exponential on the single cell level since we were assessing size control mechanisms that take size as an input to cell cycle control. We are not exploring the very interesting case where growth deviates from the exponential. In that case, size homoeostasis would have a contribution from some cells in the population outcompeting others in terms of their growth and we would have to simulate the entire cell population and not disregard one of the daughter cells as we do here.
Inspired by the dynamics of Whi5, which is produced in S/G2/M and diluted in G1, we chose an initial seed network where S/G2/M Switch activates the transcription of the inhibitor. This ensures that the concentration is ‘reset’ to a higher value following S/G2/M and prevents cells from skipping entirely the G1 phase of the subsequent cycle which would quickly send the volume of the cell converging quickly towards 0. We note that this interaction systematically appeared anyway in our early evolution simulation so we chose to include it in the initial seed network to accelerate the evolutionary process.
Size control archetypes
Size control mechanisms are often compared to three well-characterized models or archetypes in order to quantify the strength of the size control mechanism under study. Specifically, there are timers, adders and sizers. In this subsection, we will define each archetype and show that we can summarize them via a control volume response curve as described in the main text.
First, let’s consider the timescale of growth. Given , the solution of the volume Equation 2 is . From this equation, we can easily recover the doubling time defined as the time required to double a cell’s volume () which yields:
The doubling time is only a function of the growth rate . Since we are only considering symmetrical division events, fixed interdivision times shorter than the doubling time will yield progressively smaller daughter cells. Similarly, fixed interdivision times longer than the doubling time will yield progressively larger daughter cells. Thus, the absolute growth rate sets the timescale for cell cycle dynamics if we want to simulate a stable cell lineage.
With the quantity sensing of the inhibitor I at the G1/S transition (see Equation 9), we find that the instantaneous volume at G1/S sets the concentration of the biochemical species for the rest of the cycle and until the next G1/S transition in the daughter’s cell cycle. Consequently, it regulates the timing of the G1 phase of the daughter cell and thus creates a return map for the volume at G1/S. We find that the volume at G1/S at the -th generation is given recursively by:
To study the mechanisms of cell size control, we choose to define a useful new variable: the control volume . This control variable is independent from the biochemical network and maintained fixed allowing us to break the size feedback, and distinguish its input, the volume , from its output, the induced cycle period . With this new variable, we can modify the control at G1/S by forcing the transitions to trigger once is low enough. We can then extract the response curve of the system, that is, the cell cycle period induced from sensing this control volume at G1/S . We represent this process schematically in Appendix 1—figure 2A.
The control volume at which the response curve is equal to the doubling time corresponds to the equilibrium volume, , for this network. Indeed, if , then the cell cycle length will ensure that this cell exactly doubles its volume during its cell cycle and returns to the same at the next generation. This volume is a fixed point of the volume return map and can be either stable or unstable. Theoretically, there could be size control mechanisms with multiple fixed points of the response curve, but practically we have not seen this emerge from any of our evolution experiments and therefore assume that is unique. In the main text, we have substituted by the average value of the volume at the time where volume is sensed as both of these values are essentially identical. This corresponds to for models with a sizer in G1 and a timer in S/G2/M and for networks with a timer in G1 and a sizer in S/G2/M.
Size variation naturally occurs in our models due to the precise timing of G1 and S/G2/M cycle phases which are both noisy, so the volume does not stay at for very long. The stability of growth around this equilibrium volume however will depend on the sign of the local derivative with respect to control volume of the response curve and we will consider the following three cases:
is strictly increasing with . In this case, small deviations around are amplified over successive generations and the volume quickly shrinks to 0 or explodes to . In this case, we say that is an unstable fixed point of the response curve.
is constant with . In this special case and assuming exponential growth of the volume, the only stable mode of growth corresponds to the response curve . This corresponds to the only stable timer archetype. In this particular scenario, is not well defined as there are an infinite number of volumes where the response curve intersects the doubling time.
is strictly decreasing with . In this case, cells correct for size deviations over successive generations and perform size control. In this case, we say that is a stable fixed point of the response curve.
Here, we found the curves that were selected by the evolutionary algorithm were all decreasing with as expected for stable size control mechanisms. Thus, for the remainder of this document, we will assume that is uniquely defined and corresponds to a stable fixed point of the control volume response curve.
Timer
The timer archetype describes mechanisms that monitor time rather than size. If the cycle duration of the timer is tuned precisely to the doubling time , cells will double their mass over the course the cell cycle to ensure that newborn daughter cells have the same volume at birth as their mothers did when they were born. Consequently, this category of mechanisms is notoriously bad at correcting for size deviations given exponential cell volume. If growth was linear however, this mechanism would allow for size control to take place.
Adder
The adder archetype describes mechanisms where cells add a constant amount of cell volume during each cycle. We define the increment of added volume between birth and division, that is . For adder mechanisms, the added volume at each cycle is constant and does not depend on cell size. In this case, initial size deviations are reduced by a factor of 2 at each division such that the volume at birth geometrically converges to the added volume over successive generations. To recover the adder response curve , we consider that by definition, . From this equation and the definition of the adder, we can recover the cycle period :
We would like write this equation as a function of the control volume at G1/S and the equilibrium volume alone. Assuming that the G1/S transition is followed by a timer in S/G2/M, we can write . Similarly, since we know by definition that , we can recover that the added volume increment is . Finally, we can combine these two expressions with to recover the final expression of the adder response curve:
We note here that if the control volume was measured at division or at birth, the response curve of the adder would be unchanged. The only difference would be that both and would correspond to volumes at division or birth volumes instead of volume at the G1/S transition. Here for example, the volume increment corresponds to the at birth.
Sizer
The sizer archetype describes mechanisms that measure size directly and allow a cell to return to a target volume after a single generation, irrespective of how big or small a cell was initially. For sizers, by definition.
We can then extract:
We can then write and as a function of the control volume at G1/S and the equilibrium volume . Using the same definitions as before and , we find that . The sizer response curve thus follows:
We note again here that if the volume was measured at division or at birth, the equation for the sizer would be identical with the only difference being that the control volume and the equilibrium volumes would correspond to division or birth volumes respectively.
The three archetypes’ response curves are shown in Appendix 1—figure 2B. From these curves and given the particular cell cycle structures we examined, we can extract multiple relevant measures of size control such as the volume at birth, G1/S, and division from which we get the added volume during each phase of the cell cycle. It is noteworthy that the derivative of the added volumes with respect to the birth volume for the timers, adders, and sizers, are respectively 1, 0, and –1. Size control mechanisms are typically compared to the 3 archetypes by measuring the amount of added cell volumes over their cell cycles and then fitting a linear model to these data points. The fitted slope of the linear model then informs what archetype this particular mechanism is more akin to. Some models evolved with added volume slopes lower than –1 and we call those super-sizers. Such mechanisms overcompensate for volume deviations about the equilibrium value which can increase variation in the size distribution instead of decreasing it.
Response curve of the seed networks
In light of the control volume and response curve definitions from subsection 2.2, we can revisit the initial seed model and investigate how different assumptions alter the stability of growth and division in a cell lineage. Specifically, we investigate the size scaling assumption of the protein production rates and the concentration vs. quantity sensing of the transcriptional regulator at G1/S as previously described in Section 1. We summarize our results in Appendix 1—figure 3.
First, let us consider the G1 trajectory of the transcriptional regulator who is solely produced during the S/G2/M timer. The dynamics of in G1 will be described by the following equation:
Here, the time variable represents the time since birth, , is the protein’s degradation rate and is the growth rate of the cell volume. This equation holds until the G1/S transition where the S/G2/M Switch is turned on again and G1 ends. Because the degradation of the inhibitor does not yet depend on volume in any way, the time spent in G1 will only be dependent on the ratio between: (1) the initial condition at birth ; (2) the final condition at the G1/S transition, .
We found that the size scaling assumption of the protein production rates influences the initial condition at birth . When production rates scale with size, we find that is independent of volume. This is expected as this assumption was chosen specifically to model proteins whose concentrations are independent of size. Conversely, when we modify this assumption and consider that protein production rates are independent of size (e.g. like Whi5 in budding yeast), we find that the system produces a constant quantity of inhibitor instead of a constant concentration. This means that the initial concentration at birth scales as .
Similarly, when imposing quantity sensing of I at G1/S, we found that the concentration of inhibitor at G1/S scales as . Finally, when imposing concentration sensing of I at G1/S, we found a constant concentration of inhibitor at G1/S as was expected by design.
Together, those assumptions alter the scaling of the duration of the G1 phase of the initial seed cycle. We summarize these results and present the models’ response curves in Appendix 1—figure 3 where each row and column corresponds to a specific combination of assumptions. There, in Appendix 1—figure 3A, we see that for the combination of size-scaling production rate and quantity sensing at G1/S, we get a cell cycle period that is increasing with control volume . This is undesirable and leads to unstable growth of the cell lineage towards 0 or , but rewards the evolution of size control mechanisms that can prevent this unstable growth. We chose this initial seed model for most of our evolutionary simulations. In Appendix 1—figure 3B,C, we found that the two assumptions compensated each other to create size-independent timer models. The parameters of the network can be precisely fine-tuned to yield a response period of exactly as was done to produce the response curves shown here. Thus, it is technically possible to evolve a size control mechanism using these initial seed models, but we chose not to go down that path because we wanted to evolve an active size control mechanism. Finally, in Appendix 1—figure 3D, we see that if we assume that protein production rates do not scale with size and that the G1/S transition depends on the concentration of inhibitor , we get an initial seed model that already accomplishes size control as it displays a response curve that decreases with control volume . This simple model loosely corresponds to the Whi5 inhibitor dilution model of budding yeast (Schmoller et al., 2015) where a constant quantity of inhibitor Whi5 is present at birth (and thus a concentration ) and is passively diluted in G1 until it reaches concentration threshold that triggers the G1/S transition.
Evolutionary algorithm
Here we briefly describe the -evo evolutionary algorithm from Henry et al., 2018 that we used to evolve size control networks. We refer the reader to the original publication’s main text and supplementary material for a more thorough description of the algorithm. A schematic representation of the algorithm’s architecture is shown in Appendix 1—figure 4.
First, an initial seed network is selected by the user as the starting point of the evolution simulation. -evo then clones this first individual to create a population of networks. At each epoch, mutations are randomly applied to the networks of the population. Those mutations vary from topological changes to the network, where biochemical species or interactions can be added or removed, to non-topological changes, where the networks’ kinetic parameter values are modified. Following mutations, networks are ranked based on their performance at accomplishing the biological function we select for. This performance is encoded via a user-defined fitness function that is problem specific. We give details about the specific implementation of the fitness functions for cell size control in the following subsection. After ranking the networks, -evo proceeds to select the most fit half of the network population. The less fit half is then discarded and replaced by a copy of the most fit half to maintain a constant population size. With this, -evo completes the first epoch of the evolutionary process. We use the term epoch here rather than the term ’generation’ which we retain to describe a cell lineage. A predetermined number of epochs of mutation and selection are then performed after which a final population of networks is extracted.
Fitness
In order to rank and select networks based on their performance at accomplishing a specific biological function, we design a specific objective function that we call fitness. Here, we chose a fitness function that could quantify a model’s ability to produce many viable descendants during a fixed time period of length . We initially considered a simple fitness function to be minimized by -evo, . Here is the number of divisions or generations in a cell lineage produced during a total time period of length . Noise at the G1/S transition and in the S/G2/M timer duration act as a source of variation in volume at each generation which needs to be controlled by the evolved networks in order to prevent the cell volumes from diverging. Thus, networks that perform size control display a high number of divisions . The cycle duration distribution of a size control network will be centered around the doubling time in order to promote stable growth. Thus, on average, we expect a fit network to exhibit a maximum number of divisions during a simulation of length . Note that this number is mostly independent of the volume range selected by the evolutionary simulation as the doubling time is independent of the initial volume of the cell at the beginning of each cycle. There is a small effect on from the initial conditions chosen for the system of ODEs modelling the cell cycle, but this effect is mostly negligible as long as the total time period .
In our first evolution experiments, we found that the single objective function was insufficient alone to evolve a cycling network. A possible reason for this is that the fitness landscape in parameter space defined by is mostly flat far from the optimum and is difficult to navigate as it doesn’t incrementally guide the evolution process towards a proper size control phenotype. Indeed, a network that does not perform size control will display very few divisions before diverging towards sizes of 0 or . In contrast, a network that does performs some size control, even if performed badly, will display mostly stable growth with many divisions and the volume will not diverge over successive generations. There is thus an all or nothing effect with this fitness function. We found that optimization process would often get stuck in a local optima with a low number of divisions and could not find a path to the global optima of size control. Because of this, we chose to turn towards multi-objective Pareto optimization which aims to simultaneously optimize several fitness functions. The idea here is that an additional fitness function can guide the evolutionary process through a different path in parameter space and could allow the evolutionary procedure to escape local optima.
In the Pareto optimization framework, we consider equally important objective functions . Assuming that fitness functions are to be maximized, we say that individual (strictly) dominates individual if and only if their fitness and are such that:
The algorithm then selects half of the population of the highest rank using a fitness sharing algorithm to maximize population diversity. We refer the reader to Warmflash et al., 2012 for more details on this procedure.
For this project, we chose to limit the optimization process to two fitness functions. We chose as the first fitness function. We tested different measures of size control for the second objective function which are described in the following subsections. In most of the cases, we chose the coefficient of variation of the size distribution at birth to be minimized as the second fitness function. In any case, we typically run 10 independent realizations of a network’s performance and compute the average fitness score over those runs to buffer variations in fitness scores.
Residuals
We first tested a least squared residuals fitness function to be minimized by -evo. This function yielded successful evolutionary runs but was abandoned due to requiring a user-defined target volume . Indeed, we wanted to avoid the bias where we could select for biochemical networks matching a specific volume range. We used the following equation for the fitness function with indicating the cell volume at birth at the -th generation in a lineage.
We nevertheless present the result of a successful evolution run using Pareto optimization of and in Appendix 1—figure 11 where we obtain a version of the feedback-based network topology of Model A1.
Coefficient of variation
We then considered the coefficient of variation of the size distribution at birth () to be minimized by -evo. The is a measure of size control that normalizes the variance of the size distribution at birth with respect to its mean and is thus mostly insensitive to the absolute volume range of the cell. We chose this second fitness function in most of the Pareto optimization evolution experiments as described in the main text.
Added volume slope in G1
We also considered directly optimizing the fitted slope of the volume added in G1 as a function of volume at birth to reinforce the sizer behavior in G1. As described in the main text, given a series of volume values at birth and at G1/S, and for , the added volumes in G1 are defined as:
The best fit slope of a linear model has a closed-form equation which is given as a function of the lineage data directly:
A slope of –1 corresponds to a sizer, a slope of 0 to an adder and a slope of +1 corresponds to a timer. In order to directly optimize the size control mechanism in G1 and to bias towards sizer mechanisms and to keep the fitness values positive, we considered the following fitness function to be minimized by -evo:
Fitness penalties
In order to keep biochemical concentrations and cell volumes at reasonable levels, we chose to bound volume growth to a range with and . To guide the evolution of size control and to have a well defined steady-state distribution of cell sizes, we chose to impose fitness penalties on networks that would see their volumes reach one of those bounds at some point during a run. This applies to all evolution experiments performed in this study, but we will describe the case of the and fitness functions as they were used most of the time during this study.
Firstly, if the volume reached , we considered the cell too small and declared it dead. At that point, any further cycles would not contribute to fitness scores. With this penalty, networks are guided towards preventing or at least delaying the time at which the volume becomes too small for the cell to remain viable.
Similarly, when volume reaches , we penalize the fitness functions but in a different way. Because volume growth is restricted to the domain below , we set the growth rate when . In this case, we sometimes see networks make use of this growth arrest and exploit this artificial feature. This phenomenon can sometimes be seen in computational evolution where digital mirages are often exploited by optimization processes (Lehman et al., 2020). Here, such exploitative networks have initially long cycle periods and can sometimes spend the majority of their cell cycle in this growth arrest phase. Over successive epochs, this period gets shortened to get more and more divisions to take place during a run, shortening the time spent in growth arrest at each cycle. Eventually, this optimization leads to a particular type of model that has very sloppy size control but that scores highly with the fitness functions because of the artificial growth arrest. Such networks develop a timer with period and use the artificial growth arrest to buffer any volume variation incurred from late G1/S transition or noisy S/G2/M duration. These networks are optimal from an perspective since they exactly double their mass at each cycle and perform the same number of division cycles during a run as an actual size control network. They are also more than optimal from a since their growth is always stopped at . Thus, without fail, their birth volume and there is no variation at all in the distribution. This size control illusion is a global optimum of the 2D-fitness space and must thus be heavily penalized to prevent the optimization from selecting this phenotype. Thus, when , we only count 70% of the divisions which is sufficient to distinguish this artificial phenotype from actual size control mechanisms. Additionally, we penalize the score by adding to it a penalty of . This is significantly different from the usual range of coefficient of variations which lie between 0 and 0.5 typically, and prevents the optimization from selecting the artificial phenotype.
The introduction of these fitness penalties improved the convergence rate of the -evo algorithm significantly. Specifically, we believe the penalty on being somewhat less severe than the one for improved the convergence rate drastically. This is probably because the initial cell cycle model chosen for the evolution runs exhibits unstable growth and inevitably sees the cell volume grow to or shrink to rapidly as shown in Appendix 1—figure 3. Thus, the fitness landscape surrounding the initial cell cycle model is quite flat and is difficult to navigate from an optimization perspective. Gradual improvements to the fitness function at by spending less and less time in growth arrest seemed to have helped the optimization process. This guided the algorithm towards better size control models more often than if penalties were absent.
Overall, even if the penalties somewhat biased the evolutionary process into following a phenotypic trajectory, they improved the convergence rate of the evolutionary process dramatically to the point that we decided to keep them for all experiments.
Biochemical interactions
Many ‘inverse-approach’ approaches in systems biology have focused on purely transcriptional networks (Cotterell and Sharpe, 2010; François et al., 2007; Fujimoto et al., 2008; Ten Tusscher and Hogeweg, 2011;), because they are generic, easier to study and can efficiently describe many biological dynamics (Alon, 2007). In this project, we extend the biochemical interactions available for evolution: we not only model transcriptional activation and repression but also include complexation also known as protein-protein interaction (PPI), and assume there passive degradation. Adding PPIs is especially crucial because they are well known to lead to non-linear effects (Buchler and Cross, 2009; Buchler and Louis, 2008) allowing for the simple implementation of complex dynamics such as genetic oscillations (François and Hakim, 2005) observed e.g. in circadian clocks (François, 2005), and such non-linear effects indeed play crucial roles for control in our evolved model. The equations for these interactions are presented in this subsection.
We model transcriptional activation of arbitrary network species Y’s production from species X using the following Hill equation:
Similarly, repression of arbitrary species Y’s production from species X is modeled via the following Hill equation:
In these equations, is the threshold required for to activate or repress at 50% of its capacity and is the Hill coefficient. While both types of interactions are modeled using Hill equations, their combined effects on a network species’ dynamics is computed differently. Indeed, only the maximum of all the activations is accounted for whereas the inhibitions are multiplicative and all are accounted for. Additionally, we allow some species to be produced at a basal rate independent of any activator which counts as an additional activation.
Altogether using an example, assuming multiple species activate the production of species Z while multiples species repress it, and assuming that Z has a basal production rate and a maximum production rate , then the total contribution of these interactions to the ODE for the dynamics of is given by:
We use the law of mass-action to model PPIs and passive degradation. Specifically, if arbitrary species X and Y interact together and form a complex Z given a forward rate kf and a backwards rate kb, then the contribution of these interactions to the ODE for the dynamics of the system will be given by:
Lastly, all network species are assumed to be degraded at a passive rate. Thus, if an arbitrary species X is solely degraded with a rate , then the dynamic equation for the dynamics of will be given by:
Analysis of sources of noise
As mentioned in the main text and in this Appendix, we chose a hierarchical way of introducing noise in the system, starting with the biggest contributing factor and incrementally adding additional sources of noise in subsequent analyses. We first included noise in the cell cycle phases, specifically in the timing of the G1/S transition and in the length of the S/G2/M phase. Then in the later parts, we introduced protein production noise modeled as Langevin noise.
In the simulations presented in the main text, we chose not to include noise in the growth rate and in the division ratio as the recorded noise level for in experiments for these measures is lower than that for the timing of the cell cycle and the protein concentration noise (Di Talia et al., 2007; Newman et al., 2006; Zatulovskiy et al., 2020). Nevertheless, those are crucial assumptions that we made that we chose to investigate in more details here.
In subsection S/G2/M noise, we investigate how the level of noise in S/G2/M affects the conclusions drawn in the main text. Then, in subsection Growth rate noise we do the same for noise in the growth rate and in subsection Division ratio noise for the division ratio.
Appendix 1—table 1 shows the values of of the three models presented in the main text compared to the values reported in the literature for budding yeast (Di Talia et al., 2007), fission yeast (Sveiczer et al., 1996) and mouse epidermal stem cell grown in the animal (Xie and Skotheim, 2020).
S/G2/M noise
Here, we perform similar evolution experiments to those reported in Figure 4 of the main text to examine the effect of modulating the noise in the S/G2/M timer. We thus perform three independent experiment where we set the CV in the timer period to 0%, 5%, and 8% corresponding to no, medium, and high noise respectively. For reference, the CV of the timer period in the control condition where Model A1 was evolved is 3%. Note that we maintain the average duration of the timer to be about half the time it takes to double the cell’s volume. Having specified the S/G2/M timer parameters and starting from the initial seed network of Model A1, we perform evolution and select networks as previously. We compare ensembles of 60 networks for each noise level, half of them evolved under the Pareto optimization of and and the other half under the single objective optimization of . The results are shown in Appendix 1—figure 5.
Increasing the noise, progressively leads to a loss of the sizer signature and increases the . This is likely because the fixed duration of S/G2/M allows the system to accurately reset protein concentrations for the subsequent cell cycle to promote accurate G1 control (Willis et al., 2020). Thus, an increasing level of noise becomes associated with a worse accuracy in the size control mechanism which leads to loss of the sizer signature and increased as can be seen in Appendix 1—figure 5D,E. Other results from the main text remain unchanged.
Growth rate noise
Here, we perform similar evolution experiments as we did for the noise in S/G2/M but this time by adding noise in the growth rate . Specifically, at each generation of a cell’s lineage, we sample a growth rate from a Gaussian distribution centered around 0.25, the initial value we used in the rest of this project. We perform three evolution experiments with coefficient of variations for the growth rate distributions set to 3%, 5% and 8% corresponding to low, medium and high noise respectively. We perform 30 independent evolution runs with the Pareto optimization framework for each noise level, each of them starting from the initial seed network of Model A1. The results are shown in Appendix 1—figure 6.
We note that on average, the growth rate will remain centered around the same value, thus not affecting the optimal fitness score networks are able to achieve. However, individual variations at each cycle perturb the ability of the system of accomplishing size control by always modifying the doubling time . This leads to progressive loss of the sizer signature and also increases the . This increased noise in the system sometimes sends the cell volume towards 0 or as volume is kicked outside of the control mechanism’s working range. Since this behavior is highly penalized in the fitness functions scoring, the evolution finds a way to prevent this from happening via different strategies. Interestingly, at higher noise levels, strong sizers can still evolve but are not the most common phenotype. Instead, evolution seems to favor timers that reliably ensure timely cell division generation after generation. There is however a trade-off and these models exhibit a higher due to the lower amount of size control. Adders can also be evolved at all tested noise levels and provide good size control with reliably low .
Division ratio noise
Here, we perform similar evolution experiments as we did for the noise in S/G2/M and in , but this time by adding noise in the division fraction . Specifically, at each generation of a cell’s lineage, we sample from a Gaussian distribution centered around 2, the initial value we used in the rest of this project for symmetrical divisions. We perform three evolution experiments with coefficient of variations for the growth rate distributions set to 2%, 4% and 8% corresponding to low, medium and high noise respectively. We perform 30 independent evolution runs with the Pareto optimization framework for each noise level, each of them starting from the initial seed network of Model A1. The results are shown in Appendix 1—figure 7.
The results of this experiment are very similar to those for noise in the growth rate described in the previous subsection. Indeed, changing the division fraction does not change the doubling time directly like for . Instead, it changes the cycle period around which cells see their volume shrink or grow over successive generations. Indeed, for a division fraction , this equilibrium time between shrinking and growth becomes . Intuitively, if is bigger than 2, then cells need to spend a little bit more time in their cell cycles for growth to occur in order to compensate for this increased division fraction. This increased noise in leads to progressive loss of the sizer signature as seen before and also increases the . The division fraction noise has a big effect on premature cell death. Indeed, at the highest noise level tested where , we saw many runs end with premature cell death to the point where three evolution runs were completely unable to find a mechanism able to prevent this. As a result of this strong pressure to avoid premature cell death, evolution turns once again to timers instead of sizers in the noisier regime. As before, adders seem to be the most reliable size control phenotype exhibiting low .
Model descriptions
In this section, we give the full set of equations and parameter values of the models from the main text. We remind the reader that we scale all our variables so that a concentration of one arbitrary unit corresponds roughly to 1000 proteins in a 100fL cell (Milo et al., 2010). Additionally, we scale the time variable such that 1 arbitrary time unit corresponds roughly to 30 min (Di Talia et al., 2007).
Model A1
The parameter values of the Model A1 are shown in Appendix 1—table 2 along with the corresponding differential equations in Equation 24.
Model A2
The parameter values of the Model A2 are shown in Appendix 1—table 3 along with the corresponding differential equations in Equation 25.
Model B
The parameter values of the fluctuation-sensing Model B are shown in Appendix 1—table 4 along with the corresponding stochastic differential equations in Equation 26. We note that the C0 parameter in the noise terms of the equation is a concentration scaling factor.
Additional models
Here we present additional models that were evolved but not discussed in the main text to provide more examples of size control mechanisms.
Model A3
Model A3 is similar to Model A1, but lacks the homodimerization interaction of (see Equation 24). This specific model results in a weak adder/timer that displays a non-linear size control response curve. This is similar to Model A2 where we see a sizer/adder behavior in the low control volume regime and adder/timer behavior in the high control volume regime.
This model’s behavior is summarized in Appendix 1—figure 8. The parameter values of the model are shown in Appendix 1—table 5 along with the corresponding differential equations in Equation 27. This model was evolved with the Pareto fitness optimization framework to maximize and minimize . The initial seed model topology for this evolutionary simulation was the quantity sensing oscillator shown in Appendix 1—figure 3A and was optimized during 3000 epochs.
Model A4
Model A4 is similar in essence to Model A1, albeit more unstable. In this model, is the size sensor. Instead of using a PPI to titrate in a size-dependent manner in G1, this model leverages a transcriptional repression to modulate the production of directly rather than its effective degradation. The model’s behavior is summarized in Appendix 1—figure 9. The parameter values of the model are shown in Appendix 1—table 6 along with the corresponding differential equations in Equation 28. Notably, Appendix 1—figure 9E shows that the model’s response curve in the low control volume regime is ill-defined. In this model specifically, when the low volume regime is reached, the concentration of inhibitor at G1/S increases due to quantity sensing . Then, because of the homodimerization of into S3, we see the concentration also rise. We can see both of these curves spike up momentarily in the trajectories of Appendix 1—figure 9D around . The problem arises if this increase is too strong. Then, S3 activates the production of additional , kick-starting a positive feedback loop that creates more and more inhibitor , effectively interrupting the oscillator and making the period ill-defined. In practice, due to intrinsic noise at the G1/S transition and in the S/G2/M timer length, we’ve seen a cell lineage terminate prematurely before it can reach the maximum number of allowed in a simulation because of this problem. One could say this model is thus less fit than those presented in the main text, although it displays an added volume slope over its cycle of –0.41 and is close to a sizer.
This model was evolved using a Pareto fitness optimization that maximizes and minimizes . The initial model topology was the quantity sensing oscillator of Appendix 1—figure 3A which was optimized over 4000 epochs.
Model A5
Model A5 is another variation on Model A1 but here within the S. pombe cell cycle framework where controls the timing of division directly and the Switch is turned on in G1 instead of S/G2/M. Here, directly senses size and does so via the PPI linking I, , and . Indeed, since I is inversely proportional to the volume at G1/S due to quantity sensing and since is solely produced via complex formation of I with , is also inversely proportional to volume. Consequently, , which is almost solely produced via the dissociation of , becomes a direct sensor of the size of the cell.
Then, instead of using a PPI to titrate in a size-dependent manner as is done in Models A1 and A2, Model A5 leverages a transcriptional repression mechanism to modulate the production of directly instead of its degradation, precisely like in Model A4. The model’s behavior is summarized in Appendix 1—figure 10. The parameter values of the model are shown in Appendix 1—table 7 along with the corresponding differential equations in Equation 29.
This model was evolved using a Pareto fitness optimization that maximizes and minimizes . The seed model topology was the quantity sensing oscillator of Appendix 1—figure 3A which was optimized over 2500 epochs.
Model A6
Model A6 is yet another version of Model A1 evolved with slightly different fitness functions. This model was evolved using a Pareto fitness optimization that maximizes and minimizes the sum of squared residuals from a target volume at birth as described in Equation 15. For reference, the target volume at birth chosen for this simulation was . The initial model topology was the quantity sensing oscillator of Appendix 1—figure 3A which was optimized over 3000 epochs.
The model’s behavior is summarized in Appendix 1—figure 11. The parameter values of the model are shown in Appendix 1—table 8 along with the corresponding differential equations in Equation 30.
Data availability
This is a theory paper, so there is no experimental data, and all results were generated by the code. The code used is freely available at https://github.com/FelixPG/PhiEvo_SizeControl, (copy archived at swh:1:rev:afa7f16a2f8a9d793aa3685116c2436faae100dd). Reference to the code has been added in the text.
References
-
Network motifs: theory and experimental approachesNature Reviews. Genetics 8:450–461.https://doi.org/10.1038/nrg2102
-
Cell size regulation in bacteriaPhysical Review Letters 112:208102.https://doi.org/10.1103/PhysRevLett.112.208102
-
Details matter: noise and model structure set the relationship between cell size and cell cycle timingFrontiers in Cell and Developmental Biology 5:92.https://doi.org/10.3389/fcell.2017.00092
-
Molecular titration and ultrasensitivity in regulatory networksJournal of Molecular Biology 384:1106–1119.https://doi.org/10.1016/j.jmb.2008.09.079
-
Protein sequestration generates a flexible ultrasensitive response in a genetic networkMolecular Systems Biology 5:272.https://doi.org/10.1038/msb.2009.30
-
The physics of cell-size regulation across timescalesNature Physics 15:993–1004.https://doi.org/10.1038/s41567-019-0629-y
-
Scaling properties of cell and organelle sizeOrganogenesis 6:88–96.https://doi.org/10.4161/org.6.2.11464
-
Two redundant oscillatory mechanisms in the yeast cell cycleDevelopmental Cell 4:741–752.https://doi.org/10.1016/s1534-5807(03)00119-9
-
Controlling cell size through sizer mechanismsCurrent Opinion in Systems Biology 5:86–92.https://doi.org/10.1016/j.coisb.2017.08.010
-
Control of cell size and cycle time in Schizosaccharomyces pombeJournal of Cell Science 24:51–67.https://doi.org/10.1242/jcs.24.1.51
-
A model for the neurospora circadian clockBiophysical Journal 88:2369–2383.https://doi.org/10.1529/biophysj.104.053975
-
Core genetic module: the mixed feedback loopPhysical Review. E, Statistical, Nonlinear, and Soft Matter Physics 72:031908.https://doi.org/10.1103/PhysRevE.72.031908
-
Deriving structure from evolution: metazoan segmentationMolecular Systems Biology 3:154.https://doi.org/10.1038/msb4100192
-
Evolving phenotypic networks in silicoSeminars in Cell & Developmental Biology 35:90–97.https://doi.org/10.1016/j.semcdb.2014.06.012
-
Stochastic simulation of chemical kineticsAnnual Review of Physical Chemistry 58:35–55.https://doi.org/10.1146/annurev.physchem.58.032806.104637
-
Dilution and titration of cell-cycle regulators may control cell size in budding yeastPLOS Computational Biology 14:e1006548.https://doi.org/10.1371/journal.pcbi.1006548
-
φ-evo: a program to evolve phenotypic models of biological networksPLOS Computational Biology 14:e1006244.https://doi.org/10.1371/journal.pcbi.1006244
-
Cell-size maintenance: universal strategy revealedTrends in Microbiology 23:4–6.https://doi.org/10.1016/j.tim.2014.12.001
-
BookNumerical Solution of Stochastic Differential EquationsBerlin, Heidelberg: Springer.https://doi.org/10.1007/978-3-662-12616-5
-
BioNumbers -- the database of key numbers in molecular and cell biologyNucleic Acids Research 38:D750–D753.https://doi.org/10.1093/nar/gkp889
-
Synthetic circuits reveal how mechanisms of gene regulatory networks constrain evolutionMolecular Systems Biology 14:e8102.https://doi.org/10.15252/msb.20178102
-
Mechanistic origin of cell-size control and homeostasis in bacteriaCurrent Biology 29:1760–1770.https://doi.org/10.1016/j.cub.2019.04.062
-
Mapping self-organized criticality onto criticalityJournal de Physique I 5:325–335.https://doi.org/10.1051/jp1:1995129
-
The size control of fission yeast revisitedJournal of Cell Science 109:2947–2957.https://doi.org/10.1242/jcs.109.12.2947
-
Evolution of networks for body plan patterning; interplay of modularity, robustness and evolvabilityPLOS Computational Biology 7:e1002208.https://doi.org/10.1371/journal.pcbi.1002208
-
Engineering self-organized criticality in living cellsNature Communications 12:4415.https://doi.org/10.1038/s41467-021-24695-4
-
Bacterial cell size: multifactorial and multifacetedAnnual Review of Microbiology 71:499–517.https://doi.org/10.1146/annurev-micro-090816-093803
-
Sizing up the bacterial cell cycleNature Reviews. Microbiology 15:606–620.https://doi.org/10.1038/nrmicro.2017.79
-
Sizing up to divide: mitotic cell-size control in fission yeastAnnual Review of Cell and Developmental Biology 31:11–29.https://doi.org/10.1146/annurev-cellbio-100814-125601
-
A G1 sizer coordinates growth and division in the mouse epidermisCurrent Biology 30:916–924.https://doi.org/10.1016/j.cub.2019.12.062
-
On the molecular mechanisms regulating animal cell size homeostasisTrends in Genetics 36:360–372.https://doi.org/10.1016/j.tig.2020.01.011
Article and author information
Author details
Funding
Natural Sciences and Engineering Research Council of Canada (Discovery Grant)
- Paul François
Natural Sciences and Engineering Research Council of Canada (Alexander Graham Bell Canada Graduate Scholarship)
- Felix Proulx-Giraldeau
Fonds de recherche du Québec – Nature et technologies (Doctoral research scholarship)
- Felix Proulx-Giraldeau
National Institutes of Health (NIH R35 GM134858)
- Jan M Skotheim
Chan Zuckerberg Initiative (Biohub Investigator Award)
- Jan M Skotheim
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
We thank Rodrigo Reyes-Lamothe, Nicolas E Buchler, and Lucas Fuentes Valenzuela for helpful comments on the manuscript. JS was supported by the NIH (R35 GM134858), PF was supported by Natural Sciences and Engineering Research Council of Canada (NSERC), Discovery Grant Program, FPG was supported by a Fonds de Recherche du Québec Nature et Technologies (FRQNT) Doctoral scholarship (B2X) and by a Natural Sciences and Engineering Research Council of Canada (NSERC) Doctoral Canada Graduate Scholarship (CGS-D).
Copyright
© 2022, Proulx-Giraldeau et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 1,740
- views
-
- 380
- downloads
-
- 10
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Computational and Systems Biology
- Microbiology and Infectious Disease
Timely and effective use of antimicrobial drugs can improve patient outcomes, as well as help safeguard against resistance development. Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) is currently routinely used in clinical diagnostics for rapid species identification. Mining additional data from said spectra in the form of antimicrobial resistance (AMR) profiles is, therefore, highly promising. Such AMR profiles could serve as a drop-in solution for drastically improving treatment efficiency, effectiveness, and costs. This study endeavors to develop the first machine learning models capable of predicting AMR profiles for the whole repertoire of species and drugs encountered in clinical microbiology. The resulting models can be interpreted as drug recommender systems for infectious diseases. We find that our dual-branch method delivers considerably higher performance compared to previous approaches. In addition, experiments show that the models can be efficiently fine-tuned to data from other clinical laboratories. MALDI-TOF-based AMR recommender systems can, hence, greatly extend the value of MALDI-TOF MS for clinical diagnostics. All code supporting this study is distributed on PyPI and is packaged at https://github.com/gdewael/maldi-nn.
-
- Computational and Systems Biology
- Genetics and Genomics
Enhancers and promoters are classically considered to be bound by a small set of transcription factors (TFs) in a sequence-specific manner. This assumption has come under increasing skepticism as the datasets of ChIP-seq assays of TFs have expanded. In particular, high-occupancy target (HOT) loci attract hundreds of TFs with often no detectable correlation between ChIP-seq peaks and DNA-binding motif presence. Here, we used a set of 1003 TF ChIP-seq datasets (HepG2, K562, H1) to analyze the patterns of ChIP-seq peak co-occurrence in combination with functional genomics datasets. We identified 43,891 HOT loci forming at the promoter (53%) and enhancer (47%) regions. HOT promoters regulate housekeeping genes, whereas HOT enhancers are involved in tissue-specific process regulation. HOT loci form the foundation of human super-enhancers and evolve under strong negative selection, with some of these loci being located in ultraconserved regions. Sequence-based classification analysis of HOT loci suggested that their formation is driven by the sequence features, and the density of mapped ChIP-seq peaks across TF-bound loci correlates with sequence features and the expression level of flanking genes. Based on the affinities to bind to promoters and enhancers we detected five distinct clusters of TFs that form the core of the HOT loci. We report an abundance of HOT loci in the human genome and a commitment of 51% of all TF ChIP-seq binding events to HOT locus formation thus challenging the classical model of enhancer activity and propose a model of HOT locus formation based on the existence of large transcriptional condensates.