Evolution of cell size control is canalized towards adders or sizers by cell cycle structure and selective pressures
Abstract
Cell size is controlled to be within a specific range to support physiological function. To control their size, cells use diverse mechanisms ranging from ‘sizers’, in which differences in cell size are compensated for in a single cell division cycle, to ‘adders’, in which a constant amount of cell growth occurs in each cell cycle. This diversity raises the question why a particular cell would implement one rather than another mechanism? To address this question, we performed a series of simulations evolving cell size control networks. The size control mechanism that evolved was influenced by both cell cycle structure and specific selection pressures. Moreover, evolved networks recapitulated known size control properties of naturally occurring networks. If the mechanism is based on a G1 size control and an S/G2/M timer, as found for budding yeast and some human cells, adders likely evolve. But, if the G1 phase is significantly longer than the S/G2/M phase, as is often the case in mammalian cells in vivo, sizers become more likely. Sizers also evolve when the cell cycle structure is inverted so that G1 is a timer, while S/G2/M performs size control, as is the case for the fission yeast S. pombe. For some size control networks, cell size consistently decreases in each cycle until a burst of cell cycle inhibitor drives an extended G1 phase much like the cell division cycle of the green algae Chlamydomonas. That these size control networks evolved such selforganized criticality shows how the evolution of complex systems can drive the emergence of critical processes.
Editor's evaluation
This paper develops evolutionary simulations to identify the type of molecular networks that can give rise to size control. The authors propose an evolutionary framework to find which factors select for particular mechanisms in cell size control. They show that the evolution of a specific cell size control mechanism is dependent on the cell cycle structure.
https://doi.org/10.7554/eLife.79919.sa0Introduction
Cell size is fundamental to cell physiology and function because it sets the scale of subcellular compartments, cellular biosynthetic capacity, metabolism, mechanical properties, surfacetovolume ratios, and molecular transport (Chan and Marshall, 2010; Ginzberg et al., 2015; Neurohr et al., 2019; Zatulovskiy and Skotheim, 2020). While different types of cells vary enormously in size to perform their functions, cells within a particular type are generally uniform in size indicating that cell growth may be accurately coupled to division and differentiation processes. On a phenomenological level, there are many commonalities in how cells regulate their size even though the molecules controlling cell division vary across the tree of life with the most striking differences separating eukaryotes and bacteria. This extreme molecular diversity in the regulatory proteins raises the question as to what are the common features of the control systems that evolved to implement size control.
Most generally, cell size control can be viewed as a return map where the division size is a function of the cell size at birth. The examination of proliferating cells in laboratory conditions has revealed a variety of size control phenomena that can be characterized quantitatively by plotting the size of a cell at birth against the amount of mass added before it divides (Amir, 2014; Facchetti et al., 2017; Jun and TaheriAraghi, 2015). A ‘sizer’ has a slope of –1 so that all variation in cell mass at birth is compensated for in one cell cycle, whereas an ‘adder’ has a slope of 0 so that each cell adds the same amount of mass during the cell cycle regardless of initial size. In the case of an adder, control is weaker so that multiple cell cycles are required for a particularly large or small cell to return to the average cell size. Importantly, the slope relating size at birth with the amount of growth in the cell cycle is a metric that quantifies the amount of size control occurring in a particular condition.
Studies of cell size control have revealed a diverse set of phenomena. Fission yeast and mouse epidermal stem cells exhibit ‘sizers’, and most bacteria, archaea, and cultured human cell lines exhibit behavior closer to adders (Cadart et al., 2018; Eun et al., 2018; Jun et al., 2018; Sveiczer et al., 1996; Westfall and Levin, 2017; Willis and Huang, 2017; Xie and Skotheim, 2020). Thus, while diverse size control behaviors have been observed, adders have been observed more often than sizers. This raises the question of why adders are more frequently observed if sizers, by definition, are more effective at controlling cell size (Barber et al., 2017; Willis et al., 2020).
To address the question of why adders are the most often observed form of cell size control, we used evolutionary algorithms (Holland, 1992) to identify commonalities between networks evolved to control cell size. Evolutionary algorithms are a class of machine learning techniques aiming at mimicking evolutionary processes (Crombach, 2021; François, 2014; François and Hakim, 2004; Xiong et al., 2019). Because of the nature of evolution, results of evolutionary computations are often more efficient and more creative than expected (Lehman et al., 2020). Furthermore, solutions found by evolutionary algorithms are constrained by their evolutionary paths followed and present similar characteristics to biologically evolved systems (Schaerli et al., 2018). Cell size is regulated through the cell cycle control network that governs transitions from one phase of the cell cycle to the next. The division cycle can be broken up into distinct phases that are characterized by different molecular activities (Morgan, 2007). While it is typically considered that there are 4 phases of the cell cycle (G1, S, G2, and M), we here consider a two phase model based on a G1 phase and a composite S/G2/M phase. This is because size control in general has been associated with either the G1/S transition or mitosis at the end of the cell cycle.
We start with a simple model of the cell cycle with a timer for an S/G2/M phase (Figure 1) that we evolve to optimize homeostatic cell size control (Figure 1B). We discovered that different control mechanisms could perform cell size control based on protein quantity or concentration. Simulations in which size control takes place in G1 phase converge toward an adder mechanism for the entire cell cycle and identified an active quantity sensing mechanism similar to dilutionbased mechanisms previously identified experimentally (Chen et al., 2020; Qu et al., 2019; Schmoller et al., 2015). The relative durations of G1 and S/G2/M were important in determining size control properties. A relatively shorter S/G2/M phase favors sizer mechanisms, while longer S/G2/M phases favor adders. Moreover, inverting the model so that cell size controls S/G2/M and G1 is a timer, like in fission yeast, results in more sizerlike control. Thus, we anticipate adders arise when cell size is controlled at a point intermediate in the cell cycle, like the G1/S transition, while sizers will appear when cell size regulates a point later in the cell cycle, as is the case when G1 is proportionally longer, or control takes place at the transition to mitosis. We finally identify a selforganized mechanism based on fluctuation sensing where size control occurs on average over multiple cycles. While there is no onetoone correspondence between a specific size control mechanism and a given evolutionary pressure, our work identifies clear evolutionary principles that shed light on the diverse cell size control phenomena previously observed experimentally.
Results
Initial cellcycle model
In general, there are two classes of mechanisms that cells use to control their size that can be separated in terms of whether cell division or cell growth per se is regulated by cell size. Note that in this work, we will use mass, size, and volume interchangeably. In the first class, it is crucial that the growth rate per unit mass of a cell depends on cell size so that cells that are significantly larger than the optimum cell size grow slower (Cadart et al., 2019; Ginzberg et al., 2018; Miettinen and Björklund, 2016; Nordholt et al., 2020; Tzur et al., 2009). Such slower growing cells are then outcompeted by cells closer to the optimum size even when divisions occur purely by chance (Conlon and Raff, 2003). While sizedependent growth mechanisms exist and do support size homeostasis, such mechanisms rely on inefficient growth in all the cells away from the optimum size (Ginzberg et al., 2018; Miettinen and Björklund, 2016; Nordholt et al., 2020). To avoid such inefficient growth, many types of cells use active size control mechanisms to accelerate progression through the cell cycle in larger cells (Zatulovskiy and Skotheim, 2020). In our simulations, we keep cell growth rates constant over a physiological range of cell sizes. This allows us to focus on the common features of the molecular networks in which increasing cell size drives changes in molecular activities to trigger cell division. We assume that cell volume $V$ grows at a rate $\lambda \left(V\right)\times V$, so that growth is exponential when $\lambda $ is a constant. Volume is divided by 2 at each division after which we follow one of the two daughter cells. The growth rate sets the time scale for the system dynamics as it defines the doubling time $\tau =ln\left(2\right)/\lambda $. Any interdivision time shorter than $\tau $ will see the cell volume shrink at the next generation while any interdivision time larger than $\tau $ will see the cell volume grow. We also use the chemistry square bracket convention such that any protein X’s concentration is denoted by $\left[X\right]$ . Correspondingly, its quantity is denoted by $X$ only and is defined as $X=\left[X\right]\times V$.
We initialize our network evolution simulations with a very simplified model of the cellcycle (Figure 1A). We model two independent phases of a symmetrically dividing cell, G1 and S/G2/M, separated by a commitment point at the end of G1 and division at the end of S/G2/M (Figure 1A, ChandlerBrown et al., 2017). We encode this cell cycle state information via a binary switch variable we call ‘S/G2/M Switch’ that is 0 in G1 and 1 in S/G2/M. In all simulations, we follow an inhibitor model (Heldt et al., 2018; Schmoller et al., 2015; Zatulovskiy et al., 2020) and assume that the probability of passing the G1/S transition is controlled by the quantity of a transcription regulator $I$. One way the quantity rather than the concentration of a molecule could be sensed is through its titration against a fixed cellular quantity such as the genome, which is part of a general class of titrationbased cell size sensing mechanisms (Amodeo et al., 2015; Heldt et al., 2018; Si et al., 2019; Wang et al., 2009). A lower quantity of this inhibitor $I$ means a higher probability of a G1/S transition at the current time step of the simulation (Appendix 1—figure 2). Like all other proteins, the quantity $I$ is produced with a rate proportional to volume, degraded at a constant rate, diluted by cell growth, and equally partitioned between mother and daughter cells at division (see Materials and Methods). We found that due to the volume scaling assumption, $\left[I\right]$’s concentration alone was largely independent of volume and could not trigger a sizedependent G1/S transition, which is why we opted for the quantity of $I$ instead (Appendix 1—figure 3). Upon passing the G1/S transition, we assume cells are committed to division and there is a fixed time delay before they divide thus modeling S/G2/M as a timer (ChandlerBrown et al., 2017; Doncic et al., 2015). We initially fixed the timer duration to be roughly equal to 50% of the doubling time $\tau $ with some uniform noise such that G1 and S/G2/M durations would be the same at equilibrium. Regulation of the quantity $I$ during the cell cycle thus controls the precise timing of the G1/S transition, but it is not always perfect since the transition is probabilistic. This, along with the noise in S/G2/M timer duration, creates natural cell to cell variability in volume that needs to be compensated for by the evolved mechanism. We note that we initialized most of our simulations with one added interaction in which production of $\left[I\right]$ is activated by the S/G2/M Switch variable to reset its concentration to a higher level before the next generation. We initially ran simulations without this specific interaction but found that it systematically appears in the initial stages of evolution simulations. We therefore included it in the initial network to speed up our simulations. We refer the reader to the Appendix 1 for more details. The models used in this study are publicly available (ProulxGiraldeau and François, 2022).
Typical dynamics of this simple cellcycle model are represented in Figure 1. These dynamics are similar to models of cell cycles based on relaxation oscillators (Cross, 2003; Tsai et al., 2008). The left and right slow branches correspond to G1 and S/G2/M, respectively, and the fast horizontal branches represent G1/S and division. An intermediate fictitious nullcline is shown as a line that connects the average concentration $\left[I\right]$ at G1/S and at division. Starting with cell birth, the system goes down the leftG1 branch because of degradation, then jumps to the rightS/G2/M branch below the threshold for the G1/S transition, stays there while moving up due to production by the S/G2/M Switch, until it jumps back to the left branch at the end of the timer phase. We note that there is no explicit volume control in this initial model since the only control comes from the quantity of $I$ which does not initially depend in any way on the volume. This initial quantity sensing oscillator does not perform size control and instead results in unstable growth where size deviations are amplified at each generation instead of being corrected (Appendix 1—figure 3) as had been previously described for a size scaling inhibitor dilution model (Barber et al., 2017; Willis et al., 2020). Thus, the network needs to evolve some other interactions and/or parameters to go beyond a simple G1 inhibitor driven by production in S/G2/M to create a viable cell lineage.
Evolution of quantitybased size control mechanisms
To examine how networks controlling cell size could evolve, we ran evolutionary simulations that optimize both the number of divisions ${N}_{Div}$ and the coefficient of variation of the size distribution at birth $C{V}_{Birth}$ (see Materials and Methods for algorithm details). Figure 2A illustrates the behavior and results of a typical evolutionary run, with axes defined by both fitnesses used. Simulations successfully evolving size control mechanisms typically follow the same pattern. Networks initially cluster in two regions: region [ii] where cells have low $C{V}_{Birth}$ but grow too small and die after a few divisions, and region [i] where cells grow too big and reach our cutoff for fitness 2 (yaxis). Notice that our Pareto evolutionary algorithm maximizes network diversity, so that those two clusters are at first maintained during evolution (rank 1 Pareto networks Warmflash et al., 2012). As the number of epochs increases, networks in cluster [ii] have more and more divisions, but still grow too big, so that those cells are therefore penalized (see details in Appendix 1). At some point in evolution (around epoch 700 for this particular simulation), some weak control mechanism suddenly evolves, preventing cells from becoming too big without imposing a tight control on the average volume (see also Figure 5 for an explicit example of how this is done). Thus, fitness 2 collapses and the number of divisions is optimized simultaneously. Cells later optimize the control to give a lower $C{V}_{Birth}$ . The optimal networks, at the right most end of this line, both maximize ${N}_{Div}$ and minimize $C{V}_{Birth}$ .
Evolution simulations are in part reproducible and most often lead to similar network topologies. The evolution trajectory leading to Model A1 is a typical example (Figure 2B, see also variations of this network in Model A2 in Figure 2H and models A36 in Appendix 1—figures 8–11). The minimal network common to all those models is very simple. One gene, $R$, is added to the seed network and is both repressed and titrated by $I$ forming the network motif known as a Mixed Feedback Loop (François and Hakim, 2005). Size control can then be understood intuitively as follows.$\left[I\right]$ represses $\left[R\right]$, which is thus only produced in the narrow window of the cycle when $\left[I\right]$ is low, i.e., when the cell is close to the G1/S transition and in early S/G2/M. But, since the quantity $I=\left[I\right]\times V$ is fixed at G1/S by design, the concentration $\left[I\right]$ is inversely proportional to the volume of the cell at the G1/S transition (V_{G1/S}) as shown in our simulations (Figure 2C). Because of this, the $\left[I\right]$ dependent synthesis rate of $\left[R\right]$ and therefore its subsequent concentration are (linear) functions of the volume of V_{G1/S} (Figure 2C), allowing for the cell to keep a memory of its volume at G1/S via the $\left[R\right]$ variable (this holds even once $\left[R\right]$ is constantly degraded for the remainder of the cycle). This has two effects. First, during S/G2/M, $I$ synthesis rate is proportional to volume by hypothesis (and thus to V_{G1/S}), and $I$ is titrated by $\left[R\right]$, also proportional to V_{G1/S}. Both effects even out so that cells are born with a fixed quantity of inhibitor $I$ that is independent of volume (Figure 2D). Second, after division, production of $I$ is 0 by hypothesis, but $I$ still is titrated in G1 by the remaining $\left[R\right]$ (still proportional to V_{G1/S}). Because $I$ quantity at the beginning of G1 is size independent, this ensures that daughter cells reach the $I$ quantity threshold of G1/S earlier if they were born larger, thus ensuring size control. To confirm this, we examine the change in quantity of $I$ as a function of time spent in G1 and of V_{G1/S}, and we see the slope of these two quantities is volumedependent (Figure 2E). We notice that this evolved size mechanism likely is the simplest possible allowed by our formalism: on the one hand, it entirely captures the volume dependency in a single variable $\left[R\right]$, and on the other hand, it ensures proper scaling of $I$ both at birth and at G1/S for size control with the help of a single titration. Notice that such simple control also explains the sudden evolutionary “jump” of Pareto front around epoch 700 on Figure 2A, which corresponds to when the Mixed Feedback Loop motif first appears.
These evolved size control networks, while relying on quantity sensing, are conceptually similar to the budding yeast network relying on concentration sensing of the cell cycle inhibitor Whi5 since there is a constant quantity of $I$ present right after division (just like Whi5). In budding yeast, the Whi5 protein is passively diluted in G1 to increase the stochastic rate of progression through the G1/S transition. The time spent in G1 depends on the initial concentration of Whi5 at birth, which scales as $\frac{1}{V}$ , to promote a sizer mechanism. Here, the concentration of $\left[I\right]$ at birth scales as $\frac{1}{V}$ , but so does the threshold concentration of $\left[I\right]$ regulating the G1/S transition. This is precisely why an active titration mechanism is required to obtain G1 size control in our setup. Such homeostatic control ensures cell size returns to its steady state distribution following an artificial perturbation as soon as a volume deviation is detected at G1/S (Figure 2F). When we plot the amount of volume added $\Delta V$ for different phases of the cell cycle as a function of the initial volume at the beginning of these phases, we find an approximate adder over the whole cycle that results from weak sizer in G1 followed by a timer in S/G2/M as has been found in budding yeast (ChandlerBrown et al., 2017; Di Talia et al., 2007; Soifer et al., 2016, Figure 2G).
While we chose one simple model to illustrate the control mechanism common to our set of evolved networks (Model A1 shown in Figure 2B), other evolved networks were more elaborate but illustrated a similar principle. For example, Model A2 contains extra interactions for the volume sensing gene $\left[R\right]$, where $\left[R\right]$ is repressed by the S/G2/M Switch (meaning its production is completely shut down in S/G2/M leading to sawtoothlike dynamics). Furthermore, $\left[R\right]$ represses the synthesis of $\left[I\right]$, adding another layer of repression to promote size control beyond the previously described titration by $\left[R\right]$ (Figure 2H). If we perturb cell size to examine the dynamics of the return to steady state and look at the added volume during the cell cycle, we again see overall a weak adder behavior similar to that found in Model A1 (Figure 2I–J). We give additional examples of similarly evolved networks in Appendix 1—figures 8–11 where we can see the sensing and the feedback mechanism being implemented in slightly different ways. Yet, despite these mechanistic differences in feedback regulation the resulting function of the evolved networks were similar as indicated by their $C{V}_{Birth}$ (Appendix 1—table 1).
Quantifying size control
To study the mechanisms implicated in cell size control, we modify the control at G1/S and introduce the control volume ${V}_{C}$ . This control variable is independent from the biochemical network and is maintained fixed allowing us to disconnect the actual cell volume $V$ from the biochemical network and by forcing the G1/S transition to be triggered once ${I}_{C}=\left[I\right]\times {V}_{C}$ is low enough. We then numerically integrate the differential equations of the model and measure the period $T\left({V}_{C}\right)$ of the simulated cellcycle for this control volume. Use of the control volume allows us to break the size feedback system and distinguish its input, ${V}_{C}$ , from its output, the induced cycle period T (Angeli et al., 2004). We compute $T\left({V}_{C}\right)$ for Models A1 and A2 and compare their responses with the analytical curves of the archetypical timer, adder, and sizer (Figure 3A and C). Those curves intersect at the point where the induced period is exactly equal to the population doubling time ($\tau =ln\left(2\right)/\lambda $), which defines the equilibrium volume achieved by our cell size control network corresponding to ${\u27e8V}_{G1/S}\u27e9$. Examination of the size control in different cell cycle phases indicates the contributions of G1 and S/G2/M to the overall system behavior (Figure 3B and D). We note that we later examine statistics of ensembles of evolved models but that Models A1 and A2 are both typical examples of evolved feedbackbased models.
Quantifying precisely how the cell cycle period depends on the control volume at G1/S allows us to see that the $T\left({V}_{C}\right)$ for Model A1 overlaps with the theoretical adder curve over a broad range of volumes. In contrast, Model A2 behaves as a sizer for volumes smaller than the equilibrium volume. However, at higher volumes it behaves more like an adder/timer similar to Model A1. Model A2’s equilibrium is thus ‘tuned’ by evolution to be in a regime corresponding to the minimum of the $\Delta {V}_{Cycle}$ curve extrapolated from the $T\left({V}_{C}\right)$ as shown in Figure 3D, precisely when the system transits from a sizer at small volumes to an adderlike behavior at higher volumes. Similar behavior was observed experimentally in budding yeast (ChandlerBrown et al., 2017; Delarue et al., 2017). Thus, both Models A1 and A2 approximate an adder near the equilibrium size, but their behavior differs further from equilibrium where for smaller volumes, Model A1 is still an adder while Model A2 is a sizer.
Modulating cell cycle structural constraints selects for adders or sizers
So far, our evolutionary algorithm selects networks that implement adders rather than sizers near the equilibrium size. This is surprising because sizers are in principle better than adders at controlling cell size and reducing the CV of the size distribution at birth, which is one of our fitness functions. That adders are more frequently observed in nature than sizers (Cadart et al., 2018; Eun et al., 2018; Jun et al., 2018; Westfall and Levin, 2017; Willis and Huang, 2017; Zatulovskiy and Skotheim, 2020), is consistent with our evolution simulations, but adds to the mystery as to why this takes place.
To gain insight into the underlying reason for the prevalence of adders, we considered what might be exceptional in the cases where sizers occur. The best studied, and highly accurate sizer, is found in the fission yeast S. pombe (Fantes, 1977; Sveiczer et al., 1996). In contrast to budding yeast and human cells, where the size control takes place largely in G1 phase, fission yeast exerts size control later in the cell cycle at the G2/M transition. That size control took place later in the cell cycle in fission yeast suggested that this structural feature of its cell cycle might be responsible for its stronger size control. To test this, we inverted our seed network so that the G1 phase was a timer, while cell size control could evolve in S/G2/M (see Materials and Methods). We compare these results to the evolutionary simulations starting with the ‘control’ seed network with G1 size control and an S/G2/M timer (Figure 2). We note that S. pombe growth rates have been reported to deviate from exponential (Wood and Nurse, 2015). While slower than exponential growth would aid cell size control, our analysis here is restricted to exponential growth.
To determine how the seed network structure influences the subsequent evolution of cell size control, we performed 120 independent evolutionary simulations for the two network structures initialized with the Model A1 topology and parameters (Figure 4). Sixty simulations were performed using Pareto optimization and another 60 simulations were performed using individual fitness optimization based on the number of cell divisions ${N}_{Div}$ . For each simulation’s most fit model after 500 epochs, we calculate the CV of the volume distribution at birth, $C{V}_{Birth}$ , and the slope of the linear fit of the volume added in the entire cell cycle as a function of the cell size at birth, Slope $\Delta {V}_{Cycle}$ (we remind the reader that a slope of –1 corresponds to a sizer, 0 to an adder, and 1 to a timer). We chose to use 500 epochs in our simulations because in our previous experience this was sufficient for networks to evolve to be near the optimum, but not so much that they were forced to extensively explore the effects of neutral mutations near the optimum. Model A1’s initial and Slope before evolution are indicated by the dashed black line. In the control experiment where G1 is a sizer and S/G2/M is a timer, most evolutionary simulations with two fitness functions (Pareto) yield models close to the adder regime with a low $C{V}_{Birth}$ . However, when only the number of divisions (${N}_{Div}$) is used as a fitness, the evolutionary simulations are closer to the sizer regime, albeit with a slightly higher $C{V}_{Birth}$ (Figure 4A, Welch’s ttest p<10^{–4}). When the cell cycle structure is inverted so that G1 is a timer and S/G2/M is a sizer like it is in the fission yeast S. pombe, we found that more sizerlike networks evolve than in the control experiment (Figure 4B, Welch’s ttest on agglomerated data p=0.03). This shows that having a network structure like the fission yeast S. pombe promotes sizers, while having the size control portion of the cell cycle earlier, as in the budding yeast S. cerevisiae, promotes adders. Thus, performing simulations using cell cycle network structures of these two yeasts results in the evolution of size control mechanisms that reflect those that are naturally occurring.
The general notion that having cell size control in G1 results in adderlike mechanisms, while control later in the cell cycle results in more sizerlike mechanisms fits most observations of human cell lines grown in culture, budding yeast, and fission yeast. However, a recent study examining mouse epidermal stem cells growing and dividing in the skin found both a strong sizer and that this sizer was largely due to the sizedependent regulation of G1 (Mesa et al., 2018; Xie and Skotheim, 2020). This raised the question as to how a network performing size control in G1 could result in a sizer for the entire cell cycle. One important difference between mammalian cells grown in culture and the mouse epidermal stem cells growing in an animal (in vivo) is the change in the relative durations of the G1 and S/G2/M phases of the cell cycle. While the S/G2/M phase of the cell cycle is similar in duration in cultured cells and the epidermal stem cells in vivo at ~12 hr, the G1 phase extends ~fivefold from ~10 hr in culture to ~50 hr in vivo (Cadart et al., 2018; Xie and Skotheim, 2020). This suggests the hypothesis that the overall size control behavior can be dominated by the relatively longer cell cycle phase, as is likely the case for the G1 phase of epidermal stem cells. To test this hypothesis, we performed evolutionary simulations with size control in G1 and a timer in S/G2/M but where we changed the duration of the S/G2/M timer phases of the cell cycle to be significantly shorter or longer than the G1 phase at equilibrium. When a timer in S/G2/M is relatively shorter compared to G1, we generally see more sizerlike behavior can evolve (Figure 4C, Welch’s ttest on agglomerated data p=4 x 10^{–3}), while when it is relatively longer, we see more adderlike or even timerlike behavior (Figure 4D, Welch’s ttest on agglomerated data p<10^{–4}).
We next considered the effect of changing the amount of noise in the timer phase of the cell cycle. To do this, we examined the evolution of networks performing size control in G1 and where the S/G2/M phase with an increasing amount of noise. Increasing the noise in the timer progressively reduced the amount of size control done by the network (Appendix 1—figure 5). This is likely because the fixed duration of S/G2/M allows the system to accurately reset protein concentrations for the subsequent cell cycle to promote accurate G1 control (Willis et al., 2020). We also examined the effects of adding noise to the cellular growth rate and to volume partitioning at division and found similar results (Appendix 1—figures 6–7).
Taken together, our simulations show how the structural features of the cell cycle are important for determining what type of size control ultimately evolves. G1 control is more conducive to the evolution of adders, while S/G2/M control is more conducive to sizers. Moreover, size control can be dominated by the cell cycle phase of longest duration and is modulated by the specific selection criteria.
A twostep evolutionary pathway for cell size control
From our series of evolution simulations, we found a somewhat paradoxical inverse correlation between the added volume slope quantifying the degree of size control (sizer vs. adder) and the $C{V}_{Birth}$ (Figure 4B–C). This was surprising because it means that there is a broader distribution of volumes in the sizer regime where control should be more effective in theory. To better understand this inverted correlation between Slope $\Delta {V}_{Cycle}$ and $C{V}_{Birth}$ , we revisited our evolutionary simulations to examine the evolutionary pathways through which the networks progressed through the simulated epochs.
A typical evolutionary pathway for a Pareto simulation of the S. pombelike network structure presented in Figure 4B is shown in detail in Figure 5. Here, Model A1 topology is conserved throughout evolution although individual parameter values change. In the early stages of the evolution (epoch 650), we typically see dynamics where small cell size triggers an overshoot to a large cell size, which is then reduced through a series of rapid divisions (Figure 5A). Because of this smallsizetriggered overshoot, the system behaves more like a sizer when the average behavior is analyzed (Figure 5B). However, the high degree of variability in the cell cycles also leads to a broad distribution of volumes (Figure 5C). The variability in volume is attenuated in later epochs where there are fewer and smaller volume overshoots triggered by small cell size (Figure 5D–I). Since small cell size no longer triggers a dramatic amount of cell growth, the slope of $\Delta {V}_{Cycle}$ is increased and the system converges towards a weak adder in which the distribution of volume at birth is more Gaussian and the $C{V}_{Birth}$ is lower (Figure 5H–I). Taken together, these analyses suggest a twostep evolutionary pathway, consistent with the evolutionary dynamics first seen in Figure 2A. First, a strong but imprecise sizer mechanism evolves where, because of noise in the system, small variations in volume lead to a dramatic overcorrection and overshoot of the target volume. The variability in volume produced by this overshoot is then reduced by attenuating the strength of the size control response. Indeed, the overall weaker size control allows the system to respond more mildly to size deviations, thus yielding a lower $C{V}_{Birth}$ overall which we select for. Thus, selecting for a smaller $C{V}_{Birth}$ , i.e. better size control, can end up selecting for adders rather than sizers. This paradoxical result is consistent with the fact that when we select only for the number of cell divisions (${N}_{Div}$), one sees that more sizerlike behavior can evolve (Figure 4A–C). The typical behavior before optimization of $C{V}_{Birth}$ is a strong sizer as illustrated in Figure 5A.
Fluctuation sensing and the evolution of selforganized criticality
One of the main features of smaller cells is that they have fewer proteins and mRNA. If some aspects of protein synthesis and degradation are subject to Poisson fluctuations, we expect such fluctuations to produce larger concentration fluctuations in smaller cells. For example, let us assume that the balance of synthesis and degradation of a generic protein results into a Poisson distribution with parameter $\frac{\rho V}{\delta}$ , where $\rho $ is the synthesis rate in number of proteins per unit of time for a reference volume of 1, $V$ is the volume, and $\delta $ is the degradation rate. The average concentration of this protein in an exponentially growing cell will be $\frac{\rho}{\delta}$ , which is independent of the cell volume $V$ as expected from the production rate scaling. However, following the Bienaymé formula, the variance in the concentration is $\frac{\rho V}{\delta}\times \frac{1}{{V}^{2}}=\frac{\rho}{\delta V}$ (Figure 6A), which decreases with volume. This result makes intuitive sense because bigger cells have to produce more proteins to keep concentrations constant, so that the fluctuations in the relative number of proteins (and thus concentration) are smaller (see Jia et al., 2021 for a complete analytical study of how in general variance scales differently from mean when volume varies). Thus, if the cell could sense the size of concentration fluctuations in some way, it would be able to harness the cell sizedependence of such stochastic fluctuations to regulate cell division and control cell size.
To test if we can evolve networks that control cell size through sensing Poissonian fluctuations, we first initialized our simulation with a network similar to that shown in Figure 1C, but with an added selfactivating gene $A$ that can activate the production of the $I$ inhibitor. We then ran evolutionary simulations using the cell cycle structure of a sizer controlling G1 and timer in S/G2/M, but where the G1/S transition is regulated by the concentration $\left[I\right]$ instead of its quantity. We also used Pareto fitness optimization of ${N}_{Div}$ and of $C{V}_{Birth}$ . Importantly, we use the stochastic version of our equations with the molecular noise modeled using a Langevin noise term with a variance inversely proportional to the volume as explained above. We then extracted the most fit network and optimized it further using Pareto optimization of ${N}_{Div}$ and of the $\Delta {V}_{Cycle}$ slope to push the models towards the sizer regime. The most fit network of those combined evolutionary simulations is presented in Figure 6B and demonstrates that we can indeed evolve fluctuationbased cell size control (Model B).
The mechanism for size control that evolved based on sizedependent fluctuation sensing is remarkably similar to what we observed for models without sizedependent fluctuations (Figure 5A). For large volumes, the cycle has a constant period which corresponds to approximately 85% of the doubling time $\tau $. This ensures that in the highvolume regime, the system shrinks over time. When the volume is small however, fluctuations allow the concentration $\left[A\right]$ to cross the threshold of the highly nonlinear transcriptional activation of $\left[I\right]$ by $\left[A\right]$. This results in a massive increase of $\left[I\right]$ that needs to be degraded to progress further into the cell cycle. Thus, the low volume regime occasionally leads to a considerable increase in G1 length and a correspondingly very large cell at division. These very large cells then reliably and deterministically reenter multiple, rapid cell cycles with short G1 until the cell is small again and the concentration fluctuations again become large enough to trigger the activation of $\left[I\right]$ by $\left[A\right]$. This mechanism thus appears very similar to the early sizer mechanism observed in other quantity sensing simulations shown in Figure 5. However, here the mechanism is based only on sizedependent fluctuations in protein concentration and the overall behavior is closer to an adder (Figure 6E).
The system dynamics that evolved to perform fluctuationbased cell size control produce volume distributions that are longtailed due to the stochastic occurrence of occasional exceptionally long G1 phases. Interestingly, the probability distribution of cell cycle durations follows a power law (Figure 6F), which is due to the very broad distributions of G1 duration at lower cell volumes. A more controlled analysis specifying the initial conditions showed that cell cycles get increasingly long, and their distributions widen with decreasing control volume ${V}_{C}$ (Figure 6G). Such nonGaussianity is the hallmark of critical behavior, suggesting that the evolution of fluctuationbased cell size control is based on selforganized criticality (SOC). SOC is defined as a system where an order parameter feeds back on a control parameter (Sornette et al., 1995; Vidiella et al., 2021). The canonical example of SOC is the sandpile to which grains of sand are added on top. As the sand accumulates, the slope steepens, and the angle of the pile (control parameter) increases. Eventually, this triggers avalanches (order parameter) that feedback to dramatically reduce the angle of the pile. This ensures that the system dynamically tunes itself at the critical value of the angle of the pile where avalanches can occur.
We conclude that our evolved size control network exhibits SOC based on several observations. Starting from a high volume, multiple divisions at a rate faster than it takes to double the biomass reduce cell volume $V$ just like the addition of grains of sand gradually increases the slope of the pile. Then, for small enough volumes, bursts of $\left[A\right]$ drive an extended G1 that greatly increases cell size, which, like the sandpile avalanches, resets the system’s control parameter (volume of the cell or angle of the sandpile). Interestingly, evolution tuned the system to be near a bifurcation (Figure 6I). If we consider the deterministic regime, in which the fluctuations in $\left[A\right]$ are small, the cycle is unperturbed and oscillates with a period roughly equal to 85% of the doubling time (Figure 6J). In contrast, if we consider the noisy regime, in which the fluctuations in $\left[A\right]$ are large, the cycle disappears, and the system stays locked in a prolonged G1 state with a high value of $\left[I\right]$ which is akin to a bifurcation destroying the cycle (Figure 6K). This bifurcation takes place because the position of the G1 attractor for $\left[I\right]$ becomes larger with increasing $\left[A\right]$ and eventually overcomes the concentration required to induce the stochastic G1/S transition (Figure 6H). Then, the system remains stuck in a state where G1/S cannot be triggered, and cells effectively exit the cycle. As growth occurs, noise dies down and so does the position of the G1 attractor, eventually becoming smaller than the $\left[I\right]$ concentration required to induce the G1/S transition which allows cells to reenter the cell cycle. Thus, the system is critical from a dynamical systems standpoint and also fits the general observation that SOC systems tune themselves to be right at the point where the order parameter is non zero, but infinitesimal (Sornette et al., 1995). In our case, the bifurcation corresponds exactly to the point where $\left[A\right]$ can sufficiently activate the production of $\left[I\right]$ to prevent the G1/S transition.
Discussion
The last decade saw an explosion of time lapse microscopy studies measuring how cells control their size. These studies revealed diverse phenomena that are characterized by the correlation between cell size at birth and cell size at division. Size control ranged from sizers, where the size at division is uncorrelated from the size at birth, to adders, which add a constant volume in each cell division cycle, to timers, whose cell cycle duration is sizeindependent (Cadart et al., 2018; Eun et al., 2018; Jun et al., 2018; Willis and Huang, 2017; Wood and Nurse, 2015; Zatulovskiy and Skotheim, 2020). The presence of these diverse phenomena raises the question as to why the underlying control networks evolve one rather than another type of cell size control?
To explore the evolution of cell size control networks subject to distinct selection pressures, we used computational evolution simulations. We initially examined the evolution of a seed cell cycle model consisting of G1 and S/G2/M phases of similar duration, where the G1 phase was free to evolve sizedependence, but the S/G2/M phase was constrained as a timer. Our simulations reliably evolved a control mechanism based on a Mixed Feedback Loop (François and Hakim, 2005, Figure 2). This network is centered on a cell cycle regulator ($I$) that inhibits the G1/S transition in proportion to its quantity. $\left[I\right]$ is titrated away into an inactive complex by an increasing amount of another protein $\left[R\right]$ that is synthesized in proportion to cell size. This results in a sizedependent decrease in the effective cell cycle inhibitor (free $\left[I\right]$). Thus, our evolved network implements an effective dilution of a cell cycle inhibitor that is conceptually similar to the welldescribed inhibitor dilution models of budding yeast, human cells, and Arabidopsis plants (D’Ario et al., 2021; Schmoller et al., 2015; Xie and Skotheim, 2020; Zatulovskiy et al., 2020). We note that we did not allow the synthesis of our proteins, such as $\left[I\right]$, to be sizeindependent as has been found for budding yeast (Chen et al., 2020; Schmoller et al., 2015; Swaffer et al., 2021) as this could result in a onestep implementation of cell size control through the pure dilution of a cell cycle inhibitor. It is therefore interesting that given the constraint that all proteins be made in proportion to cell size, the network still evolved an effective ‘dilution’ of the active form of the cell cycle inhibitor molecule $\left[I\right]$ .
Our evolution simulations gave insight into factors that bias evolution towards sizer or adder type control mechanisms (Figure 4). First, it is worth noting that our evolution simulations were not deterministic. There was no onetoone correspondence between a given evolutionary pressure and any one specific cell size control mechanism. Rather, our claims represent an average behavior observed over the course of many simulations. Size control, as measured by the CV at a particular point in the cell cycle, has contribution both from the slope of the correlation between cell size and the amount of cell growth, and from the amount of noise characterizing the differences between cells that are initially the same size (Di Talia et al., 2007). It is therefore possible that a low noise adder can produce a lower CV than a higher noise sizer. This is reflected in the evolutionary paths of some of our simulations, which traverse from a noisy sizer to a less noisy adder (Figure 5). However, we anticipate even noisy sizers will be better than adders at controlling cell size in response to large deviations away from the steady state distribution. This is because sizers will always return the cell size to be within the steady state distribution within a cell cycle. We note that these generic results of how sizers and adders can govern cell size homeostasis can be derived from more traditional analytical methods (Barber et al., 2017; Willis et al., 2020). However, our evolution simulations are particularly useful because the molecular networks that evolved give nontrivial insights into how the observed size homeostasis dynamics can be regulated (e.g. via a Mixed Feedback Loop or using a system close to criticality). They are also suggestive of evolutionary pathways: despite different evolutionary modalities and control types, a natural step in many of our simulated evolutions is a system with strong sizers at very small volume only (Figures 5 and 6). This is practically reflected in a strongly negative slope on the very left side of the $\Delta {V}_{Cycle}$ plots, and a positive slope at higher volume corresponding to timers (Figures 5B and 6E). Similar nonmonotonicity of $\Delta {V}_{Cycle}$ has been identified in models of various realism and complexity (ChandlerBrown et al., 2017; Delarue et al., 2017) and we provide here an evolutionary explanation for such an effect. We thus predict that this will be observed in systems where CV at birth does not need to be tightly controlled.
In the selection of a size controlling G1 network followed by a timer in S/G2/M, we observed a prevalence of adders that is consistent with the prevalence of adders reported in the literature. While fewer in number, sizers have also been observed. That the most accurate sizers have been observed in the fission yeast S. pombe (Fantes, 1977; Sveiczer et al., 1996; Wood and Nurse, 2015), and that this organism performs cell size control at G2/M rather than at G1/S led us to explore the effect of cell cycle structure on the evolution of cell size control. We found that controlling cell size later in the cycle in S/G2/M biases evolution away from adders and towards sizers. In retrospect, this result can be rationalized since any size deviations incurred earlier during the timer period can be compensated for by the end of the cycle with the sizer. However, when the order is inverted, any size deviations escaping a G1 control mechanism would only be amplified by exponential volume growth during the S/G2/M timer period. A second recent case exhibiting sizer control was found in mouse epidermal stem cells, which exhibit a greatly elongated G1 phase and a relatively short S/G2/M phase (Mesa et al., 2018; Xie and Skotheim, 2020). We found that if we increased the relative duration of G1 in our simulations by shortening the S/G2/M timer, we also see a bias towards sizer control. In essence, by extending G1 to a larger and larger fraction of the cell cycle the control system is gradually approaching a size control taking place at the end of the cell cycle, that is, an S/G2/M size control. Taken together, these simulations suggest the principle that having sizedependent transitions later in the cell cycle selects for sizers, while having such transitions earlier selects for adders.
In addition to identifying cell cycle structural features that canalize evolution towards sizers and adders as described above, we also observed an intriguing mechanism relying on molecular fluctuations. In this case, small cells would trigger an abnormally long G1 that would result in very large cells that decrease in size through a series of rapid cell divisions. This type of size control is reminiscent of that found in the green algae Chlamydomonas where a series of rapid, sizereducing cell divisions cease when cells go below a target size (Heldt et al., 2020). In our case, small size results in larger concentration fluctuations due to Poisson noise in the number of molecules. These concentration fluctuations, when large enough, are then able to trigger a burst of G1/S inhibitor that leads to an extended G1 phase and massive cell size growth before another series of rapid cell divisions is initiated (Figure 6). Interestingly, the system thus performs statistical size control over many generations. Intriguingly, this size control mechanism exhibits hallmarks of selforganized criticality (SOC). Just like adding grains of sand to a pile eventually triggers avalanches, the consistent decrease of cell size in the rapid division cycles eventually triggers a greatly extended G1 phase. To our knowledge, this is the first example where selforganized criticality is obtained in artificially evolved models of gene networks performing a welldefined function and is consistent with the idea that evolution of complex systems can favor the emergence of critical processes.
Materials and methods
Mathematical formalism
Request a detailed protocolTo model gene networks, we follow a standard ODE based formalism, where we simulate dynamics of the concentrations of proteins. We use Hill functions for transcriptional interactions, and standard mass action kinetics for proteinprotein interactions. We also assume that all proteins are degraded at a constant rate. For most of the simulations presented in the paper, we use deterministic ODEs for simplicity. Importantly, celltocell variability arises from the precise timing of cell cycle progression events. This allows for a natural way to generate noise on cell volume that should then be compensated for by the evolved network. In the last part of the paper, we explicitly include Langevin noise for biochemical reactions that are modeled using a classical tauleaping formalism (Gillespie, 2007). Thus, each biochemical reaction takes place with a rate that corresponds to the deterministic rate, to which we add one white Gaussian noise with a variance equal to that rate. For example, given a deterministic biochemical rate $k$ and a time interval of size $\Delta t$, we consider a tauleaping change of $k\mathrm{\Delta}t+\mathcal{N}\left(0,k\mathrm{\Delta}t\right)$ where $\mathcal{N}\left(0,k\mathrm{\Delta}t\right)$ is a random gaussian variable of mean 0 and variance $k\Delta t$.
Volume influences protein dynamics in three ways. First, protein production rates are generally proportional to cell volume so that proteins reach and maintain a constant concentration that is independent of the cell volume (Chen et al., 2020; Elliott and McLaughlin, 1978; Newman et al., 2006; Swaffer et al., 2021). We note that we are not allowing the cell to employ proteins such as Whi5 in budding yeast whose production is independent of cell size so that its concentration is a direct readout of cell size (Schmoller et al., 2015; Swaffer et al., 2021). We chose to do this because we want to explore how cell size control can be done by a network with multiple feedbacks rather than just the concentration of a single protein with a special dedicated synthesis mechanism. Thus, the only deterministic influence of volume on concentration dynamics is on the dilution rate, which is proportional to the cell growth rate $\lambda \left(V\right)$ (see details in Appendix 1). At cell division, we also assume that proteins are equally partitioned between the daughter cells, that is, the concentration is the same before and after division. Note that we scale all our variables so that a concentration of one arbitrary unit corresponds roughly to 1000 proteins in a 100fL cell (Milo et al., 2010). Additionally, we scale the time variable so that 1 arbitrary time unit corresponds roughly to 30 min (Di Talia et al., 2007).
In this study, we chose a hierarchical way of introducing noise in the system, starting with the biggest contributing factor and incrementally adding additional sources of noise in subsequent analyses. All simulations presented include noise (stochastic control of G1/S transition and timing of S/G2/M, see below) in the cell cycle phases, whose CV has been found to be as high as 50% (Di Talia et al., 2007). Then, we introduced protein production noise via Langevin noise because the CV of regulatory protein concentrations is typically 20–30% (Newman et al., 2006). Importantly, the cell volume also contributes to stochastic effects, which are larger in smaller cells with fewer molecules. Thus, for stochastic simulations, we include a multiplicative $\frac{1}{\sqrt{V}}$ contribution to the added Gaussian noise term (see more complete description in the Appendix 1).
We also checked that our results are largely invariant when adding other sources of noise (see Appendix 1—figures 5–7). In these simulations, we also included noise in cell growth rate (CV ~15%; e.g. Di Talia et al., 2007), and in mass partitioning at cytokinesis (CV ~10%; e.g. Zatulovskiy et al., 2020).
Evolutionary procedure
Request a detailed protocolTo evolve networks regulating cell size, we use the $\phi $evo software (Henry et al., 2018) with a modified numerical integrator accounting for volume dynamics and volume dependencies as described above (Figure 1B). $\phi $evo simulates the Darwinian evolution of a population of gene networks. A network is encoded with the help of a bipartite graph connecting biochemical species (typically proteins) and interactions between them (we use a custommade Python library). Networks are converted into an ensemble of stochastic differential equations using a Python to C interpreter. This code is then compiled and integrated on the fly to compute the behavior of the networks.
Each selection step in the algorithm is referred to as an epoch rather than the more commonly used term generation because we use the term generation to refer to cell divisions in the simulations. At each epoch of the algorithm, each gene network is simulated, and its fitness computed. Based on the fitness function(s) (see below), half of the networks are selected and duplicated, while the other half is discarded to maintain a constant population size. The duplicated networks are then randomly mutated. From the most to least probable, mutations consist in random changes of parameters of the network, random removal of interactions, and random additions of interactions or new proteins. Absolute mutation rates are adjusted as a function of the number of evolutionary epochs so that all networks in a population are mutated on average once per epoch. This implements a numerical equivalent of the biological Drake’s rule that mutation rates adjust with genome size (Lynch, 2007). Practically, this prevents the known phenomenon of codebloating in evolutionary simulations (Foster, 2001) and also means that the total number of epochs is a good proxy of the number of mutations (in random directions) needed to evolve the best networks. All of this is easily made with our customized Python library encoding networks. For more details on technical aspects and implementations of the $\phi $evo software, we refer the reader to Henry et al., 2018.
Realistic evolutionary processes select for multiple phenotypes in parallel. While tradeoffs between those phenotypes are nontrivial, it has been observed that phenotypes typically define an evolutionarily Pareto front (Shoval et al., 2012; Warmflash et al., 2012). We thus perform network selection using a Pareto mode (Warmflash et al., 2012), in which two distinct fitness functions are computed. During the selection step, networks are first Pareto ranked. For example, consider two networks A and B and two fitness functions f^{1} and f^{2}. f^{1}_{A} refers to the fitness of network A calculated with function f^{1}. Assuming fitness functions are to be maximized, we say network A Paretodominates network B if both f^{1}_{A} > f^{1}_{B} and f^{2}_{A} ≥ f^{2}_{B}. Rank 1 networks are networks which are not dominated by any other networks, Rank 2 networks are networks dominated only by Rank 1 networks, and Rank 3 networks are only dominated by Rank 1 and Rank 2 networks and so on. The algorithm then selects half of the population of Rank 1 networks using a fitness sharing algorithm to maximize population diversity (see details in Warmflash et al., 2012). One advantage of Pareto selection is the increased flexibility of the evolutionary process. Multiple fitness functions can provide different optimization paths in parameter space, which prevents the selection process from getting stuck in a local optimum of a single fitness function. We also perform a few simulations with only one fitness function, in which case networks are simply ranked based on their fitness.
We impose two evolutionary selection pressures in the form of two fitness functions. The first fitness function is simply the number of cell divisions during a long period, which we call ${N}_{Div}$ . This is consistent with the classical definition of fitness as optimizing the number of offspring and is to be maximized by the algorithm. The second fitness function is the coefficient of variation of the volume distribution at birth for those ${N}_{Div}$ generations, which we call $C{V}_{Birth}$ and is to be minimized by the algorithm. This penalizes broad distributions of volume at birth, which are detrimental to cell size homeostasis, which is what we aim to examine here. We further imposed fitness penalties to prevent way too small or too big cells, see Appendix 1. There, we also study alternative fitness functions, such as leastsquare residual function to minimize volume variation about a target size, and the fitted slope of the amount of volume added at each cycle to be minimized to drive models toward being a sizer.
Varying cell cycle structure
Request a detailed protocolWe also ran evolutionary simulations with different cell cycle structures. For evolutionary simulations where the G1/S transition was controlled by the concentration of $\left[I\right]$ , we simply change the probability to pass the G1/S transition to depend on concentration $\left[I\right]$ instead of its quantity $I$. For evolutionary simulations with a cell cycle structure similar to that found in the fission yeast S. pombe, we invert the cell cycle network structure. In this case, $I$ quantity controls division and the Switch is turned on for a fixed amount of time in G1. In terms of the relaxation oscillator, this means that the left branch is now S/G2/M and the right branch is G1.
Appendix 1
In section Mathematical implementation, we first describe how to modify deterministic and stochastic ODEs to account for a exponentially growing cell volume. In section Size control, we describe the three size control archetypes, namely the timer, the adder, and the sizer, as well as our implementation of the initial seed cell cycle model. Then, in section Evolutionary algorithm, we give details on the $\phi $evo evolutionary algorithm. In section Analysis of sources of noise, we present evolutionary simulations investigating the effects of noise on the evolved networks. Finally, in sections Model descriptions and Additional models, we provide parameter values and equations for the models presented in the main text and for additional models produced by our evolutionary simulations.
Mathematical implementation
Deterministic
All biochemical concentrations can be described as a quantity of molecules of a biochemical species divided by the volume that contains it. The fundamental equation describing all concentrations is thus:
Here, $X(t)$ is the quantity of molecules of an arbitrary biochemical species $X$ as a function of time and $V(t)$ is the volume of the cell containing said species as a function of time. We will use the bracket notation $[X](t)$ for concentrations. In this project, we will model all biochemical species directly at the concentration level and assume proteins are uniformly distributed in an exponentially growing cell volume. The absolute growth rate of the cell $\lambda$ is chosen to be constant over a large viable range of cell volume and is otherwise 0. Consequently, volume grows over time with a fixed absolute growth rate $\lambda =0.25$ over a viable volume range following the equation:
Deterministic rate equations describing the dynamics of the biochemical species at the concentration level have to be adjusted to take into consideration a timevarying volume. From Equation 1 and the derivative chain rule, we get:
which we can combine with Equation 2 to give:
The first term in the Equation 3, $f([X](t))$, corresponds to the usual biochemical reaction rates that occur when the volume of the cell is fixed. The second term, $\lambda [X](t)$, is a dilution term that we can interpret as an effective degradation of the concentration $[X](t)$ due to the exponential growth of the cell volume over time.
To further simplify the expression for $f([X](t))$, we make an assumption about the production rates of all biochemical species in our models. Let us consider the rate equation describing the dynamics of the quantity of an arbitrary protein $X$ with generic production rate $\rho $ and degradation rate $\delta $ contained in a fixed cell volume $V$.
Here, $f([X](t))$ is given by:
We assume that all the proteins in our models are constitutively expressed by the cell. In other words, the production rates $\rho $ are linear functions of the cell volume $\rho =\rho (V(t))={\rho}_{0}V(t)$. This ensures that the protein production rates scale with the volume such that concentrations stay constant over time which is a general feature of most proteins in S. cerevisiae (Chen et al., 2020; Newman et al., 2006; Swaffer et al., 2021). This yields: $f([X](t))={\rho}_{0}\delta [X](t)$
Taken altogether with Equation 3, the deterministic dynamics of a constitutively expressed arbitrary protein concentration $[X](t)$ contained in an exponentially growing cell volume are given by:
Stochastic
To simulate molecular noise, we follow a classical tauleaping formalism (Gillespie, 2007). Specifically, we choose the EulerMaruyama implementation to generate approximate solutions to stochastic differential equations (Kloeden and Platen, 1992).
As we have done before in the deterministic case, let us first consider the quantity of an arbitrary protein $X(t)$ in order to extract the equation for the concentration $[X](t)$. Let’s assume $X(t)$ is changing via a single biochemical reaction rate $\frac{dX}{dt}=g(X(t))$ over the time interval $[0,T]$ given $X(t=0)={X}_{0}$. We will show later how this approach can be generalized to include multiple reaction rates. We begin by partitioning the time interval in $N$ equal segments of length $\mathrm{\Delta}t$ such that $0<{t}_{1}<{t}_{2}<...<{t}_{N}=T$ with ${t}_{n}=n\cdot \mathrm{\Delta}t$, $n\in \{1,N\}$.
The EulerMurayama approximate solution to the stochastic differential equation at the discrete time points t_{n} is then recursively given by the following equation for $n\in \{1,N1\}$ where the single biochemical reaction is assumed to happen with a Poisson rate (corresponding to the deterministic rate), which adds one white Gaussian noise to the differential equations with a variance equal to that rate:
Here, ${X}_{n}=X({t}_{n})$, $g({X}_{n})$ is the drift term, $\sqrt{g({X}_{n})}$ is the diffusion term and $\mathcal{N}(0,1)$ is a random Gaussian variable of mean 0 and variance 1. In other words, ${X}_{n+1}$ is a random Gaussian variable of mean ${X}_{n}+g({X}_{n})\mathrm{\Delta}t$ and variance $g({X}_{n})\mathrm{\Delta}t$. Since this describes the quantity of proteins, we also have to consider the change in volume over time to recover the equation for the concentration of proteins, $[X](t)$. Thus, let’s consider the volume of the cell $V(t)$ at the discrete time points t_{n}, which is given recursively by the equation:
Here, we define ${V}_{n}=V({t}_{n})$ and recover the $\lambda {V}_{n}$ term from Equation 2. We assume that the volume time evolution is noiseless for simplicity. To recover, the differential equation describing the protein concentration, we evaluate the expression ${[X]}_{n+1}=\frac{{X}_{n+1}}{{V}_{n+1}}$. Thus, combining Equations 5 and 6, we get:
where we identify $\frac{1}{{V}_{n}}g({X}_{n})=g({[X]}_{n})$ as the rate equation describing the concentration of the consitutively expressed protein $[X](t)$ when the volume $V$ is fixed as described in the deterministic case. Then, we compute the derivative as:
Finally, we recover the approximate full differential equation for the protein concentration by expanding the prefactor in the last equation to the ${0}^{\text{th}}$ order in $\mathrm{\Delta}t$ assuming it to be small to give:
So far, we have assumed that there is only a single biochemical reaction rate $g([X](t))$. We can however easily generalize our approach to include additional reaction rates by summing up the contribution of each rate to the total differential equation. Given $M$ independent reaction rates ${g}_{i}([X](t))$ and the properties of random Gaussian variables, we can easily generalize:
Importantly, the $\mathcal{N}}_{i$ are Gaussian vectors accounting for the noise correlations associated with single reactions. For instance, imagine one protein ${X}_{k}$ turns into another protein ${X}_{p}$, then the corresponding Gaussian vector for this interaction takes the form $\mathcal{N}(0,1)(\overrightarrow{{x}_{k}}\overrightarrow{{x}_{p}})$ where $\overrightarrow{{x}_{k}}$ is vector of length corresponding with the number of variables in the system whose $k$th component is equal to 1 with 0s elsewhere. This indicates that the molecular fluctuation due to this reaction should have opposite signs for ${X}_{k}$ and ${X}_{p}$ as expected.
The term $\lambda {[X]}_{n}$ is a dilution term that corresponds to an effective degradation of protein concentration ${[X]}_{n}$ as seen in the deterministic case. Interestingly, we highlight the $\frac{1}{{V}_{n}}$ dependency in the noise term. We can understand this dependency intuitively by considering a protein production process with a Poisson parameter $\theta $. In this scenario, the mean and the variance of the protein quantity distribution is given by the parameter $\theta $. Going back to concentration space, there are an infinite number of combinations of protein quantity and volume that can give the same concentration $[X]=X/V$. Thus, we need to specify both the protein number $X$ and the volume $V$ to correctly model the molecular noise contributing to fluctuations in concentrations.
Size control
Initial seed network
To guide the evolutionary process, we begin with an initial seed network. We base our first seed network on the phenomenology of the budding yeast S. cerevisiae’s cell cycle, where cell size primarily regulates the timing of the START transition in late G1 (Schmoller et al., 2015). This regulation allows small daughter cells to delay the G1/S transition allowing them to catchup in size by extending the G1 phase. S/G2/M duration on the other hand is largely independent of cell size. We note that while budding yeast divide asymmetrically, our simulated cells divide symmetrically. In our simple initial seed network, the cell cycle consists of two phases, G1 and S/G2/M, which respectively denote the preG1/S and postG1/S phases of the cell cycle. The transition between these two phases is controlled by the level of a transcription regulator we call $I$. Like the Whi5 protein in S. cerevisiae, $I$ is an inhibitor of the G1/S transition such that the lower its level, the higher the chances of cell cycle progression. Since protein production rates were assumed to be dependent on volume (as described in section Mathematical implementation), we found that $I$’s concentration alone was largely independent of volume and could not trigger a sizedependent G1/S transition as Whi5 does in budding yeast. Thus, we chose the quantity of $I$ defined as $\text{I}(t)=[\text{I}](t)\times V(t)$ as the control variable for this transition. We chose to model the probability of the G1/S transition occurring at the next time point of the simulation with a sigmoidshaped curve given by Equation 9 that can be visualized in Appendix 1—figure 1. We maintain $\theta =0.8$ and ${n}_{\theta}=8$ fixed throughout this project. We chose these values because they give a similar amount of noise in the G1/S transition as observed experimentally (Di Talia et al., 2007; ChandlerBrown et al., 2017).
In our seed model, we encode cell cycle state using a binary variable S/G2/M Switch, which is 0 in G1 and turns to 1 in S/G2/M once the G1/S transition takes place. Following S. cerevisiae’s cell cycle structure where S/G2/M duration is independent of cell size, we fix S/G2/M duration to be $\sim 50\%$ of the doubling time $\tau =\frac{1}{\lambda}\mathrm{ln}2$ with uniform noise unless stated otherwise. This way, cells can tune the length of their cell cycle by adjusting G1 length while being constrained by the incompressible length of the timer in S/G2/M, ${T}_{\text{S/G2/M}}$.
Following S/G2/M, cells divide such that ${V}_{\text{Birth}}^{(n+1)}={V}_{\text{Division}}^{(n)}/f$ with $f$ the division fraction. The $n$ exponent here is referencing the $n$th generation in the cell lineage. We choose $f=2$ for all simulations performed in this study unless explicitly mentioned otherwise. We assume perfect partitioning of all proteins between the two daughter cells such that the proteins’ concentrations remain the same before and after division. After division, we follow one of the two daughter cells during their own subsequent cycle. If we simulate the cell lineage for a long time, ergodicity guarantees that all volume states will be visited given stable growth and we can extract population statistics from the lineage data itself. Here, cell growth is exponential on the single cell level since we were assessing size control mechanisms that take size as an input to cell cycle control. We are not exploring the very interesting case where growth deviates from the exponential. In that case, size homoeostasis would have a contribution from some cells in the population outcompeting others in terms of their growth and we would have to simulate the entire cell population and not disregard one of the daughter cells as we do here.
Inspired by the dynamics of Whi5, which is produced in S/G2/M and diluted in G1, we chose an initial seed network where S/G2/M Switch activates the transcription of the $I$ inhibitor. This ensures that the concentration $[\text{I}]$ is ‘reset’ to a higher value following S/G2/M and prevents cells from skipping entirely the G1 phase of the subsequent cycle which would quickly send the volume of the cell converging quickly towards 0. We note that this interaction systematically appeared anyway in our early evolution simulation so we chose to include it in the initial seed network to accelerate the evolutionary process.
Size control archetypes
Size control mechanisms are often compared to three wellcharacterized models or archetypes in order to quantify the strength of the size control mechanism under study. Specifically, there are timers, adders and sizers. In this subsection, we will define each archetype and show that we can summarize them via a control volume response curve $T({V}_{C})$ as described in the main text.
First, let’s consider the timescale of growth. Given $V(t=0)={V}_{0}$, the solution $V(t)$ of the volume Equation 2 is $V(t)={V}_{0}{e}^{\lambda t}$. From this equation, we can easily recover the doubling time $\tau $ defined as the time required to double a cell’s volume ($V(\tau )=2{V}_{0}$) which yields:
The doubling time $\tau $ is only a function of the growth rate $\lambda $. Since we are only considering symmetrical division events, fixed interdivision times shorter than the doubling time will yield progressively smaller daughter cells. Similarly, fixed interdivision times longer than the doubling time will yield progressively larger daughter cells. Thus, the absolute growth rate $\lambda $ sets the timescale for cell cycle dynamics if we want to simulate a stable cell lineage.
With the quantity sensing of the inhibitor I at the G1/S transition (see Equation 9), we find that the instantaneous volume at G1/S sets the concentration of the biochemical species for the rest of the cycle and until the next G1/S transition in the daughter’s cell cycle. Consequently, it regulates the timing of the G1 phase of the daughter cell and thus creates a return map for the volume at G1/S. We find that the volume at G1/S at the $(n+1)$th generation ${V}_{\text{G1/S}}^{(n+1)}$ is given recursively by:
To study the mechanisms of cell size control, we choose to define a useful new variable: the control volume ${V}_{C}$. This control variable is independent from the biochemical network and maintained fixed allowing us to break the size feedback, and distinguish its input, the volume $V$, from its output, the induced cycle period $T$. With this new variable, we can modify the control at G1/S by forcing the transitions to trigger once $\mathbf{I}}_{C}=[\mathbf{I}]\times {V}_{C$ is low enough. We can then extract the response curve of the system, that is, the cell cycle period induced from sensing this control volume at G1/S $T({V}_{C})$. We represent this process schematically in Appendix 1—figure 2A.
The control volume at which the response curve $T({V}_{C})$ is equal to the doubling time $\tau $ corresponds to the equilibrium volume, ${V}_{eq}$, for this network. Indeed, if $T({V}_{\text{eq}})=\tau $, then the cell cycle length will ensure that this cell exactly doubles its volume during its cell cycle and returns to the same ${V}_{\text{eq}}$ at the next generation. This volume is a fixed point of the volume return map and can be either stable or unstable. Theoretically, there could be size control mechanisms with multiple fixed points of the response curve, but practically we have not seen this emerge from any of our evolution experiments and therefore assume that ${V}_{\text{eq}}$ is unique. In the main text, we have substituted ${V}_{\text{eq}}$ by the average value of the volume at the time where volume is sensed as both of these values are essentially identical. This corresponds to $\u27e8{V}_{\text{G1/S}}\u27e9$ for models with a sizer in G1 and a timer in S/G2/M and $\u27e8{V}_{\text{Division}}\u27e9$ for networks with a timer in G1 and a sizer in S/G2/M.
Size variation naturally occurs in our models due to the precise timing of G1 and S/G2/M cycle phases which are both noisy, so the volume does not stay at ${V}_{\text{eq}}$ for very long. The stability of growth around this equilibrium volume however will depend on the sign of the local derivative with respect to control volume of the $T({V}_{C})$ response curve and we will consider the following three cases:
is strictly increasing with ${V}_{C}$. In this case, small deviations around ${V}_{\text{eq}}$ are amplified over successive generations and the volume quickly shrinks to 0 or explodes to $\mathrm{\infty}$. In this case, we say that ${V}_{\text{eq}}$ is an unstable fixed point of the response curve.
$T({V}_{C})$ is constant with ${V}_{C}$. In this special case and assuming exponential growth of the volume, the only stable mode of growth corresponds to the response curve $T({V}_{C})=\tau $. This corresponds to the only stable timer archetype. In this particular scenario, ${V}_{\text{eq}}$ is not well defined as there are an infinite number of volumes where the response curve intersects the doubling time.
$T({V}_{C})$ is strictly decreasing with ${V}_{C}$. In this case, cells correct for size deviations over successive generations and perform size control. In this case, we say that ${V}_{\text{eq}}$ is a stable fixed point of the response curve.
Here, we found the $T({V}_{C})$ curves that were selected by the evolutionary algorithm were all decreasing with ${V}_{C}$ as expected for stable size control mechanisms. Thus, for the remainder of this document, we will assume that ${V}_{\text{eq}}$ is uniquely defined and corresponds to a stable fixed point of the control volume response curve.
Timer
The timer archetype describes mechanisms that monitor time rather than size. If the cycle duration of the timer is tuned precisely to the doubling time $\tau =\frac{1}{\lambda}\mathrm{ln}2$, cells will double their mass over the course the cell cycle to ensure that newborn daughter cells have the same volume at birth as their mothers did when they were born. Consequently, this category of mechanisms is notoriously bad at correcting for size deviations given exponential cell volume. If growth was linear however, this mechanism would allow for size control to take place.
Adder
The adder archetype describes mechanisms where cells add a constant amount of cell volume during each cycle. We define $\mathrm{\Delta}$ the increment of added volume between birth and division, that is $\mathrm{\Delta}={V}_{\text{Division}}{V}_{\text{Birth}}$. For adder mechanisms, the added volume at each cycle is constant and does not depend on cell size. In this case, initial size deviations are reduced by a factor of 2 at each division such that the volume at birth geometrically converges to the added volume $\mathrm{\Delta}$ over successive generations. To recover the adder response curve ${T}_{\text{Adder}}({V}_{C})$, we consider that by definition, ${V}^{\text{Division}}={V}^{\text{Birth}}{e}^{\lambda {T}_{\text{Adder}}}$. From this equation and the definition of the adder, we can recover the cycle period ${T}_{\text{Adder}}$:
We would like write this equation as a function of the control volume ${V}_{C}$ at G1/S and the equilibrium volume ${V}_{\text{eq}}$ alone. Assuming that the G1/S transition is followed by a timer in S/G2/M, we can write ${V}_{\text{Birth}}={V}_{C}{e}^{\lambda {T}_{\text{S/G2/M}}}/2$. Similarly, since we know by definition that ${T}_{\text{Adder}}({V}_{\text{eq}})=\tau $, we can recover that the added volume increment is $\mathrm{\Delta}={V}_{\text{eq}}{e}^{\lambda {T}_{\text{S/G2/M}}}/2$. Finally, we can combine these two expressions with to recover the final expression of the adder response curve:
We note here that if the control volume was measured at division or at birth, the response curve of the adder would be unchanged. The only difference would be that both ${V}_{C}$ and ${V}_{\text{eq}}$ would correspond to volumes at division or birth volumes instead of volume at the G1/S transition. Here for example, the volume increment $\mathrm{\Delta}$ corresponds to the ${V}_{\text{eq}}$ at birth.
Sizer
The sizer archetype describes mechanisms that measure size directly and allow a cell to return to a target volume ${V}_{\text{Target}}$ after a single generation, irrespective of how big or small a cell was initially. For sizers, ${V}^{\text{Division}}={V}_{\text{Target}}={V}^{\text{Birth}}{e}^{\lambda {T}_{\text{Sizer}}}$ by definition.
We can then extract:
We can then write ${V}_{\text{Birth}}$ and ${V}_{\text{Target}}$ as a function of the control volume at G1/S ${V}_{C}$ and the equilibrium volume ${V}_{\text{eq}}$. Using the same definitions as before ${V}_{\text{Birth}}={V}_{C}{e}^{\lambda {T}_{\text{S/G2/M}}}/2$ and ${T}_{\text{Sizer}}({V}_{\text{eq}})=\tau $, we find that ${V}_{\text{target}}={V}_{\text{eq}}{e}^{\lambda {T}_{\text{S/G2/M}}}$. The sizer response curve thus follows:
We note again here that if the volume was measured at division or at birth, the equation for the sizer would be identical with the only difference being that the control volume ${V}_{C}$ and the equilibrium volumes ${V}_{\text{eq}}$ would correspond to division or birth volumes respectively.
The three archetypes’ response curves are shown in Appendix 1—figure 2B. From these curves and given the particular cell cycle structures we examined, we can extract multiple relevant measures of size control such as the volume at birth, G1/S, and division from which we get the added volume during each phase of the cell cycle. It is noteworthy that the derivative of the added volumes $\mathrm{\Delta}V$ with respect to the birth volume ${V}^{\text{Birth}}$ for the timers, adders, and sizers, are respectively 1, 0, and –1. Size control mechanisms are typically compared to the 3 archetypes by measuring the amount of added cell volumes over their cell cycles $\mathrm{\Delta}{V}_{\text{Cycle}}$ and then fitting a linear model to these data points. The fitted slope of the linear model then informs what archetype this particular mechanism is more akin to. Some models evolved with added volume slopes lower than –1 and we call those supersizers. Such mechanisms overcompensate for volume deviations about the equilibrium value which can increase variation in the size distribution instead of decreasing it.
Response curve of the seed networks
In light of the control volume and response curve definitions from subsection 2.2, we can revisit the initial seed model and investigate how different assumptions alter the stability of growth and division in a cell lineage. Specifically, we investigate the size scaling assumption of the protein production rates and the concentration vs. quantity sensing of the transcriptional regulator $I$ at G1/S as previously described in Section 1. We summarize our results in Appendix 1—figure 3.
First, let us consider the G1 trajectory of the transcriptional regulator $[\text{I}]$ who is solely produced during the S/G2/M timer. The dynamics of $[\text{I}](t)$ in G1 will be described by the following equation:
Here, the time variable $t$ represents the time since birth, ${[\text{I}]}_{0}=[\text{I}](t=0)$, $\delta $ is the protein’s degradation rate and $\lambda $ is the growth rate of the cell volume. This equation holds until the G1/S transition where the S/G2/M Switch is turned on again and G1 ends. Because the degradation of the inhibitor does not yet depend on volume in any way, the time spent in G1 will only be dependent on the ratio between: (1) the initial condition at birth ${[\text{I}]}_{0}$; (2) the final condition at the G1/S transition, ${[\text{I}]}_{\text{G1/S}}$.
We found that the size scaling assumption of the protein production rates influences the initial condition at birth ${[\text{I}]}_{0}$. When production rates scale with size, we find that ${[\text{I}]}_{0}$ is independent of volume. This is expected as this assumption was chosen specifically to model proteins whose concentrations are independent of size. Conversely, when we modify this assumption and consider that protein production rates are independent of size (e.g. like Whi5 in budding yeast), we find that the system produces a constant quantity of inhibitor instead of a constant concentration. This means that the initial concentration at birth ${[\text{I}]}_{0}$ scales as $1/V$.
Similarly, when imposing quantity sensing of I at G1/S, we found that the concentration of inhibitor at G1/S ${[\text{I}]}_{\text{G1/S}}$ scales as $1/V$. Finally, when imposing concentration sensing of I at G1/S, we found a constant concentration of inhibitor at G1/S ${[\text{I}]}_{\text{G1/S}}$ as was expected by design.
Together, those assumptions alter the scaling of the duration of the G1 phase of the initial seed cycle. We summarize these results and present the models’ response curves $T({V}_{C})$ in Appendix 1—figure 3 where each row and column corresponds to a specific combination of assumptions. There, in Appendix 1—figure 3A, we see that for the combination of sizescaling production rate and quantity sensing at G1/S, we get a cell cycle period that is increasing with control volume ${V}_{C}$. This is undesirable and leads to unstable growth of the cell lineage towards 0 or $\mathrm{\infty}$, but rewards the evolution of size control mechanisms that can prevent this unstable growth. We chose this initial seed model for most of our evolutionary simulations. In Appendix 1—figure 3B,C, we found that the two assumptions compensated each other to create sizeindependent timer models. The parameters of the network can be precisely finetuned to yield a response period of exactly $\tau $ as was done to produce the response curves shown here. Thus, it is technically possible to evolve a size control mechanism using these initial seed models, but we chose not to go down that path because we wanted to evolve an active size control mechanism. Finally, in Appendix 1—figure 3D, we see that if we assume that protein production rates do not scale with size and that the G1/S transition depends on the concentration of inhibitor $[\text{I}]$, we get an initial seed model that already accomplishes size control as it displays a response curve that decreases with control volume ${V}_{C}$. This simple model loosely corresponds to the Whi5 inhibitor dilution model of budding yeast (Schmoller et al., 2015) where a constant quantity of inhibitor Whi5 is present at birth (and thus a concentration $[\text{Whi5}]\propto 1/V$) and is passively diluted in G1 until it reaches concentration threshold that triggers the G1/S transition.
Evolutionary algorithm
Here we briefly describe the $\phi $evo evolutionary algorithm from Henry et al., 2018 that we used to evolve size control networks. We refer the reader to the original publication’s main text and supplementary material for a more thorough description of the algorithm. A schematic representation of the algorithm’s architecture is shown in Appendix 1—figure 4.
First, an initial seed network is selected by the user as the starting point of the evolution simulation. $\phi $evo then clones this first individual to create a population of networks. At each epoch, mutations are randomly applied to the networks of the population. Those mutations vary from topological changes to the network, where biochemical species or interactions can be added or removed, to nontopological changes, where the networks’ kinetic parameter values are modified. Following mutations, networks are ranked based on their performance at accomplishing the biological function we select for. This performance is encoded via a userdefined fitness function that is problem specific. We give details about the specific implementation of the fitness functions for cell size control in the following subsection. After ranking the networks, $\phi $evo proceeds to select the most fit half of the network population. The less fit half is then discarded and replaced by a copy of the most fit half to maintain a constant population size. With this, $\phi $evo completes the first epoch of the evolutionary process. We use the term epoch here rather than the term ’generation’ which we retain to describe a cell lineage. A predetermined number of epochs of mutation and selection are then performed after which a final population of networks is extracted.
Fitness
In order to rank and select networks based on their performance at accomplishing a specific biological function, we design a specific objective function that we call fitness. Here, we chose a fitness function that could quantify a model’s ability to produce many viable descendants during a fixed time period of length $t$. We initially considered a simple fitness function to be minimized by $\phi $evo, ${f}_{0}={N}_{Div}$. Here ${N}_{Div}$ is the number of divisions or generations in a cell lineage produced during a total time period of length $t$. Noise at the G1/S transition and in the S/G2/M timer duration act as a source of variation in volume at each generation which needs to be controlled by the evolved networks in order to prevent the cell volumes from diverging. Thus, networks that perform size control display a high number of divisions ${N}_{Div}$. The cycle duration distribution of a size control network will be centered around the doubling time $\tau $ in order to promote stable growth. Thus, on average, we expect a fit network to exhibit a maximum number of $\mathrm{max}({N}_{Div})=t/\tau $ divisions during a simulation of length $t$. Note that this number is mostly independent of the volume range selected by the evolutionary simulation as the doubling time $\tau $ is independent of the initial volume of the cell at the beginning of each cycle. There is a small effect on ${N}_{Div}$ from the initial conditions chosen for the system of ODEs modelling the cell cycle, but this effect is mostly negligible as long as the total time period $t\gg \tau $.
In our first evolution experiments, we found that the single objective function ${f}_{0}={N}_{Div}$ was insufficient alone to evolve a cycling network. A possible reason for this is that the fitness landscape in parameter space defined by $f}_{0$ is mostly flat far from the optimum and is difficult to navigate as it doesn’t incrementally guide the evolution process towards a proper size control phenotype. Indeed, a network that does not perform size control will display very few divisions before diverging towards sizes of 0 or $\mathrm{\infty}$. In contrast, a network that does performs some size control, even if performed badly, will display mostly stable growth with many divisions and the volume will not diverge over successive generations. There is thus an all or nothing effect with this fitness function. We found that optimization process would often get stuck in a local optima with a low number of divisions and could not find a path to the global optima of size control. Because of this, we chose to turn towards multiobjective Pareto optimization which aims to simultaneously optimize several fitness functions. The idea here is that an additional fitness function can guide the evolutionary process through a different path in parameter space and could allow the evolutionary procedure to escape local optima.
In the Pareto optimization framework, we consider $N$ equally important objective functions $\overrightarrow{f}=({f}_{0},{f}_{1},\mathrm{\dots},{f}_{N})$. Assuming that fitness functions are to be maximized, we say that individual $i$ (strictly) dominates individual $j$ if and only if their fitness ${\overrightarrow{f}}^{i}$ and ${\overrightarrow{f}}^{j}$ are such that:
The algorithm then selects half of the population of the highest rank using a fitness sharing algorithm to maximize population diversity. We refer the reader to Warmflash et al., 2012 for more details on this procedure.
For this project, we chose to limit the optimization process to two fitness functions. We chose ${f}_{0}={N}_{div}$ as the first fitness function. We tested different measures of size control for the second objective function which are described in the following subsections. In most of the cases, we chose the coefficient of variation of the size distribution at birth $C{V}_{\text{Birth}}$ to be minimized as the second fitness function. In any case, we typically run 10 independent realizations of a network’s performance and compute the average fitness score over those runs to buffer variations in fitness scores.
Residuals
We first tested a least squared residuals fitness function to be minimized by $\phi $evo. This function yielded successful evolutionary runs but was abandoned due to requiring a userdefined target volume ${V}_{t}$. Indeed, we wanted to avoid the bias where we could select for biochemical networks matching a specific volume range. We used the following equation for the fitness function with ${V}_{\text{Birth}}^{(n)}$ indicating the cell volume at birth at the $n$th generation in a lineage.
We nevertheless present the result of a successful evolution run using Pareto optimization of ${N}_{Div}$ and $f}_{1$ in Appendix 1—figure 11 where we obtain a version of the feedbackbased network topology of Model A1.
Coefficient of variation
We then considered the coefficient of variation of the size distribution at birth ($C{V}_{\text{Birth}}$) to be minimized by $\phi $evo. The $C{V}_{\text{Birth}}$ is a measure of size control that normalizes the variance of the size distribution at birth with respect to its mean and is thus mostly insensitive to the absolute volume range of the cell. We chose this second fitness function in most of the Pareto optimization evolution experiments as described in the main text.
Added volume slope in G1
We also considered directly optimizing the fitted slope of the volume added in G1 as a function of volume at birth to reinforce the sizer behavior in G1. As described in the main text, given a series of volume values at birth and at G1/S, ${V}_{\text{Birth}}^{(i)}$ and ${V}_{\text{G1/S}}^{(i)}$ for $i\in \{1,{N}_{\text{Div}}\}$, the added volumes in G1 are defined as:
The best fit slope $m$ of a linear model $\mathrm{\Delta}{V}_{\text{G1}}=m\cdot {V}_{\text{Birth}}+b$ has a closedform equation which is given as a function of the lineage data directly:
A slope of –1 corresponds to a sizer, a slope of 0 to an adder and a slope of +1 corresponds to a timer. In order to directly optimize the size control mechanism in G1 and to bias towards sizer mechanisms and to keep the fitness values positive, we considered the following fitness function to be minimized by $\phi $evo:
Fitness penalties
In order to keep biochemical concentrations and cell volumes at reasonable levels, we chose to bound volume growth to a range $V\in [{V}_{min},{V}_{max}]$ with ${V}_{min}=0.1$ and ${V}_{max}=100$. To guide the evolution of size control and to have a well defined steadystate distribution of cell sizes, we chose to impose fitness penalties on networks that would see their volumes reach one of those bounds at some point during a run. This applies to all evolution experiments performed in this study, but we will describe the case of the ${N}_{Div}$ and $C{V}_{\text{Birth}}$ fitness functions as they were used most of the time during this study.
Firstly, if the volume reached ${V}_{min}$, we considered the cell too small and declared it dead. At that point, any further cycles would not contribute to fitness scores. With this penalty, networks are guided towards preventing or at least delaying the time at which the volume becomes too small for the cell to remain viable.
Similarly, when volume reaches ${V}_{max}$, we penalize the fitness functions but in a different way. Because volume growth is restricted to the domain below ${V}_{max}$, we set the growth rate $\lambda =0$ when $V={V}_{max}$. In this case, we sometimes see networks make use of this growth arrest and exploit this artificial feature. This phenomenon can sometimes be seen in computational evolution where digital mirages are often exploited by optimization processes (Lehman et al., 2020). Here, such exploitative networks have initially long cycle periods $T\gg \tau $ and can sometimes spend the majority of their cell cycle in this growth arrest phase. Over successive epochs, this period $T$ gets shortened to get more and more divisions to take place during a run, shortening the time spent in growth arrest at each cycle. Eventually, this optimization leads to a particular type of model that has very sloppy size control but that scores highly with the fitness functions because of the artificial growth arrest. Such networks develop a timer with period $T\approx \tau $ and use the artificial growth arrest to buffer any volume variation incurred from late G1/S transition or noisy S/G2/M duration. These networks are optimal from an ${N}_{Div}$ perspective since they exactly double their mass at each cycle and perform the same number of division cycles during a run as an actual size control network. They are also more than optimal from a $C{V}_{\text{Birth}}$ since their growth is always stopped at ${V}_{max}$. Thus, without fail, their birth volume ${V}_{\text{B}irth}={V}_{max}/2$ and there is no variation at all in the distribution. This size control illusion is a global optimum of the 2Dfitness space and must thus be heavily penalized to prevent the optimization from selecting this phenotype. Thus, when $V={V}_{max}$, we only count 70% of the divisions ${N}_{Div}$ which is sufficient to distinguish this artificial phenotype from actual size control mechanisms. Additionally, we penalize the $C{V}_{\text{Birth}}$ score by adding to it a penalty of $+10$. This is significantly different from the usual range of coefficient of variations which lie between 0 and 0.5 typically, and prevents the optimization from selecting the artificial phenotype.
The introduction of these fitness penalties improved the convergence rate of the $\phi $evo algorithm significantly. Specifically, we believe the penalty on ${V}_{max}$ being somewhat less severe than the one for ${V}_{min}$ improved the convergence rate drastically. This is probably because the initial cell cycle model chosen for the evolution runs exhibits unstable growth and inevitably sees the cell volume grow to ${V}_{max}$ or shrink to ${V}_{min}$ rapidly as shown in Appendix 1—figure 3. Thus, the fitness landscape surrounding the initial cell cycle model is quite flat and is difficult to navigate from an optimization perspective. Gradual improvements to the ${N}_{Div}$ fitness function at ${V}_{max}$ by spending less and less time in growth arrest seemed to have helped the optimization process. This guided the algorithm towards better size control models more often than if penalties were absent.
Overall, even if the penalties somewhat biased the evolutionary process into following a phenotypic trajectory, they improved the convergence rate of the evolutionary process dramatically to the point that we decided to keep them for all experiments.
Biochemical interactions
Many ‘inverseapproach’ approaches in systems biology have focused on purely transcriptional networks (Cotterell and Sharpe, 2010; François et al., 2007; Fujimoto et al., 2008; Ten Tusscher and Hogeweg, 2011;), because they are generic, easier to study and can efficiently describe many biological dynamics (Alon, 2007). In this project, we extend the biochemical interactions available for evolution: we not only model transcriptional activation and repression but also include complexation also known as proteinprotein interaction (PPI), and assume there passive degradation. Adding PPIs is especially crucial because they are well known to lead to nonlinear effects (Buchler and Cross, 2009; Buchler and Louis, 2008) allowing for the simple implementation of complex dynamics such as genetic oscillations (François and Hakim, 2005) observed e.g. in circadian clocks (François, 2005), and such nonlinear effects indeed play crucial roles for control in our evolved model. The equations for these interactions are presented in this subsection.
We model transcriptional activation of arbitrary network species Y’s production from species X using the following Hill equation:
Similarly, repression of arbitrary species Y’s production from species X is modeled via the following Hill equation:
In these equations, ${t}_{X:Y}$ is the threshold required for $[\text{X}]$ to activate or repress $[\text{Y}]$ at 50% of its capacity and ${n}_{X:Y}$ is the Hill coefficient. While both types of interactions are modeled using Hill equations, their combined effects on a network species’ dynamics is computed differently. Indeed, only the maximum of all the activations is accounted for whereas the inhibitions are multiplicative and all are accounted for. Additionally, we allow some species to be produced at a basal rate $b$ independent of any activator which counts as an additional activation.
Altogether using an example, assuming multiple species ${\text{X}}_{1},\mathrm{\dots},{\text{X}}_{q}$ activate the production of species Z while multiples species ${\text{Y}}_{1},\mathrm{\dots},{\text{Y}}_{r}$ repress it, and assuming that Z has a basal production rate ${b}_{Z}$ and a maximum production rate ${p}_{Z}$, then the total contribution of these interactions to the ODE for the dynamics of $[\text{Z}]$ is given by:
We use the law of massaction to model PPIs and passive degradation. Specifically, if arbitrary species X and Y interact together and form a complex Z given a forward rate k_{f} and a backwards rate k_{b}, then the contribution of these interactions to the ODE for the dynamics of the system will be given by:
Lastly, all network species are assumed to be degraded at a passive rate. Thus, if an arbitrary species X is solely degraded with a rate ${\delta}_{X}$, then the dynamic equation for the dynamics of $[\text{X}]$ will be given by:
Analysis of sources of noise
As mentioned in the main text and in this Appendix, we chose a hierarchical way of introducing noise in the system, starting with the biggest contributing factor and incrementally adding additional sources of noise in subsequent analyses. We first included noise in the cell cycle phases, specifically in the timing of the G1/S transition and in the length of the S/G2/M phase. Then in the later parts, we introduced protein production noise modeled as Langevin noise.
In the simulations presented in the main text, we chose not to include noise in the growth rate and in the division ratio as the recorded noise level for in experiments for these measures is lower than that for the timing of the cell cycle and the protein concentration noise (Di Talia et al., 2007; Newman et al., 2006; Zatulovskiy et al., 2020). Nevertheless, those are crucial assumptions that we made that we chose to investigate in more details here.
In subsection S/G2/M noise, we investigate how the level of noise in S/G2/M affects the conclusions drawn in the main text. Then, in subsection Growth rate noise we do the same for noise in the growth rate and in subsection Division ratio noise for the division ratio.
Appendix 1—table 1 shows the values of $C{V}_{Birth}$ of the three models presented in the main text compared to the values reported in the literature for budding yeast (Di Talia et al., 2007), fission yeast (Sveiczer et al., 1996) and mouse epidermal stem cell grown in the animal (Xie and Skotheim, 2020).
S/G2/M noise
Here, we perform similar evolution experiments to those reported in Figure 4 of the main text to examine the effect of modulating the noise in the S/G2/M timer. We thus perform three independent experiment where we set the CV in the timer period to 0%, 5%, and 8% corresponding to no, medium, and high noise respectively. For reference, the CV of the timer period in the control condition where Model A1 was evolved is 3%. Note that we maintain the average duration of the timer to be about half the time it takes to double the cell’s volume. Having specified the S/G2/M timer parameters and starting from the initial seed network of Model A1, we perform evolution and select networks as previously. We compare ensembles of 60 networks for each noise level, half of them evolved under the Pareto optimization of ${N}_{\text{Div}}$ and $C{V}_{\text{Birth}}$ and the other half under the single objective optimization of ${N}_{\text{Div}}$. The results are shown in Appendix 1—figure 5.
Increasing the noise, progressively leads to a loss of the sizer signature and increases the $C{V}_{\text{Birth}}$. This is likely because the fixed duration of S/G2/M allows the system to accurately reset protein concentrations for the subsequent cell cycle to promote accurate G1 control (Willis et al., 2020). Thus, an increasing level of noise becomes associated with a worse accuracy in the size control mechanism which leads to loss of the sizer signature and increased $C{V}_{\text{Birth}}$ as can be seen in Appendix 1—figure 5D,E. Other results from the main text remain unchanged.
Growth rate noise
Here, we perform similar evolution experiments as we did for the noise in S/G2/M but this time by adding noise in the growth rate $\lambda $. Specifically, at each generation of a cell’s lineage, we sample a growth rate from a Gaussian distribution centered around 0.25, the initial value we used in the rest of this project. We perform three evolution experiments with coefficient of variations for the growth rate distributions set to 3%, 5% and 8% corresponding to low, medium and high noise respectively. We perform 30 independent evolution runs with the Pareto optimization framework for each $\lambda $ noise level, each of them starting from the initial seed network of Model A1. The results are shown in Appendix 1—figure 6.
We note that on average, the growth rate will remain centered around the same value, thus not affecting the optimal fitness score networks are able to achieve. However, individual variations at each cycle perturb the ability of the system of accomplishing size control by always modifying the doubling time $\tau =\mathrm{ln}(2)/\lambda $. This leads to progressive loss of the sizer signature and also increases the $C{V}_{\text{Birth}}$. This increased noise in the system sometimes sends the cell volume towards 0 or $\mathrm{\infty}$ as volume is kicked outside of the control mechanism’s working range. Since this behavior is highly penalized in the fitness functions scoring, the evolution finds a way to prevent this from happening via different strategies. Interestingly, at higher noise levels, strong sizers can still evolve but are not the most common phenotype. Instead, evolution seems to favor timers that reliably ensure timely cell division generation after generation. There is however a tradeoff and these models exhibit a higher $C{V}_{\text{Birth}}$ due to the lower amount of size control. Adders can also be evolved at all tested noise levels and provide good size control with reliably low $C{V}_{\text{Birth}}$.
Division ratio noise
Here, we perform similar evolution experiments as we did for the noise in S/G2/M and in $\lambda $, but this time by adding noise in the division fraction $f$. Specifically, at each generation of a cell’s lineage, we sample $f$ from a Gaussian distribution centered around 2, the initial value we used in the rest of this project for symmetrical divisions. We perform three evolution experiments with coefficient of variations for the growth rate distributions set to 2%, 4% and 8% corresponding to low, medium and high noise respectively. We perform 30 independent evolution runs with the Pareto optimization framework for each $f$ noise level, each of them starting from the initial seed network of Model A1. The results are shown in Appendix 1—figure 7.
The results of this experiment are very similar to those for noise in the growth rate $\lambda $ described in the previous subsection. Indeed, changing the division fraction $f$ does not change the doubling time $\tau $ directly like for $\lambda $. Instead, it changes the cycle period around which cells see their volume shrink or grow over successive generations. Indeed, for a division fraction $f$, this equilibrium time between shrinking and growth becomes ${\tau}_{f}=\mathrm{ln}(f)/\lambda $. Intuitively, if $f$ is bigger than 2, then cells need to spend a little bit more time in their cell cycles for growth to occur in order to compensate for this increased division fraction. This increased noise in $f$ leads to progressive loss of the sizer signature as seen before and also increases the $C{V}_{\text{Birth}}$. The division fraction noise has a big effect on premature cell death. Indeed, at the highest noise level tested where $C{V}_{f}=8\%$, we saw many runs end with premature cell death to the point where three evolution runs were completely unable to find a mechanism able to prevent this. As a result of this strong pressure to avoid premature cell death, evolution turns once again to timers instead of sizers in the noisier regime. As before, adders seem to be the most reliable size control phenotype exhibiting low $C{V}_{\text{B}irth}$.
Model descriptions
In this section, we give the full set of equations and parameter values of the models from the main text. We remind the reader that we scale all our variables so that a concentration of one arbitrary unit corresponds roughly to 1000 proteins in a 100fL cell (Milo et al., 2010). Additionally, we scale the time variable such that 1 arbitrary time unit corresponds roughly to 30 min (Di Talia et al., 2007).
Model A1
The parameter values of the Model A1 are shown in Appendix 1—table 2 along with the corresponding differential equations in Equation 24.
Model A2
The parameter values of the Model A2 are shown in Appendix 1—table 3 along with the corresponding differential equations in Equation 25.
Model B
The parameter values of the fluctuationsensing Model B are shown in Appendix 1—table 4 along with the corresponding stochastic differential equations in Equation 26. We note that the C_{0} parameter in the noise terms of the equation is a concentration scaling factor.
Additional models
Here we present additional models that were evolved but not discussed in the main text to provide more examples of size control mechanisms.
Model A3
Model A3 is similar to Model A1, but lacks the homodimerization interaction of $I$ (see Equation 24). This specific model results in a weak adder/timer that displays a nonlinear size control response curve. This is similar to Model A2 where we see a sizer/adder behavior in the low control volume regime and adder/timer behavior in the high control volume regime.
This model’s behavior is summarized in Appendix 1—figure 8. The parameter values of the model are shown in Appendix 1—table 5 along with the corresponding differential equations in Equation 27. This model was evolved with the Pareto fitness optimization framework to maximize ${N}_{Div}$ and minimize $C{V}_{\text{Birth}}$. The initial seed model topology for this evolutionary simulation was the quantity sensing oscillator shown in Appendix 1—figure 3A and was optimized during 3000 epochs.
Model A4
Model A4 is similar in essence to Model A1, albeit more unstable. In this model, ${\text{S}}_{4}$ is the size sensor. Instead of using a PPI to titrate $I$ in a sizedependent manner in G1, this model leverages a transcriptional repression to modulate the production of $I$ directly rather than its effective degradation. The model’s behavior is summarized in Appendix 1—figure 9. The parameter values of the model are shown in Appendix 1—table 6 along with the corresponding differential equations in Equation 28. Notably, Appendix 1—figure 9E shows that the model’s response curve in the low control volume regime is illdefined. In this model specifically, when the low volume regime is reached, the concentration of inhibitor $I$ at G1/S increases due to quantity sensing $[I]\propto 1/V$. Then, because of the homodimerization of $I$ into S_{3}, we see the concentration $[{S}_{3}]$ also rise. We can see both of these curves spike up momentarily in the trajectories of Appendix 1—figure 9D around $t\approx 136$. The problem arises if this increase is too strong. Then, S_{3} activates the production of additional $I$, kickstarting a positive feedback loop that creates more and more inhibitor $I$, effectively interrupting the oscillator and making the period illdefined. In practice, due to intrinsic noise at the G1/S transition and in the S/G2/M timer length, we’ve seen a cell lineage terminate prematurely before it can reach the maximum number of ${N}_{\text{Div}}$ allowed in a simulation because of this problem. One could say this model is thus less fit than those presented in the main text, although it displays an added volume slope over its cycle of –0.41 and is close to a sizer.
This model was evolved using a Pareto fitness optimization that maximizes ${N}_{Div}$ and minimizes $C{V}_{\text{Birth}}$. The initial model topology was the quantity sensing oscillator of Appendix 1—figure 3A which was optimized over 4000 epochs.
Model A5
Model A5 is another variation on Model A1 but here within the S. pombe cell cycle framework where $I$ controls the timing of division directly and the Switch is turned on in G1 instead of S/G2/M. Here, ${\text{S}}_{3}$ directly senses size and does so via the PPI linking I, ${\text{S}}_{3}$, and ${\text{S}}_{7}$. Indeed, since I is inversely proportional to the volume at G1/S due to quantity sensing and since ${\text{S}}_{7}$ is solely produced via complex formation of I with ${\text{S}}_{3}$, ${\text{S}}_{7}$ is also inversely proportional to volume. Consequently, ${\text{S}}_{3}$, which is almost solely produced via the dissociation of ${\text{S}}_{7}$, becomes a direct sensor of the size of the cell.
Then, instead of using a PPI to titrate $I$ in a sizedependent manner as is done in Models A1 and A2, Model A5 leverages a transcriptional repression mechanism to modulate the production of $I$ directly instead of its degradation, precisely like in Model A4. The model’s behavior is summarized in Appendix 1—figure 10. The parameter values of the model are shown in Appendix 1—table 7 along with the corresponding differential equations in Equation 29.
This model was evolved using a Pareto fitness optimization that maximizes ${N}_{\text{Div}}$ and minimizes $C{V}_{\text{Birth}}$. The seed model topology was the quantity sensing oscillator of Appendix 1—figure 3A which was optimized over 2500 epochs.
Model A6
Model A6 is yet another version of Model A1 evolved with slightly different fitness functions. This model was evolved using a Pareto fitness optimization that maximizes ${N}_{Div}$ and minimizes the sum of squared residuals from a target volume at birth as described in Equation 15. For reference, the target volume at birth chosen for this simulation was ${V}_{t}=10$. The initial model topology was the quantity sensing oscillator of Appendix 1—figure 3A which was optimized over 3000 epochs.
The model’s behavior is summarized in Appendix 1—figure 11. The parameter values of the model are shown in Appendix 1—table 8 along with the corresponding differential equations in Equation 30.
Data availability
This is a theory paper, so there is no experimental data, and all results were generated by the code. The code used is freely available at https://github.com/FelixPG/PhiEvo_SizeControl, (copy archived at swh:1:rev:afa7f16a2f8a9d793aa3685116c2436faae100dd). Reference to the code has been added in the text.
References

Network motifs: theory and experimental approachesNature Reviews. Genetics 8:450–461.https://doi.org/10.1038/nrg2102

Cell size regulation in bacteriaPhysical Review Letters 112:208102.https://doi.org/10.1103/PhysRevLett.112.208102

Details matter: noise and model structure set the relationship between cell size and cell cycle timingFrontiers in Cell and Developmental Biology 5:92.https://doi.org/10.3389/fcell.2017.00092

Molecular titration and ultrasensitivity in regulatory networksJournal of Molecular Biology 384:1106–1119.https://doi.org/10.1016/j.jmb.2008.09.079

Protein sequestration generates a flexible ultrasensitive response in a genetic networkMolecular Systems Biology 5:272.https://doi.org/10.1038/msb.2009.30

The physics of cellsize regulation across timescalesNature Physics 15:993–1004.https://doi.org/10.1038/s415670190629y

Scaling properties of cell and organelle sizeOrganogenesis 6:88–96.https://doi.org/10.4161/org.6.2.11464

Two redundant oscillatory mechanisms in the yeast cell cycleDevelopmental Cell 4:741–752.https://doi.org/10.1016/s15345807(03)001199

Controlling cell size through sizer mechanismsCurrent Opinion in Systems Biology 5:86–92.https://doi.org/10.1016/j.coisb.2017.08.010

Control of cell size and cycle time in Schizosaccharomyces pombeJournal of Cell Science 24:51–67.https://doi.org/10.1242/jcs.24.1.51

A model for the neurospora circadian clockBiophysical Journal 88:2369–2383.https://doi.org/10.1529/biophysj.104.053975

Core genetic module: the mixed feedback loopPhysical Review. E, Statistical, Nonlinear, and Soft Matter Physics 72:031908.https://doi.org/10.1103/PhysRevE.72.031908

Deriving structure from evolution: metazoan segmentationMolecular Systems Biology 3:154.https://doi.org/10.1038/msb4100192

Evolving phenotypic networks in silicoSeminars in Cell & Developmental Biology 35:90–97.https://doi.org/10.1016/j.semcdb.2014.06.012

Stochastic simulation of chemical kineticsAnnual Review of Physical Chemistry 58:35–55.https://doi.org/10.1146/annurev.physchem.58.032806.104637

Dilution and titration of cellcycle regulators may control cell size in budding yeastPLOS Computational Biology 14:e1006548.https://doi.org/10.1371/journal.pcbi.1006548

φevo: a program to evolve phenotypic models of biological networksPLOS Computational Biology 14:e1006244.https://doi.org/10.1371/journal.pcbi.1006244

Cellsize maintenance: universal strategy revealedTrends in Microbiology 23:4–6.https://doi.org/10.1016/j.tim.2014.12.001

BookNumerical Solution of Stochastic Differential EquationsBerlin, Heidelberg: Springer.https://doi.org/10.1007/9783662126165

BioNumbers  the database of key numbers in molecular and cell biologyNucleic Acids Research 38:D750–D753.https://doi.org/10.1093/nar/gkp889

Synthetic circuits reveal how mechanisms of gene regulatory networks constrain evolutionMolecular Systems Biology 14:e8102.https://doi.org/10.15252/msb.20178102

Mechanistic origin of cellsize control and homeostasis in bacteriaCurrent Biology 29:1760–1770.https://doi.org/10.1016/j.cub.2019.04.062

Mapping selforganized criticality onto criticalityJournal de Physique I 5:325–335.https://doi.org/10.1051/jp1:1995129

The size control of fission yeast revisitedJournal of Cell Science 109:2947–2957.https://doi.org/10.1242/jcs.109.12.2947

Evolution of networks for body plan patterning; interplay of modularity, robustness and evolvabilityPLOS Computational Biology 7:e1002208.https://doi.org/10.1371/journal.pcbi.1002208

Engineering selforganized criticality in living cellsNature Communications 12:4415.https://doi.org/10.1038/s41467021246954

Bacterial cell size: multifactorial and multifacetedAnnual Review of Microbiology 71:499–517.https://doi.org/10.1146/annurevmicro090816093803

Sizing up the bacterial cell cycleNature Reviews. Microbiology 15:606–620.https://doi.org/10.1038/nrmicro.2017.79

Sizing up to divide: mitotic cellsize control in fission yeastAnnual Review of Cell and Developmental Biology 31:11–29.https://doi.org/10.1146/annurevcellbio100814125601

A G1 sizer coordinates growth and division in the mouse epidermisCurrent Biology 30:916–924.https://doi.org/10.1016/j.cub.2019.12.062

On the molecular mechanisms regulating animal cell size homeostasisTrends in Genetics 36:360–372.https://doi.org/10.1016/j.tig.2020.01.011
Decision letter

Sandeep KrishnaReviewing Editor; National Centre for Biological Sciences‐Tata Institute of Fundamental Research, India

Aleksandra M WalczakSenior Editor; CNRS LPENS, France

Kabir HusainReviewer; University of Chicago, United States

Shiladitya BanerjeeReviewer; Carnegie Mellon University, United States
Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.
Decision letter after peer review:
Thank you for submitting your article "Evolution of cell size control is canalized towards adders or sizers by cell cycle structure and selective pressures" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Aleksandra Walczak as the Senior Editor. The following individuals involved in the review of your submission have agreed to reveal their identity: Kabir Husain (Reviewer #1); Shiladitya Banerjee (Reviewer #2).
The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.
Essential revisions:
The reviewers liked many aspects of the manuscript but have suggested a number of revisions, which include showing the robustness of the results, a better justification of the assumptions/methods, a comparison with existing data and mathematical approaches, including clarification of how the evolutionary approach differs from other approaches, and a more careful rewording of the conclusions. Please address all of the reviewer comments (see below).
Reviewer #1 (Recommendations for the authors):
Figures and text:
I think the paper would be greatly strengthened by less dense Figures. As it stands, I found it difficult to figure out what the main points are. As an example: the main point of Figure 5, as I understand it, might be best served by a timeseries plot of Slope \ΔV_{cycle}, or some other summary statistic that shows the transient evolution of a sizer. Otherwise, this point is buried in the legend of panels B, E, and H.
On feedback control mechanisms:
There are no statistical summaries of the simulations described in Figures 2 and 3 of the main text  Only Figure 4 (whose simulations are initialised with the evolved Model A1) contains statistics on evolved networks. If I understand the text, all the simulations resulted in topologically similar networks  is this the case. What is the range of CV_births, and does the achieved CV_birth depend on the molecular implementation of the feedback control (PPI, dimerisation, transcriptional control), or do these molecular details affect the \ΔV_{slope}?
On adders vs sizers (Figures4, 5, and lines 557 to 561 in the discussion):
Figure 4, 5, and the text around it, suggest that the CV_{birth} for adders is lower than that of sizers, in contrast to the common view that sizers are better at controlling cell size. I wonder if the distinction is between the *steadystate CV_{birth}* and the time taken to return to equilibrium from a large 'perturbation' of the cell size?
If it is true that adders are better at the former, but sizers are better at the latter, then would this also explain why cell size control late in the cycle tends to favour sizers? The intuition perhaps being that size control later in the cycle needs to deal with perturbations that occurred earlier in the cell cycle (as suggested by Lines 557 to 561)?
Reviewer #2 (Recommendations for the authors):
We have the following specific comments and recommendations for the authors.
1. Lines 7477: This is the one piece of the introduction where readers unfamiliar with the eukaryotic cell cycle would be confused. Including background information about G1/S and S/G2/M phases would help expand the target audience of the paper since the techniques discussed therein are widely applicable.
2. Lines 101106: The "poisson rate that corresponds to the deterministic rate" is unclear. These two sentences could be elaborated on further since significant prior knowledge of the reader is currently assumed.
3. Line 115: "While sizedependent growth mechanisms exist and do support size homeostasis" – This assertion should be backed up with relevant citations.
4. Lines 209210: "Upon passing the G1/S transition, we assume cells are committed to division and there is a fixed time delay before they divide thus modeling S/G2/M as a timer." – What motivates this assumption? A citation or further discussion is warranted.
5. Line 285: This would be a good place for a citation to direct readers to sources discussing concentration vs quantity sensing. Note that in the bacterial size control literature, quantity sensing of division initiators has been shown to regulate adder behavior (Si et al., Curr Biol 2019).
6. Figure 1B: The dashed grey line is defined afterwards in 1C. It would be good to include it in the defined interactions here instead. In addition, the clarity of this schematic would be better with a line showing how the final network becomes the initial network in the next epoch.
7. Lines 301302: "… its production is completely shut down in S/G2/M" does not come through clearly in the associated figure. You should clearly describe the chosen dynamics for the inhibitor protein in different phases of the cell cycle, and justify why the choices are different from known inhibitors such as Whi5.
8. Figure 2A: The message of this figure is not presented clearly. There is clustering with high CV volume and low N, another with negligible CV volume and widespread N division, and then the circled optimum. However, the trajectory of how a network evolves is not clear in this picture. Do all of them converge to the optimum eventually? Do they move to low CV before high N division or at the same time? How many epochs does it take to cross the large gap between the clustered networks and the optimum? Recommend somehow indicating sample evolutionary trajectories in addition to the aforementioned clarifications to remedy this issue. Additionally, why are there no numerical values in the axes? This makes it very difficult to assess the degree to which original values have changed.
9. Lines 309322: This paragraph is somewhat confusing, in particular lines 314316. The motivation for the control volume is unclear, especially in the physical sense of why a cell would use a nonphysical volume to control a transition. While the idea makes sense later in the supplementary material, it needs to be clear from the very start that the goal here is that the control volume is a tool to examine how size at G1/S affects the cycle time.
10. Figures 3A,3C: While intuitive, specifying the role of the dashed red line would improve clarity.
11. Figure 3D: The predictions for the cell cycle scatter appear much stronger than the scatter itself. Can you comment on this?
12. Figure 3B, D: Compare how the model predictions compare with binned means for the scatter.
13. Lines 357361: The numbers of 120 simulations and 500 epochs appear to be chosen arbitrarily. Why did you choose these initial conditions, and are the results of your paper robust with respect to higher/lower values? If so, including that point here would strengthen the argument, especially with a brief discussion on the lower limit. In addition, roughly how long in time is an epoch? The speed at which evolution is occurring would be of interest to many readers.
14. Figure 4: 4B is created to resemble S. pombe. Do the other panels have reallife analogies or are they arbitrarily chosen for qualitative representations of the discussed effects in the main text?
15. Figures 2F, 2I, 5A, 5D, 5G, 6C: The second zoomedin panel of 6C is essential to understand the innergenerational dynamics of your modeling. The first manygeneration panel shows stability in V but fails to address the other variables and the multiphase dynamics. The other figures (2F, 2I, 5A, 5D, 5G) would benefit greatly from either a similar treatment or just fewer generations. The stability can be shown with significantly fewer divisions than are currently used.
16. Figure 5B: The ΔV scatters for S/G2/M and the whole cycle could be grouped into two – a positive correlation and a negative correlation. A best fit to the entire scatter is misleading therefore and does not describe the correlation trend.
17. Lines 539542: Why is a onestep implementation of size control discarded? Surely a simpler control mechanism could be preferred naturally despite being a lesser theoretical interest in evolution simulations.
18. Lines 793794: In this paper, you consider parallels to organisms that divide asymmetrically, such as budding yeast. Have you run simulations considering asymmetric division? Surely that would impact cell size distribution and variability.
19. Lines 794795: Wouldn't disregarding one of the two daughter cells add a bias against faster dividing cells? I.e. if the number of divisions is one of your fitness functions, doesn't this method eliminate the natural advantage of a relatively larger population size for multicell level exponential growth? Also an issue at S130132.
20. Lines 801802: How is the extraction of the nullcline performed?
21. 829: Recommend providing the conversion from the arbitrary units used to physical values, here and all other figures.
22. 838839: Model A2 does not receive an explanation comparable to A1 in this figure; either move to supplemental materials or explain it clearly as well.
23. Figures 6H, 6I, 6J: These subfigures are not very clear, together with captions for 6I and 6J that do not sufficiently explain to the reader how they are read. Why is the Burst Amplitude axis extended so far beyond the heatmap?
24. S120S121: How do these theta and n theta values come to be? Currently, it seems like they are chosen with no supporting reasoning or explanation.
25. S239: V target missing a capital T.
26. S328S329: Need to use a left apostrophe rather than two right apostrophes for epoch and generation.
27. Eq. (S18): Why do you minimize m+1 rather than just m?
28. S406S408: Why do you choose these interactions to include? How much of all biochemical interactions do they encompass together? Are there others that you are aware of that you are choosing to neglect, and why? Can you provide citations and/or an argument to motivate this choice? This is another crucial ansatz for your modeling that needs to be discussed more carefully.
29. Supplementary Section 4: Translating the arbitrary units into physical values (when possible) would be immensely useful/helpful here.
30. Figure S6E: Why does cut off just below 1 (here, and not in any other plots)?
31. Since the model is generally applicable to any organism, comparisons to size control in bacterial cells (even qualitative) would be useful to widen the appeal. For example, could you predict why almost all bacterial cells (even evolutionary divergent ones) behave as adders? It has been shown that adder is regulated by threshold accumulation of an initiator protein that is produced at a rate proportional to cell volume, which your model could perhaps capture. Furthermore, many bacterial cells also exhibit biphasic size regulation during the cell cycle. It has been shown that Bacillus subtilis behave as sizers during the first phase, followed by a timer phase till division (DOI:10.1016/j.cub.2020.04.030). By contrast, Caulobacter crescentus cells implement a timer first, followed by an adder phase of size control (DOI:10.1038/nmicrobiol.2017.116). Both these organisms behave as approximate adders overall.
https://doi.org/10.7554/eLife.79919.sa1Author response
Reviewer #1 (Recommendations for the authors):
Figures and text:
I think the paper would be greatly strengthened by less dense Figures. As it stands, I found it difficult to figure out what the main points are. As an example: the main point of Figure 5, as I understand it, might be best served by a timeseries plot of Slope \ΔV_{cycle}, or some other summary statistic that shows the transient evolution of a sizer. Otherwise, this point is buried in the legend of panels B, E, and H.
We have clarified the main text and the captions to better explain Figure 5 but otherwise have kept the figure mostly unchanged as we find useful to see the actual temporal dynamics of the networks throughout evolution to illustrate the sloppy sizer evolving into a weak adder in order to reduce the CV of the size distribution at birth. This key point has now been added to the discussion as described above. To reemphasize its importance, and clarify the purpose of Figure 5, we have modified the top of the figure caption, which now includes the following sentence: “Evolutionary dynamics continually reduce the selected for $C{V}_{extBirth}$ and proceed through a noisy sizer to a less noisy adder”
On feedback control mechanisms:
There are no statistical summaries of the simulations described in Figures 2 and 3 of the main text  Only Figure 4 (whose simulations are initialised with the evolved Model A1) contains statistics on evolved networks. If I understand the text, all the simulations resulted in topologically similar networks  is this the case.
It is the case as all successful evolution simulations ended with some version of Model A. In Figure 2, we present two versions of Model A that have slightly different network topologies but perform the feedback mechanism in the same way. In the Supplement, we also give additional examples of evolved models in Figures S8, S9 and S10. We rewrote the second paragraph of the Results section, which now begins as: “Evolution simulations are in part reproducible and most often lead to similar network topologies. The evolution trajectory leading to Model A1 is a typical example in which all simulations produced similar networks (Figure 2B).”
What is the range of CV_births, and does the achieved CV_birth depend on the molecular implementation of the feedback control (PPI, dimerisation, transcriptional control), or do these molecular details affect the \ΔV_{slope}?
We have now included a Table in the Supplement where we show the ranges of CV_births obtained by our evolved models (see new Table S1). We’ve seen different combinations of biochemical interactions evolve and perform feedback in different ways. Yet, they generally result in similar CV_Birth as seen in the 5 different versions of Model A shown in Figures 2, 3, S8, S9, and S10. Our understanding is that the resulting \Δ V_Slope is a direct consequence of the strength of the feedback mechanism which is in itself dictated by the molecular interactions of the network. But, the feedback control can be implemented in different ways with similar results. We now describe these results in the last paragraph of the section ‘Evolution of quantitybased size control mechanisms’ which reads as: “We give additional examples of similarly evolved networks in Figures S8S10 where we can see the sensing and the feedback mechanism being implemented in different ways. Yet, despite these mechanistic differences in feedback regulation the resulting function of the evolved networks were similar as indicated by their CV_{Birth} (Table S1).”
On adders vs sizers (Figures4, 5, and lines 557 to 561 in the discussion):
Figure 4, 5, and the text around it, suggest that the CV_{birth} for adders is lower than that of sizers, in contrast to the common view that sizers are better at controlling cell size. I wonder if the distinction is between the *steadystate CV_{birth}* and the time taken to return to equilibrium from a large 'perturbation' of the cell size?
If it is true that adders are better at the former, but sizers are better at the latter, then would this also explain why cell size control late in the cycle tends to favour sizers? The intuition perhaps being that size control later in the cycle needs to deal with perturbations that occurred earlier in the cell cycle (as suggested by Lines 557 to 561)?
As we discussed above, a noisy sizer can have a higher CV at steady state than an accurate adder. However, in line with the reviewers intuition, expect sizers to be better at reducing the effect of large fluctuations where the system is far from steady state. This is because a sizer will return the system to the steady state in one cell cycle. We now emphasize this point in the discussion, where we write:
‘However, we anticipate even noisy sizers will be better than adders at controlling cell size in response to large deviations away from the steady state distribution. This is because sizers will always return the cell size to be within the steady state distribution within a cell cycle.’
The argument here is agnostic to the distribution of cell size control in the different phases of the cell cycle and independent of the fact that G2 size control promotes sizers (which is discussed at length in the text already).
Reviewer #2 (Recommendations for the authors):
We have the following specific comments and recommendations for the authors.
1. Lines 7477: This is the one piece of the introduction where readers unfamiliar with the eukaryotic cell cycle would be confused. Including background information about G1/S and S/G2/M phases would help expand the target audience of the paper since the techniques discussed therein are widely applicable.
We have added some basic information on the cell cycle and cited the main book of the field by David Morgan. The introduction now contains the following lines of text: “Cell size is regulated through the cell cycle control network that governs transitions from one phase of the cell cycle to the next. The division cycle can be broken up into distinct phases that are characterized by different molecular activities (D. O. Morgan, 2007). While it is typically considered that there are 4 phases of the cell cycle (G1, S, G2, and M), we here consider a two phase model based on a G1 phase and a composite S/G2/M phase. This is because size control in general has been associated with either the G1/S transition or mitosis at the end of the cell cycle.”
2. Lines 101106: The "poisson rate that corresponds to the deterministic rate" is unclear. These two sentences could be elaborated on further since significant prior knowledge of the reader is currently assumed.
We now write: “Thus, each biochemical reaction takes place with a rate that corresponds to the deterministic rate, to which we add one white Gaussian noise with a variance equal to that rate. For example, given a deterministic biochemical rate $k$ and a time interval of size $ext\mathrm{\Delta}t$, we consider a tauleaping change of $k\mathrm{\Delta}t+\mathcal{N}(0,k\mathrm{\Delta}t)$ where $\mathcal{N}(0,k\mathrm{\Delta}t)$ is a random gaussian variable of mean 0 and variance $extk\mathrm{\Delta}t$.”
3. Line 115: "While sizedependent growth mechanisms exist and do support size homeostasis" – This assertion should be backed up with relevant citations.
We now cite Miettinen and Bjorklund (2016; https://doi.org/10.1016/j.devcel.2016.09.004) and Ginzburg et al. (2018; https://doi.org/10.7554/eLife.26957).
4. Lines 209210: "Upon passing the G1/S transition, we assume cells are committed to division and there is a fixed time delay before they divide thus modeling S/G2/M as a timer." – What motivates this assumption? A citation or further discussion is warranted.
This assumption is justified by what happens in S. cerevisiae’s cell cycle. Upon passing the G1/S transition, during an event called Start, the cell becomes irreversibly committed to division. This means that at that point, even when treated with drugs that would disable the late cell cycle machinery, cells will divide no matter what. We now cite Doncic et al. (2011) as a reference for the commitment point, and ChandlerBrown et al. (2017) as a reference for the time nature of S/G2/M phase.
5. Line 285: This would be a good place for a citation to direct readers to sources discussing concentration vs quantity sensing. Note that in the bacterial size control literature, quantity sensing of division initiators has been shown to regulate adder behavior (Si et al., Curr Biol 2019).
As suggested, we now refer the reader to references discussing mechanisms to sense protein quantities rather than concentrations. We now discuss quantity sensing when it first is introduced in the methods section. The subsection on initial cell cycle model now contains the following sentences: “One way the amount rather than the concentration of a molecule could be sensed is through its titration against a fixed cellular quantity such as the genome, which is part of a general class of titrationbased cell size sensing mechanisms (Amodeo et al., 2015; Heldt et al., 2018; Si et al., 2019; Wang et al., 2009).”
6. Figure 1B: The dashed grey line is defined afterwards in 1C. It would be good to include it in the defined interactions here instead. In addition, the clarity of this schematic would be better with a line showing how the final network becomes the initial network in the next epoch.
The dashed grey line is a special interaction that cannot be evolved or mutated by the PhiEvo algorithm. This grey dashed line always connects the inhibitor I to the S/G2/M Switch to indicate that I is in fact an inhibitor of the S/G2/M cell cycle phase. Because of this distinction, we want to separate this interaction from the evolvable interactions (transcriptional activation, transcriptional repression and complexation) that are shown in Figure 1B. To avoid confusion, we have removed the dashed grey line from the cartoon networks of Figure 1B as the focus of this panel is on the general evolution algorithm itself and not the specific implementation used in this study.
As for a line showing how the final network becomes the initial network in the next epoch, we have included a figure in the Supplement showing this process in more detail (Figure S4).
7. Lines 301302: "… its production is completely shut down in S/G2/M" does not come through clearly in the associated figure. You should clearly describe the chosen dynamics for the inhibitor protein in different phases of the cell cycle, and justify why the choices are different from known inhibitors such as Whi5.
Here in these lines, we were talking about the R protein and not the I inhibitor that plays the role of Whi5 in our system. We apologize for this confusion and have clarified this point in the text, which now reads as: “For example, Model A2 contains extra interactions for the volume sensing gene [R], where [R] is repressed by the S/G2/M Switch (meaning its production is completely shut down in S/G2/M leading to sawtoothlike dynamics).”
8. Figure 2A: The message of this figure is not presented clearly. There is clustering with high CV volume and low N, another with negligible CV volume and widespread N division, and then the circled optimum. However, the trajectory of how a network evolves is not clear in this picture. Do all of them converge to the optimum eventually? Do they move to low CV before high N division or at the same time? How many epochs does it take to cross the large gap between the clustered networks and the optimum? Recommend somehow indicating sample evolutionary trajectories in addition to the aforementioned clarifications to remedy this issue. Additionally, why are there no numerical values in the axes? This makes it very difficult to assess the degree to which original values have changed.
We have completely remade this figure panel to render it more transparent (shown below) and have updated the caption accordingly. We have also added additional details about the evolutionary trajectory in the subsection S3A – Fitness of the Supplement.
Evolution happens in several stages. First, there are several epochs without any size control; networks then cluster in two regions of the Pareto front, essentially corresponding to volume going to 0 ([ii] in the new panel) and volume going to maximum volume ([i] in the new panel) both cases which are highly penalized in their fitness score (as is now explained in the Supplement). In Figure S3A, we show that the initial cell cycle network for the evolution process leads to unstable growth which is why we inevitably see the cell volume crash to 0 or maximum volume and cluster in [i] or [ii]. Evolution goes back and forth between those two clusters with a slow increase of the number of divisions (the fitness on the x axis). Eventually, some volume control evolves, most of the time corresponding to the Model 1 architecture described in the main text. Then, both the number of divisions N_div and the CV_Birth of those networks are optimized considerably, which is what we described in the main text as an “all or nothing” fitness score. Finally, CV_Birth keeps decreasing slowly until it reaches a plateau of 69% at the last generation. Not all networks go to optimum: in fact our Pareto evolution favours population diversity so makes sure that some networks are always relatively far from optimum. Also, notice that all simulations are different: those are stochastic simulations. Some simulations converge while others get lost on their way to the optimum as is expected. However, we implement the equivalent of a Drake’s rule: on average each network is mutated once per epoch. So the number of epochs before an evolutionary jump is a proxy of the typical number of mutations needed to select for one “good” mutation (changing Pareto front). We have added more details on all of this.
9. Lines 309322: This paragraph is somewhat confusing, in particular lines 314316. The motivation for the control volume is unclear, especially in the physical sense of why a cell would use a nonphysical volume to control a transition. While the idea makes sense later in the supplementary material, it needs to be clear from the very start that the goal here is that the control volume is a tool to examine how size at G1/S affects the cycle time.
The reviewer is absolutely correct and phrases this statement better than we did. We have now redone Figure 3 rewritten this part of the text, which now reads as: “We then numerically integrate the differential equations of the model and measure the period T(VC) of the simulated cellcycle for this control volume. Use of the control volume allows us to break the size feedback system and distinguish its input, V_{C}, from its output, the induced cycle period T (Angeli et al., 2004).”
10. Figures 3A,3C: While intuitive, specifying the role of the dashed red line would improve clarity.
We clarified this and have updated the figure accordingly.
11. Figure 3D: The predictions for the cell cycle scatter appear much stronger than the scatter itself. Can you comment on this?
We think there is some confusion here. We note that we are predicting the average response using the control volume framework described above. We do not predict the scatter, which results from the stochastic simulations in steady state. We have clarified this in the caption of Figure 3B which now reads: “Added volumes $ext\mathrm{\Delta}V$ for different phases of the cell cycles for simulations of Model A1. Individual dots correspond to different cell cycles for a simulation at steadystate. The full line corresponds to the extrapolation from the $T({V}_{C})$ curve shown in A for a restricted range of ${V}_{C}$ relevant to the scatter. The black cross, star and square indicate the average added volumes corresponding to when the system senses a volume corresponding to $\u27e8{V}_{G1/S}angle$ at the G1/S transition. We see that the model is predicted to follow an adder over a large range of volumes.”
12. Figure 3B, D: Compare how the model predictions compare with binned means for the scatter.
In evolutionary simulations, fitness typically improves rapidly in the beginning but then plateaus and only very gradually increases after a couple of hundred epochs. This is a general property of optimization simulations, as is also generally seen in machine learning. Based on our extensive previous experience with PhiEvo simulations, 500 epochs typically is sufficient to find the fitness plateau, but not so much that the network extensively explores the neutral mutations around the plateau. An epoch has no length per se, it just is a cycle of mutation/selection, however, the evolutionary algorithm adjust its mutation rates to have on average 1 mutation per generation (this is in fact an experimental fact called Drake’s law, and practically it prevents the so called “codebloat” that can be observed in evolutionary simulations). This means that if a network evolves within, say, 100 generations, it is at most 100 mutations away from the initial state, and practically much less than that because most mutations are either neutral or deleterious (and in that case are not kept). To clarify this point in the text, we now write: “We chose to use 500 epochs in our simulations because in our previous experience this was sufficient for networks to evolve to be near the optimum, but not so much that they were forced to extensively explore the effects of neutral mutations near the optimum.”
13. Lines 357361: The numbers of 120 simulations and 500 epochs appear to be chosen arbitrarily. Why did you choose these initial conditions, and are the results of your paper robust with respect to higher/lower values? If so, including that point here would strengthen the argument, especially with a brief discussion on the lower limit. In addition, roughly how long in time is an epoch? The speed at which evolution is occurring would be of interest to many readers.
For clarification, as suggested by the reviewer, we have redone the figure panels on the dynamics with fewer generations to allow for better visualization of the multigenerational protein and volume dynamics.
14. Figure 4: 4B is created to resemble S. pombe. Do the other panels have reallife analogies or are they arbitrarily chosen for qualitative representations of the discussed effects in the main text?
Panel A was designed to resemble S. cerevisiae, at least qualitatively, which was discussed in the text as having a size control at the G1/S transition.
15. Figures 2F, 2I, 5A, 5D, 5G, 6C: The second zoomedin panel of 6C is essential to understand the innergenerational dynamics of your modeling. The first manygeneration panel shows stability in V but fails to address the other variables and the multiphase dynamics. The other figures (2F, 2I, 5A, 5D, 5G) would benefit greatly from either a similar treatment or just fewer generations. The stability can be shown with significantly fewer divisions than are currently used.
For clarification, as suggested by the reviewer, we have redone the figure panels on the dynamics with fewer generations to allow for better visualization of the multigenerational protein and volume dynamics.
16. Figure 5B: The ΔV scatters for S/G2/M and the whole cycle could be grouped into two – a positive correlation and a negative correlation. A best fit to the entire scatter is misleading therefore and does not describe the correlation trend.
This is a good point, we have done this as there are indeed two distinct behaviors as pointed out by the referee depending on whether or not the cell is small or large. The figure panel now looks as follows and we have adjusted the caption accordingly:
17. Lines 539542: Why is a onestep implementation of size control discarded? Surely a simpler control mechanism could be preferred naturally despite being a lesser theoretical interest in evolution simulations.
As explained above in response to the reviewers major comment #4, we chose to do the analysis the way we did because we want to explore how cell size control can be done by a network with multiple feedbacks rather than just the concentration of a single protein, such as the budding yeast Whi5 protein, that has a special dedicated synthesis mechanism to make its concentration directly reflect cell size.
18. Lines 793794: In this paper, you consider parallels to organisms that divide asymmetrically, such as budding yeast. Have you run simulations considering asymmetric division? Surely that would impact cell size distribution and variability.
It is definitively of interest to explore the effects of asymmetric divisions, but this is outside the scope of this already dense manuscript and will be the subject of future investigations.
19. Lines 794795: Wouldn't disregarding one of the two daughter cells add a bias against faster dividing cells? I.e. if the number of divisions is one of your fitness functions, doesn't this method eliminate the natural advantage of a relatively larger population size for multicell level exponential growth? Also an issue at S130132.
Here, cell growth is exponential on the single cell level since we were assessing size control mechanisms that take size as an input to cell cycle control. We are not exploring the very interesting case where growth deviates from the exponential. In that case, what the reviewer says is absolutely essential because then size homoeostasis would have a contribution from some cells in the population outcompeting others in terms of their growth. We intend to explore this possibility in future work. We have now added this text to the supplementary material which reads: “Here, cell growth is exponential on the single cell level since we were assessing size control mechanisms that take size as an input to cell cycle control. We are not exploring the very interesting case where growth deviates from the exponential. In that case, size homoeostasis would have a contribution from some cells in the population outcompeting others in terms of their growth and we would have to simulate the entire cell population and not disregard one of the daughter cells as we do here.”
20. Lines 801802: How is the extraction of the nullcline performed?
We clarify in the text how the fictitious nullcline is extracted. We write:
“An intermediate fictitious nullcline is shown as a line that connects the average concentration $[I]$ at G1/S and at division.”
21. 829: Recommend providing the conversion from the arbitrary units used to physical values, here and all other figures.
Our arbitrary units are described in the methods where it states that: “Note that we scale all our variables so that a concentration of one arbitrary unit corresponds roughly to 1000 proteins in a 100fL cell (Milo et al., 2010). Additionally, we scale the time variable so that 1 arbitrary time unit corresponds roughly to 30 min (Di Talia et al., 2007).” However, we prefer to have the figures in AU since this leads to a numerically simpler presentation and corresponds directly to what we have evolved.
22. 838839: Model A2 does not receive an explanation comparable to A1 in this figure; either move to supplemental materials or explain it clearly as well.
We think the shorter explanation of A2 is fine for the figure caption because there is a longer explanation in the main text to explain this model. We now refer the reader to see the text in the figure caption.
23. Figures 6H, 6I, 6J: These subfigures are not very clear, together with captions for 6I and 6J that do not sufficiently explain to the reader how they are read. Why is the Burst Amplitude axis extended so far beyond the heatmap?
To make these figure panels more clear, we truncated the Burst Amplitude axis as suggested by the reviewer, and moved the inset to be its own panel. We also modified the caption to reflect these changes.
24. S120S121: How do these theta and n theta values come to be? Currently, it seems like they are chosen with no supporting reasoning or explanation.
This is an important point. We chose these values because they give a similar amount of noise in the G1/S transition as observed experimentally (e.g., Di Talia et al. 2007; ChandlerBrown et al. 2017). We have added this explanation in the supporting information.
25. S239: V target missing a capital T.
This correction has been made.
26. S328S329: Need to use a left apostrophe rather than two right apostrophes for epoch and generation.
This correction has been made.
27. Eq. (S18): Why do you minimize m+1 rather than just m?
In principle, it is the same. We chose to optimize m+1 rather than m in order to keep the fitness function positive. This helps when imposing multiplicative penalties on the fitness when cell volume is too low or too high and does not affect the optimization process.
28. S406S408: Why do you choose these interactions to include? How much of all biochemical interactions do they encompass together? Are there others that you are aware of that you are choosing to neglect, and why? Can you provide citations and/or an argument to motivate this choice? This is another crucial ansatz for your modeling that needs to be discussed more carefully.
We chose these interactions because they are commonly used in the systems biology literature. Moreover, we have found in our previous evolutionary simulations that a combination of transcriptional interactions with proteinprotein interactions is sufficient to account for many standard mechanisms in systems biology (e.g., switches, oscillators, biochemical adaptation). We note that other work in this area often restricts itself to purely transcriptional networks because they are easier to study and understand, and we added references to those works. Our approach is more flexible and generic.
We now write in the Supplement: “Many “inverseapproach” approaches in systems biology have focused on purely transcriptional networks (Francois et al., 2007, Fujimoto et al., 2008, Cotterell et al., 2010, Ten Tusscher et al., 2011), because they are generic, easier to study and can efficiently describe many biological dynamics (Alon, 2007).
In this project, we extend the biochemical interactions available for evolution: we not only model transcriptional activation and repression but also include complexation also known as proteinprotein interaction (PPI), and assume there passive degradation. Adding PPIs is especially crucial because they are well known to lead to nonlinear effects (Buchler et al., 2008, Buchler et al., 2009) allowing for the simple implementation of complex dynamics such as genetic oscillations (François et al., 2005) observed e.g. in circadian clocks (François, 2005), and such nonlinear effects indeed play crucial roles for control in our evolved model. The equations for these interactions are presented in this subsection.”
29. Supplementary Section 4: Translating the arbitrary units into physical values (when possible) would be immensely useful/helpful here.
We have made the requested modification.
30. Figure S6E: Why does cut off just below 1 (here, and not in any other plots)?
For smaller Vc than the cutoff, Model A4 does not produce an oscillation, but instead reaches a steady state so there is no defined T/tau. We now note this fact in the new S9E (old S6E) caption and include a more detailed explanation for this model in the Supplement.
We write: “Notably, Figure S9E shows that the model's response curve in the low control volume regime is illdefined. In this model specifically, when the low volume regime is reached, the concentration of inhibitor I at G1/S increases due to quantity sensing $[I]\text{}\propto \text{}1/V$. Then, because of the homodimerization of I into S_{3}, we see the concentration [S_{3}] also rise. We can see both of these curves spike up momentarily in the trajectories of Figure S9D around $t\text{}\approx \text{}136$. The problem arises if this increase is too strong. Then, S activates the production of additional I, kickstarting a positive feedback loop that creates more and more inhibitor I, effectively interrupting the oscillator and making the period illdefined.”
31. Since the model is generally applicable to any organism, comparisons to size control in bacterial cells (even qualitative) would be useful to widen the appeal. For example, could you predict why almost all bacterial cells (even evolutionary divergent ones) behave as adders? It has been shown that adder is regulated by threshold accumulation of an initiator protein that is produced at a rate proportional to cell volume, which your model could perhaps capture. Furthermore, many bacterial cells also exhibit biphasic size regulation during the cell cycle. It has been shown that Bacillus subtilis behave as sizers during the first phase, followed by a timer phase till division (DOI:10.1016/j.cub.2020.04.030). By contrast, Caulobacter crescentus cells implement a timer first, followed by an adder phase of size control (DOI:10.1038/nmicrobiol.2017.116). Both these organisms behave as approximate adders overall.
We thank the reviewer for indicating those references, that we now cite in the manuscript. We have a generic argument for why adders might arise, ie, if size control takes place early in the division cycle, but do not have any specific things to say about bacteria in comparison to eukaryotic cells.
https://doi.org/10.7554/eLife.79919.sa2Article and author information
Author details
Funding
Natural Sciences and Engineering Research Council of Canada (Discovery Grant)
 Paul François
Natural Sciences and Engineering Research Council of Canada (Alexander Graham Bell Canada Graduate Scholarship)
 Felix ProulxGiraldeau
Fonds de recherche du Québec – Nature et technologies (Doctoral research scholarship)
 Felix ProulxGiraldeau
National Institutes of Health (NIH R35 GM134858)
 Jan M Skotheim
Chan Zuckerberg Initiative (Biohub Investigator Award)
 Jan M Skotheim
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
We thank Rodrigo ReyesLamothe, Nicolas E Buchler, and Lucas Fuentes Valenzuela for helpful comments on the manuscript. JS was supported by the NIH (R35 GM134858), PF was supported by Natural Sciences and Engineering Research Council of Canada (NSERC), Discovery Grant Program, FPG was supported by a Fonds de Recherche du Québec Nature et Technologies (FRQNT) Doctoral scholarship (B2X) and by a Natural Sciences and Engineering Research Council of Canada (NSERC) Doctoral Canada Graduate Scholarship (CGSD).
Senior Editor
 Aleksandra M Walczak, CNRS LPENS, France
Reviewing Editor
 Sandeep Krishna, National Centre for Biological Sciences‐Tata Institute of Fundamental Research, India
Reviewers
 Kabir Husain, University of Chicago, United States
 Shiladitya Banerjee, Carnegie Mellon University, United States
Publication history
 Preprint posted: April 13, 2022 (view preprint)
 Received: May 2, 2022
 Accepted: September 25, 2022
 Accepted Manuscript published: September 30, 2022 (version 1)
 Version of Record published: November 11, 2022 (version 2)
Copyright
© 2022, ProulxGiraldeau et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 790
 Page views

 268
 Downloads

 0
 Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading

 Cancer Biology
 Computational and Systems Biology
Lung squamous cell carcinoma (LUSC) is a type of lung cancer with a dismal prognosis that lacks adequate therapies and actionable targets. This disease is characterized by a sequence of low and highgrade preinvasive stages with increasing probability of malignant progression. Increasing our knowledge about the biology of these premalignant lesions (PMLs) is necessary to design new methods of early detection and prevention, and to identify the molecular processes that are key for malignant progression. To facilitate this research, we have designed XTABLE (Exploring Transcriptomes of Bronchial Lesions), an opensource application that integrates the most extensive transcriptomic databases of PMLs published so far. With this tool, users can stratify samples using multiple parameters and interrogate PML biology in multiple manners, such as two and multiplegroup comparisons, interrogation of genes of interests, and transcriptional signatures. Using XTABLE, we have carried out a comparative study of the potential role of chromosomal instability scores as biomarkers of PML progression and mapped the onset of the most relevant LUSC pathways to the sequence of LUSC developmental stages. XTABLE will critically facilitate new research for the identification of early detection biomarkers and acquire a better understanding of the LUSC precancerous stages.

 Computational and Systems Biology
 Neuroscience
Humans make a number of choices when they walk, such as how fast and for how long. The preferred steady walking speed seems chosen to minimize energy expenditure per distance traveled. But the speed of actual walking bouts is not only steady, but rather a timevarying trajectory, which can also be modulated by task urgency or an individual’s movement vigor. Here we show that speed trajectories and durations of human walking bouts are explained better by an objective to minimize Energy and Time, meaning the total work or energy to reach destination, plus a cost proportional to bout duration. Applied to a computational model of walking dynamics, this objective predicts dynamic speed vs. time trajectories with inverted U shapes. Model and human experiment (N=10) show that shorter bouts are unsteady and dominated by the time and effort of accelerating, and longer ones are steadier and faster and dominated by steadystate time and effort. Individualdependent vigor may be characterized by the energy one is willing to spend to save a unit of time, which explains why some may walk faster than others, but everyone may have similarshaped trajectories due to similar walking dynamics. Tradeoffs between energy and time costs can predict transient, steady, and vigorrelated aspects of walking.