Evolution of irreversible somatic differentiation
Abstract
A key innovation emerging in complex animals is irreversible somatic differentiation: daughters of a vegetative cell perform a vegetative function as well, thus, forming a somatic lineage that can no longer be directly involved in reproduction. Primitive species use a different strategy: vegetative and reproductive tasks are separated in time rather than in space. Starting from such a strategy, how is it possible to evolve life forms which use some of their cells exclusively for vegetative functions? Here, we develop an evolutionary model of development of a simple multicellular organism and find that three components are necessary for the evolution of irreversible somatic differentiation: (i) costly cell differentiation, (ii) vegetative cells that significantly improve the organism’s performance even if present in small numbers, and (iii) large enough organism size. Our findings demonstrate how an egalitarian development typical for loose cell colonies can evolve into germsoma differentiation dominating metazoans.
Introduction
In complex multicellular organisms, different cells specialise to execute different functions. These functions can be generally classified into two kinds: reproductive and vegetative. Cells performing reproductive functions contribute to the next generation of organisms, while cells performing vegetative function contribute to sustaining the organism itself. In unicellular species and simple multicellular colonies, these two kinds of functions are performed at different times by the same cells – specialization is temporal. In more complex multicellular organisms, specialization transforms from temporal to spatial (Mikhailov et al., 2009), where groups of cells focused on different tasks emerge in the course of organism development.
Typically, cell functions are changed via differentiation, such that a daughter cell performs a different function than the maternal cell. The vast majority of metazoans feature a very specific and extreme pattern of cell differentiation: any cell performing vegetative functions forms a somatic lineage, that is, producing cells performing the same vegetative function – somatic differentiation is irreversible. Since such somatic cells cannot give rise to reproductive cells, somatic cells do not have a chance to pass their offspring to the next generation of organisms. Such a mode of organism development opened a way for deeper specialization of somatic cells and consequently to the astonishing complexity of multicellular animals. Outside of the metazoans – in a group of green algae Volvocales serving as a model species for evolution of multicellularity – the emergence of irreversibly differentiated somatic cells is the hallmark innovation marking the transition from colonial life forms to multicellular species (Kirk, 2005).
While the production of individual cells specialized in vegetative functions comes with a number of benefits (Grosberg and Strathmann, 2007), the development of a dedicated vegetative cell lineage that is lost for organism reproduction is not obviously a beneficial adaptation. From the perspective of a cell in an organism, the guaranteed termination of its lineage seems the worst possible evolutionary outcome for itself. From the perspective of an entire organism, the death of somatic cells at the end of the life cycle is a waste of resources, as these cells could in principle become parts of the next generation of organisms. For example, exceptions from irreversible somatic differentiation are widespread in plants (Lanfear, 2018) and are even known in simpler metazoans among cnidarians (DuBuc et al., 2020) for which differentiation from vegetative to reproductive functions has been reported. Therefore, the irreversibility of somatic differentiation cannot be taken for granted in the course of the evolution of complex multicellularity.
Terminal differentiation is a type of cell differentiation different from irreversible cell differentiation. Unlike irreversibly differentiated cells who are capable of cell division, terminally differentiated cells lose the ability to divide. Terminally differentiated cells often perform tasks too demanding to be compatible with cell division. For example heterocysts of cyanobacteria perform nitrogen fixation, which requires anaerobic conditions, therefore these cells are very limited in resources and do not divide. In the scope of this study, we do not consider terminal differentiation but focus on somatic cells that are able to divide while being part of an organism (or cell colony) but not able to grow into a new organism, that is, irreversible somatic differentiation.
The majority of the theoretical models addressing the evolution of somatic cells focuses on the evolution of cell specialization, abstracting from the developmental process how germ (reproductive specialists) and soma are produced in the course of the organism growth. For example, a large amount of work focuses on the optimal distribution of reproductive and vegetative functions in the adult organism (Michod, 2007; Willensdorfer, 2009; Rossetti et al., 2010; Rueffler et al., 2012; Ispolatov et al., 2012; Goldsby et al., 2012; Solari et al., 2013; Goldsby et al., 2014; Amado et al., 2018; Tverskoi et al., 2018). However, these models do not consider the process of organism development. Other work takes the development of an organism into account to some extent: In Gavrilets, 2010, the organism development is considered, but the fraction of cells capable of becoming somatic is fixed and does not evolve. In Erten and Kokko, 2020, the strategy of germtosoma differentiation is an evolvable trait, but the irreversibility of somatic differentiation is taken for granted. In Rodrigues et al., 2012, irreversible differentiation was found, but both considered cell types pass to the next generation of organisms, such that the irreversible specialists are not truly somatic cells in the sense of evolutionary dead ends. Finally, in Cooper and West, 2018 a broad scope of cell differentiation patterns has been investigated in the context of evolution of cooperation. However, irreversible somatic differentiation was not considered in the study. Hence, the theoretical understanding of the evolution of irreversibly differentiated somatic cell lines is limited so far.
In the present work, we developed a theoretical model to investigate conditions for the evolution of the irreversible somatic differentiation. In the model, we suppose there are two cell types: germrole and somarole, where only germrole cells pass to the next generation of organisms while somarole cells are responsible for vegetative functions. Both germrole cells and somarole cells can divide and they may switch to each other during growth. In our model, we incorporate factors including (i) costs of cell differentiation, (ii) benefits provided by presence of somarole cells, (iii) maturity size of the organism. We ask under which circumstances irreversible somatic differentiation is a strategy that can maximize the population growth rate compared to strategies in which differentiation does not occur or somatic differentiation is reversible.
Model
We consider a large population of clonally developing organisms composed of two types of cells: germrole and somarole. The roles differ in the ability to survive beyond the end of the organism life cycle: somarole cells die at the end, while germrole cells continue to live. Each organism is initiated as a single germrole cell. In the course of the organism growth, germrole cells may differentiate to give rise to somarole cells and vice versa, see Figure 1A,B. After $n$ rounds of synchronous cell divisions, the organism reaches its maturity size of ${2}^{n}$ cells. Immediately upon reaching maturity, the organism reproduces: germrole cells disperse and each becomes a newborn organism, while all somarole cells die and are thus lost, see Figure 1A. We assume that somarole cells are capable to accelerate growth: an organism containing more somatic cells grows faster, so having somarole cells during the life cycle is beneficial for the organism.
To investigate the evolution of irreversible somatic differentiation, we consider organisms in which the functional role of the cell (germrole or somarole) is not necessarily inherited. When a cell divides, the two daughter cells can change their role, leading to three possible combinations: two germrole cells, one germrole cell plus one somarole cell, or two somarole cells. We allow all these outcomes to occur with different probabilities, which also depend on the parental type, see Figure 1B. If the parental cell had the germrole, the probabilities of each outcome are denoted by ${g}_{gg}$, ${g}_{gs}$, and ${g}_{ss}$ respectively. If the parental cell had the somarole, these probabilities are ${s}_{gg}$, ${s}_{gs}$, and ${s}_{ss}$. Altogether, six probabilities define a stochastic developmental strategy $D=({g}_{gg},{g}_{gs},{g}_{ss};{s}_{gg},{s}_{gs},{s}_{ss})$. In our model, it is the stochastic developmental strategy that is inherited by offspring cells rather than the functional role of the parental cell.
To feature irreversible somatic differentiation, the developmental strategy must allow germrole cells to give rise to somarole cells (${g}_{gg}<1$) and must forbid somarole cells to give rise to germrole cells (${s}_{ss}=1$). All other developmental strategies can be broadly classified into two classes. Reversible somatic differentiation describes strategies where cells of both roles can give rise to each other: ${g}_{gg}<1$ and ${s}_{ss}<1$. In the strategy with no somatic differentiation, somarole cells are not produced in the first place: ${g}_{gg}=1$, see Table 1.
In our model, evolution of the developmental strategy is driven by the growth competition between populations executing different strategies – these populations able to produce more offspring and/or complete their life cycle faster gain a selective advantage. Specifically, we measure the fitness in the growth competition by the population growth rate in a stationary regime of exponential growth (Pichugin et al., 2017; Gao et al., 2019). The rate of population growth is determined by the number of offspring produced by an organism (equal to the number of germrole cells at the end of life cycle) and the time needed for an organism to develop from a single cell to maturity (improved with the number of somarole cells during the life cycle).
To obtain these growth rates, we simulate the process of the organism growth. Here, we assume that resource distribution among cells is coordinated at the level of the organism: Cells which need more resources will get more, such that cell division is synchronous. In our model, we consider synchronous cell division of organisms and our main results are dependent on this assumption. However, we shortly explore the effects of asynchronous cell division in Appendix G. Any organism is born as a single germrole cell and passes through $n$ rounds of simultaneous cell divisions. Each round starts with every cell independently choosing the outcome of its division with probability of each outcome given by the developmental strategy ($D$). This step determines what composition will the organism have at the next round of cell division. Then, the length of the cell doubling round ($t$) is computed as a product of two independent effects: the differentiation effect ${F}_{\text{diff}}$ representing costs of changing cell roles (Gallon, 1992) and the organism composition effect ${F}_{\text{comp}}$ representing benefits from having somarole cells (Grosberg and Strathmann, 1998; Shelton et al., 2012; Matt and Umen, 2016),
Both ${F}_{\text{diff}}$ and ${F}_{\text{comp}}$ are recalculated at every round of cell division.
The cell differentiation effect ${F}_{\text{diff}}$ represents the costs of cell differentiation. The differentiation of a cell requires efforts to modify epigenetic marks in the genome, recalibration of regulatory networks, synthesis of additional and utilization of no longer necessary proteins. This requires an investment of resources and therefore an additional time to perform cell division. Hence, any cell, which is about to give rise to a cell of a different role, incurs a differentiation cost ${c}_{g\to s}$ for germtosoma and ${c}_{s\to g}$ for somatogerm transitions (and double of these if both offspring take a role different from the parent), see Figure 1C. The differentiation cost is the averaged differentiation cost among all cells in an organism
where ${N}_{s\to gs}$ is the number of somaroll cells that produce a germrole cell and a somarole cell in a cell division step. ${N}_{s\to gg}$, ${N}_{g\to gs}$ and ${N}_{g\to ss}$ are defined in the analogous way. $N$ is the number of total cells. As organisms undergo synchronous cell division, we have $N={2}^{n}$ cells after the $n$ th cell division.
The composition effect profile ${F}_{\text{comp}}(x)$ captures how the cell division time depends on the proportion of somarole cells $x=s/(s+g)$ present in an organism ($s$ and $g$ are the numbers of somarole and germrole cells). In this study, we use a functional form illustrated in Figure 1D and given by
With the functional form (3), somarole cells can benefit to the organism growth, only if their proportion in the organism exceeds the contribution threshold x_{0}. Interactions between somarole cells may lead to the synergistic (increase in the number of somarole cells improves their efficiency), or discounting benefits (increase in the number of somarole cells reduces their efficiency) to the organism growth, controlled by the contribution synergy parameter $\alpha $. The maximal achievable reduction in the cell division time is given by the maximal benefit $b$, realized beyond the saturation threshold x_{1} of the somarole cell proportion. A further increase in the proportion of somarole cells does not provide any additional benefits. With the right combination of parameters, (3) is able to recover various characters of somarole cells contribution to the organism growth: linear (${x}_{0}=0,{x}_{1}=1,\alpha =1$), powerlaw (${x}_{0}=0,{x}_{1}=1,\alpha \ne 1$), stepfunctions (${x}_{0}={x}_{1}$), and a huge range of other scenarios. Previous works have shown that convex (accelerating) performance functions favour cell differentiation (Michod, 2006; Rueffler et al., 2012; Cooper and West, 2018). The performance functions measure the performance of organisms with respect to different traits, such as fertility and viability. Lately, the form of functions favoring cell differentiation has been extended to be concave (decelerating) by including topological constraints in organisms (Yanni et al., 2020). Our model extends the form of performance functions by allowing it has a contribution threshold and saturation threshold.
Once the outcome of all cell divisions is known and the time needed to complete the current cell doubling round is computed, the current round ends and the next starts. The development completes after $n$ rounds. At this stage, the number of germrole cells (organism offspring number) and the cumulative length of the life cycle are obtained.
In Gao et al., 2019, we have shown that the growth rate ($\lambda $) of a population, in which organisms undergo a stochastic development and fragmentation, is given by the solution of
Here, $i$ is the developmental trajectory – in our case, the specific combination of all cell division outcomes; ${G}_{i}$ is the number of offspring organisms produced at the end of developmental trajectory $i$, equal to the number of germrole cells at the moment of maturity; ${P}_{i}$ is the probability that an organism development will follow the trajectory $i$; $T}_{i$ is the time necessary to complete the trajectory $i$ – from a single cell to the maturity size of ${2}^{n}$ cells.
For a given combination of differentiation costs (${c}_{g\to s}$, ${c}_{s\to g}$) and a composition effect profile (determined by four parameters: x_{0}, x_{1}, $b$, and $\alpha $), we screen through a number of stochastic developmental strategies $D$ and identify the one providing the largest growth rate ($\lambda $) to the population. In this study, we searched for those parameters under which irreversible strategies lead to the fastest growth and are thus evolutionary optimal, see model details in Appendix A.
Results
For irreversible somatic differentiation to evolve, cell differentiation must be costly
We found that irreversible somatic differentiation does not evolve when cell differentiation is not associated with any costs (${c}_{s\to g}={c}_{g\to s}=0$), see Figure 2A. Only reversible differentiation evolves there, see Figure 2B. This finding comes from the fact that when somatic differentiation is irreversible, the fraction of germrole cells can only decrease in the course of life cycle. As a result, irreversible strategies deal with the tradeoff between producing more somarole cells at the beginning of the life cycle, and having more germrole cells by the end of it. On the one hand, irreversible strategies which produce a lot of somarole cells early on, complete the life cycle quickly but preserve only a few germrole cells by the time of reproduction. On the other hand, irreversible strategies which generate a lot of offspring, can deploy only a few somarole cells at the beginning of it and thus their developmental time is inevitably longer. By contrast, reversible somatic differentiation strategies do not experience a similar tradeoff, as germrole cells can be generated from somarole cells. As a result, reversible strategy allows higher differentiation rates and can develop a high somarole cell fraction in the course of the organism growth and at the same time have a large number of germrole cells by the moment of reproduction. Under costless cell differentiation, for any irreversible strategy, we can find a reversible differentiation counterpart, which leads to faster growth: the development proceeds faster, while the expected number of produced offspring is the same, see Appendix 2 for details. As a result, costless cell differentiation cannot lead to irreversible somatic differentiation.
To confirm the reasoning that reversible strategies gain an edge over irreversible strategies by having larger differentiation rates, we asked which reversible and irreversible strategies become optimal at various cell differentiation costs ($c={c}_{s\to g}={c}_{g\to s}$). At each value of costs, we found evolutionarily optimal developmental strategy for 3000 different randomly sampled composition effect profiles ${F}_{\text{comp}}(x)$. We found that evolutionarily optimal reversible strategies feature much larger rates of cell differentiation than evolutionarily optimal irreversible strategies, see Figure 2D. Even at large costs, where frequent differentiation is heavily penalized, the distinction between differentiation rates of reversible and irreversible strategies remains apparent.
We screened through a spectrum of germtosoma (${c}_{g\to s}$) and somatogerm (${c}_{s\to g}$) differentiation costs, see Figure 2A–C. Irreversible somatic differentiation is most likely to evolve when it is cheap to differentiate from germrole to somarole (low ${c}_{g\to s}$) but it is expensive to differentiate back (high ${c}_{s\to g}$), see Figure 2A. Irreversible strategies are insensitive to high somatogerm costs, since somarole cells never differentiate. At the same time, reversible strategies are heavily punished by high costs of somarole differentiation.
It is not very surprising to find irreversible differentiation where the differentiation costs are highly asymmetric. However, irreversible strategies are consistently observed in other regions of the costs space, even including these, where the asymmetry is opposite (it is hard to go from germ to soma but easy to return back), see Figure 2A,H. To identify what other factors, beyond asymmetric costs, can lead to evolution of irreversible somatic differentiation, below we focus on the scenario of equal differentiation costs ${c}_{s\to g}={c}_{g\to s}=c$.
Evolution of irreversible somatic differentiation is promoted when even a small number of somatic cells provides benefits to the organism
The composition effect profiles ${F}_{\text{comp}}(x)$ that promote the evolution of irreversible somatic differentiation have certain characteristic shapes, see Figure 2E–H. We investigated what kind of composition effect profiles can make irreversible somatic differentiation become an evolutionary optimum. We sampled a number of random composition effect profiles with independently drawn parameter values and found optimal developmental strategies for each profile for a number of differentiation costs ($c$) and maturity size (${2}^{n}$) values. We took a closer look at the instances of ${F}_{\text{comp}}(x)$ which resulted in irreversible somatic differentiation being evolutionarily optimal.
We found that irreversible strategies are only able to evolve when the somarole cells contribute to the organism cell doubling time even if present in small proportions, see Figure 3A,B. Analysing parameters of the composition factors promoting irreversible differentiation, we found that this effect manifests in two patterns. First, the contribution threshold value (x_{0}) has to be small, see Figure 3D – irreversible differentiation is promoted when somarole cells begin to contribute to the organism growth even in low numbers. Second, the contribution synergy was found to be large ($\alpha >1$) or, alternatively, the saturation threshold (x_{1}) was small, see Figure 3C.
Both the contribution threshold x_{0} and the contribution synergy $\alpha $ control the shape of the composition effect profile at intermediary abundances of somarole cells. If the contribution synergy $\alpha $ exceeds 1, the profile is convex, so the contribution of somarole cells quickly becomes close to maximum benefit ($b$). A small saturation threshold (x_{1}) means that the maximal benefit of soma is achieved already at low concentrations of somarole cells (and then the shape of composition effect profile between two close thresholds has no significance). Together, these patterns give an evidence that the most crucial factor promoting irreversible somatic differentiation is the effectiveness of somarole cells at small numbers, see Appendix 4 for more detailed data presentation.
These patterns are driven by the static character of differentiation strategies we use: the chances for a cell to differentiate are the same at the first and the last round of cell division. Therefore, the optimal germtosoma differentiation rate is found as a balance between the needs to deploy somarole cells early on and to keep the high number of germrole by the end of the life cycle. This implies that irreversible somatic differentiation strategies produce somarole cells at lower rate than reversible strategies, see Figure 2D. With irreversible differentiation, an organism spends a significant amount of time having only a few somarole cells. Hence, the irreversible strategy can only be evolutionarily successful, if the few somarole cells have a notable contribution to the organism growth time.
We also found that profiles featuring irreversible differentiation do not possess neither extremely large, nor extremely small maximal benefit values $b$, see Figure 3D. When the maximal benefit is too small, the cell differentiation just does not provide enough benefits to be selected for and the evolutionarily optimal strategy is no differentiation. In the opposite case, when the maximal benefit is very close to one, the cell doubling time approaches zero, see Equation (3). Then, the benefits of having many somarole cells outweighs the costs of differentiation and the optimal strategy is reversible, see Appendix 4.
For irreversible somatic differentiation to evolve, the organism size must be large enough
By screening through the maturity size (${2}^{n}$) and differentiation costs ($c$), we found that the evolution of irreversible somatic differentiation is heavily suppressed at small maturity sizes, Figure 4A. We found that either reversible strategies or the no differentiation strategy evolve in small organisms. Since reversible strategies can quickly reach a fixed fraction of somarole cells, thus they can obtain maximised benefits from somarole cells with small maturity sizes (Appendix 2—figure 1). Since the no differentiation strategy does not involve cell differentiation, they do not have cell differentiation costs. In contrast, irreversible strategies increase the fraction of somaroles and increase the benefits of somarole cells gradually as maturity size increases. Meanwhile, the cell differentiation costs for irreversible strategies decrease as maturity size increases as the fraction of germrole cells decreases. Thus compared with other strategies, the irreversible strategies have advantages in large organisms. We found that under ${c}_{s\to g}={c}_{g\to s}$, the minimal maturity size allowing irreversible somatic differentiation to evolve is ${2}^{n}=64$ cells. At the same time, organisms performing just a few more rounds of cell divisions are able to evolve irreversible differentiation at a wide range of cell differentiation costs, see also Appendix 5. This indicates that the evolution of irreversible somatic differentiation is strongly tied to the size of the organism.
Evolution of irreversible strategies at sizes smaller than 64 cells is possible for ${c}_{s\to g}>{c}_{g\to s}$. For instance, at ${c}_{s\to g}=2{c}_{g\to s}$ some irreversible strategies were found to be optimal at the maturity size 2^{5} = 32 cells, Figure 4B. However, irreversible strategies were found in a narrow range of cell differentiation costs and the fraction of composition effect profiles that allow evolution of irreversible differentiation there was quite low – about 1%. The evolution of irreversible strategies at such small maturity sizes becomes likely only at extremely unequal costs of transition between germ and some roles ${c}_{s\to g}\gg {c}_{g\to s}$, see Figure 4C. Hence, for irreversible somatic differentiation to evolve, the organism size should exceed a threshold of roughly 64 cells.
Irreversible somatic differentiation can also evolve when cell differentiation is risky
In our main model, we considered differentiation costs in a specific form of cell division delay. However, the process of cell differentiation may impact the organism development in another way. Differentiation requires modifications in DNA regulation, which in turn poses a risk of dysregulation resulting in an emergence of selfish mutants that could for example cause cancer. The disposable soma theory suggests that cells performing vegetative functions form separate lineages to contain emerging mutations and prevent them from passing to the next generations of organisms. In line with this hypothesis, we also considered a model of risky cell differentiation, where the transition between germ and soma roles incurs a risk of getting cancer that kills the entire organism, see Appendix 6.
The results obtained with a model of risky differentiation are very similar to the outcomes of our main model, where cell differentiation cause delay, see Figure 5. In both models, irreversible differentiation only evolves if cell differentiation does not come for free but brings costly sideeffects (delay or risk). Also, in both models irreversible differentiation is prevalent when costs of somatogerm transitions are intense; reversible differentiation is prevalent when costs of both transitions are low; and no differentiation is prevalent when costs of germtosoma transitions are intense Figure 2A–C.
Discussion
The vast majority of cells in a body of any multicellular being contains enough genetic information to build an entire new organism. However, in a typical metazoan species, very few cells actually participate in the organism reproduction – only a limited number of germ cells are capable of doing it. The other cells, called somatic cells, perform vegetative functions but do not contribute to reproduction – somatic differentiation is irreversible. We asked for the reason for the success of such a specific mode of organism development. We theoretically investigated the evolution of irreversible somatic differentiation with a model of clonally developing organisms taking into account benefits provided by somarole cells, costs arising from cell differentiation, and the effect of the raw organism size.
Our key findings are:
The evolution of irreversible somatic differentiation is inseparable from costly cell differentiation or risky cell differentiation.
For irreversible somatic differentiation to evolve in organisms with synchronous cell division, somatic cells should be able to contribute to the organism performance already when their numbers are small.
Only large enough organisms tend to develop irreversible somatic differentiation.
According to our results, cell differentiation costs are essential for the emergence of irreversible somatic differentiation, see Figure 2A. The costs punish strategies with high rate of cell differentiation. As a result, irreversible strategies gain an advantage because their overall differentiation rate is low, see Figure 2D, and somarole cells do not differentiate at all. Most models focus on traits that lead to benefits for the organism, while the cost of cell differentiation are rarely considered. For cells in a multicellular organism, differentiation costs arise from the material needs, energy, and time it takes to produce components necessary for the performance of the differentiated cell, which were absent in the parent cell. For instance, in filamentous cyanobacteria nitrogenfixating heterocysts develop much thicker cell wall than parent photosynthetic cells had. Also, reports indicate between 23% (Ow et al., 2008) and 74% (Sandh et al., 2014) of the proteome changes its abundance in heterocysts compared against photosynthetic cells. Similarly, the changes in the protein composition in the course of cell differentiation was found during the development of stalk and fruiting bodies of Dictyostelium discoideum (Bakthavatsalam and Gomer, 2010; Czarna et al., 2010).
An alternative to differentiation costs in terms of slower growth is a model with a risky differentiation, where we found similar patterns, see Figure 5. These results indicate that the exact mechanism of the differentiation costs does not play a major role in the evolution of irreversible somatic differentiation.
Our model demonstrates that irreversible somatic differentiation is more likely to evolve when a few somarole cells are able to provide a substantial benefit to the organism, see Figure 3. Volvocales algae demonstrate that a significant contribution by small numbers of somatic cells might indeed be found in a natural population: In Eudorina illinoiensis, only four out of thirtytwo cells are vegetative (Sambamurty AVSS, 2005) (somarole in our terms). This species has developed some reproductive division of labour and a fraction of only $1/8$ of vegetative cells is sufficient for colony success. Thus, it seems possible that highlyefficient somarole cells open the way to the evolution of irreversible somatic differentiation. Several patterns of how cells proved the benefit to an organism have been previously considered (Michod, 2007; Willensdorfer, 2009; Rossetti et al., 2010; Rueffler et al., 2012; Cooper and West, 2018; Yanni et al., 2020). The majority of papers focuses on the resource allocation toward different tasks in each cell in an organism and how divergent different cells can be. In our model, we assume that the germrole and somarole cell are different in function and focus on the relationship between the number of somarole cells and their impact, e.g. the character of their interactions. While the found ${F}_{\text{comp}}$ curves exhibit convexlike shape, see Figure 3A,B, this finding has a different nature from the convex tradeoff between fertility and viability found in the models of cell differentiation (Michod, 2007).
Our model shows that irreversible somatic differentiation does not evolve if the organism size is small, see Figure 4A. The maturity size plays an important role in an organism’s life cycle (Amado et al., 2018; Erten and Kokko, 2020): Large organisms have potential advantages to optimize themselves in multiple ways, such as to improve growth efficiency (Waters et al., 2010), to avoid predators (Matz and Kjelleberg, 2005; Fisher et al., 2016; Hiltunen and Becks, 2014), to increase problemsolving efficiency (MorandFerron and Quinn, 2011), and to exploit the division of labour in organisms (Carroll, 2001; Matt and Umen, 2016). Moreover, the maximum size has been related to the reproduction of the organism from the onset of multicellularity in Earth’s history (Ratcliff et al., 2012). Our results suggest that the smallest organism able to evolve irreversible somatic differentiation should typically be about 32–64 cells (unless the cost of somatogerm differentiation is extremely large and the cost of the reverse is low). This is in line with the pattern of development observed in Volvocales green algae. In Volvocales, cells are unable to move (vegetative function) and divide (reproductive function) simultaneously, as a unique set of centrioles are involved in both tasks (Wynne and Bold, 1985; Koufopanou, 1994). Chlamydomonas reinhardtii (unicellular) and Gonium pectorale (small colonies up to 16 cells) perform these tasks at different times. They move towards the top layers of water during the day to get more sunlight. At night, however, these species perform cell division and/or colony reproduction, slowly sinking down in the process. However, among larger Volvocales, a division of labour begins to develop. In Eudorina elegans colonies, containing 16–32 cells, a few cells at the pole have their chances to give rise to an offspring colony reduced (Marchant, 1977; Hallmann, 2011). In P. californica, half of the 128celled colony is formed of smaller cells, which are totally dedicated to the colony movement and die at the end of colony life cycle (Kikuchi, 1978; Hallmann, 2011). In Volvox carteri, most of a 10,000 cell colony is formed by somatic cells, which die upon the release of offspring groups (Hallmann, 2011).
In a majority of our tests, we used the maturity size of 2^{10} = 1024 cells. This is significantly larger than the minimal necessary size for evolution of irreversible somatic differentiation. However, the body size of the order of 1000 cell attracts attention because at this scale organisms of very diverse degrees of complexity are observed: from undifferentiated colonies (ocean algae Phaeocystis antarctica), to intermediary life forms (slime molds slugs), to paradigm multicellular organisms (higher Volvocales and nematode Caenorhabditis elegans).
The model presented in our study focuses on the transition from colonial life forms to multicellular beings. Further development of complexity opens multiple new ways for optimization of life cycle. For example, a maternal organism can provide protection and nurture for offspring at their early stages of growth, like in V. carteri (10,000 cells) in which offspring colonies develop inside the parental organism. There, the rate of offspring growth depends mostly on the performance of the maternal organism and much less on the differentiation strategy of offspring. Having maternal protection allows to relax the conditions for evolution of irreversible differentiation indicated in our study. How much these conditions can be relaxed is a very interesting question.
One of the most significant assumptions we took is the synchronicity of cell divisions even if division outcomes are different. This is only possible if cell actions are coordinated at the level of organism – otherwise, cells that do not differentiate may complete their divisions before differentiating cells. When in the history of multicellularity such a coordination emerges is an open question. However, in a number of rather simple species, a synchronicity of cell divisions paired with cell differentiation is observed. One example is the green algae Eudorina illinoiensis – one of the simplest species demonstrating the first signs of reproductive division of labour, in which four out of 32 cells are differentiated (Sambamurty AVSS, 2005). Another example is 128celled algae Pleodorina californica, half of the cells are differentiated. And still, the cell divisions are synchronous (Kikuchi, 1978). Even the size of the mature organism being a power of two indicates that cells do not divide independently, but their actions are controlled at the level of the organism.
To peek at the impact of the cell division synchronicity, we developed a model with asynchronous cell division, where cell differentiation costs are paid individually by each differentiating cell, see Appendix. G. We found that the evolution of irreversible differentiation is significantly suppressed even under the most favourable conditions (${c}_{s\to g}\gg {c}_{g\to s}$) – the frequency of composition profiles promoting irreversible somatic differentiation is much smaller and the maturity size restriction is higher.
Another assumption, which shapes the results of our study, is the static differentiation strategy the probability of each division outcome does not depend on the stage of life cycle. On the one hand, the static nature of differentiation strategy puts irreversible differentiation in disadvantage, as it creates a tradeoff between the fraction of somarole cells at the early stage of life cycle and the number of germrole cells at the end of life cycle. On the other hand, a set of fully flexible dynamic differentiation strategies present an efficient but hardly realistic solution to the life cycle optimization problem: at the first round of cell divisions organism converts to allsoma state and remains so until the last round, when all cells convert back to germstate. Theoretically, this strategy provides simultaneously the fastest possible development rate (100% somarole cells during life cycle) and the largest possible number of offspring (100% germrole cells at the end of life cycle). Still, we cannot provide an example of such a developmental program in nature. Nevertheless, the differentiation strategy of higher Volvocales is not static Kirk, 2005 and the exploration of a vast space of dynamic differentiation strategies warrants further investigation.
We acknowledge that our discussion of natural examples of germsoma differentiation relies heavily on Volvocales algae. This merely reflects the bias in the empirical literature about evolution of germ/soma differentiation towards this group. We should note that our model is not a model of Volvocales life cycle. Instead, we aim to answer the question about emergence of irreversible somatic differentiation in a broad context without tailoring it to the features of a single group.
Our study originated from curiosity about driving factors in the evolution of irreversible somatic differentiation: Why does the green algae Volvox from the kingdom Plantae shed most of its biomass in a single act of reproduction? And why, in another kingdom, Animalia, in most of the species the majority of body cells is outright forbidden to contribute to the next generation? Our results show which factors makes a difference between the evolution of an irreversible somatic differentiation and other strategies of development. One of these factors, the maturity size, is known in the context of the evolution of reproductive division of labour (Kirk, 2005). Another factor, the costs of cell differentiation, is, in general, discussed in a greater biological scope but is hardly acknowledged as a factor contributing to the evolution of organism development. Finally, the early contribution of somarole cells to the organism growth, even if they are small in numbers, is an unexpected outcome of our investigation, overlooked so far as well. Despite the simplistic nature of our model (we did not aim to model any specific organism), all our results find a confirmation among the Volvocales clade. Hence, we expect that the findings of this study reveal general properties of the evolution of irreversible somatic differentiation, independently of the clade where it evolves.
Appendix 1
Search for the evolutionarily optimal developmental program
Finding the population growth rate for a given developmental program
In Gao et al., 2019, we have shown that a population of organisms, which begin their life cycle from the same state but have a stochastic development, eventually grows exponentially with the rate $\lambda $ given by the solution of
Here, $i$ is the developmental trajectory – in our case, the specific combination of all cell division outcomes; ${P}_{i}$ is the probability that an organism development will follow the trajectory $i$; $T}_{i$ is the time necessary to complete the trajectory $i$ – from a single cell to the maturity size of ${2}^{n}$ cells; ${G}_{i}$ is the number of offspring organisms produced at the end of developmental trajectory $i$, equal to the number of germrole cells at the moment of maturity.
In order to find the population growth rate, we need to know ${G}_{i}$, ${T}_{i}$, and ${P}_{i}$ (how many offspring are produced, how long did it take to mature, and how likely is this developmental trajectory, respectively). The complete set of developmental trajectories is huge as it scales exponentially with the number of divisions $n$.
In our study, for each developmental strategy, we sampled $M=300$ developmental trajectories at random. To get each trajectory, we simulated the growth of the single organism according to the rules of our model. For each trajectory, the developmental time ${T}_{i}$ was computed as a sum of cell doubling times at each of the $n$ synchronous cell divisions, the number of offspring ${G}_{i}$ was given by the count of germrole cells at the end of development. The resulting ensemble of trajectories (with ${P}_{i}=1/M$) was plugged into (5) to compute the population growth rate $\lambda $.
Finding the developmental program with the largest population growth rate
We assume that evolution occurs by growth competition between populations executing different developmental strategies. These strategies, which provide larger population growth rate will outgrow others. To find evolutionarily optimal strategies under given conditions, we screened through a large set of developmental strategies and identified the one with the maximal population growth rate $\lambda $. Since the probabilities of cell division outcomes sum into one (${g}_{gg}+{g}_{gs}+{g}_{ss}=1$ and ${s}_{gg}+{s}_{gs}+{s}_{ss}=1$), these probabilities can be represented as a point on two simplexes, one for the division of germrole cells, and one for the division of somarole cells. Consequently, we choose the set of developmental strategies as a Cartesian product of two triangular lattices – one for division probabilities of germrole cells (${g}_{gg},{g}_{gs},{g}_{ss}$) and one for somarole cells (${s}_{gg},{s}_{gs},{s}_{ss}$). The lattice space was set to 0.1, so each of two independent lattices contained $11\times 12/2=66$ nodes, and the whole set of developmental strategies comprised 66 × 66 = 4356 different strategies. For each of these strategies, the population growth rate $\lambda $ was calculated and the strategy with the largest growth rate was identified as evolutionarily optimal.
In our investigation, parameters such as differentiation costs (${c}_{s\to g}$, ${c}_{g\to s}$) and maturity size (${2}^{n}$) were used as control parameters. In other words, we either fix them at the specific values, or screened through a range of values to obtain a map (see Figures 2 and 3 in the main text). However, the parameters that controlled the shape of composition effect profile (x_{0}, x_{1}, $\alpha $, and $b$) were treated differently. For each combination of control parameters, we randomly sampled a number (between 200 and 3000) of combinations of these parameters. The thresholds ($0\le {x}_{0}\le {x}_{1}\le 1$) were sampled as a pair of independent distributed random values from the uniform distribution $U(0,1)$. The contribution threshold x_{0} was set to the minimum of the pair, and the saturation threshold x_{1} was set to the maximum. The contribution synergy ($\alpha >0$) corresponds to the concave shape of the profile at $\alpha <1$ and to the convex shape at $\alpha >1$. Therefore, ${\mathrm{log}}_{10}(\alpha )$ was sampled from the uniform distribution $U(2,+2)$, so the profile has an equal probability to demonstrate concave and convex shape. Finally, the maximum benefit ($0\le b<1$) was sampled from a uniform distribution, $U(0,1)$. For each tested combination of control parameters, we found the optimal developmental strategy for every sampled profile. We then classified these as irreversible somatic differentiation, reversible somatic differentiation, or no somatic differentiation.
Appendix 2
Under costless cell differentiation, irreversible soma strategy cannot be evolutionarily optimal
In this section, we will show that an irreversible strategy can never be an evolutionary optimum without cell differentiation being costly. To do that, we first consider the deterministic dynamics of the expected composition of the organism. Then, for an arbitrary irreversible strategy, we identify a more advantageous reversible strategy which gives the same organism composition at the end of life cycle but higher number of somarole cells during the life cycle.
In our model, the composition of the organism is governed by the stochastic developmental strategy and differs between different organisms. Here, as a proxy for this complex stochastic dynamics, we consider the mathematical expectation of the composition. Assume that after $j$ cell divisions the fraction of somarole cells is ${r}_{s}(j)$ and the fraction of germrole cells is ${r}_{g}(j)=1{r}_{s}(j)$ , $j=1,\mathrm{\dots},n$, where $n$ is the maximal number of divisions. Then, the expected fractions of cells of the two types after the next cell division is
where we introduced ${m}_{s}={s}_{gg}+\frac{{s}_{gs}}{2}$ and ${m}_{g}={g}_{ss}+\frac{{g}_{gs}}{2}$ – the probabilities that the offspring of a cell will have a different role. Naturally, for irreversible somatic differentiation ${m}_{s}=0$ and ${m}_{g}>0$ , for no somatic differentiation strategies ${m}_{g}=0$ and m_{s} being irrelevant, while the reversible differentiation class covers the rest. (6) can be written in matrix form
A newborn organism contains a single germrole cell (${r}_{s}(0)=0,{r}_{g}(0)=1$) , therefore, the expected composition of an organism after $j$ divisions is
The matrix has two eigenvalues: 1 and $1{m}_{g}{m}_{s}$, with associated right eigenvectors ${({m}_{g},{m}_{s})}^{T}$ and ${(1,1)}^{T}$, respectively. Hence, the expected composition after $j$ divisions can be obtained in the explicit form
For an arbitrary irreversible somatic differentiation strategy $D$, ${m}_{s}=0$, the expected number of somarole cells changes as
which is a monotonically increasing function of the number of cell divisions $t$, see the green line in Fig. B. In the life cycle involving $j$ cell divisions, the fraction of somarole cells at the end of life cycle is $r}_{s,D}(j)=1(1{m}_{g}{)}^{j$.
Now, we consider another developmental strategy ${D}^{\prime}$ with reversible somatic differentiation in which ${m}_{g}^{\mathrm{\prime}}={r}_{s,D}(n)$ and ${m}_{s}^{\mathrm{\prime}}=1{r}_{s,D}(n)$. Using ${m}_{g}^{\prime}+{m}_{s}^{\prime}=1$ in (9), it can be shown that the expected fraction of somarole cells in ${D}^{\prime}$ after the very first cell division is exactly ${r}_{s,D}(n)$ and stays constant thereafter, see the orange line in Fig. B. Thus, the number of offspring produced is the same for both development strategies.
If cell differentiation is costless (${d}_{s}={d}_{g}=0$), then the cell doubling time depends only on the fraction of somarole cells. As all somarole cells are then present already after the first cell division, organisms following the reversible strategy ${D}^{\prime}$ will grow faster than organisms using the irreversible strategy $D$ at any stage of organism development, independently of the choice of the composition effect profile (${F}_{\text{comp}}$). At the end of the life cycle, both strategies have the same expected number of offspring. Therefore, under costless cell differentiation, for any irreversible strategy, we can find a reversible strategy that leads to a larger population growth rate.
Appendix 3
Conditions promoting the evolution of reversible, irreversible, and no differentiation strategies
Appendix 4
Parameters of composition effect profiles promoting reversible, irreversible, and no differentiation strategies
Appendix 5
Evolution of irreversible somatic differentiation under various maturity sizes and unequal cell differentiation costs
Appendix 6
Model of risky cell differentiation
In the risky differentiation model, we assume that cell differentiation implies a risk of errors leading to defective cells (Aktipis et al., 2015). These cells act in their selfish interests, compromising the integrity of an organism. This leads to the organism death, very similar to outcomes of cancer in complex multicellular species.
The impact of the defective cell depends on which stage of life cycle it appears. A defective cell emerged during the first cell division will likely result in a nonviable organism. At the same time, a defective cell emerged in the very last round of cell divisions is unlikely to affect the organism because its life cycle is about to end. To reflect this effect, we scaled the impact of a newly emerged defective cell by the number of cells already present in an organism. This way, the probability to get cancer is proportional to the frequency of cell differentiation events. The proportions of somarole cells and germrole cells that differentiate upon division in the total number of cell divisions are
where ${N}_{x\to yz}$ is the number of cell divisions at which cell of role $x$ gives rise to a $y$ cell and a $z$ cell, and $Z={2}^{n}1$ is the total number of cell divisions during the organism growth with maturity size ${2}^{n}$.
We define the probabilities of death caused by defective cells emerged in germ to soma and soma to germ transitions as
where ${\delta}_{g\to s}$ and ${\delta}_{s\to g}$ characterize the risk of cancer from a germ to soma and from a soma to germ transition. The transformation function $\mathrm{tanh}(x)$ is chosen to grow linearly at a small number of differentiation events but exponentially saturates to one if these events are numerous, see Fig. F.
We assume that an organism successfully completes its life cycle and produces offspring only if no cancer emerges in the course of its growth. The probability of this at each round of cell division is
Otherwise, the organism dies and does not produce any offspring. There are no delay differentiation costs in this model $({c}_{s\to g}={c}_{g\to s}=0)$.
A typical feature of the cancer cells in complex organisms is a high cell division rate. This has a large impact on organisms of complex animals, in which the division rate of regular cells is low and the life cycle are long. However, organisms in the focus of our study have very short life cycles (few rounds of cell divisions) and even the regular cells actively proliferate. Hence, the growth advantage of defective cells should have much smaller impact on simple species. Therefore, in this model, we neglect the difference in division rates between defective and regular cells and keep cell divisions synchronous.
The probability of getting cancer depends on the frequency of cell differentiation events. An organism with a higher cell differentiation rate has a higher death probability, which leads to slower population growth.
Appendix 7
Evolution of irreversible somatic differentiation in a model with asynchronous cell division
Our original model features synchronous cell division. This comes from the assumption that differentiation costs are paid collectively by the whole organism. Here, we consider another option, where differentiation costs are paid individually by each cell. An immediate consequence is that cell division in such a model is asynchronous because differentiating cells take more time to divide.
In the asynchronous model, we model cell division as a random process occurring with the reaction rates
where $s,g$ are the number of germ and soma cells in the organism, ${s}_{xy},{g}_{xy}$ are elements of the differentiation program $D$, ${F}_{\text{comp}}$ is the composition effect profile computed identically to the synchronous model, see Equation 3, and ${c}_{s\to g}$ and ${c}_{g\to s}$ are differentiation costs.
We use the Gillespie algorithm to find which kind of cell division occurs next and how much time does it take. Then a chosen cell division occurs once (organism grows by a single cell). After that ${F}_{\text{comp}}$ value is updated to reflect the changed composition. Then the next cell division is sampled and the process continues until the organism reaches the maturity size. This model is designed to be the asynchronous implementation of our ideas, which remains close to our original model presented in the main text. Therefore, the rest of simulation protocol remains the same.
Computation time of the asynchronous model scales linearly with the number of cell divisions: it takes 1023 simulation steps to simulate the growth from 1 to 1024 cells. Therefore, it is computationally much more demanding than the synchronous model. The synchronous model scales linearly with the number of cell generations: the same growth to 1024 cells needs only 10 steps there. Maps similar to Figure 2A–C are unavailable with asynchronous model for computational reasons. Still, we calculated optimal differentiation strategies for a single combination of costs: ${c}_{g\to s}=0$, ${c}_{s\to g}=10$. Under these conditions, which favour evolution of irreversible differentiation in the synchronous model, it is significantly suppressed in the asynchronous model, see Fig. G.
The reason behind the difference between results for synchronous and asynchronous models is the different performance of reversible strategies in these models. If costs of soma differentiation are large enough, the expected period of cell division for a differentiating somarole cell is longer than the length of life cycle. As a result, instead of redifferentiation, somarole cells become effectively terminally differentiated. In such a situation, the growth of the organism is determined by propagation of germrole cells and does not depend on the value of soma differentiation costs.
The key to success of reversible strategies in the synchronous model was an ability to develop large fractions of somarole cells early on and to keep this fraction in the course of life cycle. There, the fraction of somarole cells is preserved by a dynamic equilibrium between differentiation in both directions, see Appendix 2. In the asynchronous model with high soma differentiation costs, somarole cells do not divide and such a dynamic equilibrium does not exist. The fraction of somarole cells is maintained differently here. If we denote the number of germrole and somarole cells at time $j$ (unlike Appendix 2, it is a continuous parameter here) as $g(j)$ and $s(j)$, respectively, then in the case of nondividing somarole cells (${c}_{g\to s}=0$, ${c}_{s\to g}\gg 1$, ${s}_{ss}=0$), the dynamics of the organism is given by
The solution of this system of equations with initial condition of one germrole and no somarole cells is
Hence, the fraction ${r}_{s}(j)$ of somarole cells is
The differentiation strategy considered above (${s}_{ss}=0$) is an extreme case where a dynamic equilibrium between cell differentiations is not possible. Still, Equation 17 demonstrates that a balance between germrole and somarole cells is still achieved here. Therefore, in the asynchronous model with highly asymmetric differentiation costs, the reversible strategies keep all components that make them successful in the no costs scenario: the early production of somarole cells due to high differentiation rates, the necessary fraction of somarole cells during the life cycle (Equation 17), and the overall fast growth of the whole organism, despite having nondividing somarole cells (Equation 16).
Note that in irreversible strategies, somarole cells do not differentiate and therefore divide at a normal rate. Therefore, the characteristic tradeoff of irreversible strategies between having more somarole cells early and more germrole cells later in life cycle remains in place even in the asynchronous model. As a result, in this model, reversible strategies are not punished by asymmetric costs and outcompete irreversible ones.
Data availability
The code implementing our model is deposited at https://github.com/YuanxiaoGao/Evolutionofirreversiblesomaticdifferentiation (copy archived at https://archive.softwareheritage.org/swh:1:rev:9a1ea7c84f3041ebe3720e7837b28182912b5e00).
References

Cancer across the tree of life: cooperation and cheating in multicellularityPhilosophical Transactions of the Royal Society B: Biological Sciences 370:20140219.https://doi.org/10.1098/rstb.2014.0219

A mechanistic model for the evolution of multicellularityPhysica A: Statistical Mechanics and Its Applications 492:1543–1554.https://doi.org/10.1016/j.physa.2017.11.080

Division of labour and the evolution of extreme specializationNature Ecology & Evolution 2:1161–1167.https://doi.org/10.1038/s4155901805649

From zygote to a multicellular soma: Body size affects optimal growth strategies under cancer riskEvolutionary Applications 13:1593–1604.https://doi.org/10.1111/eva.12969

Multicellular group formation in response to predators in the alga Chlorella vulgarisJournal of Evolutionary Biology 29:551–559.https://doi.org/10.1111/jeb.12804

Reconciling the incompatible: N2 fixation and O2The New Phytologist 129:571–609.https://doi.org/10.1111/J.14698137.1992.TB00087.X

Interacting cells driving the evolution of multicellular life cyclesPLOS Computational Biology 15:e1006987.https://doi.org/10.1371/journal.pcbi.1006987

Rapid transition towards the division of labor via evolution of developmental plasticityPLOS Computational Biology 6:e1000805.https://doi.org/10.1371/journal.pcbi.1000805

One cell, two cell, red cell, blue cell: the persistence of a unicellular stage in multicellular life historiesTrends in Ecology & Evolution 13:112–116.https://doi.org/10.1016/S01695347(97)01313X

The Evolution of Multicellularity: A Minor Major Transition?Annual Review of Ecology, Evolution, and Systematics 38:621–654.https://doi.org/10.1146/annurev.ecolsys.36.102403.114735

Evolution of reproductive development in the volvocine algaeSexual Plant Reproduction 24:97–112.https://doi.org/10.1007/s0049701001584

Division of labour and the evolution of multicellularityProceedings of the Royal Society B: Biological Sciences 279:1768–1776.https://doi.org/10.1098/rspb.2011.1999

The evolution of soma in the volvocalesThe American Naturalist 143:907–931.https://doi.org/10.1086/285639

Do plants have a segregated germline?PLOS Biology 16:e2005439.https://doi.org/10.1371/journal.pbio.2005439

Off the hookhow Bacteria survive protozoan grazingTrends in Microbiology 13:302–307.https://doi.org/10.1016/j.tim.2005.05.009

Quantitative shotgun proteomics of enriched heterocysts from Nostoc sp. PCC 7120 using 8plex isobaric peptide tagsJournal of Proteome Research 7:1615–1628.https://doi.org/10.1021/pr700604v

Fragmentation modes and the evolution of life cyclesPLOS Computational Biology 13:e1005860.https://doi.org/10.1371/journal.pcbi.1005860

Differences in cell division rates drive the evolution of terminal differentiation in microbesPLOS Computational Biology 8:e1002468.https://doi.org/10.1371/journal.pcbi.1002468

The evolutionary path to terminal differentiation and division of labor in cyanobacteriaJournal of Theoretical Biology 262:23–34.https://doi.org/10.1016/j.jtbi.2009.09.009

Distributions of reproductive and somatic cell numbers in diverse Volvox (Chlorophyta) speciesEvolutionary Ecology Research 14:707.

A general allometric and lifehistory model for cellular differentiation in the transition to multicellularityThe American Naturalist 181:369–380.https://doi.org/10.1086/669151

BookIntroduction to the Algae: Structure and ReproductionPrenticeHall, Incorporated.
Decision letter

Aleksandra M WalczakSenior and Reviewing Editor; École Normale Supérieure, France

E Yagmur ErtenReviewer; University of Zurich, Switzerland

Guy CooperReviewer; St. John's College, United Kingdom

George ConstableReviewer
Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.
Acceptance summary:
The paper proposes a model, which studies an oftenneglected aspect of cellular differentiation and division of labour. While the model is relatively simple, the premise and the findings are thoughtprovoking and this study can potentially provide the groundwork for further investigation.
Decision letter after peer review:
Thank you for submitting your article "Evolution of irreversible somatic differentiation" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Aleksandra Walczak as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: E. Yagmur Erten (Reviewer #1); Guy Cooper (Reviewer #2); George Constable (Reviewer #3).
The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.
Essential revisions:
All reviewers found value in your work, stressed the simplicity and elegance of your model, and appreciated the insights and intuitions it provides. This elegance, however, comes at a cost. Your model relies on a key assumption, namely that cell divisions within an individual are synchronous, and that there does not seem to be withinhost fitness differences between the different cell types. Reviewers questioned the biological relevance of the assumption, and we think it would be profitable to investigate further its impact on the model's results.
The discussion among reviewers also highlighted that the presentation of the model lacks details, and would need to be more precise and formalized. Reviewer #3, in particular, provides specific suggestions for the formalization of the model and leads for its analysis. I would like to encourage you to try and analyse the model, and at least to provide in the paper all elements necessary to fully understand it (in the form of equations, but also links to code). Notation should also be clarified, especially for the difference between time and cell generations, and add information about time in the notation (especially in the definition of c). The results are appealing and make intuitive sense, but careful readers need to be able to fully understand the model, and this is not the case with the current presentation of the manuscript. As the model is clarified, new questions may arise, which is why I am not making a recommendation for acceptance at this stage.
I encourage the authors to address all the comments made by the reviewers.
Reviewer #1 (Recommendations for the authors):
1. Lines 6467: In what way the current study (model, assumptions etc.) differs from Cooper and West (2018), such that irreversible somatic differentiation is observed in this study but not in Cooper and West (2018)?
2. Lines 8081: It is unclear at this point through what mechanism somatic cells accelerate growth. Do the organisms grow faster because somatic cells themselves divide at a faster rate, so having more of them means shorter development time? Or do the somatic cells contribute to overall resources available to all cells and every cell (including germrole ones) divides faster? It becomes clearer later on and I think in their particular model it would not make a difference. But it would help to at least indicate that more explanation will come later.
3. Lines 125128: The authors use a functional form (Equation 2) to determine soma cells' contribution to the growth rate. As their results depend on the shape of this function, I am wondering if there are empirical studies that support one type of form or the other. For instance, under what conditions would soma cells work better alone (Line 128)? In other words, which of these functional forms we are more likely to encounter in nature? This is later discussed to some extent, but references to the relevant literature (e.g. other models) could be useful in the Methods section as well, if a reader wanted to check other related approaches.
4. The authors refer to Appendix 3 for the first time at line 177, whereas while reading the results up to this point, I kept wondering what the fractions of the other strategies (RSD and NSD) were. In case adding the figures for RSD and NSD to the main text distracts from the main message, I think at least mentioning that they are at Appendix 3 much earlier in the Results section would help the readers.
5. Line 565: Here the authors say that large b favours ISD and a very large one promotes RSD, whereas in the main text they say "neither extremely large, nor extremely small" b favours ISD (Lines 208209), which I found somewhat inconsistent.
6. It is not clear to me why the evolution of irreversible somatic differentiation requires a large enough organismal size. Also, in the main text, the authors do not mention what instead evolves in smaller organisms (RSD or NSD? This is later found in Appendix 3, but is not referred to or discussed in the main text). The authors later link their results about body size to some empirical examples in the Discussion section, but again, they do not discuss what might underlie these empirical observations or their findings about body size.
7. The second paragraph of the Discussion seems outofplace as it is. I also cannot follow the logic; why do these cell numbers indicate organismal synchronicity? And what about cell death?
Reviewer #2 (Recommendations for the authors):
I like the model, it is simple and easy to interpret, providing predictions that make sense. However, it is not as general a model as the discussion implies in some cases. The predictions of the model are likely to depend on modelling assumptions that may be unrealistic in different systems, including the examples often cited in the paper.
My biggest request is that I would like more of a discussion of the limits that arise due to the these assumptions. In particular, to what extent are the predictions contingent on the fact that soma provide benefits continuously as the group grows? This is not the case for many of the systems cited in the work, such as in the Volvocine algae and in fruiting body formations such as in Dictyostelium. Furthermore, one could also imagine that differentiation probabilities are density dependent, or that germ cell fecundity depends on the number of soma cells in the last generation. I suspect that predictions 2 and 3 would not necessarily hold in these scenarios, which could explain for instance why many Volvocine species have a very large number of somatic cells. Acknowledging and discussing exactly how the predictions hinge on these assumptions would make the analysis much stronger.
Secondly, I think some definitions could be clearer in the introduction. For instance, if soma do not replicate at all, does it even make sense to speak of irreversible soma vs reversible soma? Many of the models cited have sterile soma that do not replicate (most Michod models, and Cooper and West model at least). Similarly, what if separation between germ and soma only occurs in onegeneration of the group life cycle? What does the distinction between irreversible vs reversible soma mean in this case? Is irreversible soma just the same as soma sterility? How does all of this compare to the germline sequestration question, which readers may be more familiar with? These distinctions could be much clearer, which would help to set up the key question of the paper and make its scope more obvious.
Finally, I think some aspects of the presentation of the results could be improved. I found Figure 2A in isolation difficult to fully interpret. There are three outcomes in this model, ISD, RSD, and NSD, and the frequency of each outcome is only shown in Appendix 3. I would suggest including the frequency of the two other strategies in the main text. The same applies to Figure 4. You can't infer from just looking at the frequency of ISD alone to what extent the patterns are driven by irreversible soma being favoured over reversible soma vs no soma being favoured at all.
Reviewer #3 (Recommendations for the authors):
I very much enjoyed this paper, and only have a few suggestions with respect to the model.
I think potential conceptual limitations of the model lie in the assumptions of synchronous cell division and constant development strategy.
It may be possible to address the first of these issues (and thus the initial concern of Dr. Walczak) with some illustrative supplementary simulations (e.g. preliminary results to demonstrate the extent to which maturation time is affected by such asynchronicity). These might even take the form of some simple continuous time ODE models.
However the second of these issues would be a highly difficult task, and lies well outside the scope of the current paper. While exploring this question might certainly serve as a nice extension to the current work, I would not expect the authors to tackle this in the current context, where it would merely muddle the story presented.
Finally, while I like the model in general, there are some points of clarification I think could be made. Although I feel I have understood the core elements, there are some points of ambiguity where it is possible that I may be mistaken, and ironing out these potential misconceptions in the appendices would be beneficial for readers.
As I understand it:
The fraction of soma and germ cells in an organism are given by g(t) and s(t) in Equation 7, with s(t)=x in the main text (see Equation 2 – this should be made consistent?).
Note that 't' here refers to the generation t=1,2,…,n
These dynamics are independent of the costs of differentiation, c_{g} and c_{s}.
However, the division time for cells in the organism during growth is dependent on these costs (see 't' in Equation 1 – note that 't' here is the continuous doubling time, which has an inconsistent notation with Equations 48).
Writing T_{gen}(t) for the doubling time at generation t, we have
T_{gen}(t)=F_{diff} * F_{comp}
= (1+<c> )*( 1 – b + b ( ( x_{1} – s(t) )/( x_{1} – x_{0} ) ))^{\α}
(when x_{0}<s(t)<x_{1} – for simplicity I won't write out the other conditions)
At this point I'm not 100% sure how <c> is defined. I'd assume the following:
<c> = 1+s(t)*c_{s}*(s_{gs} + 2 s_{gg})+g(t)*c_{g}*(g_{gs} + 2 g_{ss})
(Is this correct?)
At this point I have T_{gen}(t) as a function of t (having substituted for g(t) and s(t) from Equation 7). This allows me to write the maturation time (time to reach a size 2^{n}) as
T_{mat} = \sum_{t=1}^{n} T_{gen}(t)
Finally, the fitness of an organism with a particular developmental strategy is given by the rate of gamete production (i.e. the number of gametes at maturity divided by the time taken to reach maturity)
W = g(n) / T_{mat}
Working out the evolutionary optimal strategy is then a matter of maximising W with respect to s_{gs}, s_{ss}, g_{gs} and g_{ss}.
Is this all correct?
If so, it may be possible to make analytical progress on this problem by replacing the discontinuous function in Equation 1 with a continuous approximation, e.g.
1 – (1 – b) x\^{β} / ( x\^{β} + (1 – x)\^{β} )
I mainly mention this latter point as a potential area of future investigation. However with respect to the model details, I would recommend the authors clarify the points above in one of the appendices.
[Editors' note: further revisions were suggested prior to acceptance, as described below.]
Thank you for submitting your article "Evolution of irreversible somatic differentiation" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by Aleksandra Walczak as the Senior and Reviewing Editor. The following individuals involved in review of your submission have agreed to reveal their identity: E. Yagmur Erten (Reviewer #1); Guy Cooper (Reviewer #2); George Constable (Reviewer #3).
The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.
Essential Revisions:
All reviewers note that the paper has greatly improved. However they still would like to see some small changes made to improve clarity (no new research is required). Here is a summary but please go through the reviews attached below for concrete points:
– Clarify the details of the new models, as suggested by reviewer 1;
– Define soma and germ cells clearly, as noted by reviewer 2;
– Provide more discussion on the assumption of fixed strategies, as noted by reviewer 2;
– Add clarifications as suggested by reviewer 3.
Reviewer #1 (Recommendations for the authors):
I thank the authors for their responses and clarifications, as well as for extending their model to include risky differentiation and asynchronous cell division. I very much enjoyed reading the revised version of their manuscript, but I have the following questions about the new models they included.
Risky cell differentiation (Appendix 6): I don't exactly follow why the authors multiply the frequency and not the number of cell differentiations with per differentiation death risk. Maybe I am misunderstanding something, but isn't this implicitly assuming that deadly differentiation errors, if they happen in larger organisms, will have less probability to have an impact than in smaller organisms? Or in other words, the effect of deadly differentiation will be larger in smaller organisms? If true, this assumption might still be realistic, e.g. one deadly cell within 32 cells compared to 1024 can plausibly have a larger organismal level effect. But one can also argue that if one cell becomes a mutant and acquires a growth advantage, the overall size of the organism might not matter, especially e.g. if the mutant occurs early in the development or has a very high cell proliferation/resource uptake rate. Although this might not change the results in the main text qualitatively, as there the authors use one maturity size (2^{10}) in their calculations.
Cell division asynchrony (Appendix 7): seeing the results of the cell division asynchrony, it seems like synchrony is almost a necessary condition for the evolution of irreversible differentiation in their model, just like those conditions summarized in Lines 335339 albeit perhaps not as strict as them (since, although rarely, irreversible strategies still evolve). Perhaps it should be acknowledged as such earlier and more explicitly, rather than at the end of Discussion? Particularly given the fact that the authors looked at one of the most favourable conditions for irreversible strategies (c _{(s> g)} >> c_{(g>s)}) and found that the evolution of irreversible differentiation is very rare.
Reviewer #2 (Recommendations for the authors):
I think this model and analyses are very good and so I won't comment too much on these (with the exception of a thought on the asynchrony model). I think a few things need to be clarified but I am otherwise happy to endorse the paper for publication.
My main comments will concern the introduction and framing of this study as I think that there are still things that need to be made clear, which will really help the reader right from the start.
I think some very clear definitions of different terms are needed and as early as possible in the introduction. From what I can gather across different sections of the introduction: the soma cells are those that contribute to vegetative functions (sustaining the overall organism) but cannot act as a seed/propagule/spore for founding of a new organism. In contrast, germ cells are those that do not contribute to vegetative functions but can act as such a spore. The authors distinguish between terminally differentiated soma that do not divide such as in cyanobacteria and nonterminally differentiated soma that can divide such as in the Volvocales. They then ask the question in what conditions do the latter kind of soma (nonterminally differentiated) become irreversibly differentiated (that is when they can only divide and produce more soma). If the above is correct, I would suggest making these definitions and distinctions clearer and more localised rather than having bits of each definition spread across the introduction in a way that needs piecing together (perhaps a glossary type table could also help?)
If the above is correct, then the definitions need to be applied more consistently throughout the manuscript. For instance, the "somatic" cells that exist during the growth of the "higher" Volvocales do not qualify as somatic cells per the author's definition as they do not contribute to vegetative functions. In this case only the last generation of flagella beaters are "soma", none of which divide and so the distinction between reversibly and irreversibly differentiated does not apply here. The authors have added a paragraph about this in the discussion but they lean on the Volvocales so much in the introduction and discussion of their work that this mismatch needs to be flagged much sooner in the paper.
A similar issue applies to the discussion of Cooper and West 2018. Much as I would like to pretend that this paper could potentially cover all possibilities, group growth is not explicitly modelled here and so the distinction between reversible and irreversibly differentiated soma does not apply here (one can imagine that in this model there are no cooperative interactions as the group grows and that division of labour then may occur but only in the last generation of the group life cycle before spore dispersal much like for the Volvocales). If nonsterile helpers count as soma then this might be a different issue as sterile cells may be considered irreversible soma and nonsterile helpers as reversible soma, but then these nonsterile helpers can "seed" the next generation so I don't think they really qualify as soma per the author's definition? Having clearer definitions will help resolve these confusions.
Otherwise, I feel that the authors have sidestepped the potential impact of nonstatic traits in their model by saying in their response to reviewers that they have plans for a future paper on this. That is great and I am very much looking forward to what they find but this issue still needs to be discussed in this paper as many of their results here could be explained as arising directly from the assumption of static strategies as the group grows. I would suggest mentioning this at least once or twice as they go through the results (around lines 238242 would be good) and then a whole paragraph on this in the discussion is warranted (can also mention plans for future work on this here). For instance, the need for just a few somatic cells that provide large benefits seems to arise directly from the fact that germ cells can't modulate the number of soma cells they spawn once these become too numerous, or that they can't have a time based strategy that produces many soma earlier but fewer later as the group grows.
I think the results they have found in the asynchronous model is really good but needs more explanation/discussion of why ISD can't seem to work here. They have modelled a time to replication cost as arising from the different differentiation costs. I find it strange in that case that RSD is not the worst affected strategy as the authors have established that this is the strategy with the most differentiation. I otherwise would have thought that having soma cells that divert their energy to vegetative functions as the slower replicator might have been a natural way to introduce asynchrony.
Finally, I think a word of caution on the discussion of "convex" shapes and how this favours division of labour/terminal differentiation/irreversible differentiation (lines 323332). In several of the models cited (if not all), the convexity at issue is the relationship between an individual's investment in a public good/vegetative function and the fitness return to the group. In the authors' paper, the convexity is between the number/proportion of "helpers/soma" in the group and the fitness return to the group. These are very different things (one has to do with synergy from internal efficiencies whereas as the other comes from synergies from between individual interactions) and so should not be treated as the same prediction.
Reviewer #3 (Recommendations for the authors):
The authors have made substantial changes to the manuscript that appear to have addressed many of the concerns of the reviewers.
In their response, the reviewers clarified some details of the model, and I now feel I have a better understanding as to how it works. However that has led to another couple of small suggestions on my part that I believe would help readers.
In my original review I stated:
"Note that 't' here refers to the generation t=1,2,…,n
…
However, the division time for cells in the organism during growth is dependent on these costs (see 't' in Equation 1 – note that 't' here is the continuous doubling time, which has an inconsistent notation with Equations 48)"
I now see that under the costless differentiation assumed in Appendix 2 , t becomes an integer which helps simplify the subsequent analysis. It's worthwhile to make a note of this fact (before the sentence "Then, the expected fractions.…" would be an obvious potential place to mention this).
The authors response also makes clear at multiple points that cell divisions are stochastic:
"Since the differentiation program is stochastic, the costs of differentiation depend on the actual number of differentiation events happened in the course of growth, rather than probabilities like g_{gs}.",
"Since the differentiation strategy is stochastic, the time to reach maturity (T_{mat}) and the number of offspring at the last stage (g(n)) are random values, which we sample by repeatedly simulating the process of growth."
"However, since the outcomes of cell divisions are stochastic, the sampling of developmental trajectories has to reflect that and in our case it is done numerically."
I understand this. My comments, which I may not have articulated clearly in my initial review, were more aimed at asking how much understanding could be gained from alternatively taking a mean field approach. Indeed, this is precisely the approach the authors themselves take in Appendix 2, where they "consider the mathematical expectation of the composition". This leads to the obvious question – why can't a similar approach be used when differentiation is not costless?
Of course, I completely understand that stochasticity could be very important in a model such as this (where initial cell numbers are low), and it may be that such a meanfield approach leads to misleading results with respect to the prediction of the mean population growth rate. If this is the case, I think the authors should make a statement of this fact somewhere, perhaps with a reference to results in Gao et al., 2019 with respect to the differences between mean field and stochastic predictions.
Otherwise the authors have done a good job of clarifying my questions and addressing my concerns.
https://doi.org/10.7554/eLife.66711.sa1Author response
Essential revisions:
All reviewers found value in your work, stressed the simplicity and elegance of your model, and appreciated the insights and intuitions it provides. This elegance, however, comes at a cost. Your model relies on a key assumption, namely that cell divisions within an individual are synchronous, and that there does not seem to be withinhost fitness differences between the different cell types. Reviewers questioned the biological relevance of the assumption, and we think it would be profitable to investigate further its impact on the model's results.
The discussion among reviewers also highlighted that the presentation of the model lacks details, and would need to be more precise and formalized. Reviewer #3, in particular, provides specific suggestions for the formalization of the model and leads for its analysis. I would like to encourage you to try and analyse the model, and at least to provide in the paper all elements necessary to fully understand it (in the form of equations, but also links to code). Notation should also be clarified, especially for the difference between time and cell generations, and add information about time in the notation (especially in the definition of c). The results are appealing and make intuitive sense, but careful readers need to be able to fully understand the model, and this is not the case with the current presentation of the manuscript. As the model is clarified, new questions may arise, which is why I am not making a recommendation for acceptance at this stage.
I encourage the authors to address all the comments made by the reviewers.
Thank you and the reviewers for so well thought and constructive reviews. In the revised version of the manuscript, we additionally test our findings with a model featuring the risky differentiation (inspired by cancer) and an asynchronous cell division. We also clarified the model presentation. Later, next to the reviewers’ comments, we describe the specific changes in details.
Reviewer #1 (Recommendations for the authors):
1. Lines 6467: In what way the current study (model, assumptions etc.) differs from Cooper and West (2018), such that irreversible somatic differentiation is observed in this study but not in Cooper and West (2018)?
We discuss this in the updated manuscript. In principle, the ingredients we consider to be necessary for irreversible somatic differentiation are also included in that study, but the model setup is very different. For example, the sterile cells in Cooper and West do not divide further.
2. Lines 8081: It is unclear at this point through what mechanism somatic cells accelerate growth. Do the organisms grow faster because somatic cells themselves divide at a faster rate, so having more of them means shorter development time? Or do the somatic cells contribute to overall resources available to all cells and every cell (including germrole ones) divides faster? It becomes clearer later on and I think in their particular model it would not make a difference. But it would help to at least indicate that more explanation will come later.
Thank you for highlighting this issue and others below. We reworked the presentation of the model to make it clear. Regarding the specific question here, having somarole cells allows for higher resource uptake by the organism, so having more somarole cells generally results in faster developmental time. Since our main model is based on synchronous division, we do not allow that only somatic cells would divide faster.
3. Lines 125128: The authors use a functional form (Equation 2) to determine soma cells' contribution to the growth rate. As their results depend on the shape of this function, I am wondering if there are empirical studies that support one type of form or the other. For instance, under what conditions would soma cells work better alone (Line 128)? In other words, which of these functional forms we are more likely to encounter in nature? This is later discussed to some extent, but references to the relevant literature (e.g. other models) could be useful in the Methods section as well, if a reader wanted to check other related approaches.
The choice of the functional form is motivated by its flexibility – with a right combination of parameters, a diverse scenarios can be covered: linear, concave, convex, a step threshold and so on. For instance, the prominent theoretical finding is that germsoma differentiation arises when the tradeoff between viability and fertility is convex. Our functional form covers this set of models as a special case with x_{0} = 0 and x_{1}=1. In the revised manuscript, we also discuss equivalent expressions used by other models.
Our statement of “soma cells working better alone” implied that there is no synergy between somarole cells and increasing in the number of somarole cells brings diminishing benefits. We now see that this wording causes confusion and rewrote that sentence.
4. The authors refer to Appendix 3 for the first time at line 177, whereas while reading the results up to this point, I kept wondering what the fractions of the other strategies (RSD and NSD) were. In case adding the figures for RSD and NSD to the main text distracts from the main message, I think at least mentioning that they are at Appendix 3 much earlier in the Results section would help the readers.
We also find figures for RSD and NSD very insightful but decided to put them into appendix to keep the focus of the paper on ISD. With great pleasure, we bring Figure 1 from appendix 3 to the main text.
5. Line 565: Here the authors say that large b favours ISD and a very large one promotes RSD, whereas in the main text they say "neither extremely large, nor extremely small" b favours ISD (Lines 208209), which I found somewhat inconsistent.
Thank you for highlighting this. All fixed.
6. It is not clear to me why the evolution of irreversible somatic differentiation requires a large enough organismal size. Also, in the main text, the authors do not mention what instead evolves in smaller organisms (RSD or NSD? This is later found in Appendix 3, but is not referred to or discussed in the main text). The authors later link their results about body size to some empirical examples in the Discussion section, but again, they do not discuss what might underlie these empirical observations or their findings about body size.
Thank you for pointing on the lack of the discussion on this topic. In the revised manuscript, we discuss what causes the limitation of the organism size. In a nutshell, it is related to the monotonic increase in the fraction of soma cells in irreversible differentiation strategies and the decreased differentiation costs as the organism size increases.
7. The second paragraph of the Discussion seems outofplace as it is. I also cannot follow the logic; why do these cell numbers indicate organismal synchronicity? And what about cell death?
There, we list two organisms, one with 32 and one with 128 cells. Both numbers are powers of two. If the cell divisions are synchronized at the level of organism, then at any moment, the number of cells should be power of two (cell death is not reported for these species). By contrast, in a bacterial colony founded by a single cell, cell divisions eventually desynchronize – at 32 cells this effect is already notable and by 128 cells it is apparent. Thus, we may suggest that some organismlevel events synchronize cell division in the listed Volvocales. We polished this paragraph to be more clear and better integrated into the discussion.
Reviewer #2 (Recommendations for the authors):
I like the model, it is simple and easy to interpret, providing predictions that make sense. However, it is not as general a model as the discussion implies in some cases. The predictions of the model are likely to depend on modelling assumptions that may be unrealistic in different systems, including the examples often cited in the paper.
My biggest request is that I would like more of a discussion of the limits that arise due to the these assumptions. In particular, to what extent are the predictions contingent on the fact that soma provide benefits continuously as the group grows? This is not the case for many of the systems cited in the work, such as in the Volvocine algae and in fruiting body formations such as in Dictyostelium. Furthermore, one could also imagine that differentiation probabilities are density dependent, or that germ cell fecundity depends on the number of soma cells in the last generation. I suspect that predictions 2 and 3 would not necessarily hold in these scenarios, which could explain for instance why many Volvocine species have a very large number of somatic cells. Acknowledging and discussing exactly how the predictions hinge on these assumptions would make the analysis much stronger.
Thank you for raising this important topic. In the revised manuscript, we extend the discussion of our assumptions. We agree that our model is simplified comparing with the processes occurring in nature.
We assume that the somarole cells provide benefits to the group growth during the whole ontogenesis. This may not accurately represent more complex species like Volvox, where daughter colonies grow within the maternal organism. Taking this into account can relax our last finding: “somarole cells should bring benefits even in small numbers”, as only the final composition of the organism will matter in this case. However, a life cycle, where juvenile organisms are protected and nurtured by maternal body, represents quite an advanced degree of evolved complexity, which may not be directly relevant to the earliest stage of the evolution of multicellularity.
Secondly, I think some definitions could be clearer in the introduction. For instance, if soma do not replicate at all, does it even make sense to speak of irreversible soma vs reversible soma? Many of the models cited have sterile soma that do not replicate (most Michod models, and Cooper and West model at least). Similarly, what if separation between germ and soma only occurs in onegeneration of the group life cycle? What does the distinction between irreversible vs reversible soma mean in this case? Is irreversible soma just the same as soma sterility? How does all of this compare to the germline sequestration question, which readers may be more familiar with? These distinctions could be much clearer, which would help to set up the key question of the paper and make its scope more obvious.
Thank you for highlighting this overlook. Indeed, the range of cell specializations that are called “soma cells” is very wide and we are looking in a particular type of these. By somarole cells, we consider cells, which divide in the course of organism growth (unlike sterile/terminally differentiated cells), but do not contribute to the organism reproduction (unlike temporal division of labor).
We also do not consider the sequestration of the germ line and oneoff separation between germ and soma, as these require considering a dynamical differentiation program, in which probabilities of differentiation change with time. We even considered such a model at the conceptualization stage but found an efficient but degenerate differentiation program – turn all cells into somarole at the beginning of the life cycle and keep them until the end, where all cells turn to germrole. This way, the developmental speed is extremely fast and the number of produced offspring is high as well. Obviously, this means that the space of dynamic differentiation programs is constrained but investigation of this space and its constraints is too far from the scope of the current work. We are planning to investigate this topic in later work.
Finally, I think some aspects of the presentation of the results could be improved. I found Figure 2A in isolation difficult to fully interpret. There are three outcomes in this model, ISD, RSD, and NSD, and the frequency of each outcome is only shown in Appendix 3. I would suggest including the frequency of the two other strategies in the main text. The same applies to Figure 4. You can't infer from just looking at the frequency of ISD alone to what extent the patterns are driven by irreversible soma being favoured over reversible soma vs no soma being favoured at all.
Thank you for the suggestion. Figure 1 from appendix 3 is elevated to the main text.
Reviewer #3 (Recommendations for the authors):
I very much enjoyed this paper, and only have a few suggestions with respect to the model.
I think potential conceptual limitations of the model lie in the assumptions of synchronous cell division and constant development strategy.
It may be possible to address the first of these issues (and thus the initial concern of Dr. Walczak) with some illustrative supplementary simulations (e.g. preliminary results to demonstrate the extent to which maturation time is affected by such asynchronicity). These might even take the form of some simple continuous time ODE models.
However the second of these issues would be a highly difficult task, and lies well outside the scope of the current paper. While exploring this question might certainly serve as a nice extension to the current work, I would not expect the authors to tackle this in the current context, where it would merely muddle the story presented.
Finally, while I like the model in general, there are some points of clarification I think could be made. Although I feel I have understood the core elements, there are some points of ambiguity where it is possible that I may be mistaken, and ironing out these potential misconceptions in the appendices would be beneficial for readers.
Thank you for this feedback. With the help from your presentation, we rewrote the model section in the updated version of the manuscript. Below, there is our clarification of the model steps.
As I understand it:
The fraction of soma and germ cells in an organism are given by g(t) and s(t) in Equation 7, with s(t)=x in the main text (see Equation 2 – this should be made consistent?).
Note that 't' here refers to the generation t=1,2,…,n
These dynamics are independent of the costs of differentiation, c_{g} and c_{s}.
However, the division time for cells in the organism during growth is dependent on these costs (see 't' in Equation 1 – note that 't' here is the continuous doubling time, which has an inconsistent notation with Equation 48).
Writing T_{gen}(t) for the doubling time at generation t, we have
T_{gen}(t)=F_{diff} * F_{comp}
= (1+<c> )*( 1 – b + b ( ( x_{1} – s(t) )/( x_{1} – x_{0} ) ))^{\α}
(when x_{0}<s(t)<x_{1} – for simplicity I won't write out the other conditions)
Up to this point, all is correct.
At this point I'm not 100% sure how <c> is defined. I'd assume the following: <c> = 1+s(t)*c_{s}*(s_{gs} + 2 s_{gg})+g(t)*c_{g}*(g_{gs} + 2 g_{ss})
(Is this correct?)
Not exactly. Since the differentiation program is stochastic, the costs of differentiation depend on the actual number of differentiation events happened in the course of growth, rather than probabilities like g_{gs}. In each simulation, at each cellgeneration step, we sample how many differentiation events (i.e. offspring is of a different type than the parent) will occur from a multinomial distribution. Each event brings c_{s} and c_{g} costs. The factor <c> used to compute the generation time is the average of these costs: the ratio between cumulative cost and the total number of cells at the beginning of the simulation step.
This way, the costs and the organism composition will differ among different runs of the same simulation. Hence, for each combination of control parameters, we run 300 independent realizations of organism growth and compute the expected population growth from this whole dataset (see below).
At this point I have T_{gen}(t) as a function of t (having substituted for g(t) and s(t) from Equation 7). This allows me to write the maturation time (time to reach a size 2^{n}) as
T_{mat} = \sum_{t=1}^{n} T_{gen}(t)
T_{mat} is a sum of individual cell doubling times T_{gen} (t) but it is a random value from rather complex distribution and not a function.
Finally, the fitness of an organism with a particular developmental strategy is given by the rate of gamete production (i.e. the number of gametes at maturity divided by the time taken to reach maturity)
W = g(n) / T_{mat}
Since the differentiation strategy is stochastic, the time to reach maturity (T_{mat}) and the number of offspring at the last stage (g(n)) are random values, which we sample by repeatedly simulating the process of growth. In our previous work (Interacting cells driving the evolution of multicellular life cycles, PloS CB, 2019), we have shown that a population with stochastic development will grow exponentially with the rate (W) given by the solution of equation
\sum_{i} P_{i} G_{i} e^{W T}_{i} = 1
where the sum is over possible trajectories i, P_{i} is the probability of each trajectory, G_{i} is the number of offspring produced at the maturity, T_{i} is the time to maturity. In our work, we numerically solve this equation using a sampled distribution of trajectories. In deterministic scenarios, like the absence of differentiation, the solution of this equation yields
W = log(g(n)) / T_{mat},
which has a similar form as your expression.
Working out the evolutionary optimal strategy is then a matter of maximising W with respect to s_{gs}, s_{ss}, g_{gs} and g_{ss}.
Is this all correct?
This is correct.
If so, it may be possible to make analytical progress on this problem by replacing the discontinuous function in Equation 1 with a continuous approximation, e.g.
1 – (1 – b) x\^{β} / ( x\^{β} + (1 – x)\^{β} )
Thank you for a suggestion. However, since the outcomes of cell divisions are stochastic, the sampling of developmental trajectories has to reflect that and in our case it is done numerically. Unfortunately, a replacement of F_{comp} function with a continuous approximation does not lead to deeper analytical insights.
I mainly mention this latter point as a potential area of future investigation. However with respect to the model details, I would recommend the authors clarify the points above in one of the appendices.
Thank you again for careful consideration of our model. Now, we see that our initial presentation was confusing. In the updated text, we extended the presentation of the model.
[Editors' note: further revisions were suggested prior to acceptance, as described below.]
Reviewer #1 (Recommendations for the authors):
I thank the authors for their responses and clarifications, as well as for extending their model to include risky differentiation and asynchronous cell division. I very much enjoyed reading the revised version of their manuscript, but I have the following questions about the new models they included.
Risky cell differentiation (Appendix 6): I don't exactly follow why the authors multiply the frequency and not the number of cell differentiations with per differentiation death risk. Maybe I am misunderstanding something, but isn't this implicitly assuming that deadly differentiation errors, if they happen in larger organisms, will have less probability to have an impact than in smaller organisms? Or in other words, the effect of deadly differentiation will be larger in smaller organisms? If true, this assumption might still be realistic, e.g. one deadly cell within 32 cells compared to 1024 can plausibly have a larger organismal level effect. But one can also argue that if one cell becomes a mutant and acquires a growth advantage, the overall size of the organism might not matter, especially e.g. if the mutant occurs early in the development or has a very high cell proliferation/resource uptake rate. Although this might not change the results in the main text qualitatively, as there the authors use one maturity size (2^{10}) in their calculations.
We appreciate your attention to details of the model design. During the manuscript revision, we discussed a lot how to implement the cell differentiation risk. The assumption, which you suggested – each individual defective cell brings the same risk to the organism, has also been closely considered. However, taking into account the context of our study, we settled on a different design. The key factor in our decision is that the organisms in our model constantly grow in size. As a result, the moment of the emergence of a defective cell plays a large role in the defect’s impact. If a defective cell emerges at the last round of cell divisions, it cannot harm the organism much, because the life cycle is about to end. However, a defective cell emerged in the very first cell division will have a large impact on the organism performance during the life cycle. To take this effect into account and make later defects less dangerous, we scaled the number of differentiation events by the organism size, thus attributing the risk to the frequency of differentiated cells.
We agree that a growth advantage of defective (defecting) cells may also play a role here. However, such an advantage should have much smaller impact on evergrowing simple organisms than on large animals. For animals, the division rate of the typical cell is close to zero, hence even a humble growth advantage of a defective cell brings a lot of difference to the cell dynamics over the span of long life cycle. In our case, every cell in the organism actively proliferates, while the life cycle is very short. Thus, the growth advantage must be large in order to have an impact. Hence, emergence of defects leading to a change in relative cell growth must be less likely in our model than in complex animals. On top of that, the rapid propagation of malignant tumors in human organism is often the endresult of a silent process of accumulation of multiple mutations, which can take years to complete. For the organisms we study, the defective cells just don’t have enough time to turn into proper cancer cells before the life cycle ends. Hence, we are convinced that the growth advantage of defective cells can be ignored in the context of our study.
We acknowledge that this design choice is tailored to simple organisms and may not be optimal for complex animals. In the revised manuscript, we added an exposition of our reasoning to the risky model section in Appendix 6.
Cell division asynchrony (Appendix 7): seeing the results of the cell division asynchrony, it seems like synchrony is almost a necessary condition for the evolution of irreversible differentiation in their model, just like those conditions summarized in Lines 335339 albeit perhaps not as strict as them (since, although rarely, irreversible strategies still evolve). Perhaps it should be acknowledged as such earlier and more explicitly, rather than at the end of Discussion? Particularly given the fact that the authors looked at one of the most favourable conditions for irreversible strategies (c _{(s> g)} >> c_{(g>s)}) and found that the evolution of irreversible differentiation is very rare.
We agree that there is a need to mention the assumption of synchrony as early as possible. We now explicitly state that the model and the results are based on synchronous cell division in the Model section. And the asynchronous cell division is explored additionally in the Appendix 7.
Reviewer #2 (Recommendations for the authors):
I think this model and analyses are very good and so I won't comment too much on these (with the exception of a thought on the asynchrony model). I think a few things need to be clarified but I am otherwise happy to endorse the paper for publication.
My main comments will concern the introduction and framing of this study as I think that there are still things that need to be made clear, which will really help the reader right from the start.
I think some very clear definitions of different terms are needed and as early as possible in the introduction. From what I can gather across different sections of the introduction: the soma cells are those that contribute to vegetative functions (sustaining the overall organism) but cannot act as a seed/propagule/spore for founding of a new organism. In contrast, germ cells are those that do not contribute to vegetative functions but can act as such a spore. The authors distinguish between terminally differentiated soma that do not divide such as in cyanobacteria and nonterminally differentiated soma that can divide such as in the Volvocales. They then ask the question in what conditions do the latter kind of soma (nonterminally differentiated) become irreversibly differentiated (that is when they can only divide and produce more soma). If the above is correct, I would suggest making these definitions and distinctions clearer and more localised rather than having bits of each definition spread across the introduction in a way that needs piecing together (perhaps a glossary type table could also help?)
We adopted the suggestion to clarify the differences between terminal differentiation and irreversible differentiation. We also explained the irreversible differentiation that is of most interest for us further in the Model section.
If the above is correct, then the definitions need to be applied more consistently throughout the manuscript. For instance, the "somatic" cells that exist during the growth of the "higher" Volvocales do not qualify as somatic cells per the author's definition as they do not contribute to vegetative functions. In this case only the last generation of flagella beaters are "soma", none of which divide and so the distinction between reversibly and irreversibly differentiated does not apply here. The authors have added a paragraph about this in the discussion but they lean on the Volvocales so much in the introduction and discussion of their work that this mismatch needs to be flagged much sooner in the paper.
Thank you for your thorough treatment of definitions! We define the germrole as a state of a cell in which it continues to live after the organism fragments at the end of life cycle, and the somarole as a state of a cell in which it dies upon fragmentation. We additionally assume that somarole cells are capable to provide a benefit to the organism. However, this benefit is separate from the definition of somarole itself. For example, according to our model, if the fraction of somarole cells is below the contribution threshold x_{0}, there is no benefit to the organism. Still, these cells are considered as somarole. Hence, in your example, we see no conceptual problem with progenitors of flagella beaters not contributing to the vegetative functions, as contribution of somarole cells is still possible under the right conditions (after the last round of cell divisions here) but not mandatory. In the revised manuscript we make our definitions more clear.
Our attention to Volvocales in the introduction and discussion reflects the overwhelming bias in the empirical literature about evolution of germ/soma differentiation towards this group (and also cellular slime molds but their multicellularity is aggregative). Without this group, simply, there is a little to discuss. Nevertheless, our model is not about Volvocales, we aim to answer the question about emergence of irreversible somatic differentiation in a broad context without tailoring it to the features of a single group. As a result, the design of our model does not fully reflect the features of Volvocales life cycle. We overlooked that our introduction made an impression that the manuscript is about a single prominent group. In the updated version, we modified the exposition of our work. We also explicitly acknowledge the bias of empirical examples in our discussion.
A similar issue applies to the discussion of Cooper and West 2018. Much as I would like to pretend that this paper could potentially cover all possibilities, group growth is not explicitly modelled here and so the distinction between reversible and irreversibly differentiated soma does not apply here (one can imagine that in this model there are no cooperative interactions as the group grows and that division of labour then may occur but only in the last generation of the group life cycle before spore dispersal much like for the Volvocales). If nonsterile helpers count as soma then this might be a different issue as sterile cells may be considered irreversible soma and nonsterile helpers as reversible soma, but then these nonsterile helpers can "seed" the next generation so I don't think they really qualify as soma per the author's definition? Having clearer definitions will help resolve these confusions.
We appreciate your suggestions. We have modified the comparison to this paper in the Introduction section. Meanwhile, as you mentioned in the above comment, we have elaborated the definition of germrole and somarole in the model section and stressed the differences between your and our models in discussion.
Otherwise, I feel that the authors have sidestepped the potential impact of nonstatic traits in their model by saying in their response to reviewers that they have plans for a future paper on this. That is great and I am very much looking forward to what they find but this issue still needs to be discussed in this paper as many of their results here could be explained as arising directly from the assumption of static strategies as the group grows. I would suggest mentioning this at least once or twice as they go through the results (around lines 238242 would be good) and then a whole paragraph on this in the discussion is warranted (can also mention plans for future work on this here). For instance, the need for just a few somatic cells that provide large benefits seems to arise directly from the fact that germ cells can't modulate the number of soma cells they spawn once these become too numerous, or that they can't have a time based strategy that produces many soma earlier but fewer later as the group grows.
As you suggested, in the updated Results section, we acknowledge that our findings are obtained under assumption of static strategies. We also added a new paragraph to discussion about nonstatic differentiation strategies: their existence in nature and a possible impact on the model results.
I think the results they have found in the asynchronous model is really good but needs more explanation/discussion of why ISD can't seem to work here. They have modelled a time to replication cost as arising from the different differentiation costs. I find it strange in that case that RSD is not the worst affected strategy as the authors have established that this is the strategy with the most differentiation. I otherwise would have thought that having soma cells that divert their energy to vegetative functions as the slower replicator might have been a natural way to introduce asynchrony.
Irreversible differentiation is suppressed in the asynchronous model because there, the reversible strategies are capable to develop large number of somarole cells early on and keep large fraction of germrole cells throughout the life cycle even at the high soma differentiation costs (cs>g=10). Moreover, we would like to highlight that unlike in the synchronous model, here, high costs of soma differentiation bring an advantage to reversible differentiation strategies. The key effect here is that if costs of soma differentiation is large enough, the expected period of cell division for a differentiating somarole cell is longer than the length of life cycle. As a result, instead of redifferentiation, somarole cells become effectively terminally differentiated. In such a situation, the growth of the organism is determined by propagation of germrole cells and is not affected by soma differentiation costs. Hence, the development of organisms using reversible strategies remains fast and irreversible strategies are outcompeted. In the updated manuscript, we discuss this effect in the Appendix 7.
We appreciate the idea of somarole cells being slower replicators. We believe it may lead to a more elegant model. However, in this appendix, we test the impact of the assumption of the synchronous cell division used in the baseline model. Hence, it is more appropriate to present the asynchronous implementation of the base model keeping the remaining design intact. This immediately dictates that the costs must be associated with differentiation events and not with the cell state itself. Otherwise, it would be difficult to compare results from two models. We emphasize this motivation in the updated Appendix 7.
Finally, I think a word of caution on the discussion of "convex" shapes and how this favours division of labour/terminal differentiation/irreversible differentiation (lines 323332). In several of the models cited (if not all), the convexity at issue is the relationship between an individual's investment in a public good/vegetative function and the fitness return to the group. In the authors' paper, the convexity is between the number/proportion of "helpers/soma" in the group and the fitness return to the group. These are very different things (one has to do with synergy from internal efficiencies whereas as the other comes from synergies from between individual interactions) and so should not be treated as the same prediction.
We agree with you on the analyses of the model differences between ours and the previous works. In the revised manuscript, we have stressed the model differences in the curvature of the shapes.
Reviewer #3 (Recommendations for the authors):
The authors have made substantial changes to the manuscript that appear to have addressed many of the concerns of the reviewers.
In their response, the reviewers clarified some details of the model, and I now feel I have a better understanding as to how it works. However that has led to another couple of small suggestions on my part that I believe would help readers.
In my original review I stated:
"Note that 't' here refers to the generation t=1,2,…,n
…
However, the division time for cells in the organism during growth is dependent on these costs (see 't' in Equation 1 – note that 't' here is the continuous doubling time, which has an inconsistent notation with Equations 48)"
I now see that under the costless differentiation assumed in Appendix 2 , t becomes an integer which helps simplify the subsequent analysis. It's worthwhile to make a note of this fact (before the sentence "Then, the expected fractions.…" would be an obvious potential place to mention this).
Thanks – this was obviously a flaw in our notation. We have taken a new letter “j” to represent the previous “t” in the revised manuscript in Appendix 2 to clarify the number of cell divisions. We have further explained the variable “n”, which describes the maximal number of divisions. Thus j=1,⋯,n.
The authors response also makes clear at multiple points that cell divisions are stochastic:
"Since the differentiation program is stochastic, the costs of differentiation depend on the actual number of differentiation events happened in the course of growth, rather than probabilities like g_{gs}.",
"Since the differentiation strategy is stochastic, the time to reach maturity (T_{mat}) and the number of offspring at the last stage (g(n)) are random values, which we sample by repeatedly simulating the process of growth."
"However, since the outcomes of cell divisions are stochastic, the sampling of developmental trajectories has to reflect that and in our case it is done numerically."
I understand this. My comments, which I may not have articulated clearly in my initial review, were more aimed at asking how much understanding could be gained from alternatively taking a mean field approach. Indeed, this is precisely the approach the authors themselves take in Appendix 2, where they "consider the mathematical expectation of the composition". This leads to the obvious question – why can't a similar approach be used when differentiation is not costless?
Of course, I completely understand that stochasticity could be very important in a model such as this (where initial cell numbers are low), and it may be that such a meanfield approach leads to misleading results with respect to the prediction of the mean population growth rate. If this is the case, I think the authors should make a statement of this fact somewhere, perhaps with a reference to results in Gao et al., 2019 with respect to the differences between mean field and stochastic predictions.
Thank you for the clarification. While we completely agree that a mean field model can potentially provide a lot of insights about our findings, we have to admit that it does not do so. The expressions in the Appendix 2 are short and it seems promising to develop such a model further. However, once we put differentiation costs into play, things quickly get complicated. To find the population growth rate λ, we also need to compute the length of life cycle, which is not always possible. For the special case of stepfunction F_{comp}, (e.g. x_{0}=x_{1}), ignoring the discreteness of the organism composition, and relying on Mathematica software, we managed to find the population growth rate λ explicitly. Shortening the notation with shortcut variables, it is given by:
where
The condition x_{0}<x_{f} checks whether the fraction of somarole cells ever reaches the threshold value x_{0} (x_{f} is the fraction of somarole cells at the end of life cycle, c.f. Equation 9 in Appendix 2).
Finding the optimal developmental strategies means finding the maximum of the expression above. This can be only performed numerically (and we already have a more detailed numerical model). In the absence of insightful results from this analytical approach, we just do not see the reason to include this result to the paper.
In our previous paper (Gao 2019) we did not use a mean field approximation (replacement of stochastic trajectories with the most likely one). Instead, our analytical results were obtained with the weak selection approximation (all stochastic trajectories have approximately similar duration Ti). We do not use this approximation in our current work because all the interesting strategies are found under such conditions where trajectories with many somarole cells complete life cycle much faster, and thus, times to complete different trajectories differ a lot.
https://doi.org/10.7554/eLife.66711.sa2Article and author information
Author details
Funding
Ministry of Science and ICT, South Korea (2020R1A2C1101894)
 Hye Jin Park
JRG Program
 Hye Jin Park
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
YG, HJP, AT, and YP thank the Max Planck Society for generous funding. HJP was supported by the NRF grant funded by the Korea government (MSIT) Grant No.2020R1A2C1101894 and by an appointment to the JRG Program at the APCTP through the Science and Technology Promotion Fund and Lottery Fund of the Korean Government. This was also supported by the Korean Local Governments  Gyeongsangbukdo Province and Pohang City.
Senior and Reviewing Editor
 Aleksandra M Walczak, École Normale Supérieure, France
Reviewers
 E Yagmur Erten, University of Zurich, Switzerland
 Guy Cooper, St. John's College, United Kingdom
 George Constable
Version history
 Preprint posted: January 19, 2021 (view preprint)
 Received: January 20, 2021
 Accepted: September 23, 2021
 Version of Record published: October 13, 2021 (version 1)
Copyright
© 2021, Gao et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 797
 Page views

 128
 Downloads

 3
 Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading

 Evolutionary Biology
 Genetics and Genomics
Microbial plankton play a central role in marine biogeochemical cycles, but the timing in which abundant lineages diversified into ocean environments remains unclear. Here, we reconstructed the timeline in which major clades of bacteria and archaea colonized the ocean using a highresolution benchmarked phylogenetic tree that allows for simultaneous and direct comparison of the ages of multiple divergent lineages. Our findings show that the diversification of the most prevalent marine clades spans throughout a period of 2.2 Ga, with most clades colonizing the ocean during the last 800 million years. The oldest clades – SAR202, SAR324, Ca. Marinimicrobia, and Marine Group II – diversified around the time of the Great Oxidation Event, during which oxygen concentration increased but remained at microaerophilic levels throughout the MidProterozoic, consistent with the prevalence of some clades within these groups in oxygen minimum zones today. We found the diversification of the prevalent heterotrophic marine clades SAR11, SAR116, SAR92, SAR86, and Roseobacter as well as the Marine Group I to occur near to the Neoproterozoic Oxygenation Event (0.8–0.4 Ga). The diversification of these clades is concomitant with an overall increase of oxygen and nutrients in the ocean at this time, as well as the diversification of eukaryotic algae, consistent with the previous hypothesis that the diversification of heterotrophic bacteria is linked to the emergence of large eukaryotic phytoplankton. The youngest clades correspond to the widespread phototrophic clades Prochlorococcus, Synechococcus, and Crocosphaera, whose diversification happened after the Phanerozoic Oxidation Event (0.45–0.4 Ga), in which oxygen concentrations had already reached their modern levels in the atmosphere and the ocean. Our work clarifies the timing at which abundant lineages of bacteria and archaea colonized the ocean, thereby providing key insights into the evolutionary history of lineages that comprise the majority of prokaryotic biomass in the modern ocean.

 Evolutionary Biology
 Genetics and Genomics
In many species, meiotic recombination events tend to occur in narrow intervals of the genome, known as hotspots. In humans and mice, double strand break (DSB) hotspot locations are determined by the DNAbinding specificity of the zinc finger array of the PRDM9 protein, which is rapidly evolving at residues in contact with DNA. Previous models explained this rapid evolution in terms of the need to restore PRDM9 binding sites lost to gene conversion over time, under the assumption that more PRDM9 binding always leads to more DSBs. This assumption, however, does not align with current evidence. Recent experimental work indicates that PRDM9 binding on both homologs facilitates DSB repair, and that the absence of sufficient symmetric binding disrupts meiosis. We therefore consider an alternative hypothesis: that rapid PRDM9 evolution is driven by the need to restore symmetric binding because of its role in coupling DSB formation and efficient repair. To this end, we model the evolution of PRDM9 from first principles: from its binding dynamics to the population genetic processes that govern the evolution of the zinc finger array and its binding sites. We show that the loss of a small number of strong binding sites leads to the use of a greater number of weaker ones, resulting in a sharp reduction in symmetric binding and favoring new PRDM9 alleles that restore the use of a smaller set of strong binding sites. This decrease, in turn, drives rapid PRDM9 evolutionary turnover. Our results therefore suggest that the advantage of new PRDM9 alleles is in limiting the number of binding sites used effectively, rather than in increasing net PRDM9 binding. By extension, our model suggests that the evolutionary advantage of hotspots may have been to increase the efficiency of DSB repair and/or homolog pairing.