Evolution of irreversible somatic differentiation
Abstract
A key innovation emerging in complex animals is irreversible somatic differentiation: daughters of a vegetative cell perform a vegetative function as well, thus, forming a somatic lineage that can no longer be directly involved in reproduction. Primitive species use a different strategy: vegetative and reproductive tasks are separated in time rather than in space. Starting from such a strategy, how is it possible to evolve life forms which use some of their cells exclusively for vegetative functions? Here, we develop an evolutionary model of development of a simple multicellular organism and find that three components are necessary for the evolution of irreversible somatic differentiation: (i) costly cell differentiation, (ii) vegetative cells that significantly improve the organism’s performance even if present in small numbers, and (iii) large enough organism size. Our findings demonstrate how an egalitarian development typical for loose cell colonies can evolve into germsoma differentiation dominating metazoans.
Introduction
In complex multicellular organisms, different cells specialise to execute different functions. These functions can be generally classified into two kinds: reproductive and vegetative. Cells performing reproductive functions contribute to the next generation of organisms, while cells performing vegetative function contribute to sustaining the organism itself. In unicellular species and simple multicellular colonies, these two kinds of functions are performed at different times by the same cells – specialization is temporal. In more complex multicellular organisms, specialization transforms from temporal to spatial (Mikhailov et al., 2009), where groups of cells focused on different tasks emerge in the course of organism development.
Typically, cell functions are changed via differentiation, such that a daughter cell performs a different function than the maternal cell. The vast majority of metazoans feature a very specific and extreme pattern of cell differentiation: any cell performing vegetative functions forms a somatic lineage, that is, producing cells performing the same vegetative function – somatic differentiation is irreversible. Since such somatic cells cannot give rise to reproductive cells, somatic cells do not have a chance to pass their offspring to the next generation of organisms. Such a mode of organism development opened a way for deeper specialization of somatic cells and consequently to the astonishing complexity of multicellular animals. Outside of the metazoans – in a group of green algae Volvocales serving as a model species for evolution of multicellularity – the emergence of irreversibly differentiated somatic cells is the hallmark innovation marking the transition from colonial life forms to multicellular species (Kirk, 2005).
While the production of individual cells specialized in vegetative functions comes with a number of benefits (Grosberg and Strathmann, 2007), the development of a dedicated vegetative cell lineage that is lost for organism reproduction is not obviously a beneficial adaptation. From the perspective of a cell in an organism, the guaranteed termination of its lineage seems the worst possible evolutionary outcome for itself. From the perspective of an entire organism, the death of somatic cells at the end of the life cycle is a waste of resources, as these cells could in principle become parts of the next generation of organisms. For example, exceptions from irreversible somatic differentiation are widespread in plants (Lanfear, 2018) and are even known in simpler metazoans among cnidarians (DuBuc et al., 2020) for which differentiation from vegetative to reproductive functions has been reported. Therefore, the irreversibility of somatic differentiation cannot be taken for granted in the course of the evolution of complex multicellularity.
Terminal differentiation is a type of cell differentiation different from irreversible cell differentiation. Unlike irreversibly differentiated cells who are capable of cell division, terminally differentiated cells lose the ability to divide. Terminally differentiated cells often perform tasks too demanding to be compatible with cell division. For example heterocysts of cyanobacteria perform nitrogen fixation, which requires anaerobic conditions, therefore these cells are very limited in resources and do not divide. In the scope of this study, we do not consider terminal differentiation but focus on somatic cells that are able to divide while being part of an organism (or cell colony) but not able to grow into a new organism, that is, irreversible somatic differentiation.
The majority of the theoretical models addressing the evolution of somatic cells focuses on the evolution of cell specialization, abstracting from the developmental process how germ (reproductive specialists) and soma are produced in the course of the organism growth. For example, a large amount of work focuses on the optimal distribution of reproductive and vegetative functions in the adult organism (Michod, 2007; Willensdorfer, 2009; Rossetti et al., 2010; Rueffler et al., 2012; Ispolatov et al., 2012; Goldsby et al., 2012; Solari et al., 2013; Goldsby et al., 2014; Amado et al., 2018; Tverskoi et al., 2018). However, these models do not consider the process of organism development. Other work takes the development of an organism into account to some extent: In Gavrilets, 2010, the organism development is considered, but the fraction of cells capable of becoming somatic is fixed and does not evolve. In Erten and Kokko, 2020, the strategy of germtosoma differentiation is an evolvable trait, but the irreversibility of somatic differentiation is taken for granted. In Rodrigues et al., 2012, irreversible differentiation was found, but both considered cell types pass to the next generation of organisms, such that the irreversible specialists are not truly somatic cells in the sense of evolutionary dead ends. Finally, in Cooper and West, 2018 a broad scope of cell differentiation patterns has been investigated in the context of evolution of cooperation. However, irreversible somatic differentiation was not considered in the study. Hence, the theoretical understanding of the evolution of irreversibly differentiated somatic cell lines is limited so far.
In the present work, we developed a theoretical model to investigate conditions for the evolution of the irreversible somatic differentiation. In the model, we suppose there are two cell types: germrole and somarole, where only germrole cells pass to the next generation of organisms while somarole cells are responsible for vegetative functions. Both germrole cells and somarole cells can divide and they may switch to each other during growth. In our model, we incorporate factors including (i) costs of cell differentiation, (ii) benefits provided by presence of somarole cells, (iii) maturity size of the organism. We ask under which circumstances irreversible somatic differentiation is a strategy that can maximize the population growth rate compared to strategies in which differentiation does not occur or somatic differentiation is reversible.
Model
We consider a large population of clonally developing organisms composed of two types of cells: germrole and somarole. The roles differ in the ability to survive beyond the end of the organism life cycle: somarole cells die at the end, while germrole cells continue to live. Each organism is initiated as a single germrole cell. In the course of the organism growth, germrole cells may differentiate to give rise to somarole cells and vice versa, see Figure 1A,B. After $n$ rounds of synchronous cell divisions, the organism reaches its maturity size of ${2}^{n}$ cells. Immediately upon reaching maturity, the organism reproduces: germrole cells disperse and each becomes a newborn organism, while all somarole cells die and are thus lost, see Figure 1A. We assume that somarole cells are capable to accelerate growth: an organism containing more somatic cells grows faster, so having somarole cells during the life cycle is beneficial for the organism.
To investigate the evolution of irreversible somatic differentiation, we consider organisms in which the functional role of the cell (germrole or somarole) is not necessarily inherited. When a cell divides, the two daughter cells can change their role, leading to three possible combinations: two germrole cells, one germrole cell plus one somarole cell, or two somarole cells. We allow all these outcomes to occur with different probabilities, which also depend on the parental type, see Figure 1B. If the parental cell had the germrole, the probabilities of each outcome are denoted by ${g}_{gg}$, ${g}_{gs}$, and ${g}_{ss}$ respectively. If the parental cell had the somarole, these probabilities are ${s}_{gg}$, ${s}_{gs}$, and ${s}_{ss}$. Altogether, six probabilities define a stochastic developmental strategy $D=({g}_{gg},{g}_{gs},{g}_{ss};{s}_{gg},{s}_{gs},{s}_{ss})$. In our model, it is the stochastic developmental strategy that is inherited by offspring cells rather than the functional role of the parental cell.
To feature irreversible somatic differentiation, the developmental strategy must allow germrole cells to give rise to somarole cells (${g}_{gg}<1$) and must forbid somarole cells to give rise to germrole cells (${s}_{ss}=1$). All other developmental strategies can be broadly classified into two classes. Reversible somatic differentiation describes strategies where cells of both roles can give rise to each other: ${g}_{gg}<1$ and ${s}_{ss}<1$. In the strategy with no somatic differentiation, somarole cells are not produced in the first place: ${g}_{gg}=1$, see Table 1.
In our model, evolution of the developmental strategy is driven by the growth competition between populations executing different strategies – these populations able to produce more offspring and/or complete their life cycle faster gain a selective advantage. Specifically, we measure the fitness in the growth competition by the population growth rate in a stationary regime of exponential growth (Pichugin et al., 2017; Gao et al., 2019). The rate of population growth is determined by the number of offspring produced by an organism (equal to the number of germrole cells at the end of life cycle) and the time needed for an organism to develop from a single cell to maturity (improved with the number of somarole cells during the life cycle).
To obtain these growth rates, we simulate the process of the organism growth. Here, we assume that resource distribution among cells is coordinated at the level of the organism: Cells which need more resources will get more, such that cell division is synchronous. In our model, we consider synchronous cell division of organisms and our main results are dependent on this assumption. However, we shortly explore the effects of asynchronous cell division in Appendix G. Any organism is born as a single germrole cell and passes through $n$ rounds of simultaneous cell divisions. Each round starts with every cell independently choosing the outcome of its division with probability of each outcome given by the developmental strategy ($D$). This step determines what composition will the organism have at the next round of cell division. Then, the length of the cell doubling round ($t$) is computed as a product of two independent effects: the differentiation effect ${F}_{\text{diff}}$ representing costs of changing cell roles (Gallon, 1992) and the organism composition effect ${F}_{\text{comp}}$ representing benefits from having somarole cells (Grosberg and Strathmann, 1998; Shelton et al., 2012; Matt and Umen, 2016),
Both ${F}_{\text{diff}}$ and ${F}_{\text{comp}}$ are recalculated at every round of cell division.
The cell differentiation effect ${F}_{\text{diff}}$ represents the costs of cell differentiation. The differentiation of a cell requires efforts to modify epigenetic marks in the genome, recalibration of regulatory networks, synthesis of additional and utilization of no longer necessary proteins. This requires an investment of resources and therefore an additional time to perform cell division. Hence, any cell, which is about to give rise to a cell of a different role, incurs a differentiation cost ${c}_{g\to s}$ for germtosoma and ${c}_{s\to g}$ for somatogerm transitions (and double of these if both offspring take a role different from the parent), see Figure 1C. The differentiation cost is the averaged differentiation cost among all cells in an organism
where ${N}_{s\to gs}$ is the number of somaroll cells that produce a germrole cell and a somarole cell in a cell division step. ${N}_{s\to gg}$, ${N}_{g\to gs}$ and ${N}_{g\to ss}$ are defined in the analogous way. $N$ is the number of total cells. As organisms undergo synchronous cell division, we have $N={2}^{n}$ cells after the $n$ th cell division.
The composition effect profile ${F}_{\text{comp}}(x)$ captures how the cell division time depends on the proportion of somarole cells $x=s/(s+g)$ present in an organism ($s$ and $g$ are the numbers of somarole and germrole cells). In this study, we use a functional form illustrated in Figure 1D and given by
With the functional form (3), somarole cells can benefit to the organism growth, only if their proportion in the organism exceeds the contribution threshold x_{0}. Interactions between somarole cells may lead to the synergistic (increase in the number of somarole cells improves their efficiency), or discounting benefits (increase in the number of somarole cells reduces their efficiency) to the organism growth, controlled by the contribution synergy parameter $\alpha $. The maximal achievable reduction in the cell division time is given by the maximal benefit $b$, realized beyond the saturation threshold x_{1} of the somarole cell proportion. A further increase in the proportion of somarole cells does not provide any additional benefits. With the right combination of parameters, (3) is able to recover various characters of somarole cells contribution to the organism growth: linear (${x}_{0}=0,{x}_{1}=1,\alpha =1$), powerlaw (${x}_{0}=0,{x}_{1}=1,\alpha \ne 1$), stepfunctions (${x}_{0}={x}_{1}$), and a huge range of other scenarios. Previous works have shown that convex (accelerating) performance functions favour cell differentiation (Michod, 2006; Rueffler et al., 2012; Cooper and West, 2018). The performance functions measure the performance of organisms with respect to different traits, such as fertility and viability. Lately, the form of functions favoring cell differentiation has been extended to be concave (decelerating) by including topological constraints in organisms (Yanni et al., 2020). Our model extends the form of performance functions by allowing it has a contribution threshold and saturation threshold.
Once the outcome of all cell divisions is known and the time needed to complete the current cell doubling round is computed, the current round ends and the next starts. The development completes after $n$ rounds. At this stage, the number of germrole cells (organism offspring number) and the cumulative length of the life cycle are obtained.
In Gao et al., 2019, we have shown that the growth rate ($\lambda $) of a population, in which organisms undergo a stochastic development and fragmentation, is given by the solution of
Here, $i$ is the developmental trajectory – in our case, the specific combination of all cell division outcomes; ${G}_{i}$ is the number of offspring organisms produced at the end of developmental trajectory $i$, equal to the number of germrole cells at the moment of maturity; ${P}_{i}$ is the probability that an organism development will follow the trajectory $i$; $T}_{i$ is the time necessary to complete the trajectory $i$ – from a single cell to the maturity size of ${2}^{n}$ cells.
For a given combination of differentiation costs (${c}_{g\to s}$, ${c}_{s\to g}$) and a composition effect profile (determined by four parameters: x_{0}, x_{1}, $b$, and $\alpha $), we screen through a number of stochastic developmental strategies $D$ and identify the one providing the largest growth rate ($\lambda $) to the population. In this study, we searched for those parameters under which irreversible strategies lead to the fastest growth and are thus evolutionary optimal, see model details in Appendix A.
Results
For irreversible somatic differentiation to evolve, cell differentiation must be costly
We found that irreversible somatic differentiation does not evolve when cell differentiation is not associated with any costs (${c}_{s\to g}={c}_{g\to s}=0$), see Figure 2A. Only reversible differentiation evolves there, see Figure 2B. This finding comes from the fact that when somatic differentiation is irreversible, the fraction of germrole cells can only decrease in the course of life cycle. As a result, irreversible strategies deal with the tradeoff between producing more somarole cells at the beginning of the life cycle, and having more germrole cells by the end of it. On the one hand, irreversible strategies which produce a lot of somarole cells early on, complete the life cycle quickly but preserve only a few germrole cells by the time of reproduction. On the other hand, irreversible strategies which generate a lot of offspring, can deploy only a few somarole cells at the beginning of it and thus their developmental time is inevitably longer. By contrast, reversible somatic differentiation strategies do not experience a similar tradeoff, as germrole cells can be generated from somarole cells. As a result, reversible strategy allows higher differentiation rates and can develop a high somarole cell fraction in the course of the organism growth and at the same time have a large number of germrole cells by the moment of reproduction. Under costless cell differentiation, for any irreversible strategy, we can find a reversible differentiation counterpart, which leads to faster growth: the development proceeds faster, while the expected number of produced offspring is the same, see Appendix 2 for details. As a result, costless cell differentiation cannot lead to irreversible somatic differentiation.
To confirm the reasoning that reversible strategies gain an edge over irreversible strategies by having larger differentiation rates, we asked which reversible and irreversible strategies become optimal at various cell differentiation costs ($c={c}_{s\to g}={c}_{g\to s}$). At each value of costs, we found evolutionarily optimal developmental strategy for 3000 different randomly sampled composition effect profiles ${F}_{\text{comp}}(x)$. We found that evolutionarily optimal reversible strategies feature much larger rates of cell differentiation than evolutionarily optimal irreversible strategies, see Figure 2D. Even at large costs, where frequent differentiation is heavily penalized, the distinction between differentiation rates of reversible and irreversible strategies remains apparent.
We screened through a spectrum of germtosoma (${c}_{g\to s}$) and somatogerm (${c}_{s\to g}$) differentiation costs, see Figure 2A–C. Irreversible somatic differentiation is most likely to evolve when it is cheap to differentiate from germrole to somarole (low ${c}_{g\to s}$) but it is expensive to differentiate back (high ${c}_{s\to g}$), see Figure 2A. Irreversible strategies are insensitive to high somatogerm costs, since somarole cells never differentiate. At the same time, reversible strategies are heavily punished by high costs of somarole differentiation.
It is not very surprising to find irreversible differentiation where the differentiation costs are highly asymmetric. However, irreversible strategies are consistently observed in other regions of the costs space, even including these, where the asymmetry is opposite (it is hard to go from germ to soma but easy to return back), see Figure 2A,H. To identify what other factors, beyond asymmetric costs, can lead to evolution of irreversible somatic differentiation, below we focus on the scenario of equal differentiation costs ${c}_{s\to g}={c}_{g\to s}=c$.
Evolution of irreversible somatic differentiation is promoted when even a small number of somatic cells provides benefits to the organism
The composition effect profiles ${F}_{\text{comp}}(x)$ that promote the evolution of irreversible somatic differentiation have certain characteristic shapes, see Figure 2E–H. We investigated what kind of composition effect profiles can make irreversible somatic differentiation become an evolutionary optimum. We sampled a number of random composition effect profiles with independently drawn parameter values and found optimal developmental strategies for each profile for a number of differentiation costs ($c$) and maturity size (${2}^{n}$) values. We took a closer look at the instances of ${F}_{\text{comp}}(x)$ which resulted in irreversible somatic differentiation being evolutionarily optimal.
We found that irreversible strategies are only able to evolve when the somarole cells contribute to the organism cell doubling time even if present in small proportions, see Figure 3A,B. Analysing parameters of the composition factors promoting irreversible differentiation, we found that this effect manifests in two patterns. First, the contribution threshold value (x_{0}) has to be small, see Figure 3D – irreversible differentiation is promoted when somarole cells begin to contribute to the organism growth even in low numbers. Second, the contribution synergy was found to be large ($\alpha >1$) or, alternatively, the saturation threshold (x_{1}) was small, see Figure 3C.
Both the contribution threshold x_{0} and the contribution synergy $\alpha $ control the shape of the composition effect profile at intermediary abundances of somarole cells. If the contribution synergy $\alpha $ exceeds 1, the profile is convex, so the contribution of somarole cells quickly becomes close to maximum benefit ($b$). A small saturation threshold (x_{1}) means that the maximal benefit of soma is achieved already at low concentrations of somarole cells (and then the shape of composition effect profile between two close thresholds has no significance). Together, these patterns give an evidence that the most crucial factor promoting irreversible somatic differentiation is the effectiveness of somarole cells at small numbers, see Appendix 4 for more detailed data presentation.
These patterns are driven by the static character of differentiation strategies we use: the chances for a cell to differentiate are the same at the first and the last round of cell division. Therefore, the optimal germtosoma differentiation rate is found as a balance between the needs to deploy somarole cells early on and to keep the high number of germrole by the end of the life cycle. This implies that irreversible somatic differentiation strategies produce somarole cells at lower rate than reversible strategies, see Figure 2D. With irreversible differentiation, an organism spends a significant amount of time having only a few somarole cells. Hence, the irreversible strategy can only be evolutionarily successful, if the few somarole cells have a notable contribution to the organism growth time.
We also found that profiles featuring irreversible differentiation do not possess neither extremely large, nor extremely small maximal benefit values $b$, see Figure 3D. When the maximal benefit is too small, the cell differentiation just does not provide enough benefits to be selected for and the evolutionarily optimal strategy is no differentiation. In the opposite case, when the maximal benefit is very close to one, the cell doubling time approaches zero, see Equation (3). Then, the benefits of having many somarole cells outweighs the costs of differentiation and the optimal strategy is reversible, see Appendix 4.
For irreversible somatic differentiation to evolve, the organism size must be large enough
By screening through the maturity size (${2}^{n}$) and differentiation costs ($c$), we found that the evolution of irreversible somatic differentiation is heavily suppressed at small maturity sizes, Figure 4A. We found that either reversible strategies or the no differentiation strategy evolve in small organisms. Since reversible strategies can quickly reach a fixed fraction of somarole cells, thus they can obtain maximised benefits from somarole cells with small maturity sizes (Appendix 2—figure 1). Since the no differentiation strategy does not involve cell differentiation, they do not have cell differentiation costs. In contrast, irreversible strategies increase the fraction of somaroles and increase the benefits of somarole cells gradually as maturity size increases. Meanwhile, the cell differentiation costs for irreversible strategies decrease as maturity size increases as the fraction of germrole cells decreases. Thus compared with other strategies, the irreversible strategies have advantages in large organisms. We found that under ${c}_{s\to g}={c}_{g\to s}$, the minimal maturity size allowing irreversible somatic differentiation to evolve is ${2}^{n}=64$ cells. At the same time, organisms performing just a few more rounds of cell divisions are able to evolve irreversible differentiation at a wide range of cell differentiation costs, see also Appendix 5. This indicates that the evolution of irreversible somatic differentiation is strongly tied to the size of the organism.
Evolution of irreversible strategies at sizes smaller than 64 cells is possible for ${c}_{s\to g}>{c}_{g\to s}$. For instance, at ${c}_{s\to g}=2{c}_{g\to s}$ some irreversible strategies were found to be optimal at the maturity size 2^{5} = 32 cells, Figure 4B. However, irreversible strategies were found in a narrow range of cell differentiation costs and the fraction of composition effect profiles that allow evolution of irreversible differentiation there was quite low – about 1%. The evolution of irreversible strategies at such small maturity sizes becomes likely only at extremely unequal costs of transition between germ and some roles ${c}_{s\to g}\gg {c}_{g\to s}$, see Figure 4C. Hence, for irreversible somatic differentiation to evolve, the organism size should exceed a threshold of roughly 64 cells.
Irreversible somatic differentiation can also evolve when cell differentiation is risky
In our main model, we considered differentiation costs in a specific form of cell division delay. However, the process of cell differentiation may impact the organism development in another way. Differentiation requires modifications in DNA regulation, which in turn poses a risk of dysregulation resulting in an emergence of selfish mutants that could for example cause cancer. The disposable soma theory suggests that cells performing vegetative functions form separate lineages to contain emerging mutations and prevent them from passing to the next generations of organisms. In line with this hypothesis, we also considered a model of risky cell differentiation, where the transition between germ and soma roles incurs a risk of getting cancer that kills the entire organism, see Appendix 6.
The results obtained with a model of risky differentiation are very similar to the outcomes of our main model, where cell differentiation cause delay, see Figure 5. In both models, irreversible differentiation only evolves if cell differentiation does not come for free but brings costly sideeffects (delay or risk). Also, in both models irreversible differentiation is prevalent when costs of somatogerm transitions are intense; reversible differentiation is prevalent when costs of both transitions are low; and no differentiation is prevalent when costs of germtosoma transitions are intense Figure 2A–C.
Discussion
The vast majority of cells in a body of any multicellular being contains enough genetic information to build an entire new organism. However, in a typical metazoan species, very few cells actually participate in the organism reproduction – only a limited number of germ cells are capable of doing it. The other cells, called somatic cells, perform vegetative functions but do not contribute to reproduction – somatic differentiation is irreversible. We asked for the reason for the success of such a specific mode of organism development. We theoretically investigated the evolution of irreversible somatic differentiation with a model of clonally developing organisms taking into account benefits provided by somarole cells, costs arising from cell differentiation, and the effect of the raw organism size.
Our key findings are:
The evolution of irreversible somatic differentiation is inseparable from costly cell differentiation or risky cell differentiation.
For irreversible somatic differentiation to evolve in organisms with synchronous cell division, somatic cells should be able to contribute to the organism performance already when their numbers are small.
Only large enough organisms tend to develop irreversible somatic differentiation.
According to our results, cell differentiation costs are essential for the emergence of irreversible somatic differentiation, see Figure 2A. The costs punish strategies with high rate of cell differentiation. As a result, irreversible strategies gain an advantage because their overall differentiation rate is low, see Figure 2D, and somarole cells do not differentiate at all. Most models focus on traits that lead to benefits for the organism, while the cost of cell differentiation are rarely considered. For cells in a multicellular organism, differentiation costs arise from the material needs, energy, and time it takes to produce components necessary for the performance of the differentiated cell, which were absent in the parent cell. For instance, in filamentous cyanobacteria nitrogenfixating heterocysts develop much thicker cell wall than parent photosynthetic cells had. Also, reports indicate between 23% (Ow et al., 2008) and 74% (Sandh et al., 2014) of the proteome changes its abundance in heterocysts compared against photosynthetic cells. Similarly, the changes in the protein composition in the course of cell differentiation was found during the development of stalk and fruiting bodies of Dictyostelium discoideum (Bakthavatsalam and Gomer, 2010; Czarna et al., 2010).
An alternative to differentiation costs in terms of slower growth is a model with a risky differentiation, where we found similar patterns, see Figure 5. These results indicate that the exact mechanism of the differentiation costs does not play a major role in the evolution of irreversible somatic differentiation.
Our model demonstrates that irreversible somatic differentiation is more likely to evolve when a few somarole cells are able to provide a substantial benefit to the organism, see Figure 3. Volvocales algae demonstrate that a significant contribution by small numbers of somatic cells might indeed be found in a natural population: In Eudorina illinoiensis, only four out of thirtytwo cells are vegetative (Sambamurty AVSS, 2005) (somarole in our terms). This species has developed some reproductive division of labour and a fraction of only $1/8$ of vegetative cells is sufficient for colony success. Thus, it seems possible that highlyefficient somarole cells open the way to the evolution of irreversible somatic differentiation. Several patterns of how cells proved the benefit to an organism have been previously considered (Michod, 2007; Willensdorfer, 2009; Rossetti et al., 2010; Rueffler et al., 2012; Cooper and West, 2018; Yanni et al., 2020). The majority of papers focuses on the resource allocation toward different tasks in each cell in an organism and how divergent different cells can be. In our model, we assume that the germrole and somarole cell are different in function and focus on the relationship between the number of somarole cells and their impact, e.g. the character of their interactions. While the found ${F}_{\text{comp}}$ curves exhibit convexlike shape, see Figure 3A,B, this finding has a different nature from the convex tradeoff between fertility and viability found in the models of cell differentiation (Michod, 2007).
Our model shows that irreversible somatic differentiation does not evolve if the organism size is small, see Figure 4A. The maturity size plays an important role in an organism’s life cycle (Amado et al., 2018; Erten and Kokko, 2020): Large organisms have potential advantages to optimize themselves in multiple ways, such as to improve growth efficiency (Waters et al., 2010), to avoid predators (Matz and Kjelleberg, 2005; Fisher et al., 2016; Hiltunen and Becks, 2014), to increase problemsolving efficiency (MorandFerron and Quinn, 2011), and to exploit the division of labour in organisms (Carroll, 2001; Matt and Umen, 2016). Moreover, the maximum size has been related to the reproduction of the organism from the onset of multicellularity in Earth’s history (Ratcliff et al., 2012). Our results suggest that the smallest organism able to evolve irreversible somatic differentiation should typically be about 32–64 cells (unless the cost of somatogerm differentiation is extremely large and the cost of the reverse is low). This is in line with the pattern of development observed in Volvocales green algae. In Volvocales, cells are unable to move (vegetative function) and divide (reproductive function) simultaneously, as a unique set of centrioles are involved in both tasks (Wynne and Bold, 1985; Koufopanou, 1994). Chlamydomonas reinhardtii (unicellular) and Gonium pectorale (small colonies up to 16 cells) perform these tasks at different times. They move towards the top layers of water during the day to get more sunlight. At night, however, these species perform cell division and/or colony reproduction, slowly sinking down in the process. However, among larger Volvocales, a division of labour begins to develop. In Eudorina elegans colonies, containing 16–32 cells, a few cells at the pole have their chances to give rise to an offspring colony reduced (Marchant, 1977; Hallmann, 2011). In P. californica, half of the 128celled colony is formed of smaller cells, which are totally dedicated to the colony movement and die at the end of colony life cycle (Kikuchi, 1978; Hallmann, 2011). In Volvox carteri, most of a 10,000 cell colony is formed by somatic cells, which die upon the release of offspring groups (Hallmann, 2011).
In a majority of our tests, we used the maturity size of 2^{10} = 1024 cells. This is significantly larger than the minimal necessary size for evolution of irreversible somatic differentiation. However, the body size of the order of 1000 cell attracts attention because at this scale organisms of very diverse degrees of complexity are observed: from undifferentiated colonies (ocean algae Phaeocystis antarctica), to intermediary life forms (slime molds slugs), to paradigm multicellular organisms (higher Volvocales and nematode Caenorhabditis elegans).
The model presented in our study focuses on the transition from colonial life forms to multicellular beings. Further development of complexity opens multiple new ways for optimization of life cycle. For example, a maternal organism can provide protection and nurture for offspring at their early stages of growth, like in V. carteri (10,000 cells) in which offspring colonies develop inside the parental organism. There, the rate of offspring growth depends mostly on the performance of the maternal organism and much less on the differentiation strategy of offspring. Having maternal protection allows to relax the conditions for evolution of irreversible differentiation indicated in our study. How much these conditions can be relaxed is a very interesting question.
One of the most significant assumptions we took is the synchronicity of cell divisions even if division outcomes are different. This is only possible if cell actions are coordinated at the level of organism – otherwise, cells that do not differentiate may complete their divisions before differentiating cells. When in the history of multicellularity such a coordination emerges is an open question. However, in a number of rather simple species, a synchronicity of cell divisions paired with cell differentiation is observed. One example is the green algae Eudorina illinoiensis – one of the simplest species demonstrating the first signs of reproductive division of labour, in which four out of 32 cells are differentiated (Sambamurty AVSS, 2005). Another example is 128celled algae Pleodorina californica, half of the cells are differentiated. And still, the cell divisions are synchronous (Kikuchi, 1978). Even the size of the mature organism being a power of two indicates that cells do not divide independently, but their actions are controlled at the level of the organism.
To peek at the impact of the cell division synchronicity, we developed a model with asynchronous cell division, where cell differentiation costs are paid individually by each differentiating cell, see Appendix. G. We found that the evolution of irreversible differentiation is significantly suppressed even under the most favourable conditions (${c}_{s\to g}\gg {c}_{g\to s}$) – the frequency of composition profiles promoting irreversible somatic differentiation is much smaller and the maturity size restriction is higher.
Another assumption, which shapes the results of our study, is the static differentiation strategy the probability of each division outcome does not depend on the stage of life cycle. On the one hand, the static nature of differentiation strategy puts irreversible differentiation in disadvantage, as it creates a tradeoff between the fraction of somarole cells at the early stage of life cycle and the number of germrole cells at the end of life cycle. On the other hand, a set of fully flexible dynamic differentiation strategies present an efficient but hardly realistic solution to the life cycle optimization problem: at the first round of cell divisions organism converts to allsoma state and remains so until the last round, when all cells convert back to germstate. Theoretically, this strategy provides simultaneously the fastest possible development rate (100% somarole cells during life cycle) and the largest possible number of offspring (100% germrole cells at the end of life cycle). Still, we cannot provide an example of such a developmental program in nature. Nevertheless, the differentiation strategy of higher Volvocales is not static Kirk, 2005 and the exploration of a vast space of dynamic differentiation strategies warrants further investigation.
We acknowledge that our discussion of natural examples of germsoma differentiation relies heavily on Volvocales algae. This merely reflects the bias in the empirical literature about evolution of germ/soma differentiation towards this group. We should note that our model is not a model of Volvocales life cycle. Instead, we aim to answer the question about emergence of irreversible somatic differentiation in a broad context without tailoring it to the features of a single group.
Our study originated from curiosity about driving factors in the evolution of irreversible somatic differentiation: Why does the green algae Volvox from the kingdom Plantae shed most of its biomass in a single act of reproduction? And why, in another kingdom, Animalia, in most of the species the majority of body cells is outright forbidden to contribute to the next generation? Our results show which factors makes a difference between the evolution of an irreversible somatic differentiation and other strategies of development. One of these factors, the maturity size, is known in the context of the evolution of reproductive division of labour (Kirk, 2005). Another factor, the costs of cell differentiation, is, in general, discussed in a greater biological scope but is hardly acknowledged as a factor contributing to the evolution of organism development. Finally, the early contribution of somarole cells to the organism growth, even if they are small in numbers, is an unexpected outcome of our investigation, overlooked so far as well. Despite the simplistic nature of our model (we did not aim to model any specific organism), all our results find a confirmation among the Volvocales clade. Hence, we expect that the findings of this study reveal general properties of the evolution of irreversible somatic differentiation, independently of the clade where it evolves.
Appendix 1
Search for the evolutionarily optimal developmental program
Finding the population growth rate for a given developmental program
In Gao et al., 2019, we have shown that a population of organisms, which begin their life cycle from the same state but have a stochastic development, eventually grows exponentially with the rate $\lambda $ given by the solution of
Here, $i$ is the developmental trajectory – in our case, the specific combination of all cell division outcomes; ${P}_{i}$ is the probability that an organism development will follow the trajectory $i$; $T}_{i$ is the time necessary to complete the trajectory $i$ – from a single cell to the maturity size of ${2}^{n}$ cells; ${G}_{i}$ is the number of offspring organisms produced at the end of developmental trajectory $i$, equal to the number of germrole cells at the moment of maturity.
In order to find the population growth rate, we need to know ${G}_{i}$, ${T}_{i}$, and ${P}_{i}$ (how many offspring are produced, how long did it take to mature, and how likely is this developmental trajectory, respectively). The complete set of developmental trajectories is huge as it scales exponentially with the number of divisions $n$.
In our study, for each developmental strategy, we sampled $M=300$ developmental trajectories at random. To get each trajectory, we simulated the growth of the single organism according to the rules of our model. For each trajectory, the developmental time ${T}_{i}$ was computed as a sum of cell doubling times at each of the $n$ synchronous cell divisions, the number of offspring ${G}_{i}$ was given by the count of germrole cells at the end of development. The resulting ensemble of trajectories (with ${P}_{i}=1/M$) was plugged into (5) to compute the population growth rate $\lambda $.
Finding the developmental program with the largest population growth rate
We assume that evolution occurs by growth competition between populations executing different developmental strategies. These strategies, which provide larger population growth rate will outgrow others. To find evolutionarily optimal strategies under given conditions, we screened through a large set of developmental strategies and identified the one with the maximal population growth rate $\lambda $. Since the probabilities of cell division outcomes sum into one (${g}_{gg}+{g}_{gs}+{g}_{ss}=1$ and ${s}_{gg}+{s}_{gs}+{s}_{ss}=1$), these probabilities can be represented as a point on two simplexes, one for the division of germrole cells, and one for the division of somarole cells. Consequently, we choose the set of developmental strategies as a Cartesian product of two triangular lattices – one for division probabilities of germrole cells (${g}_{gg},{g}_{gs},{g}_{ss}$) and one for somarole cells (${s}_{gg},{s}_{gs},{s}_{ss}$). The lattice space was set to 0.1, so each of two independent lattices contained $11\times 12/2=66$ nodes, and the whole set of developmental strategies comprised 66 × 66 = 4356 different strategies. For each of these strategies, the population growth rate $\lambda $ was calculated and the strategy with the largest growth rate was identified as evolutionarily optimal.
In our investigation, parameters such as differentiation costs (${c}_{s\to g}$, ${c}_{g\to s}$) and maturity size (${2}^{n}$) were used as control parameters. In other words, we either fix them at the specific values, or screened through a range of values to obtain a map (see Figures 2 and 3 in the main text). However, the parameters that controlled the shape of composition effect profile (x_{0}, x_{1}, $\alpha $, and $b$) were treated differently. For each combination of control parameters, we randomly sampled a number (between 200 and 3000) of combinations of these parameters. The thresholds ($0\le {x}_{0}\le {x}_{1}\le 1$) were sampled as a pair of independent distributed random values from the uniform distribution $U(0,1)$. The contribution threshold x_{0} was set to the minimum of the pair, and the saturation threshold x_{1} was set to the maximum. The contribution synergy ($\alpha >0$) corresponds to the concave shape of the profile at $\alpha <1$ and to the convex shape at $\alpha >1$. Therefore, ${\mathrm{log}}_{10}(\alpha )$ was sampled from the uniform distribution $U(2,+2)$, so the profile has an equal probability to demonstrate concave and convex shape. Finally, the maximum benefit ($0\le b<1$) was sampled from a uniform distribution, $U(0,1)$. For each tested combination of control parameters, we found the optimal developmental strategy for every sampled profile. We then classified these as irreversible somatic differentiation, reversible somatic differentiation, or no somatic differentiation.
Appendix 2
Under costless cell differentiation, irreversible soma strategy cannot be evolutionarily optimal
In this section, we will show that an irreversible strategy can never be an evolutionary optimum without cell differentiation being costly. To do that, we first consider the deterministic dynamics of the expected composition of the organism. Then, for an arbitrary irreversible strategy, we identify a more advantageous reversible strategy which gives the same organism composition at the end of life cycle but higher number of somarole cells during the life cycle.
In our model, the composition of the organism is governed by the stochastic developmental strategy and differs between different organisms. Here, as a proxy for this complex stochastic dynamics, we consider the mathematical expectation of the composition. Assume that after $j$ cell divisions the fraction of somarole cells is ${r}_{s}(j)$ and the fraction of germrole cells is ${r}_{g}(j)=1{r}_{s}(j)$ , $j=1,\mathrm{\dots},n$, where $n$ is the maximal number of divisions. Then, the expected fractions of cells of the two types after the next cell division is
where we introduced ${m}_{s}={s}_{gg}+\frac{{s}_{gs}}{2}$ and ${m}_{g}={g}_{ss}+\frac{{g}_{gs}}{2}$ – the probabilities that the offspring of a cell will have a different role. Naturally, for irreversible somatic differentiation ${m}_{s}=0$ and ${m}_{g}>0$ , for no somatic differentiation strategies ${m}_{g}=0$ and m_{s} being irrelevant, while the reversible differentiation class covers the rest. (6) can be written in matrix form
A newborn organism contains a single germrole cell (${r}_{s}(0)=0,{r}_{g}(0)=1$) , therefore, the expected composition of an organism after $j$ divisions is
The matrix has two eigenvalues: 1 and $1{m}_{g}{m}_{s}$, with associated right eigenvectors ${({m}_{g},{m}_{s})}^{T}$ and ${(1,1)}^{T}$, respectively. Hence, the expected composition after $j$ divisions can be obtained in the explicit form
For an arbitrary irreversible somatic differentiation strategy $D$, ${m}_{s}=0$, the expected number of somarole cells changes as
which is a monotonically increasing function of the number of cell divisions $t$, see the green line in Fig. B. In the life cycle involving $j$ cell divisions, the fraction of somarole cells at the end of life cycle is $r}_{s,D}(j)=1(1{m}_{g}{)}^{j$.
Now, we consider another developmental strategy ${D}^{\prime}$ with reversible somatic differentiation in which ${m}_{g}^{\mathrm{\prime}}={r}_{s,D}(n)$ and ${m}_{s}^{\mathrm{\prime}}=1{r}_{s,D}(n)$. Using ${m}_{g}^{\prime}+{m}_{s}^{\prime}=1$ in (9), it can be shown that the expected fraction of somarole cells in ${D}^{\prime}$ after the very first cell division is exactly ${r}_{s,D}(n)$ and stays constant thereafter, see the orange line in Fig. B. Thus, the number of offspring produced is the same for both development strategies.
If cell differentiation is costless (${d}_{s}={d}_{g}=0$), then the cell doubling time depends only on the fraction of somarole cells. As all somarole cells are then present already after the first cell division, organisms following the reversible strategy ${D}^{\prime}$ will grow faster than organisms using the irreversible strategy $D$ at any stage of organism development, independently of the choice of the composition effect profile (${F}_{\text{comp}}$). At the end of the life cycle, both strategies have the same expected number of offspring. Therefore, under costless cell differentiation, for any irreversible strategy, we can find a reversible strategy that leads to a larger population growth rate.
Appendix 3
Conditions promoting the evolution of reversible, irreversible, and no differentiation strategies
Appendix 4
Parameters of composition effect profiles promoting reversible, irreversible, and no differentiation strategies
Appendix 5
Evolution of irreversible somatic differentiation under various maturity sizes and unequal cell differentiation costs
Appendix 6
Model of risky cell differentiation
In the risky differentiation model, we assume that cell differentiation implies a risk of errors leading to defective cells (Aktipis et al., 2015). These cells act in their selfish interests, compromising the integrity of an organism. This leads to the organism death, very similar to outcomes of cancer in complex multicellular species.
The impact of the defective cell depends on which stage of life cycle it appears. A defective cell emerged during the first cell division will likely result in a nonviable organism. At the same time, a defective cell emerged in the very last round of cell divisions is unlikely to affect the organism because its life cycle is about to end. To reflect this effect, we scaled the impact of a newly emerged defective cell by the number of cells already present in an organism. This way, the probability to get cancer is proportional to the frequency of cell differentiation events. The proportions of somarole cells and germrole cells that differentiate upon division in the total number of cell divisions are
where ${N}_{x\to yz}$ is the number of cell divisions at which cell of role $x$ gives rise to a $y$ cell and a $z$ cell, and $Z={2}^{n}1$ is the total number of cell divisions during the organism growth with maturity size ${2}^{n}$.
We define the probabilities of death caused by defective cells emerged in germ to soma and soma to germ transitions as
where ${\delta}_{g\to s}$ and ${\delta}_{s\to g}$ characterize the risk of cancer from a germ to soma and from a soma to germ transition. The transformation function $\mathrm{tanh}(x)$ is chosen to grow linearly at a small number of differentiation events but exponentially saturates to one if these events are numerous, see Fig. F.
We assume that an organism successfully completes its life cycle and produces offspring only if no cancer emerges in the course of its growth. The probability of this at each round of cell division is
Otherwise, the organism dies and does not produce any offspring. There are no delay differentiation costs in this model $({c}_{s\to g}={c}_{g\to s}=0)$.
A typical feature of the cancer cells in complex organisms is a high cell division rate. This has a large impact on organisms of complex animals, in which the division rate of regular cells is low and the life cycle are long. However, organisms in the focus of our study have very short life cycles (few rounds of cell divisions) and even the regular cells actively proliferate. Hence, the growth advantage of defective cells should have much smaller impact on simple species. Therefore, in this model, we neglect the difference in division rates between defective and regular cells and keep cell divisions synchronous.
The probability of getting cancer depends on the frequency of cell differentiation events. An organism with a higher cell differentiation rate has a higher death probability, which leads to slower population growth.
Appendix 7
Evolution of irreversible somatic differentiation in a model with asynchronous cell division
Our original model features synchronous cell division. This comes from the assumption that differentiation costs are paid collectively by the whole organism. Here, we consider another option, where differentiation costs are paid individually by each cell. An immediate consequence is that cell division in such a model is asynchronous because differentiating cells take more time to divide.
In the asynchronous model, we model cell division as a random process occurring with the reaction rates
where $s,g$ are the number of germ and soma cells in the organism, ${s}_{xy},{g}_{xy}$ are elements of the differentiation program $D$, ${F}_{\text{comp}}$ is the composition effect profile computed identically to the synchronous model, see Equation 3, and ${c}_{s\to g}$ and ${c}_{g\to s}$ are differentiation costs.
We use the Gillespie algorithm to find which kind of cell division occurs next and how much time does it take. Then a chosen cell division occurs once (organism grows by a single cell). After that ${F}_{\text{comp}}$ value is updated to reflect the changed composition. Then the next cell division is sampled and the process continues until the organism reaches the maturity size. This model is designed to be the asynchronous implementation of our ideas, which remains close to our original model presented in the main text. Therefore, the rest of simulation protocol remains the same.
Computation time of the asynchronous model scales linearly with the number of cell divisions: it takes 1023 simulation steps to simulate the growth from 1 to 1024 cells. Therefore, it is computationally much more demanding than the synchronous model. The synchronous model scales linearly with the number of cell generations: the same growth to 1024 cells needs only 10 steps there. Maps similar to Figure 2A–C are unavailable with asynchronous model for computational reasons. Still, we calculated optimal differentiation strategies for a single combination of costs: ${c}_{g\to s}=0$, ${c}_{s\to g}=10$. Under these conditions, which favour evolution of irreversible differentiation in the synchronous model, it is significantly suppressed in the asynchronous model, see Fig. G.
The reason behind the difference between results for synchronous and asynchronous models is the different performance of reversible strategies in these models. If costs of soma differentiation are large enough, the expected period of cell division for a differentiating somarole cell is longer than the length of life cycle. As a result, instead of redifferentiation, somarole cells become effectively terminally differentiated. In such a situation, the growth of the organism is determined by propagation of germrole cells and does not depend on the value of soma differentiation costs.
The key to success of reversible strategies in the synchronous model was an ability to develop large fractions of somarole cells early on and to keep this fraction in the course of life cycle. There, the fraction of somarole cells is preserved by a dynamic equilibrium between differentiation in both directions, see Appendix 2. In the asynchronous model with high soma differentiation costs, somarole cells do not divide and such a dynamic equilibrium does not exist. The fraction of somarole cells is maintained differently here. If we denote the number of germrole and somarole cells at time $j$ (unlike Appendix 2, it is a continuous parameter here) as $g(j)$ and $s(j)$, respectively, then in the case of nondividing somarole cells (${c}_{g\to s}=0$, ${c}_{s\to g}\gg 1$, ${s}_{ss}=0$), the dynamics of the organism is given by
The solution of this system of equations with initial condition of one germrole and no somarole cells is
Hence, the fraction ${r}_{s}(j)$ of somarole cells is
The differentiation strategy considered above (${s}_{ss}=0$) is an extreme case where a dynamic equilibrium between cell differentiations is not possible. Still, Equation 17 demonstrates that a balance between germrole and somarole cells is still achieved here. Therefore, in the asynchronous model with highly asymmetric differentiation costs, the reversible strategies keep all components that make them successful in the no costs scenario: the early production of somarole cells due to high differentiation rates, the necessary fraction of somarole cells during the life cycle (Equation 17), and the overall fast growth of the whole organism, despite having nondividing somarole cells (Equation 16).
Note that in irreversible strategies, somarole cells do not differentiate and therefore divide at a normal rate. Therefore, the characteristic tradeoff of irreversible strategies between having more somarole cells early and more germrole cells later in life cycle remains in place even in the asynchronous model. As a result, in this model, reversible strategies are not punished by asymmetric costs and outcompete irreversible ones.
Data availability
The code implementing our model is deposited at https://github.com/YuanxiaoGao/Evolutionofirreversiblesomaticdifferentiation (copy archived at https://archive.softwareheritage.org/swh:1:rev:9a1ea7c84f3041ebe3720e7837b28182912b5e00).
References

Cancer across the tree of life: cooperation and cheating in multicellularityPhilosophical Transactions of the Royal Society B: Biological Sciences 370:20140219.https://doi.org/10.1098/rstb.2014.0219

A mechanistic model for the evolution of multicellularityPhysica A: Statistical Mechanics and Its Applications 492:1543–1554.https://doi.org/10.1016/j.physa.2017.11.080

Division of labour and the evolution of extreme specializationNature Ecology & Evolution 2:1161–1167.https://doi.org/10.1038/s4155901805649

From zygote to a multicellular soma: Body size affects optimal growth strategies under cancer riskEvolutionary Applications 13:1593–1604.https://doi.org/10.1111/eva.12969

Multicellular group formation in response to predators in the alga Chlorella vulgarisJournal of Evolutionary Biology 29:551–559.https://doi.org/10.1111/jeb.12804

Reconciling the incompatible: N2 fixation and O2The New Phytologist 129:571–609.https://doi.org/10.1111/J.14698137.1992.TB00087.X

Interacting cells driving the evolution of multicellular life cyclesPLOS Computational Biology 15:e1006987.https://doi.org/10.1371/journal.pcbi.1006987

Rapid transition towards the division of labor via evolution of developmental plasticityPLOS Computational Biology 6:e1000805.https://doi.org/10.1371/journal.pcbi.1000805

One cell, two cell, red cell, blue cell: the persistence of a unicellular stage in multicellular life historiesTrends in Ecology & Evolution 13:112–116.https://doi.org/10.1016/S01695347(97)01313X

The Evolution of Multicellularity: A Minor Major Transition?Annual Review of Ecology, Evolution, and Systematics 38:621–654.https://doi.org/10.1146/annurev.ecolsys.36.102403.114735

Evolution of reproductive development in the volvocine algaeSexual Plant Reproduction 24:97–112.https://doi.org/10.1007/s0049701001584

Division of labour and the evolution of multicellularityProceedings of the Royal Society B: Biological Sciences 279:1768–1776.https://doi.org/10.1098/rspb.2011.1999

The evolution of soma in the volvocalesThe American Naturalist 143:907–931.https://doi.org/10.1086/285639

Do plants have a segregated germline?PLOS Biology 16:e2005439.https://doi.org/10.1371/journal.pbio.2005439

Off the hookhow Bacteria survive protozoan grazingTrends in Microbiology 13:302–307.https://doi.org/10.1016/j.tim.2005.05.009

Quantitative shotgun proteomics of enriched heterocysts from Nostoc sp. PCC 7120 using 8plex isobaric peptide tagsJournal of Proteome Research 7:1615–1628.https://doi.org/10.1021/pr700604v

Fragmentation modes and the evolution of life cyclesPLOS Computational Biology 13:e1005860.https://doi.org/10.1371/journal.pcbi.1005860

Differences in cell division rates drive the evolution of terminal differentiation in microbesPLOS Computational Biology 8:e1002468.https://doi.org/10.1371/journal.pcbi.1002468

The evolutionary path to terminal differentiation and division of labor in cyanobacteriaJournal of Theoretical Biology 262:23–34.https://doi.org/10.1016/j.jtbi.2009.09.009

Distributions of reproductive and somatic cell numbers in diverse Volvox (Chlorophyta) speciesEvolutionary Ecology Research 14:707.

A general allometric and lifehistory model for cellular differentiation in the transition to multicellularityThe American Naturalist 181:369–380.https://doi.org/10.1086/669151

BookIntroduction to the Algae: Structure and ReproductionPrenticeHall, Incorporated.
Article and author information
Author details
Funding
Ministry of Science and ICT, South Korea (2020R1A2C1101894)
 Hye Jin Park
JRG Program
 Hye Jin Park
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
YG, HJP, AT, and YP thank the Max Planck Society for generous funding. HJP was supported by the NRF grant funded by the Korea government (MSIT) Grant No.2020R1A2C1101894 and by an appointment to the JRG Program at the APCTP through the Science and Technology Promotion Fund and Lottery Fund of the Korean Government. This was also supported by the Korean Local Governments  Gyeongsangbukdo Province and Pohang City.
Version history
 Preprint posted: January 19, 2021 (view preprint)
 Received: January 20, 2021
 Accepted: September 23, 2021
 Version of Record published: October 13, 2021 (version 1)
Copyright
© 2021, Gao et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 846
 views

 135
 downloads

 3
 citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading

 Evolutionary Biology
 Genetics and Genomics
A protein’s genetic architecture – the set of causal rules by which its sequence produces its functions – also determines its possible evolutionary trajectories. Prior research has proposed that the genetic architecture of proteins is very complex, with pervasive epistatic interactions that constrain evolution and make function difficult to predict from sequence. Most of this work has analyzed only the direct paths between two proteins of interest – excluding the vast majority of possible genotypes and evolutionary trajectories – and has considered only a single protein function, leaving unaddressed the genetic architecture of functional specificity and its impact on the evolution of new functions. Here, we develop a new method based on ordinal logistic regression to directly characterize the global genetic determinants of multiple protein functions from 20state combinatorial deep mutational scanning (DMS) experiments. We use it to dissect the genetic architecture and evolution of a transcription factor’s specificity for DNA, using data from a combinatorial DMS of an ancient steroid hormone receptor’s capacity to activate transcription from two biologically relevant DNA elements. We show that the genetic architecture of DNA recognition consists of a dense set of main and pairwise effects that involve virtually every possible amino acid state in the proteinDNA interface, but higherorder epistasis plays only a tiny role. Pairwise interactions enlarge the set of functional sequences and are the primary determinants of specificity for different DNA elements. They also massively expand the number of opportunities for singleresidue mutations to switch specificity from one DNA target to another. By bringing variants with different functions close together in sequence space, pairwise epistasis therefore facilitates rather than constrains the evolution of new functions.

 Evolutionary Biology
The circadian clock enables anticipation of the day/night cycle in animals ranging from cnidarians to mammals. Circadian rhythms are generated through a transcriptiontranslation feedback loop (TTFL or pacemaker) with CLOCK as a conserved positive factor in animals. However, CLOCK’s functional evolutionary origin and mechanism of action in basal animals are unknown. In the cnidarian Nematostella vectensis, pacemaker gene transcript levels, including NvClk (the Clock ortholog), appear arrhythmic under constant darkness, questioning the role of NvCLK. Utilizing CRISPR/Cas9, we generated a NvClk allele mutant (NvClk^{Δ}), revealing circadian behavior loss under constant dark (DD) or light (LL), while maintaining a 24 hr rhythm under lightdark condition (LD). Transcriptomics analysis revealed distinct rhythmic genes in wildtype (WT) polypsunder LD compared to DD conditions. In LD, NvClk^{Δ/Δ} polyps exhibited comparable numbers of rhythmic genes, but were reduced in DD. Furthermore, under LD, the NvClk^{Δ/Δ} polyps showed alterations in temporal pacemaker gene expression, impacting their potential interactions. Additionally, differential expression of nonrhythmic genes associated with cell division and neuronal differentiation was observed. These findings revealed that a lightresponsive pathway can partially compensate for circadian clock disruption, and that the Clock gene has evolved in cnidarians to synchronize rhythmic physiology and behavior with the diel rhythm of the earth’s biosphere.