Cellular organization in lab-evolved and extant multicellular species obeys a maximum entropy law

  1. Thomas C Day  Is a corresponding author
  2. Stephanie S Höhn  Is a corresponding author
  3. Seyed A Zamani-Dahaj
  4. David Yanni
  5. Anthony Burnetti
  6. Jennifer Pentz
  7. Aurelia R Honerkamp-Smith
  8. Hugo Wioland
  9. Hannah R Sleath
  10. William C Ratcliff  Is a corresponding author
  11. Raymond E Goldstein  Is a corresponding author
  12. Peter J Yunker  Is a corresponding author
  1. School of Physics, Georgia Institute of Technology, United States
  2. Department of Applied Mathematics and Theoretical Physics, Centre for Mathematical Sciences, University of Cambridge, United Kingdom
  3. School of Biological Sciences, Georgia Institute of Technology, United States
  4. Quantitative Biosciences Graduate Program, Georgia Institute of Technology, United States
  5. Department of Molecular Biology, Umeå University, Sweden

Abstract

The prevalence of multicellular organisms is due in part to their ability to form complex structures. How cells pack in these structures is a fundamental biophysical issue, underlying their functional properties. However, much remains unknown about how cell packing geometries arise, and how they are affected by random noise during growth - especially absent developmental programs. Here, we quantify the statistics of cellular neighborhoods of two different multicellular eukaryotes: lab-evolved ‘snowflake’ yeast and the green alga Volvox carteri. We find that despite large differences in cellular organization, the free space associated with individual cells in both organisms closely fits a modified gamma distribution, consistent with maximum entropy predictions originally developed for granular materials. This ‘entropic’ cellular packing ensures a degree of predictability despite noise, facilitating parent-offspring fidelity even in the absence of developmental regulation. Together with simulations of diverse growth morphologies, these results suggest that gamma-distributed cell neighborhood sizes are a general feature of multicellularity, arising from conserved statistics of cellular packing.

Editor's evaluation

This work uncovers a simple but far-reaching statistical principle that describes the geometry of cell packing in snowflake yeast and green algae. It draws on ideas from granular physics to offer new insight into universal rules of multicellular geometry that are otherwise easily obscured by the cell-scale idiosyncrasies of the different biological systems.

https://doi.org/10.7554/eLife.72707.sa0

Introduction

The evolution of multicellularity was transformative for life on Earth, occurring in at least 25 separate lineages (Grosberg and Strathmann, 2007). The success of multicellular organisms is due in part to their ability to assemble cells into complex, functional arrangements. Self-assembly, however, is fundamentally subject to random noise (Zeravcic and Brenner, 2014; Szavits-Nossan et al., 2014; Damavandi and Lubensky, 2019) that affects the final emergent structure (Michel and Yunker, 2019). The physiology of multicellular organisms can depend sensitively on the geometry of cellular packing (Bi et al., 2015b; Drescher et al., 2016; Jacobeen et al., 2018b; Brunet et al., 2019; Schmideder et al., 2021), and such noise may therefore have direct consequences on organismal fitness. Understanding the evolution of multicellularity, and the subsequent evolution of multicellular complexity (Bell and Mooers, 1997), requires understanding the impact of random noise on multicellular self-assembly. How do organisms accurately assemble functional multicellular components in the presence of noise?

Recent work has shown that extant multicellular organisms can either suppress (Hong et al., 2016) or leverage (Haas et al., 2018) variability in the process of reliably generating structures, and their tissues can change function based on cellular packing geometry (Bi et al., 2015a). This occurs through a coordinated developmental process involving genetic (Davidson, 2001), chemical (Sampathkumar, 2020), mechanical (Deneke and Di Talia, 2018), and bioelectric (Levin, 2004) feedbacks between interacting cells. However, even with coordinated developmental processes, noise during self-assembly results in deviations from perfectly regular structures. Further, as these developmental processes have not yet evolved in nascent multicellular organisms, it is unclear how unregulated assembly can reliably result in reproducible packing geometries and multicellular structures.

Multicellular organisms also exhibit diverse growth morphologies; for example, cells can remain attached through incomplete cytokinesis (Bonner, 1998; Grosberg and Strathmann, 2007; Knoll, 2011), they can adhere through aggregative bonds (Claessen et al., 2014), and they can assemble multicellular groups through successive cell division within a confining membrane (Angert, 2005; Herron et al., 2019). These growth morphologies can have distinct intercellular connection topologies (Yanni et al., 2020), changing how randomness is manifested. For instance, groups that grow with persistent mother-daughter bonds maintain the same intercellular connections, ‘freezing’ in place any structural randomness that arises during reproduction. In contrast, cells in aggregates can rearrange, so their final structure emerges from a combination of reproduction and intercellular interactions and noise (Delarue et al., 2016; Hartmann et al., 2019). Further, the dimensionality of multicellular groups can vary, from filaments (Herrero et al., 2016) and quasi-two-dimensional sheets (Besson and Dumais, 2011; Brunet et al., 2019) to groups that grow equally in three dimensions (Ratcliff et al., 2012; Tang et al., 2020; Butterfield, 2016). While the impact of noise on systems in thermal equilibrium is well known to depend sensitively on spatial dimensionality (Mermin and Wagner, 1966; Hohenberg, 1967; Vivek et al., 2017), no such information is yet at hand for biological development, which is intrinsically out of equilibrium. The growth morphology, connection topology, and dimensionality therefore altogether determine a multicellular architecture. Randomness resulting from many sources, such as stochastic cell division, variability in cell growth, intercellular interactions, and more, subsequently occurs as perturbations to this idealized form. It would appear that noise manifests in a unique, context-dependent manner in each of these different multicellular systems.

Here, we provide experimental evidence that, rather than being context-dependent, fluctuations in cell packing geometry instead follow a universal distribution, independent of the presence or absence of developmental regulation. We quantify the distributions of cellular space in two different types of organisms: experimentally evolved multicellular yeast (Ratcliff et al., 2012) and wild-type multicellular green algae (Goldstein, 2015). In both cases, maximum entropy considerations (Aste and Di Matteo, 2008) (see inset box) accurately predict the cell packing distribution. Building on these observations, we use computational models of diverse prescribed growth rules, mimicking extant biological morphologies, to show that cells are ubiquitously packed according to the maximum entropy principle. Detailed analysis of the case of green algae shows that correlations, that is, the lack of structural randomness, produce deviations from maximum entropy predictions, but that even a relatively small amount of randomness is sufficient to generate cellular packings that largely follow maximum entropy predictions. Next, we explore the evolutionary consequences of cell packing. We use the cell packing distribution to predict the distribution of snowflake yeast group sizes, an emergent multicellular trait that arises from cell crowding (Jacobeen et al., 2018b). Then, we use a theoretical analysis to show that the effects of fluctuations in intercellular space on the motility of green algae are small. These findings together suggest that, rather than impeding innovation, fluctuations in cell packing are highly repeatable, and may play a fundamental role in the origin and subsequent evolution of multicellular organisms.

Results

Maximum entropy

Within statistical physics, the maximum entropy principle relates randomness in low-level units (e.g. cells) to the properties of the assembly (e.g. a multicellular group). It works by enumerating all low-level configurations that conform to a set of constraints. Any particular group-level property can be generated by many different low-level configurations, but some group-level properties may correspond to more low-level configurations than others. Those that are generated by many configurations are more likely to be observed than those that correspond to relatively few configurations; in this way, the maximum entropy principle allows one to calculate the probability of observing different group properties, given a set of constraints. Multicellular groups obey a simple but universal constraint: each group has some total volume, V. This volume can be divided into N pieces, where N is the total number of cells. Each piece is associated with a particular cell, and the N pieces must sum to the total volume of the group, V=ivi, for i=1,2,,N. Using this constraint, and assuming no correlations, one can predict the most likely distribution of volumes for the N pieces. This approach has been successfully used to predict the distribution of free volumes within granular materials and foams (Aste and Di Matteo, 2008; Katgert and van Hecke, 2010). Here, we use it to predict the distribution of cellular free volumes in the absence of spatial correlations in cell positions.

Consider the ensemble of all possible cellular configurations in a simple group. As first derived by Aste and Di Matteo, 2008 and Aste et al., 2007 for granular materials, the maximum entropy probability distribution p(v) of cell neighborhood volumes within V is the modified gamma distribution

(1) p(v)=kkΓ(k)(v-vc)k-1(v¯-vc)kexp(-kv-vcv¯-vc)

where v¯ is the mean cell neighborhood volume, vc is the minimum cell neighborhood volume, Γ(k) is the gamma function, and k(v¯-vc)2/σv2 is a shape parameter that is defined by vc, v¯, and the variance of the cell neighborhood volumes, σv2. This distribution is expected if cell neighborhood volumes are determined independently of each other (while still conforming to the total volume constraint). In other words, volumes must be set randomly; correlations between the size of separate volumes will lead to deviations from maximum entropy predictions. If this condition holds, then maximum entropy volume distribution predictions should be valid, regardless of other geometric or structural details. For example, maximum entropy statistics hold in granular materials, despite the fact that they must obey strict force and torque balance conditions (Aste and Di Matteo, 2008; Snoeijer et al., 2004; Bi et al., 2015a). Further, the same approach applies to groups with a constraint on total area or length; this does not change the result, and V can be replaced by A or L without other modifications. However, we would not expect this prediction to necessarily hold for confluent cell layers (i.e. packing fraction ϕ1), since in general cell size is a highly regulated process; when cells tile space, the cell neighborhood volume distribution will strongly correlate with the the cell size distribution.

In practice, we divide the total group volume or area into N pieces via a Voronoi tessellation. The size of the space associated with cell i includes the cell itself and the portion of intercellular space closer to its center than to the center of any other cell. As cells must have non-zero size, we therefore set vc to be the volume of a single cell without any intercellular space (or ac, the area of a single cell).

Experimental tests of multicellular maximum entropy predictions

To test whether different kinds of multicellular groups pack their cells according to the maximum entropy principle, we investigated cell packing in two different multicellular organisms. First, we used experimentally evolved ‘snowflake’ yeast (Ratcliff et al., 2012), a model system of undifferentiated multicellularity. Second, we used the green microalga Volvox carteri, a member of the volvocine algae that first evolved multicellularity in the Triassic (Starr, 1969; Herron et al., 2009).

Snowflake yeast

Snowflake yeast grow via incomplete cytokinesis, generating branched structures in which mother-daughter cells remain attached by permanently bonded cell walls (Figure 1A). New buds appear on ellipsoidal cells at a polar angle θ=42%±23% and azimuthal angle ϕ that is randomly distributed (ϕ=180%±104%, Figure 1—figure supplement 1). Therefore, even though snowflake yeast structure is limited to configurations in which daughter cells remain attached at their base to their mother, these new cells bud in random orientations. Each snowflake yeast cluster has a different specific configuration of cells; the ensemble of snowflake clusters will include many different configurations. We expect that this structural randomness produces predictable distributions of cellular neighborhood volumes. Conversely, if there are strong correlations in the locations of daughter cells, then we will observe deviations from maximum entropy predictions regarding the cell neighborhood volumes.

Figure 1 with 1 supplement see all
Cell packing in two multicellular species.

(A) Cross-section of a multicellular yeast organism, which grows with persistent intercellular bonds. Scalebar is 5 m. The inset shows a smaller section, with ellipsoidal fits to individual cells along with their corresponding Voronoi polyhedra. Black overlays indicate the connection topology between yeast cells; not all connections are labeled. (B) Darkfield microscopy image of Volvox carteri, scalebar is 100µm. Inset: a small piece of the Voronoi-tessellated surface; black points are somatic cell positions. (C) Distributions of Voronoi polyhedron volumes as a function of cell size normalized by average size vc for snowflake yeast. In orange is the histogram for all cells; the other three distributions correspond to different subsections of Voronoi volumes. The cells were grouped into spherical shells with radius R and width ΔR from the cluster center of mass. Shown are shells with edges [0,6.2], [6.2,9.7], and [9.7,20.4]. Black lines are maximum entropy predictions. (D) Distributions of solid angles subtended by Volvox somatic cells divided by a minimum solid angle Ωc. Solid black lines are the maximum entropy predictions. The top row shows the histogram for all cells in green and a subsection of correlated areas in gray. Bottom row illustrates the subsectioning process: blue polygon is the center of the subsectioned region. Only the Voronoi polygons, i.e. not the somatic cells, are shown for clarity. (E,F) Empirical cumulative distribution function vs entropic predictions for all distributions shown in c,d. The dashed black line represents hypothetical perfect agreement between observation and prediction.

Figure 1—source data 1

Experimental data files enumerating the cell centers positions for each organism sampled.

The folders are subdivided into those on snowflake yeast (from SEM studies) and Volvox (from lightsheet studies). Each subdirectory contains an explanatory README file.

https://cdn.elifesciences.org/articles/72707/elife-72707-fig1-data1-v1.zip

To determine the distribution of cell neighborhood volumes, we first must measure the position of every cell in a cluster. It is difficult to image individual cells within snowflake yeast clusters due to excessive light scattering. Instead, we used a serial block face scanning electron microscope equipped with a microtome to scan and shave thin (50nm) layers off a resin block with embedded yeast clusters with stained cytoplasms. This process allowed us to determine the 3D structure of N=20 snowflake yeast clusters and locate cell centers with nanometer precision.

We define the group volume as the smallest convex hull that surrounds all cells in the cluster and computed the 3D Voronoi tessellation of cell centers within that (Figure 1A). The distribution of cellular Voronoi volumes closely matched the predicted k-gamma distribution (Figure 1C, k=2.88). This agreement is quantified via ‘P-P plots’ of the empirical cumulative distribution function (CDF) plotted against the predicted k-gamma CDF. We find a root-mean-square residual rRMS=(F(v)-Fi)2=0.02, where Fi is the empirical CDF and F(v) is the predicted k-gamma CDF.

The influence of the convex hull on these results was investigated by using an alternative procedure in which the Voronoi volumes were binned into shells centered at the cluster’s center of mass (Figure 1C and E). We binned cells into shells with shell edges of [0,6.2], [6.2,9.7], and [9.7,20.4] away from the center of mass. We found that the distribution of Voronoi volumes within each shell matched the predicted k-gamma distribution, with rRMS={0.037,0.020,0.014}, k={3.45,3.08,4.63} in the shells shown in Figure 1C(ii-iv).

Volvocine algae

To test if cell neighborhood volumes in extant multicellular organisms are consistent with maximum entropy cell packing predictions, we examined cell packing within the green microalgae Volvox carteri. Development in V. carteri, which evolved over millions of years, is highly regulated, occurring through a stereotyped morphological progression (Kirk, 2005). V. carteri embryos arise as a spherical cellular monolayer from palintomic cell divisions with incomplete cytokinesis, which leaves the cells attached via cytoplasmic bridges. These bridges disappear when ECM is secreted by the cells, filling the entire sphere, and eventually moving the cells apart. The approximately 1000 somatic cells remain embedded on the surface of a translucent sphere of extracellular matrix (Figure 1B). While six-fold coordination is the most frequent local arrangement of somatic cells, the fact that the cells are embedded in a surface with spherical topology requires there to be ‘defects’ with differing coordination number (e.g. 5,7), and these are found interspersed around the spheroid. Thus, despite their developmental regulation, somatic cells exhibit a degree of disorder with respect to coordination number. From a physics perspective, the local hexatic order in the somatic cell arrangement is low (see Materials and methods).

To determine the distribution of Volvox cell neighborhood sizes, we imaged somatic cells using their chlorophyll autofluorescence in a light sheet microscope. Since the somatic cells are arranged around a surface embedded in 3D space, we constructed a 2D Voronoi tessellation of somatic cells on the surface. Each organism imaged had a different size, and therefore had a different mean Voronoi area A. To compare distributions across organisms, we removed the systematic area differences by recording the solid angle Ωi=4πAi/S subtended by each somatic cell, where S=iAi is the total surface area of the organism. We found that the k-gamma distribution largely matched the distribution of solid angles (Figure 1D, k=2.40, rRMS=0.04). However, there are systematic deviations between the data and maximum entropy predictions (Figure 1F).

We next investigated if maximum entropy predictions are more accurate within subregions with similar mean solid angles; specifically we examine regions whose mean is Ω=0.0185±0.0003, obtained across six organisms. The distribution of Voronoi solid angles within these subregions closely follows the k-gamma distribution (Figure 1F, k=10.66, rRMS=0.01). This observation suggests that while there are systematically correlated subregions of cells, within these subregions cells are largely arranged randomly. Thus, the organization of Volvox carteri somatic cells is consistent with maximum entropy predictions.

Simulations of different growth morphologies

We next used simulations to investigate the impact on cell packing of four different growth morphologies: growth via incomplete cell division (cf. snowflake yeast), cells distributed on a spherical surface (cf. Volvox), aggregation, and palintomy. The goal of these studies was to determine if morphological details and constraints impact entropic packing using simplified models that capture the essential features of the growth and behavior of these varied organisms. These simulations captured a couple basic properties of multicellularity: organisms may grow in both two and three dimensions, and they may assemble with two different classes of bonds: bonds that are reformable, and bonds that are not reformable. Nonreformable, or ‘fixed’ bonds (like those formed from incomplete cytokinesis) are common in plants, algaes, and fungi; on the other hand, reformable bonds (such as those formed via sticky proteins) are common in animal tissues and cellular aggregates, like biofilms and slime molds. Finally, palintomy, or growth confined inside a maternal membrane, is also common from algae to animals.

First, we performed geometric simulations of multicellular groups that grow via incomplete cell division; these simulations were inspired by previous simulations of snowflake yeast (Jacobeen et al., 2018a; Jacobeen et al., 2018b). Daughter cells bud from mother cells with experimentally determined polar angle and random azimuthal angle, and remain attached to mother cells with rigid bonds. We ran 9, 100 simulations starting from a single cell, each of which underwent seven generations of division, and calculated the Voronoi tessellation of the final structure from each simulation. The distribution of Voronoi volumes closely matched the k-gamma distribution across four orders of magnitude (Figure 2A, k=2.26, rRMS=0.007).

Figure 2 with 3 supplements see all
Entropic packing is a general feature of simple multicellularity.

We simulated four different growth morphologies: (A) Tree-like groups formed with rigid, permanent bonds between cells, (B) surface-attached cells located on a spherical surface, (C) aggregates formed with attractive ‘sticky’ interactions, and (D) groups formed by rapid cell division within a maternal membrane. In all subfigures, left panel shows the predicted and observed probability distributions, and right panel plots the observed cumulative distribution vs. the expected cumulative distribution. Histogram bars represent measured Voronoi volume distribution in simulations, and black solid line represents the maximum entropy prediction. Maximum entropy predictions accurately described the distribution of cellular volumes/areas, despite their varying mechanisms of group formation (rRMS0.01).

Figure 2—source data 1

Simulation data (enumerating the cell center positions) for the six classes of numerical studies; aggregation, apoptosis, polydispersity, snowflake yeast growth, tree-like growth, and Volvox growth.

Each subdirectory contains an explanatory README file.

https://cdn.elifesciences.org/articles/72707/elife-72707-fig2-data1-v1.zip

Inspired by Volvox, we simulated cells distributed across the two-dimensional surface of a sphere through a random Poisson point process. We completed 10 simulations, each with 1000 cells, and computed the distribution of solid angles subtended by Voronoi cells. As shown in Figure 2B, the distribution of Voronoi solid angles is consistent with maximum entropy predictions (k=9.29, rRMS=0.009).

Next, we simulated organisms that stick together via reformable cell-cell adhesions, a mechanism of group formation that is common in biofilms and extant aggregative multicellular life (Claessen et al., 2014) (i.e. Dictylostelium and Myxococcus; Figure 2C). In these simulations, multicellular aggregates were grown from a single cell. Seven generations of cell division occurred, in which new cells appear on the surface of existing cells at random positions, and steric interactions force cells to separate after division and occupy space. Aggregative bonds were modeled through harmonic interactions of the cell centers. The observed Voronoi volume distributions were consistent with maximum entropy predictions (k=7.84 and rRMS=0.007).

Finally, we modeled cells undergoing palintomic division within a maternal cell wall, as is common in green algae (Lürling and Van Donk, 1997; Boraas et al., 1998; Ratcliff et al., 2013; Fisher et al., 2016; Herron et al., 2019), and is reminiscent of both baeocyte production in Stanieria bacteria (Angert, 2005) and neoproterozoic fossils of early multicellularity (Xiao et al., 1998; Figure 2D). The details of these simulations remained similar to the simulations of aggregative multicellularity, with the important difference being that instead of harmonic interactions between cell centers enforcing groups to stay together, cells interacted with a spherical maternal wall acting as a corral. The Voronoi volume distributions for these simulations were also consistent with maximum entropy predictions (k=15.16 and rRMS=0.013).

Taken together, the results of these simulations suggest that a broad distribution of cell neighborhood sizes is a general feature of multicellular growth morphologies. In particular, when cell locations are random under these rules, cell neighborhood size distributions closely follow the k-gamma distribution.

The role of spatial correlations

While we have shown that the distribution of cell neighborhood volumes closely follows the k-gamma distribution in two very different organisms, we have also seen that in some cases maximum entropy predictions are more accurate in sub-sections of an organism than across its entirety. For instance, in Volvox we observed that rRMS is much smaller within subregions with similar mean solid angles than across the whole organism. This observation suggests that correlations exist in the arrangement of V. carteri somatic cells. Such a correlation is not wholly unexpected; V. carteri have a well-known anterior-posterior polarity which they use to orient and swim toward light sources (Drescher et al., 2010b). This polarity may affect the somatic cell packing distribution in different subregions of the organism, leading to deviations from maximum entropy predictions.

The spatial correlations in the cellular areas in Volvox were studied first by plotting a 3D heatmap of Voronoi solid angle sizes (Figure 3A). It is apparent that extended spatial regions have well-defined and non-random mean Voronoi solid angles. We quantified this feature by calculating the spatial correlation function C(Q) of the solid angle

(2) C(Q)=(Ω-Ω)YQσΩσYQ,

where YQ=J(Q)-1j(Ωj-Ω) is the average deviation of the solid angle of a given polygon’s neighbors at a neighbor distance Q from the mean. Here, the number of neighbors is J(Q), a function of Q, which enumerates the network distance from the polygon of interest (i.e. Q=1 calls the nearest neighbors, of which there are J(1), Q=2 calls the next nearest neighbors, of which there are J(2), and so on). The standard deviation of the solid angle across the population is σΩ, and σYQ is the standard deviation of YQ across the population. We find that Volvox Voronoi solid angles are positively correlated at distances as large as Q=10 (Figure 3A). This analysis suggests that there are systematic differences in Volvox group structure in different spatial regions. We therefore should expect to observe deviations from the k-gamma distribution, which was derived under the assumption that there are no correlations in the division of space among cells.

Correlations lead to deviations from maximum entropy predictions in Volvox carteri.

(A) Correlation function of Voronoi polygon areas vs. network neighbor distance Q. Green circles represent all experimental Volvox data. Lines indicate the same correlation function calculated in subsections of size Q0={6,10,14,18}. Inset: visualization of spatial correlations of solid angle; one Volvox’s Voronoi tessellation is displayed with a three-color heatmap corresponding to polygons with areas smaller than (light gray), within (gray) and larger (black) than one standard deviation of the mean. Scale bar is 200µm. (B) PP plots for the observed vs predicted cumulative distribution function. In green is the Volvox distribution for all cells before corrections for correlations. A selection of differently-sized subsections is also plotted, corresponding to sizes Q0={6,10,14,18}. Arrow indicates direction of increasing Q0 value. (C) Root-mean-square residual deviation from maximum entropy predictions as a function of subsection size, as a function of nearest neighbor number Q0. As the subsection size increases (including more and more uncorrelated Voronoi areas), the deviation from predictions first decreases until Q0=6, then increases.

Figure 3—source data 1

Used in this figure is specifically Volvox cartesian cell center position measured from lightsheet microscopy.

https://cdn.elifesciences.org/articles/72707/elife-72707-fig3-data1-v1.zip

A natural question is whether maximum entropy predictions are more accurate within correlated subregions of an organism. We measured the Voronoi distribution in subregions with similar mean solid angles across six organisms and, for each subregion, a central node and its neighbors up to Q0 were identified. We varied Q0 from 3 (corresponding to, on average, 38±2 cells in the subregion) to Q0=20 (928±10 cells on average in the subregion) to measure the Voronoi solid angle distributions in subregions of different sizes. Spatial correlations were weaker within smaller domains (Figure 3A), and deviations from maximum entropy predictions smaller as well (Figure 3B) with the minimum rRMS at Q0=6 (Figure 3C, average of 133±5 cells). These observations suggest that while there are systematic correlations between subregions, cell neighborhood sizes are largely randomly distributed within subregions.

The crucial role of randomness

How much randomness is necessary for the k-gamma distribution to predict cell neighborhood size distributions? Our analysis of the solid angle distribution of Volvox cells demonstrates that maximum entropy principle predictions are relatively accurate (rRMS=0.04) even in the presence of some spatial correlations. However, it is unclear how much randomness is necessary for the k-gamma distribution to be predictive. Developmental processes are known to incur random effects (Hong et al., 2016; Haas et al., 2018), but, at the same time, fluctuations must be sufficiently suppressed for complex structures to assemble. This dilemma is augmented because there are few methods for measuring the degree of randomness in developmental processes. Therefore, it is difficult to disentangle exactly which reproducible qualities and traits stem from regulated processes, and which ones stem from random processes. In this section, we address this gap by assessing the stability of maximum entropy packing distributions to correlative perturbations (Figure 4). We do so by simulating three different sources of correlations: (i) size polydispersity, (ii) precisely defined growth patterns, and (iii) coordinated cellular apoptosis, each of which is a common multicellular developmental process. In each scenario, we varied the relative strength of correlations and noise, and monitored how closely the cell neighborhood size distributions agreed with the k-gamma distribution via P-P plots and the rRMS. We do not attempt to exactly recapitulate naturally-occurring levels of randomness strength; rather, we investigate the limits of how well maximum entropy predictions can measure the impact of correlations.

Introducing correlations and structure can break the maximum entropy distribution.

In A-C are PP plots of the observed vs. predicted cumulative distribution function for three different simulations. The colors correspond to increasing levels of noisiness in the simulations, from red (strongest correlations/determinism) to blue (strongest noise). The dashed black line in each represents y=x, or exact predictive efficacy. (A) Aggregative groups with bimodal size polydispersity; noise is introduced by varying the probability that small cells reproduce into small or large cells, and vice versa. (B) Tree-like groups with persistent intercellular bonds that grow according to a growth plan modified by noise in cell placement. (C) Surface-bound groups with programmed cell death events that may be localized or randomly dispersed. (D) The root mean square deviation from predicted values for each simulation case. Circles are aggregative simulations from A, triangles are tree-like simulations from B, and squares are surface-bound simulations from C. Note: the discrete points in (B) arise because absent randomness in the cell locations the exact same cellular structure is achieved by every simulation. Therefore, the observed distribution of Voronoi volumes is noticeably discrete for this case. Upon adding randomness, the cellular structure is altered between successive simulations, and the distribution of Voronoi volumes becomes less discrete.

Figure 4—source data 1

This figure was generated from the data provided through simulations of aggregative growth with size polydispersity, tree-like growth with noise in cell position, and growth on a spherical surface with planned apoptosis.

https://cdn.elifesciences.org/articles/72707/elife-72707-fig4-data1-v1.zip

The impact of heritable size polydispersity was investigated by simulating aggregative groups consisting of large and small cells. All simulations were seeded with one small cell and one large cell. We then varied the probability ξ that a new cell is the same size as its mother from 0.5 to 1.0. When ξ=1, cells always produce offspring with the same radius; for ξ=0.5 it is equally likely that a small cell produces a small or large daughter (and vice versa for large cells). Therefore, groups with ξ=1 have correlated regions of cell size, but the degree of correlations decreases with decreasing ξ. While groups with ξ=1 deviate significantly from the k-gamma distribution (rRMS=0.09), we observed that even a small amount of randomness results in excellent agreement between simulated groups and the k-gamma distribution (in order from ξ=1 to ξ=0.5, rRMS={0.09,0.04,0.03,0.03,0.02}).

Next we investigated groups with varying amounts of noise on top of defined growth patterns. In these simulations, new cells bud in precise positions; the first daughter at the position (θ=0±η, ϕ=0±η in spherical coordinates), the second at θ=90±η, ϕ=0±η, the third at θ=90±η, ϕ=180±η, etc., where the noise is uniformly distributed with zero mean and width η. For η=0 (no noise), the distribution of Voronoi volumes was discontinuous, since cells could only access a finite number of local configurations. As expected, as η increases (η={0,5,30,60,90}), rRMS decreases (rRMS={0.07,0.05,0.02,0.02,0.02}).

Finally, we investigated groups with localized and random cell death. In these simulations, 50 cells were confined to the surface of a sphere of unit radius following the protocol described above. One cell is randomly selected to die. Centered at this cell, a spherical region of radius R is defined, and then 10 cells in this region were randomly selected to die (and disappear, thereby not contributing to the Voronoi tessellation). For small R, cell death is highly localized, and thus spatially correlated. As R increases, cell death events become less localized, and therefore more random. We find that highly correlated cell death resulted in large deviations from maximum entropy predictions. Conversely, as R increases dead cells become less localized, the observed distribution becomes more accurately described by the k-gamma distribution; as R increases from R={0.75,1,1.25,1.5,1.75,2}, we find rRMS={0.10,0.07,0.03,0.02,0.01}.

In summary, absent randomness, spatial correlations lead to large deviations from the k-gamma distribution. Yet, with even a small amount of randomness, the k-gamma distribution holds significant predictive power. These simulations suggest that maximum entropy predictions are likely to be robust against even moderate correlations.

Parent-offspring fidelity via maximum entropy packing

So far, we have shown that randomness in cellular packing leads to highly predictable packing statistics. Here we show that maximum entropy statistics can directly impact the emergence of a highly heritable multicellular trait, organism size.

Prior work has shown that the size of snowflake yeast at fragmentation is remarkably heritable – higher, in fact, than the traits of most clonally reproducing animals (Ratcliff et al., 2015). The size to which snowflake yeast grow depends strongly on the aspect ratio of its constituent cells; more elongated cells allow the growth of larger clusters before strain from cellular packing causes group fragmentation (Jacobeen et al., 2018a; Jacobeen et al., 2018b). Recently, experiments with engineered yeast showed that this emergent multicellular trait, group size, was in fact more heritable than the underlying cellular trait upon which it was based (cellular aspect ratio), despite the fact that the mutations engineered in this system only affected cellular aspect ratio directly (Zamani-Dahaj et al., 2021). Simulations of multicellular chemotaxis observed a similar effect (Colizzi et al., 2020). While at first glance this may seem surprising, we show below that the high heritability of snowflake yeast group size arises from the direct dependence of size on the robust maximum entropy distribution of volume within groups.

Before addressing how fracture impacts the distribution of cluster sizes by impacting the number of cells within a group, we first must address fluctuations in size among clusters with the same number of cells. Given a number of cells N in the cluster, variation in cell packing fraction results in variation of the total volume. The arguments given above for predicting the distribution of individual cell volumes also applies to the distribution of total volume (Aste and Di Matteo, 2008); the distribution of total volume for clusters with the same number of cells should follow the k-gamma distribution. To generate enough clusters with identical N to test this prediction, we used simulations. We generated 3000 snowflake yeast clusters, each with 100 cells, and measured their total volumes. The distribution of volumes is consistent (rRMS=0.0043, k=23.0) with the k-gamma distribution as shown in Figure 5A. Further, these fluctuations in size are small compared to the differences in size gained via reproduction of cells or lost via fracture.

Maximum entropy cell packing generates a consistent and predictable life cycle in snowflake yeast.

(A) Distribution of total cluster volume for 3000 simulated snowflake clusters, each with N=100 cells. The total volume is divided by the minimum possible volume Nvc. The k-gamma distribution (black line, k=23.0,rRMS=0.0043) provides a good description of the data. (B) Distribution of all experimental Voronoi volumes (black circles) and the maximum entropy prediction (black line). The vertical black dashed line is the critical Voronoi volume v*=2.02 predicted from simulations. Orange filled region integrates up to p* the probability that any one cell occupies a volume less than v*. Insets: sequential brightfield microscope images of one yeast cluster undergoing group fragmentation. White arrowhead indicates location of fracture point. The images measure 150µm across from top to bottom. (C), Experimentally measured yeast cluster size distribution (solid black line) along with the prediction from weakest link theory (orange line). Gray region represents 1σ confidence bounds on the measured distribution from estimating the number of cells in a group. (D), Coefficient of variation in group radius (σ/R¯) for snowflake groups (S) and flocculating groups (F). Data from Pentz et al., 2020.

Figure 5—source data 1

This figure was generated from a combination of the simulation data and experimental data provided with Figures 1 and 2, as well as epxerimental size measurements included here from a particle multisizer, or Coulter Counter.

https://cdn.elifesciences.org/articles/72707/elife-72707-fig5-data1-v1.zip

To predict the group size distribution, we consider the probability of fragmentation via a weakest-link model of fracture. As the location of new cells is random (see Figure 1—figure supplement 1), each new cell has a chance of causing intercellular bond fracture. It was previously observed that bonds only break if cells are highly confined, that is they have smaller Voronoi volumes; otherwise flexible cellular branches simply bend (Jacobeen et al., 2018b). We model fracture as occurring when a cell’s Voronoi volume is below a critical value denoted by v* (Figure 5B) such that its motion is completely restricted. We measure v* from simulations that determine the maximum local packing density for groups with same cell size and shape distributions as seen in experiments (see Materials and methods). The probability that a particular cell is confined to a Voronoi volume vv* is the integral

(3) p=vcvp(v)dv.

As each cell in a cluster of N cells independently has probability p* of having vv* (and thus causing fracture), the probability of a cluster with N cells not fragmenting is

(4) P(N)=(1-p*)N

As we do not model the fate of products of fragmentation (i.e. the size of the separate pieces post-fracture), we expect the weakest link model to be more accurate for larger clusters than it is for smaller clusters.

We measured group size for approximately 10,000 snowflake clusters, all descendants of a single isolate, using a particle multisizer, and found strong agreement between the experimentally observed cluster size distribution and the weakest-link prediction (the coefficient of determination is r2=0.97 for log(Counts) vs N) (Figure 5C). Hence, the predictable statistics of entropic cell packing guides the distribution of group size among offspring of a single isolate.

For context, we compared the distribution of group size in snowflake yeast to that of flocculating yeast, which forms multicellular groups via aggregation. The multicellular size of flocculating yeast depends on the rate of collisions with other cells and groups of cells. The growth rate of aggregates is thus typically proportional to their size, as larger aggregates are more likely to contact more cells (Pentz et al., 2020). In fact, the maximum size of a flocculating yeast aggregate is bounded by the duration of aggregation, an extrinsic parameter, while the minimum size can be a single cell (Stratford, 1992). Using data from Pentz et al., 2020, we compared the group size distributions of snowflake yeast and flocculating yeast grown in the same environmental conditions. We find that flocculating yeast groups exhibit a much larger coefficient of variation in size compared to snowflake yeast groups (Pentz et al., 2020; Figure 5D). These results demonstrate that randomly assembled groups can exhibit more reproducible group traits than groups assembled with correlations.

Multicellular motility is robust to cellular area heterogeneity

One of the issues arising from the existence of the broad distribution of somatic cell areas in Volvox is the extent to which colony motility is affected by that heterogeneity. Each of the somatic cells at the surface of a Volvox colony has two flagella that beat at 30 Hz, in planes that are primarily oriented in the anterior-posterior (AP) direction but with a slight lateral tilt that makes each colony spin around its AP axis. A longstanding focus in biological fluid mechanics of multicellular flagellates has been to understand the connection between the beating of the carpet of flagella that cover their surface and their self-propulsion. Measurements of the flow fields around micropipette-held (Short et al., 2006) and freely-swimming colonies (Drescher et al., 2010a) have shown that despite the discreteness of the flagella, the flow is remarkably smooth, albeit often displaying metachronal waves (Brumley et al., 2015), long-wavelength phase modulations of the beating pattern. However, the effect of heterogeneity in cell area on motility is as yet unknown; naively, we might expect that heterogeneities in flagella packing may lower the motility of the organism in comparison to a highly regular arrangement. Here, we show in fact that Volvox motility is largely unaffected by fluctuations in cell area, indicating that area heterogeneity may not be detrimental to the organism’s swimming fitness. The absence of such a barrier may represent a scenario where trait optimization can be achieved without first evolving sophisticated developmental genes.

A heuristic explanation for the smoothness of the flows can be developed by noting first that the flow arising from each flagellum, beating close to the no-slip surface of the colony, will fall off only as an inverse power of distance r from the flagellum. Thus, the superposition of the flows from many flagella will be sensitive to contributions from distant neighbors (Boselli et al., 2021) and will tend to wash out local variations in flagellar actuation. This argument can be made quantitative using two different models for the motility of such flagellates. The first is the ‘squirmer’ model (Lighthill, 1952), in which the flagellate is characterized by a tangential ‘slip’ velocity u(θ) on the surface, which can be thought of as corresponding to the mean motion of the flagella tips. Here, θ[0,π] is the polar angle with respect to the AP axis. In this approach the details of the fluid velocity profile below the tips are not resolved, and in particular the no-slip condition at the surface of the ECM is ignored. In the second approach (Ishikawa et al., 2020), which builds on earlier work (Short et al., 2006) that specified a force density at the colony surface instead of a slip velocity, there is a specified force density applied at some small distance above the no-slip colony surface, and the flow field below that locus is resolved. This approach, termed the ‘shear stress, no-slip’ model, captures the very large viscous dissipation that occurs in the region between the ECM and the locus of forcing. Within either of these two approaches above the effects of area inhomogeneities can be investigated by coarse-graining the flagella dynamics; either the local slip velocity u(θ) or the local tangential force density f(θ) has noise.

In the squirmer model, the swimming speed U is (Lauga, 2020)

(5) U=120πdθsinθuθ(θ)V1(θ),

where

(6) Vn(θ)=2n(n+1)Pn(cosθ)sinθ,

Pn is the Legendre polynomial, and the prime indicates differentiation with respect to its argument. If we represent the effects of area inhomogeneities as noise in the slip velocity, then it is most natural to use Vn as the basis functions for the tangential slip velocity, expressed as

(7) uθ(θ)=n=1unVn(θ),

where Vn(0)=Vn(π)=0, guaranteeing that the slip velocity vanishes at the anterior and posterior poles (Short et al., 2006). Accurate experimental measurements of the azimuthal velocity field of Volvox (Drescher et al., 2010b) show that it is well-captured by that lowest mode, leading to a modest anterior-posterior asymmetry. From the orthogonality relation for the Vn,

(8) dθsinθV1(θ)Vn(θ)=2n(n+1)2n+1δ1n,

we see immediately that the contributions from all modes n>1 vanish identically, and thus the swimming speed is given identically by the amplitude of the lowest mode V1(θ)=sinθ,

(9) U=23u1.

Thus, within the squirmer model, motility is essentially insensitive to area inhomogeneities. This result does not preclude effects of those higher modes, only that such effects will be on quantities other than the swimming speed, such as the nutrient uptake rate (Magar et al., 2003).

In the shear-stress, no-slip model, the velocity field in the region between the colony radius R and the radius R(1+ϵ) at which the shear stress is applied is solved separately from that for r>R(1+ϵ) and the two flow fields are matched at R(1+ϵ) through boundary conditions of continuity in velocity and normal stress and the specified discontinuity in shear stress. Analogously to the expansion of the slip velocity in the squirmer model (7), noise in that discontinuity can be expressed by assuming that the coarse-grained shear force applied by the flagella has spatial variations, and can be expanded in the form

(10) fθ(θ)=n=1fnVn(θ).

The swimming speed again depends only on the lowest-order mode in this expansion,

(11) U=2ϵR3μf1,

and we again have insensitivity of U to inhomogeneities in the area per somatic cell.

Discussion

In this paper, we demonstrated that universal cellular packing geometries are an inevitable consequence of noisy multicellular assembly. We measured the distribution of Voronoi polytope sizes in both nascent and extant multicellular organisms, and showed that they are consistent with the k-gamma distribution, which arises via maximum entropy considerations. Using simulations, we demonstrated that k-gamma distributions arise in many different growth morphologies, and do so requiring only a relatively small amount of structural randomness. Further, we showed that the distribution of cell neighborhood sizes can be used to distinguish the effects of randomness from the effects of developmental patterning. Finally, we demonstrated that consistent packing statistics can lead to highly reproducible, and thus heritable, multicellular traits, such as group size in snowflake yeast. Altogether, these results indicate that entropic cell packing is a general organizing feature of multicellularity, applying to multicellular organisms with varying growth morphologies, connection topologies, and dimensionalities.

One of the strengths of the packing-based maximum entropy framework employed here is its simplicity. We have demonstrated that the distribution of cell neighborhood sizes can be predicted with high accuracy, in many different multicellular morphologies, from only the first two moments of the distribution. Deviations from maximum entropy predictions therefore encode important information about additional correlations that can arise via a variety of sources, such as developmental regulation or interactions with the environment. These additional correlations could be explored via, for example, higher order structures to the maximum entropy model (Schneidman et al., 2003). Relatedly, it would be interesting to extend our analysis to include topological information, which has recently been applied to theoretical and experimental systems (Skinner et al., 2021; Jeckel et al., 2021). Developing methods to incorporate these higher-order effects may prove especially useful when assessing the importance of random fluctuations in organisms with structures more complex than those studied here.

The effect of random noise has been an important area of research in developmental biology (Tsimring, 2014; Lander, 2011). During development, cellular growth, reproduction, differentiation, and patterning combine to form a multicellular organism. Random noise introduced at any stage in this process can result in phenotypic variability, which may affect an organism’s fitness (Waddington, 1957). But while some multicellular traits exhibit high variability, others are tightly conserved, leading to a wide body of research addressing the origin of mechanisms underlying robustness and stability, and the nature of feedback mechanisms that must be present to manage the large number of stochastic fluctuations in gene expression and growth (Gregor et al., 2007; Haas et al., 2018; Hong et al., 2016; Sampathkumar, 2020; Deneke and Di Talia, 2018). In this context, our results demonstrate that random noise can itself lead to highly reproducible multicellular traits such as the cell packing distribution.

Our observation that heritable properties can arise from random processes is reminiscent of the reproducible structures and phenomena generated by random noise in a wide range of physical (Shinbrot and Muzzio, 2001; Manoharan, 2015) and biological systems (Tsimring, 2014; Lander, 2011). While it may be surprising that the distribution of free space in snowflake yeast and Volvox follow the same k-gamma distributions despite the many differences between these organisms, this universality actually extends beyond multicellular organisms to non-living materials, such as those seen in granular materials and foams (Katgert and van Hecke, 2010; Varadan and Solomon, 2002; Aste and Di Matteo, 2008). This broad universality likely arises due to the simple requirements for application of the maximum entropy principle to packing; specifically, there must be a total volume, individual volumes cannot overlap, and volumes must be determined independently (subject to the total volume constraint). It is thus important to note that entropic packing is not necessarily adaptive; it can readily emerge as a consequence of random cellular reproduction or interactions. While entropic packing statistics may produce advantages in some cases, they could be neutral or detrimental in others.

An example of one possible advantage granted by entropic packing is the parent-offspring fidelity that arises from its ensemble statistics. Since both parents and their offspring are assembled through similar noisy processes, they achieve similar cell packing distributions. This statistical similarity therefore details at least one heritable multicellular trait that does not rely on genetically regulated multicellular development. Other multicellular traits that build on the cell packing distribution are similarly affected by this emergent process and could become heritable as well. Such parent-offspring heredity could play a crucial role in the evolutionary transition to multicellularity, providing a mechanism for nascent multicellular organisms to participate in the evolutionary process without first having to possess genetically regulated development. Over time, developmental innovation may arise via multicellular adaptation, modifying or replacing entropic cell packing as a mechanism of multicellular heredity. Consistent with this hypothesis, maximum entropy retains considerable predictive power in extant multicellular organisms such as Volvox, animal embryos (Alsous et al., 2018), and epithelial tissue monolayers (Atia et al., 2018), each of which have canalized development. There may be other examples of organisms with highly regulated development which pack cells according to maximum entropy predictions, and future work could address cell packing in, for example, animal embryos, brain tissue, and more. Finally, as fragmentation is a common mode of multicellular reproduction (Larson et al., 2020; Prakash et al., 2021; Angert, 2005; Keim et al., 2004; Koyama et al., 1977), fracture driven by maximum entropy packing statistics may be relevant to organisms other than snowflake yeast.

The broad distributions in cellular volumes we have found in two very different types of organisms, with two very different modes of reproduction and growth, suggest that noise in developmental geometry may be an inevitable consequence of almost any microscopic mechanism. In this sense, they may be just as unavoidable in biological contexts as thermal fluctuations are in systems that obey the rules of equilibrium statistical physics. As an example, we recall the ‘flicker phenomenon’ of erythrocytes, in which the red blood cell membrane exhibits stochastic motions around its equilibrium biconcave discoid shape. Thought for many years to be a consequence of specific biochemical processes associated with living systems, flickering was eventually shown by quantitative video microscopy (Brochard and Lennon, 1975) to be consistent with equilibrium thermal fluctuations of elastic biomembranes immersed in water. This was later confirmed by similar studies of shape fluctuations exhibited by large lipid vesicles (Schneider et al., 1984). The generalization of these considerations to homeostatic tissues with cell division, rearrangements and apoptosis has also been considered (Risler et al., 2015; Kalziqi et al., 2018). While such membrane systems may differ greatly in the specific values of their elastic modulus (and, indeed, of their microscopic membrane constituents), the viscosity of the surrounding fluid, and their physical size, the space-time correlation function of fluctuations about the equilibrium shape adopts a universal form in appropriately rescaled length and frequency variables.

These results on equilibrium fluctuations provide a conceptual precedent for the results reported here. A central issue that then arises from our results is how to connect any given stochastic biochemical growth process defined at the microscopic level to the more macroscopic probability distribution function observed for cellular volumes. Mathematically, this is the same question that arises in the theory of random walks, wherein a Langevin equation defined at the microscopic level leads, through suitable averaging, to a Fokker-Planck equation for the probability distribution function of displacements. Can the same procedure be implemented for growth laws?

Materials and methods

Yeast genotypes and growth morphology

Snowflake yeast genotypes

Request a detailed protocol

Multicellular yeast groups were constructed from initially unicellular Saccharomyces cerevisiae. Petite yeast groups (P-) were used in all experiments except those noted below. Snowflake yeast were engineered by replacing a functional copy of ace2 with a nonfunctional version as described in Ratcliff et al., 2015 (these modified genotypes will be referred to as either snowflakes or Ace2KO). Under daily selection for large size through settling in liquid media, groups can arise via a single mutation in the ace2 gene (Ratcliff et al., 2012; Ratcliff et al., 2015). When the ace2 gene is not expressed, the final stage of cell division is not completed, and mother-daughter cells remain attached at the chitinous bud site. Since all cells are attached directly to their mothers, snowflake groups form a fractal-like branched tree collective. To measure bud scar size, we used a unicellular strain of Y55 yeast; these measurements were only used to pick parameters for snowflake yeast simulations.

Yeast growth morphology

Request a detailed protocol

S. cerevisiae cells reproduce by budding, a type of asexual reproduction where a new cell extrudes from the surface of the parent cell. During budding, mother and daughter cells remain attached via a rigid chitinous bond; in unicellular yeast, chitinase will degrade this bond as the last step in cell division, releasing the daughter cell and leaving behind a ‘bud scar’ on the mother surface and a ‘birth scar’ defining the proximal hemisphere on the daughter’s cell surface. In all experiments, we use yeast expressing bipolar budding patterns (Chant and Pringle, 1995). The bipolar budding pattern is characterized by bud sites that typically do not form along the equator of the cell. Usually, the first daughter buds near the distal pole. Subsequent budding sites are typically positioned along a budding ring defined by a polar angle θ ( Figure 1—figure supplement 1). Some buds will ‘backbud’ toward the mother cell (i.e. on the proximal end of the cell), but most buds are placed on the distal side. By contrast, the azimuthal positions of all buds appears to be randomly distributed.

Growth conditions

Request a detailed protocol

All experiments were performed on yeast grown for approximately 24 hr in 10 mL of yeast peptone dextrose (YPD, 10 g/L yeast extract, 20 g/L peptone, and 20 g/L dextrose) liquid medium at 30 C, and shaken at 250 rpm in a Symphony Incubating Orbital Shaker model 3,500I. All cultures were therefore in the stationary phase of growth at the time of experiments.

Scanning electron microscopy to measure group structure

Since yeast cells have thick cell walls that limit the effectiveness of optical microscopy, we used a Zeiss Sigma VP 3View scanning electron microscope (SEM) equipped with a Gatan 3View SBF microtome installed inside a Gemini SEM column to obtain high resolution images of the internal structure of snowflake yeast groups and locate the positions of all cells. All SEM images were obtained in collaboration with the University of Illinois’s Materials Research Laboratory at the Grainger College of Engineering. Snowflake yeast clusters were grown overnight in YPD media, then fixed, stained with osmium tetroxide, and embedded in resin in an eppendorf tube. A cube of resin 200µm x 200µm x 200µm (with an isotropic distribution of yeast clusters) was cut out of the resin block for imaging. The top surface of the cube was scanned by the SEM to acquire an image with resolution 50nm per pixel (4000 × 4000 pixels). Then, a microtome shaved a 50-nm-thick layer from the top of the specimen, and the new top surface was scanned. This process was repeated until 4000 images were obtained so that the data cube had equal resolution in x,y,z dimensions.

Custom image analysis scripts were written for the SEM datasets. First, a local adaptive threshold was used to binarize the image. A distance transform was used to identify the center of each cell slice in a particular 2d image. A watershed algorithm was then seeded with the cell slice centers, followed by a particle tracking algorithm to label cells across image slices. After labeling, the boundary for each cell was found, resulting in a point cloud of the exterior of each cell. Each cell was then fitted with an ellipsoid with nine fit parameters: (x0,y0,z0) cell center, (a,b,c) cell radii, and (θ,φ,ψ) for cell orientation. The net rotation matrix R was then found, where each column of R corresponds to the direction vector of one principal axis of the ellipsoid. We consider the radii of the principal axes (a,b,c) to be part of a diagonal scaling matrix S which sets the ellipsoid size. Since the SEM images only capture the cell cytoplasm, each principal axis was increased in size by an additional 100nm to account for the cell wall during visualization. Last, although there is no possible 3d 3 × 3 translation matrix, a 4 × 4 translation matrix T can capture the position of the cell center (x0,y0,z0). Adding one additional column and row to the matrices R and S with the diagonal element being 1 and all other elements being 0 then means that a unit sphere centered at the origin can be mapped to any specific cell by a surface matrix M=TRS, and furthermore any point on the cell’s surface can be mapped back to the unit sphere by the inverse of M. Then, the surface matrices are the only information that must be stored. From this dataset, 20 clusters of 105±51 cells in each cluster were identified along with their intercellular mother-daughter chitin bonds. We chose to stop at N=20 clusters due to the expensive nature of hand-identifying mother-daughter bonds, and since N=20 clusters gave over 2000 measurements of single cells in the clusters. In following experiments with different organisms and in following simulations of different growth morphologies, we always made at least as many measurements as for these experiments of snowflake yeast.

Petite yeast cell size and shape

Request a detailed protocol

We measured cellular volumes from SEM images by ellipsoid fits. The average cellular volume of petite yeast was vc=17.44µm3±7.33µm3. This measurement was used in our Voronoi distribution derivations. We measured the mean cellular aspect ratio to be αa/b=1.28±0.20.

Bud scar size

Request a detailed protocol

We next measured the typical size of bud scars on the surface of Y55 yeast cells. Single cells were stained with calcafluor to highlight the chitinous bud scars (Figure 1—figure supplement 1). Confocal z-stacks were obtained on a Nikon A1R confocal microscope equipped with a 40× oil immersion objective. These images were visualized using the image processing software FIJI, and the 3d volume viewer plugin. To track the location and size of bud scars, a custom MatLab script was written to map the strongest calcafluor signals, since calcafluor makes bud scars brighter than other portions of the cell wall. Brightness isosurfaces then isolated the bud scars from the cell wall. Next, the isosurface points were rotated to the x-y plane by finding its principal components in a principal component analysis. The rotated surface points were then fit with an ellipse, returning the major and minor axes. The average of the major and minor axes returned an average interior bud scar diameter of 1.2µm. This value was later used in simulations of yeast groups.

Bud scar locations

Request a detailed protocol

We measured bud scar positional distributions for petite yeast Ace2KO. Since the SEM does not image chitinous bud scars, we approximated bud scar positions as the closest point on a mother cell’s surface to the corresponding daughter cell’s proximal pole. We recorded 1990 bud scar positions in polar coordinates, as defined in Figure 1—figure supplement 1. There is a clearly defined polar angle for the budding ring, while the azimuthal angle is uniformly distributed. The mean and standard deviations of the two angular coordinates were θ=42%±23%, and φ=180%±104%.

Imaging Volvox

Cultivation and selective plane illumination microscopy

Request a detailed protocol

The V. carteri f. nagariensis strain HK10 (UTEX 1885) was obtained from the Culture Collection of Algae at the University of Texas at Austin and cultured as previously described (Brumley et al., 2014). To visualise somatic cells, V. carteri spheroids were embedded in 1% low-melting-point agarose, suspended in liquid medium and imaged using a custom-built Selective Plane Illumination Microscope (Haas et al., 2018). Each somatic cell is mostly filled with a single chloroplast. Chlorophyll autofluorescence was excited at λ=561nm and detected at λ=570nm. To increase the accuracy with which we identify somatic cell positions, z-stacks of six spheroids were acquired from three different angles (0, 120, 240 degrees) and fused as described in the following paragraph. We used data from N=6 different Volvox organisms, which gave us 6000 measurements of single-cell positions. This gave us more data than what we measured for snowflake yeast clusters.

Registration of cell positions

Request a detailed protocol

Positions of cells were registered based on fluorescence intensity using custom Matlab scripts. This was achieved by carrying out a 2D convolution of each frame of the z-stack with a basic kernel modeling the appearance of a cell – this was set to be an asymmetric double sigmoidal function. Cell segmentation was corrected manually. Z-stacks taken from different angles were roughly aligned using Fiji and the Matlab function fminsearch to minimise distances between the reproductive cells. This alignment was used as starting point for alignment of the somatic cells again using fminsearch. The positions of somatic cells were merged and averaged.

Voronoi tessellation

We used a Voronoi tessellation algorithm to measure the distribution of cell neighborhood sizes in groups. We computed both 3D and 2D Voronoi tessellations.

3D voronoi tessellations

Request a detailed protocol

First, we computed 3D Voronoi tessellations within a defined boundary. These tessellations were performed for experimental snowflake yeast data from the SEM and simulations of 3D groups using the open-source Voronoi code Voro++ (Rycroft, 2009), wrapped in a custom MatLab script. Voro ++ takes as input the Cartesian coordinates of the cell centers and the boundary of the shape within which to compute the tessellation. Without a boundary, all of the Voronoi cells located on the periphery would extend to infinity. We started the tessellation process by setting the input boundary to be a sphere; the Voronoi algorithm tessellated space within the spherical boundary. Then, pieces of the sphere were pared away until a Voronoi tessellation within the group’s convex hull was obtained, as described in the next paragraph.

The boundary sphere was centered on the cluster’s center of mass. Its radius was the distance to the farthest cell center plus an additional 5µm. Upon tessellation within the sphere, each Voronoi polyhedron is defined by Cartesian vertices rj. We group these preliminary vertices by the cells to which they correspond, so that Qi={r1,r2,...,rm} is a list of the m vertices corresponding to cell i[1,N], N being the total number of cells in the organism. We next computed the cluster’s convex hull, which is the smallest convex polyhedron that contains all cell centers. We then extended the vertices of the convex hull by 3µm outwards from the cluster center of mass so the boundary contained the entirety of each cell. This new boundary polyhedron, whose vertices are labeled B, defines the cluster boundary. We then found the intersection polyhedron, Zi=QiB by taking the union of the dual of their vertices. This process thereby trims all Voronoi polyhedra to lie exclusively within the cluster’s convex hull. The polyhedra Zi were the final Voronoi polyhedra used for the remaining data analysis.

Voronoi tessellation on a sphere

Request a detailed protocol

For Voronoi tessellations of cells on the surface of simulated spheres (see Figure 3 and Figure 4 of the main text), we used a built-in Matlab function called ‘voronoisphere’ for Voronoi tessellations on a sphere.

Voronoi tessellation on non-spherical surfaces

Request a detailed protocol

We also computed 2D Voronoi tessellations on surfaces embedded in 3D space using custom-written MatLab functions. This approach was used for Volvox experimental data. Performing this computation with Volvox experiments presented a challenge as Volvox are roughly spherical, but with varying local curvature. It was therefore necessary to compute a Voronoi tessellation on an arbitrary surface.

The first step toward generating the proper Voronoi tessellation was computing the Delaunay triangulation of the cells on the surface (the Voronoi tessellation is the dual of the Delaunay triangulation). First, we found the Cartesian coordinates of each somatic cell (as described above), and normalized these coordinates so that all cell centers laid on the unit sphere. Then, a Delaunay triangulation of the normalized points was calculated. Edges of the triangulation that cut through the unit sphere were eliminated, and edges that laid along the sphere surface were kept. This Delaunay triangulation therefore mapped out the connectivity of the somatic cells. We then projected that triangulation onto the lumpy surface. The Voronoi polygon vertices are the circumcenters of each Delaunay triangle. Further, any edge shared between two Delaunay triangles denotes an edge shared between the Voronoi vertices associated with those two triangles. We found all edges connecting the Voronoi vertices. Next, connected edges were flattened so that each Voronoi cell was a 2D polygon. This step eliminates the curvature associated with the surface of the organism. However, we found that the distribution of Voronoi areas was unaffected by taking either the planar approximation or by approximating the area by taking the local curvature into account – the average difference between Voronoi areas when approximating the surface as a plane Ap vs. approximating the surface as a spherical cap As was found to be Ap-AsAp=0.001, measured for one organism. Therefore, we used the flattened Voronoi polygons as the final tessellation shapes.

Data analysis of voronoi measurements

In all cases, the output of the Voronoi algorithm is a list of Voronoi polytope sizes: in 3D, the measurements were the final Voronoi polyhedron volumes, while in 2D the measurements were polygon areas. Histograms of these sizes were generated to compare with the k-gamma distribution. As we observe cells in direct contact with each other, the minimum size of a Voronoi volume or area was defined by single cell measurements. For petite yeast cells, the mean cell size was calculated from the ellipsoid fits described above to be vc=17.44µm3±7.33µm3. In simulations, the minimum volume was set by the defined cell radius; in bidisperse simulations, the minimum size was set by the volume of the smallest cells.

We then calculated the expected maximum entropy distribution using only the mean and variance of the observed Voronoi volumes, v¯ and σ2, as inputs. Together with the minimum volume vc, these measurements define k=(v¯-vc)2/σ2, a dimensionless shape parameter (Aste and Di Matteo, 2008). The maximum entropy distribution was therefore not fit to the data using, for example, a least squares method, but inferred from the first two moments of the distribution.

Volvox

Request a detailed protocol

Along the surface of the Volvox organisms, there are gaps between some of the somatic cells due to the gonidia that lie beneath, but near the surface of the organism. These gonidia effectively occupy space on the surface, making it inaccessible to somatic cells. We excluded all Voronoi cells that intersected these gonidia gaps. We identified gaps in the soma cells by flagging Delaunay triangles with exceptionally high aspect ratios. Any Voronoi polygons that intersect the flagged Delaunay triangles were then flagged and later excluded from the dataset. The polygons were generally spatially clustered, indicating that the gonidial gaps were being correctly isolated. Roughly 90 polygons were excluded from each organism.

In Volvox organisms, each cell is surrounded by extracellular matrix, so cells do not contact each other. Furthermore, each of the six organisms studied varied in diameter (standard deviation in diameter was 28.2µm), yet all contained roughly the same number of somatic cells, leading to systematic differences in average surface area per cell across the organisms. Quantitatively, the coefficient of variation of the diameter of the groups was CVD=0.05, while the coefficient of variation in the number of cells in each group was roughly 10 times smaller, CVN=0.006. To counter the systematic size differences between organisms, we converted the Voronoi polygon areas into solid angles by dividing by the total surface area of each organism, Ωi=Ai/S; we then grouped all six organisms together into one histogram. We allowed the minimum solid angle, used in the k-gamma equation, to be a fit parameter in a least squares minimization procedure. There was one outlier cell with solid angle Ω=0.0048 steradians; the next two smallest cells had solid angles 0.0068 and 0.0069 steradians. We removed the outlier; the least squares minimization procedure then fit a minimum solid angle Ωc=0.0070 steradians. We used this value for all further calculations. Just as in the 3D case, the mean and variance of the solid angle were measured to set the expected maximum entropy distribution.

Goodness-of-fit analysis

To test how well the k-gamma distribution performed as a predictive distribution, we systematically compared its performance to three other distributions: the normal (Gaussian) distribution, the log-normal distribution, and the beta prime distribution. We used two of our datasets for this comparison: experimental values of Voronoi volumes for snowflake yeast, and simulations of snowflake yeast (which provided us with more datapoints for comparison). First, we used the measured mean V¯ and standard deviation s of the datasets to estimate each distribution’s parameters. In Table 1 are the two parameters of each distribution and how they are calculated from the measured mean and variance of the dataset.

Table 1
Model parameters for distribution comparisons.
K-gamma (μ,k)μ=V¯k=(V¯-vc)2/s2
Normal (μ,σ)μ=V¯σ=s
Log-normal (μ,σ)μ=exp(V¯+s2/2)σ=(exp(s2)-1)exp(2V¯+s2)
Beta prime (α,β)α=V¯(V¯+V¯2+s2)/s2β=(V¯+V¯2+2s2)/s2

The three distributions that we used to compare to the k-gamma distribution were chosen based upon their properties. For instance, all three comparison distributions have two parameters; also, all three chosen distributions have either semi-infinite or infinite domain. We chose the Normal distribution because it is the centrally-limiting distribution for a process with summative random errors. We chose the Log-normal distribution because it is the centrally-limiting distribution for a process with multiplicative random errors. Last, we chose the beta prime distribution arbitrarily; the goal was to make comparisons to a distribution which we have no reason to believe would accurately describe this dataset.

To systematically compare the distributions, we sampled N datapoints, with replacement, from the original datasets. We varied N to test the effect of finite datasets. Each time we sub-sampled the dataset, we calculated the mean and variance of the new data sample, then used those measured values to estimate the parameters of the four distributions. We then recorded the root-mean-square residual for all four cases. We plotted the root-mean-square residual for each distribution as a function of the data sample size, N, in Figure 2. We observed that the k-gamma distribution outperformed the other distributions at every sample size.

Next, we used the four distributions to predict the skewness of the simulation data. We empirically determined the first two moments of the dataset: we take these to be the first two moments for each of our distributions. Then, we used the first two moments to predict the skewness, which is related to the third moment of the distribution. We then compared this value to experimentally-measured skewness, finding that the k-gamma distribution estimated the experimentally observed skewness with only a 7% error; the other three distributions estimated the skewness with percent errors of 100%,96%,and174% for the normal, log-normal, and beta-prime distributions, respectively (Figure 2).

As an additional test, we used least-squares fitting to test the performance of the k-gamma distribution compared to the other three distributions for simulation data. In this case, we let the least-squares fitting algorithm find the best parameters to fit each distribution; then, we compared the least-squares-achieved mean and standard deviation of the fit distribution to the measured mean and standard deviation of the population. We found that the k-gamma fit reproduced the closest values to the measured mean and standard deviation (Figure 2).

Cluster size distribution measurements

Request a detailed protocol

Cluster sizes were measured using a Beckman Coulter Multisizer 4e particle analyzer in the Cellular Analysis and Cytometry Core of the Shared User Management System located at the Georgia Institute of Technology. Petite Ace2KO clusters were taken from steady state concentration in YPD and then submerged in electrolytic fluid and passed through a 100µm aperture tube. The volume measured on the multisizer corresponds to the volume of electrolyte displaced by the cluster. The number of cells in each cluster was then estimated by N=V/vc, where V is the volume of organism measured by the Coulter Counter, and vc is the average cell volume from SEM measurements, vc=17.44µm3.

Cumulative distribution function statistics

Request a detailed protocol

To quantify goodness-of-fit for predicted maximum entropy distributions, we compared the predicated cumulative distribution function (CDF), F(x), to the empirical CDF, Fi, using P-P plots. Exactly predicted points will lie on the line y=x in these plots. We measured the root-mean-square residual from the line

(12) rRMS=(FiF(x))2

Measurements of Ψ6 in V. carteri

From the light sheet images of Volvox, we obtained the Cartesian coordinates of each somatic cell. From Delaunay triangulation, we then obtained a list of every cell’s closest neighbors. Each cell and its NN nearest neighbors did not generally lie in a plane due to local curvature of the Volvox surface. We therefore calculated in-plane and out-of-plane components using principal component analysis. The in-plane components were then used to write the positions of each nearest neighbor in polar coordinates. The formula for calculating Ψ6 is

(13) Ψ6=|1NNj=1NNe6iθj|

where θj defines the polar angle coordinates around the cell of interest and denotes averaging over all cells. We calculated Ψ6 separately for each of six different organisms; we report Ψ6=0.03±0.01.

Correlation of voronoi areas

Request a detailed protocol

In Volvox organisms, we calculated the spatial correlation of polygon areas. First, we extracted the list of cell neighbors from the Delaunay triangulation of the organism surface. Nearest neighbors were designated as living a network distance of 1 away from a cell of interest; next nearest neighbors live a network distance of 2 away from the cell of interest, etc. The number of neighbors a network distance of Q away is then J(Q), which is empirically determined. The network correlation function is then

(14) C(Q)=(Ω-Ω)YQσΩσYQ

where YQ=J(Q)-1j(Ωj-Ω) is the average deviation of the solid angle of a given polygon’s neighbors from the mean. The standard deviation of the solid angle across the population is σΩ, and σYQ is the standard deviation of YQ across the population.

Simulation methods

Simulations of snowflake yeast groups

Request a detailed protocol

Simulations of snowflake yeast groups were adapted from previously published work by Jacobeen et al., 2018a; Jacobeen et al., 2018b that found simulations of snowflake yeast growth morphology accurately replicated experimentally measured cellular packing fractions and average group sizes. In the present work, cells were modeled as prolate ellipsoids of revolution with a semi-major axis a=2.88µm and semi-minor axis b=2.29µm, characterized by the aspect ratio αa/b=1.26. Each generation, every cell attempted to reproduce; however, if new cells closely overlapped with existing cells (i.e. their bud scars are closer than 1.2µm), they were eliminated. Setting the number of generations (for example, 7) sets the maximum possible number of cells in the group at the end of the simulation (27=128), and roughly sets the expected number of cells in the group (100). In our simulations, cells were 80% likely to bud first from the distal pole (i.e. θ=0±10 degrees). Subsequent cells budded at a polar angle θ, and with an azimuthal angle randomly chosen from a uniform distribution φ[0,2π]; in other words, after the first bud, cells generally appeared along a ‘budding ring’. There was a 20% chance that the first bud would appear along this budding ring instead of exactly at the pole. After 3 bud scars, there was a 50% chance that new cells bud on the proximal side (π-θ) instead of the distal side. The orientation of the new cell is determined by the surface normal to the mother cell at the position of the bud site; the major axis of the new cell lies along the surface normal.

To compare exhaustively the distribution of Voronoi volumes between simulations and the k-gamma distribution, we simulated 9,100 clusters. In each simulation, clusters were allowed to grow for 7 generations of cell division, corresponding to an average of 94.2±10.9 cells per cluster. The budding ring was defined by the polar angle θ=45%, a close approximation to the experimentally measured mean polar angle. These simulations did not include intercellular forces. The cell centers were recorded and then Voronoi tessellations were made within each cluster’s convex hull.

Simulations of V. carteri

Request a detailed protocol

We simulated 10 Volvox-like groups with N=1000 cells confined to the surface of a sphere. Cells were placed on the surface of a sphere of unit radius by randomly selecting polar and azimuthal coordinates in a Poisson point process. The process proceeds as follows: each new cell was randomly placed, and its distance from all other cells was calculated. If the new cell is within a threshold distance d from any existing cell, it was removed and a new cell was placed elsewhere on the spherical surface. This process was iterated until all 1000 cells were placed. We chose a minimum separation distance of d=0.088, which allowed reasonably rapid convergence. We then calculated the Voronoi tessellation and the correlation function as described above. The N=10 groups, each with 1000 cells, returned 10000 single cell measurements.

Simulations of two additional growth morphologies

Request a detailed protocol

We next sought to model two additional classes of growth morphologies: sticky aggregates and cells contained within a maternal membrane. In both simulations, cells were modeled as spheres with unit radius.

Aggregative groups

Request a detailed protocol

First, we considered a multicellular model of sticky aggregates, mimicking group formation in, for example, flocculating yeast and bacterial aggregates. In our simulations, groups were grown from a single cell. New spherical daughters appeared at a polar angle θ and azimuthal angle φ. Within each step, there was stochasticity in the budding location: cells would appear at θ=θ0±15%. The azimuthal angle was always drawn from a uniform distribution on the interval [0%,360%]. We simulated N=50 aggregative groups, each with 64 cells, for a total of gt3000 measurements of single-cell volume.

Cells interacted with both steric and attractive interactions in overdamped dynamics. Steric interactions were modeled through a harmonic potential when two cells overlapped, with a cutoff once cells were no longer overlapping. That is, for two cells i and j (radii Ri and Rj) separated by the vector rij=rjri, the steric force acting on cell i from cell j is

(15) Fij={0|rij|>(Ri+Rj)κs(|rij|(Ri+Rj))r^ij|rij|(Ri+Rj)

Attractive interactions (i.e. sticky, aggregative bonds) were also modeled through a harmonic potential, but these interactions had both a lower bound and upper bound cutoff.

(16) Gij={0|rij|>2(Ri+Rj)κa(|rij|a(Ri+Rj))r^ij(Ri+Rj)|rij|2(Ri+Rj)0|rij|<(Ri+Rj),

where a sets the location of the attractive well minimum. We used a=0.9, so that the attractive interactions allow a small amount of cell overlap.

Size polydispersity

Request a detailed protocol

In simulations in which we introduced size polydispersity, cells were allowed to reproduce into two separate sizes, R1=1 and R2=2. The probability of budding cells of the same size as the mother cell is denoted ξ. When ξ=1, the mother cell always produces cells of the same size, while when ξ=0.5, there is a 50% chance that the mother cell produces a cell of size R1 or R2, independent of the radius of the mother. Simulations were seeded with a pair of contacting cells, one each of the two radii. The simulation then proceeded with subsequent rounds of cell division and mechanical relaxation. We varied the polydispersity parameter ξ in even increments from ξ=0.5 to ξ=1.0 for a total of 6 different simulations. In each simulation, we simulated N=50 organisms, each with 64 cells, for a total of 3200 cells per simulation, and almost 20,000 cells total.

Groups confined within a membrane

Request a detailed protocol

In another common mode of group formation, cells divide repeatedly within a confining membrane. This type of group formation has been observed in experimentally-evolved multicellular algae derived from unicellular Chlamydomonas reinhardtii (Herron et al., 2019), and is reminiscent of both baeocyte production in Stanieria bacteria (Angert, 2005), and neoproterozoic embryo fossils (Xiao et al., 1998). In a simulation model, we adopted the essential components of this class of growth: groups grow from a single spherical cell, cells divide stochastically, and cells interact sterically with both a maternal cell wall and each other. Typically, palintomic cell division occurs rapidly, meaning that the packing fraction remains the same within the maternal cell wall. We simulated this by increasing the radius of the cell membrane after each cell division, but before allowing any mechanical relaxations.

Steric forces between a cell and the maternal cell wall were modeled as being proportional to the non-overlapping volume of the cell and the maternal cell wall. In other words, if a cell is not contacting the membrane, there is no force acting on it. However, if the cell is contacting the membrane, the force is proportional to how much of the cell volume lies outside the membrane. Each cell was assigned volume vc=4/3*pi. The overlapping volume of the cell and the membrane is labeled vi. The force the cell experiences from the membrane is then

(17) Fi=κm(vcvi)r^i,

where r^i is a unit vector pointing to the center of the maternal membrane. Additionally, steric interactions between cells were calculated as described above for aggregative groups.

We simulated N=200 groups growing in a confining membrane, each with 64 cells in the group, for a total of 12800 measurements of single cells.

Groups confined to a spherical surface

Request a detailed protocol

Some groups form by arranging cells around a central core of extracellular matrix (ECM). To simulate such groups, we modeled a sphere of ECM with cells arranged randomly along the surface. Cell positions were chosen by selecting a position in spherical coordinates from uniform polar θ[0,π] and uniform azimuthal φ[0,2π] distributions. The only rule implemented in cell placement is that no two cells can be located closer than two cell radii from one another. If a new cell is chosen to be located too close to any existing cells, it is eliminated and a new position is chosen. We iterated this process until N cells were placed on the ECM surface.

First, we chose to place N=50 cells on the surface. Therefore, the maximum cell radius allowing all 50 cells to be placed is 0.283 units (where the total sphere has unit radius). We chose the cell radius to be 0.1980 units, which allowed for reasonably rapid random placement of all 50 cells (other choices of cell radii demonstrate qualitatively similar results).

Simulated cellular apoptosis

Request a detailed protocol

In simulations with apoptosis events, cell death occurred after group generation. Briefly, groups were generated by iterated generations of cell division starting from a single cell. After this process, one cell was chosen at random to die. Then, all cells within a localization radius R were flagged. Of the flagged cells, 9 more were chosen at random to die. Therefore, small localization radii correspond to highly localized death events, where 10 juxtaposed cells may die together. As the localization radius increases, there are more flagged cells, and therefore more randomness in cell death. All other cells were unaffected by the cell death process. We simulated six different cell death radii. Each cell death radius simulation included 100 organisms simulated, each with 43 cells after the cell death process. Therefore, each cell death radius R includes 4300 single cell measurements, for a total of 25,800 measurements.

Tree-like groups with precisely defined cell placement/location

Request a detailed protocol

We also investigated groups with precisely defined growth patterns. The spherical cells were held together with fixed, chitin-like bonds. The first cell was placed at the origin. It then proceeded to bud 3 daughter cells, each of which also budded subsequent cells. The exact budding pattern is described below.

Daughter cells were placed as follows. In spherical coordinates on the surface of the mother cell, the first daughter cell was placed at (θ=0±η, ϕ=0±η), the second at (θ=90±η, ϕ=90±η), and the third at (θ=90±η, ϕ=270±η), where η is the strength of random noise added.

The first daughter cell’s coordinate system was rotated 90%±η around the z-axis from the mother cell; in other words, for the first daughter cell, xx, yy, and zz, where x=Rz(π/2+η)x, y=Rz(π/2+η)y, and z=Rz(π/2+η)z, and Rz is the rotation matrix around the z-axis. This daughter cell then proceeded to bud daughters in the exact same pattern as its mother; however, because its local coordinates were rotated, the budding positions were also rotated 90% with respect to the mother cell’s buds. This process was iterated for five generations of cell division. When η=0, this corresponds to only three cells overlapping three other cells. The three overlapping cells were then removed.

After each round of cell division, cells were allowed to relax mechanically in overdamped dynamics according to steric repulsive interactions and sticky, rigid bond interactions to their mother cell. The steric interactions were the same as described above. Fixed bond interactions were modeled as follows. When new cells appear, they incur a bud scar on the mother cell’s surface and a birth scar on the daughter cell’s surface. The positions rbu and rbi of the bud scar and the birth scar were recorded and tracked. The vector pointing from the bud scar (on the mother’s surface) to the birth scar (on the daughter’s surface) is r=rbirbu. Then, the force acting on a cell from it’s mother cell is

(18) Fmother=κ(|r|2)r^

where κ was the chitin bond strength. In addition, cells experienced forces from all of their daughter buds (given by the same relationship and the same chitin bond strength). The initially seeded cell did not experience forces from a mother cell.

For η=0 (i.e. no noise), the distribution of Voronoi volumes was visually discontinuous, since cells could only access a finite number of local configurations. As the noise strength increased, the maximum entropy predictions were gradually recovered. We simulated 11 different noise strenths, each with 150 organisms, each with 28 different cells simulated, for a total of 46,800 measurements.

Appendix 1

It may appear surprising that the distribution of cell volumes is not governed by the Central Limit Theorem (CLT), i.e. the volumes are not distributed normally. After all, Voronoi polytope volumes are generated from many randomly interacting pieces - should not these many different random fluctuations sum to a CLT-like scenario? A simple comparison between the modified gamma distribution, a normal distribution, and a log-normal distribution shows in fact that both the normal distribution and the log-normal distribution fail to capture essential characteristics of the volume packing, while the k-gamma distribution does. For snowflake yeast, the reason for this disagreement is that as each new cell is added to a cluster, it changes the entire volume distribution, since the new cell occupies space which was previously unoccupied. It therefore changes the volumes of all its nearest neighbors; if they flex to accommodate the new cell, then those neighbors change the Voronoi volumes of their neighbors, and so on. Therefore, adding a new cell does not sample the same distribution as before - the distribution itself changes, rendering the limit inapplicable.

In the case of the Volvox, the somatic cells are originally connected together only by cytoplasmic bridges, forming a small sphere. As the ECM is generated the sphere “inflates”. This process, in which many random fluctuations in the amount of ECM excreted by each cell over time can integrate together, seems appropriate for CLT-like arguments. However, it is worth noting that the cells are generally locally oriented with a hexagonal symmetry. In order to maintain a non-wrinkled surface, more ECM must be secreted in some local regions, such as the corners of the hexagons, than in other places, such as at the hexagon edges. Since there is no local wrinkling observed, the secretion of ECM from the somatic cells cannot be a completely random process orientationally. In other words, the ECM excretion process is controlled, which implies that the CLT does not properly capture the sampling space. Instead, the cells inevitably occupy positions on the surface of the sphere that vary from organism to organism; the maximum entropy distribution of their Voronoi areas is then the k-gamma distribution.

Data availability

Figures 1 & 3 source data: Experimental data files enumerating the cell centers positions for each organism sampled. The folders are subdivided into those on snowflake yeast (from SEM studies) and Volvox (from lightsheet studies). Each subdirectory contains an explanatory README file. Figures 2, 4 & 5 source data: Simulation data (enumerating the cell center positions) for the six classes of numerical studies; aggregation, apoptosis, polydispersity, snowflake yeast growth, tree-like growth, and Volvox growth. Each subdirectory contains an explanatory README file.

References

  1. Book
    1. Davidson EH
    (2001)
    Chapter 1 - regulatory hardwiring: A brief overview of the genomic control apparatus and its causal role in development and evolution
    In: Davidson EH, editors. Genomic Regulatory Systems. San Diego: Academic Press. pp. 1–23.
    1. Kirk DL
    (2005) A twelve-step program for evolving multicellularity and a division of labor
    BioEssays : News and Reviews in Molecular, Cellular and Developmental Biology 27:299–310.
    https://doi.org/10.1002/bies.20197
    1. Schneider M
    2. Jenkins J
    3. Webb W
    (1984)
    Thermal fluctuations of large quasi-spherical bimolecular phospholipid vesicles
    Biophysical Journal 45:891–899.
    1. Starr R
    (1969)
    Structure, reproduction and differentiation in volvox carteri f. nagariensis lyengar, strains hk9 and 10
    Archiv Für Protistenkunde 111:204–222.
    1. Stratford M
    (1992)
    Yeast flocculation: a new perspective
    Advances in Microbial Physiology 33:2–71.
    1. Tsimring LS
    (2014) Noise in biology
    Reports on Progress in Physics. Physical Society 77:026601.
    https://doi.org/10.1088/0034-4885/77/2/026601
  2. Book
    1. Waddington C
    (1957)
    The Strategy of the Genes
    Routledge.

Decision letter

  1. Irene Giardina
    Reviewing Editor; Università Sapienza, Italy
  2. Aleksandra M Walczak
    Senior Editor; École Normale Supérieure, France

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Decision letter after peer review:

Thank you for submitting your article "Cellular organization in lab-evolved and extant multicellular species obeys a maximum entropy law" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Aleksandra Walczak as the Senior Editor. The reviewers have opted to remain anonymous.

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

Both reviewers praise the quality of the paper and the relevance of the results. There are not, in their opinion, critical aspects in the manuscript to be further addressed. However, they suggest a number of revisions, which would improve the clarity and presentation of the work. In particular, both reviewers think that more discussion is needed of the maximum entropy model, e.g. whether additional information on higher order structures or morphology related correlations might lead to more effective statistical models. Referee 2 also advises for more contextualization of the results, and a wider discussion about their generality.

Below the full reports, the authors are kindly invited to take into account the referees' comments in a revised version.

Reviewer #1:

The manuscript by Day and colleagues investigates the geometry of cell packing in two multicellular eukaryotes (snowflake yeast and green algae). Using a combination of experiments and models drawn from statistical physics, they show that the distribution of cellular neighborhood volumes follows a simple universal form – a modified gamma function – that arises from a maximum entropy argument. Using simulations of different growth processes, they then show that these universal distributions are ubiquitous-arising, for example, even in correlated systems as long as there is a minimal level of noise. Finally, they show how these principles contribute to emergent evolutionary features (specifically group size distributions) in snowflake yeast, and use simple theoretical models to argue that fluctuations, while inherently stochastic, give rise to robust structures that do not depend sensitively on the microscopic and biological features of the system.

This paper is a beautiful example of how simple biophysical models can provide fundamental and unifying insight into complex biological systems. It is well written, addresses an important and timely topic, and raises intriguing questions about the balance between "regulated" biology and simple statistical physics as selective forces for evolution.

I have several comments for the authors to consider, at their discretion. Overall, I really enjoyed this paper and learned a great deal from it.

– The manuscript offers an interesting guiding principle that describes two considerably different biological systems. As the authors show in simulations, the principle is expected to hold over a broad range of conditions, but of course not universally (though even small levels of stochasticity broaden the range of applicability). I think the paper could be improved by expanding on the discussion of these limitations. In particular, it is not clear to me exactly how surprising it is to see "good" fits to a 2-parameter distribution of this sort (or more generally, what level of "good" we should expect of the fits in finite data sets like these). The authors address this issue in part by showing fits to other distributions, which is nice. But I wonder if it would be helpful to also include (or at least discuss) more systematic model selection. To be clear, I find the analysis quite convincing as is. But I am trying to get my head around the limitations, and in particular, to get a feel for how likely one is to see similar "goodness of fit" results using other distributions with a relatively small number of parameters.

– Related to the previous point: one approach might be to construct a type of "null" model from the data, perhaps by systematically shuffling the data in some way and then bootstrapping to evaluate the likelihood of achieving fits of similar quality.

– Have the authors considered trying to systematically quantify the impact of including higher-order structures in the max ent model? For example, one could perhaps use multi-information metrics (https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.91.238701) to evaluate the extent to which higher-order features of the data are relevant / necessary; the idea is essentially to construct maximum entropy models with various levels of data complexity "built in" and then evaluate (perhaps with an info-theoretic style metric) the extent to which that complexity improved the model. Perhaps something similar has been done for granular materials to capture higher-order statistics of packing? I ask this primarily out of curiosity, not as a serious criticism of the current approach. A discussion of this point might add to the paper.

Reviewer #2:

Day, Thomas C. et al. investigate the geometrical statistic of cell packing in multicellular organisms. Using a maximum entropy prediction originally developed in the study of granular materials, the authors show that the statistics of cell packing imposes a robust physical, entropic constraint on the geometrical arrangement of cells. Strikingly, the authors show that both snowflake yeast evolved under lab conditions and wild-type Volvox, which develop according to very different processes and have disparate overall morphology, both exhibit cell packing statistics consistent with the maximum entropy predictions. They then use simulations to show that entropic cellular packing can arise from various modes of multicellular development due to randomness in cell positions and that substantial deviations from entropic packing arise only in the case of low developmental noise (randomness) and strong correlations in cell positions. Finally, the authors use theory and measurements from experiments with yeast to show how maximum entropy statistics dictate size heritability in simple multicellular systems. Together, their results support the perhaps counterintuitive result that developmental randomness can actually underpin developmental reproducibility, in this case reproducibility in the geometry of cell packing in terms of the free space associated with individual cells within a multicellular structure. This work contributes the identification and new consideration of a fundamental physical constraint of particular relevance to the evolutionary origins of multicellularity and to multicellular morphogenesis in general.

The conclusions of this paper are well supported by the rigorous analysis of data and simulations.

Work on the evolution of multicellularity has traditionally focused on molecular and genetic mechanisms, but because multicellular morphogenesis is an inherently physical process, biophysical studies provide an important complementary perspective. A particular strength of this paper is that insights are derived from theory that requires few, but specific, conditions be met in order to be satisfied, and therefore stands to apply generally to diverse multicellular systems, irrespective of many differences between them. The combination of empirical results from disparate multicellular systems in conjunction with simulations encompassing an expanded set of multicellular morphologies and growth processes compellingly support the generality of the insights. Beyond simply speculating about the implications of entropic packing on the function of multicellular systems, the authors demonstrate impact or lack thereof on aspects of form and function in multicellular yeast and Volvox. Importantly, simulations allowed the authors to investigate in detail the robustness of theoretical predictions in terms of deviations from theory arising from developmental processes. In addition to providing new insight, this work lays the foundation for the exciting possibility of inferring aspects of developmental dynamics and regulation simply by observing the statistics of cell packing in an organism, which could be of great use in comparative evo-devo studies where developmental processes are difficult or impossible to observe.

While the work is very strong overall, there are a few caveats to consider, primarily concerning the simulations. Multicellularity takes many forms by many different processes among eukaryotes. While simulations do cover a range of different morphologies and developmental processes found in nature, factors not explicitly addressed such as constraint or patterning by secreted extracellular matrix, differences in cell shape, cell migration, and others can lead to different kinds of multicellular form. The extent to which potential correlations imposed by diverse morphologies might lead to deviations from theoretical maximum entropy predictions, and how robust those deviations might be to noise is not entirely clear. Additionally, the randomness strength in simulations from high to low, while reasonable, does not appear to be grounded in empirical characterization of randomness strength in developmental processes across biological systems. Ultimately, although they leave some uncertainty as to the generality of the results, these limitations do not contradict or significantly diminish the key claims of the paper.

Comments for the authors:

1) The simulation results are compelling but left me with some questions. To what extent do the morphologies and processes investigated by simulations address the diverse forms of multicellularity encountered across eukaryotes? To what extent does the overall shape of the multicellular structure affect the cell packing distribution (e.g. multi-lobed structure as in Zoothamnium niveum, dichotomous branching as in Dinobryon, something with an undulating boundary)? Are there any examples of simple multicellular eukaryotes that might exhibit very strongly correlated cell positions? What is known about randomness strength or precision in developmental processes in biological systems, and if anything is known, how does this compare to values in simulations? Providing a bit more contextualization or motivation for specific choices in simulations could help address these questions and would support the generality of conclusions drawn from the simulations. Although I am convinced that the results hold for a broad range of multicellular architectures and do not think that the possible existence of a few edge cases contradicts the main conclusions of the work, it is not entirely clear to me that the effects of growth morphology, connection topology, and dimensionality have been accounted for.

2) The sections titled "Multicellular motility is robust to cellular area heterogeneity" starting on p. 11 is slightly perplexing. It is certainly interested, and I see that it addresses a question that may arise from analysis of Volvox cell packing, but in its current form, I do not believe it contributes substantially to the key points of the paper. The introduction section seemed to imply that the results would demonstrate that fluctuations in cell packing may play a role in the evolution of multicellular systems, but as I understood them, the results suggest that fluctuations do not affect motility, at least implying that there should be little to no effect on any aspect of fitness related to motility. It is possible that there could be other aspects of organismal fitness related to cell packing, so while these results are consistent with cell packing fluctuations not necessarily impeding or constraining the evolution of multicellularity, they do not strongly support that conclusion. Perhaps contextualizing the results a bit more in terms of key points of the paper while reporting them and referring to them in the Discussion section might help the reader better appreciate their significance within the context of the paper overall.

3) I might suggest removing or otherwise modifying the phrase "highly-evolved" (p.14) as its meaning is unclear, has connotations of evolutionary teleology, and clashes with the fact that all extant organisms share an evolutionary history of equal length. Maybe something such as "organisms with highly-regulated development" may be more appropriate.

4) Is anything known about the source of correlated subregions of cells in Volvox? Do the authors have any ideas about this? Either way, it would be interesting to know and may warrant at least a small comment in the text.

5) In the author list, SSH is missing an asterisk to denote corresponding authorship.

6) An "e" is missing in "surface" in the caption for Figure 2B.

7) I believe the dotted red line in Figure 4B should be a solid line to match those in panels A and C.

https://doi.org/10.7554/eLife.72707.sa1

Author response

Reviewer #1:

[…] I have several comments for the authors to consider, at their discretion. Overall, I really enjoyed this paper and learned a great deal from it.

– The manuscript offers an interesting guiding principle that describes two considerably different biological systems. As the authors show in simulations, the principle is expected to hold over a broad range of conditions, but of course not universally (though even small levels of stochasticity broaden the range of applicability). I think the paper could be improved by expanding on the discussion of these limitations. In particular, it is not clear to me exactly how surprising it is to see "good" fits to a 2-parameter distribution of this sort (or more generally, what level of "good" we should expect of the fits in finite data sets like these). The authors address this issue in part by showing fits to other distributions, which is nice. But I wonder if it would be helpful to also include (or at least discuss) more systematic model selection. To be clear, I find the analysis quite convincing as is. But I am trying to get my head around the limitations, and in particular, to get a feel for how likely one is to see similar "goodness of fit" results using other distributions with a relatively small number of parameters.

We completely agree that assessing goodness of fit is crucial, despite the fact that doing so is difficult for complex, non-linear, non-monotonic functions. Thanks to these comments, we have clarified our discussion of the topic and added a new analysis (described in detail in response to the next question). Thank you for this suggestion; we believe addressing it has strengthened the manuscript.

We agree that it is, in general, unclear how well one should expect 2-parameter distributions to perform and what exactly distinguishes a good 2-parameter distribution from a bad 2-parameter distribution for our datasets. To rectify this, we consider four different 2-parameter distributions that use the empirically measured mean and standard deviation (as well as the measured minimum cell size); they are the k-gamma, normal, log-normal, and beta-prime distributions (in the original text, we compared the data only to the normal and log-normal distributions). We do not calculate least-squares fits for these distributions, but simply use the empirically measured values of the mean and variance to extract the two relevant parameters of each distribution. For both experiments and simulations, the k-gamma distribution gives the best match to our data (Figure 2 Supplement 1). Further, as our simulations are less data-limited, we can confirm that the k-gamma distribution is significantly more accurate than the other distributions at predicting the frequency of large Voronoi volumes.

These four distributions were purposefully chosen. The rationale for the k-gamma distribution with regards to maximum entropy packing is detailed in the text. The normal, log-normal and beta-prime choices are discussed below.

We chose the normal distribution as it is the maximum entropy distribution given only the mean and variance of a population; furthermore, it sits as the limiting distribution according to the central limit theorem. Therefore, should random fluctuations add together completely independently, we might expect to observe a (truncated) normal distribution of the volumes. The absence of agreement between the data and this distribution implies that there is an additional feature that must be considered; namely, that cell volumes are not completely independent of one another but rather must sum to match a total volume (this is, of course, a requirement for the k-gamma maximum entropy distribution).

Similarly, the log-normal distribution was chosen for comparison due to its MaxEnt and Central-Limitlike properties. In essence, the log-normal distribution is the central limit theorem result for the logarithm of a variable. Further, volumes must be positive numbers, and the log-normal distribution only exists in the positive domain. Log-normal distributions are observed in many natural systems such as ecology, physiology, geology and more, and have recently been shown to describe organism swimming speeds quite well [6]. They define a good null model for any complex process with many interacting elements, particularly when there is multiplicative noise. By observing that the k-gamma distribution outperforms the log-normal distribution, we recognize that the volumes are not multiplicatively independent of one another.

Last, we chose to compare the performance of the k-gamma distribution to an arbitrary distribution with two parameters: we chose the beta-prime distribution, which has two parameters (α and β) that can be computed from the mean and variance of the population. It also has a semi-infinite domain. As we might expect, the arbitrarily chosen beta-prime distribution underperforms compared to the k-gamma distribution.

Therefore, showing these four distributions together allows us to compare four “minimal” models, and to demonstrate that the k-gamma distribution is the most accurate match for our experiments and simulations.

– Related to the previous point: one approach might be to construct a type of "null" model from the data, perhaps by systematically shuffling the data in some way and then bootstrapping to evaluate the likelihood of achieving fits of similar quality.

Thank you for this suggestion. We agree that this is an excellent approach to justify systematically the chosen distributions. For four 2-parameter distributions (k-gamma, normal, log-normal, and beta-prime), we applied a bootstrapping/resampling approach to understand how much error one might expect to see in these fits given the finite data sets. We applied this approach to experimental and simulated snowflake yeast Voronoi volumes. We took variable number subsamples of these datasets and measured how these four 2-parameter distributions perform. For each sample size, we computed the dataset’s mean and standard deviation, then used these parameters to calculate the necessary parameters for each of the distributions (i.e., we again did not do any “fitting” in a least-squares sense). By calculating the root-mean-square residual of the cumulative distribution functions, we found that the k-gamma distribution always outperforms the others (see Figure 2 Supp 1c). Further, the k-gamma distribution error continually decreases with increasing sample size (albeit slowly for large sample sizes), while the error plateaus at a larger value for all other distributions.

Next, we used the four distributions to predict the skewness of the simulation data. We empirically determined the first two moments of the dataset: we take these to be the first two moments for each of our distributions. Then, we used the first two moments to predict the skewness, which is related to the third moment of the distribution. We then compared this value to experimentally-measured skewness, finding that the k-gamma distribution estimated the experimentally-observed skewness with an error of only 7 percent, while the other three distributions (normal, log-normal, and beta-prime) estimated the skewness with percent errors of −100%, 96%, and 174%, respectively.

As an alternative approach, we test the ability of our selected distributions to perform in a least-squares fit. In other words, we fit the k-gamma, log-normal, Gaussian, and beta-prime distributions to the observed dataset. Then, we compare the values of the fitted parameters (namely, the mean and variance) to the observed values. Distributions that more closely fit the observed data are expected to produce closer estimates of the observed moments. Indeed, upon least-squares fitting the simulated dataset, we observe that the k-gamma distribution fit to the data most closely matches both the observed mean and the observed standard deviation, indicating that it is not only the distribution that best describes the tail of the distribution (as visualized in Supplementary Figure 2) but also the best fit for the distribution when considering the first two moments.

Figure additions: We revised and appended two supplemental figures for this response and the previous one. We included one figure on the probability distribution and cumulative distribution functions for all datapoints (Figure 2—figure supplement 1a,b), and then an additional panel on the bootstrapping analysis (Figure 2—figure supplement 1c). Then, we added a figure detailing the comparison of the estimated skewness from each distribution to the true measured values, and then of the measured mean and variance to the least-squares fit mean and variance for all four distributions (Figure 2—figure supplement 2).

Text additions: We added a subsection to the Methods section titled “Goodness-of-fit analysis” in which we present these results. In this section, we describe our reasons for choosing the three comparison distributions, indicate the parameters involved in each, and finally compare the performance of each.

– Have the authors considered trying to systematically quantify the impact of including higher-order structures in the max ent model? For example, one could perhaps use multi-information metrics (https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.91.238701) to evaluate the extent to which higher-order features of the data are relevant / necessary; the idea is essentially to construct maximum entropy models with various levels of data complexity "built in" and then evaluate (perhaps with an info-theoretic style metric) the extent to which that complexity improved the model. Perhaps something similar has been done for granular materials to capture higher-order statistics of packing? I ask this primarily out of curiosity, not as a serious criticism of the current approach. A discussion of this point might add to the paper.

Thank you for these thought-provoking resources and ideas. We agree that this would be a very interesting next step to take. In particular, in Figures 1, 3, and 4 we probe cases where our predictions and observations differ. We think that future work could build upon these observations, though the framework to do so for maximum entropy packing may not yet exist. Nevertheless, it would be very interesting to use higher-order correction terms to understand exactly what kinds of new considerations are relevant for organisms that achieve various levels of morphological complexity.

We now directly address this topic in the discussion, and discuss both the suggested higher order structure paper as well as recently published work that uses topological information about the contact network of bacterial cells to uncover universal motif distributions. By combining topological and further additions to geometric models, future research may build upon the platform constructed here.

Text Additions: Inspired by this comment, we added a new paragraph (the second) to the Discussion section.

Reviewer #2:

[…] While the work is very strong overall, there are a few caveats to consider, primarily concerning the simulations. Multicellularity takes many forms by many different processes among eukaryotes. While simulations do cover a range of different morphologies and developmental processes found in nature, factors not explicitly addressed such as constraint or patterning by secreted extracellular matrix, differences in cell shape, cell migration, and others can lead to different kinds of multicellular form. The extent to which potential correlations imposed by diverse morphologies might lead to deviations from theoretical maximum entropy predictions, and how robust those deviations might be to noise is not entirely clear. Additionally, the randomness strength in simulations from high to low, while reasonable, does not appear to be grounded in empirical characterization of randomness strength in developmental processes across biological systems. Ultimately, although they leave some uncertainty as to the generality of the results, these limitations do not contradict or significantly diminish the key claims of the paper.

Comments for the authors:

1) The simulation results are compelling but left me with some questions. To what extent do the morphologies and processes investigated by simulations address the diverse forms of multicellularity encountered across eukaryotes? To what extent does the overall shape of the multicellular structure affect the cell packing distribution (e.g. multi-lobed structure as in Zoothamnium niveum, dichotomous branching as in Dinobryon, something with an undulating boundary)? Are there any examples of simple multicellular eukaryotes that might exhibit very strongly correlated cell positions? What is known about randomness strength or precision in developmental processes in biological systems, and if anything is known, how does this compare to values in simulations? Providing a bit more contextualization or motivation for specific choices in simulations could help address these questions and would support the generality of conclusions drawn from the simulations. Although I am convinced that the results hold for a broad range of multicellular architectures and do not think that the possible existence of a few edge cases contradicts the main conclusions of the work, it is not entirely clear to me that the effects of growth morphology, connection topology, and dimensionality have been accounted for.

As this is a detailed comment with many independent questions, we interleave them and our responses below.

“To what extent do the morphologies and processes investigated by simulations address the diverse forms of multicellularity encountered across eukaryotes?”

Broadly, multicellular growth morphologies can be sorted into two general classes: intercellular bonds may be reformable, or they can be permanent (i.e., un-reformable). Our manuscript addresses both classes.

“Permanent” intercellular bonds are not reformable if broken; mother and daughter cells remain physically attached after cell division is complete. This process occurs in many clades of multicellularity, including plants, green algae, brown algae, red algae, fungi, bacteria, and in some stages of animal development. Incomplete cell separation is one of the oldest forms of multicellular assembly and one of the most successful, dominating the planet’s biomass [1]. Thus, we have primarily focused on this class of growth morphologies, which includes experiments on snowflake yeast and Volvox carteri, which differ in terms of dimensionality. In two simulations, we aspire to construct these experimentally observed growth morphologies. In the remaining five simulations, we seek to address a range of other multicellular morphologies. In particular, we simulate groups with reformable, sticky intercellular bonds and groups of cells that are confined by a corralling membrane. “Sticky” aggregates are broadly prevalent in nature (e.g., flocculating yeast, bacterial aggregates, or slime molds), and growth morphologies featuring maternal membranes are common as well (e.g., animals, algae).

Of course, multicellularity is rich in form and function, and we cannot capture the growth morphologies of all multicellular organisms. However, sampling a range of growth morphologies with “reformable” and “non-reformable” bonds provides at least an initial sampling of the diversity of growth morphologies that exist. We specifically do not simulate growth morphologies of confluent tissues, whether these be assembled through permanent bonds or sticky cells. As confluent tissues exhibit packing fractions of nearly 1.0, the distribution of Voronoi volumes or areas is highly impacted by the cell size distribution, which is regulated at the cellular level.

Text additions: We added the following sentences to clarify the motivation for our simulations:

“These simulations captured a couple basic properties of multicellularity: organisms may grow in both two and three dimensions, and they may assemble with two different classes of bonds: bonds that are reformable, and bonds are not reformable. […] Finally, palintomy, or growth confined inside a maternal membrane, is also common from algae to animals.”

“To what extent does the overall shape of the multicellular structure affect the cell packing distribution (e.g. multi-lobed structure as in Zoothamnium niveum, dichotomous branching as in Dinobryon, something with an undulating boundary)? Are there any examples of simple multicellular eukaryotes that might exhibit very strongly correlated cell positions?”

This is a very interesting question. What is the role played by the boundary between ‘inside’ and ‘outside’ an organism? Geometric constraints, by definition, limit how cells can be arranged, and sufficiently strong geometric constraints, i.e., constraints that precisely determine cellular positions, could dampen or eliminate fluctuations in cell packing volumes. So, how do boundary conditions impact the cell packing distribution?

First, in the case of groups with unreformable bonds, such as snowflake yeast, Zoothamnium, and Dinobryon, cells are “frozen” into place (subject to minor displacements due to, for example, mechanical deformations). However, so long as the exact location is subject to random fluctuations, each organism will have a similar but different cell spatial structure. The maximum entropy principle thus suggests that the ensemble of all organisms precisely follows the most likely cell packing distribution, even though each individual organism deviates slightly from this distribution.

Second, we would like to clarify how we define the boundary of the organism. This is a non-trivial point: where should the boundary of a multicellular organism be drawn? In some cases, for example cells contained within an epithelial sheet (such as Volvox), the answer is fairly clear. However, even with snowflake yeast, an algorithm must be applied to mathematically define inside versus outside. We used multiple approaches to test that our results did not depend on the method chosen. In most analyses, we used a convex hull to determine the boundary of the organism. In others, we examined “shells” within snowflake yeast clusters; we calculated the Voronoi tessellation of the entire dataset, but then separately analyzed the distribution of Voronoi volumes for cells within a range of distances from the cluster center of mass (e.g., those within 15 microns). For distances smaller than the cluster radius, the shape of Voronoi volumes is unaffected by the exact choice of boundary. Crucially, we find that both approaches exhibit maximum entropy distributions of cell packing volumes, suggesting that the convex hull method and the rough, fractal-like boundary of snowflake yeast does not significantly influence the accuracy of maximum entropy cell packing predictions.

Having clarified those points, we now more directly address the referee’s question: could a particular boundary condition make cell positions more correlated, damping fluctuations and producing deviations from maximum entropy predictions? While we cannot completely address this question, at a minimum it is known that packing statistics in thermal systems are different near and far from boundaries. This phenomenon is well demonstrated in soft matter physics, impacting everything from colloidal particles to polymers [3, 5, 7, 8, 9]. Thus, we expect that boundaries can and do impact packing statistics in multicellular organisms. In particular, it is likely that the maternal membrane we simulate causes similar effects. To test this hypothesis, we returned to our simulations of palintomy within a confining membrane to examine how the boundary impacted cell packing statistics. We bin cells into shells of finite width centered at the center of mass of the organism. As expected, we find that the distribution of Voronoi volumes depends on the distance from the boundary; cells located near the boundary are packed more densely than cells deeper in the interior. However, we find that packing within the ‘core’ and within the ‘shell’ still follow separate maximum entropy predictions (see Figure 2—figure supplement 3).

There are, in fact, a number of simple multicellular organisms with highly correlated cell positions. Volvox carteri is just such an example. More broadly, linear filaments (e.g., cyanobacteria) are highly correlated; the topological constraints of the bond network in linear filaments present strong non-stochastic effects, and their cell packing distribution will generally not follow the k-gamma distribution. Last, organisms that form with confluent cell layers will likely be highly correlated as the distribution of Voronoi volumes or areas will largely arise from the distribution of cell size, a trait that is highly regulated at the cellular level.

Text additions: To make this subject more clear to the reader, we have revised the first paragraph of the subsection “Snowflake Yeast” in the section “Experimental tests of multicellular maximum entropy predictions”. We have changed this paragraph to read as follows:

“Snowflake yeast grow via incomplete cytokinesis, generating branched structures in which mother-daughter cells remain attached by permanently bonded cell walls (Figure 1A). […] Conversely, if there are strong correlations in the locations of daughter cells, then we will observe deviations from maximum entropy predictions regarding the cell neighborhood volumes.”

We also added a sentence about confluent cell layers: “However, we would not expect this prediction to hold for confluent cell layers (i.e. packing fraction φ ≈ 1), since in general cell size is a highly regulated process; when cells tile space, the cell size distribution is highly correlated with the cell neighborhood volume distribution.”

Figure additions: We have added Figure 2—figure supplement 3, in which we show that in simulations of cells confined by a maternal membrane, the membrane poses boundary effects on the packing of the group, as shown in panel A. In panel A, we show the volume of the Voronoi volumes as a function of the distance from the center of mass (in normalized units). Then, we partition the membrane into spherical shells and show that within each shell, the distribution of Voronoi volumes is well-described by the k-gamma distribution (panels B-E).

“What is known about randomness strength or precision in developmental processes in biological systems, and if anything is known, how does this compare to values in simulations?”

A number of studies have shown that fluctuations can be crucial to highly-functioning developmental processes, for instance in sheet folding sepal growth [2, 4]. Such studies tell us three important things: (i) randomness can be just as integral a component of development as directed growth; (ii) it may be difficult to disentangle exactly which reproducible qualities and traits stem from a regulated developmental process, and which stem from random processes; and (iii) these studies imply that random processes may be more prevalent in development than we currently know, as there are not many tools by which we can actually measure the effect of randomness. With this in mind, our manuscript provides a mechanism for testing the effect of randomness on cell packing distributions. Our simulations provide test cases for what we might expect to observe in the presence or absence of randomness. We show how deviations from maximum entropy packing predictions occur, and that they indicate the presence of correlations. With this information, future work can then use this tool to quantify the randomness strength of various developmental processes.

Text additions: To highlight better this point, we have modified a paragraph of the text. The first paragraph of the section “The crucial role of randomness” now reads:

“How much randomness is necessary for the k-gamma distribution to predict cell neighborhood size distributions? […] We do not attempt to recapitulate exactly any naturally-occurring levels of randomness strength; rather, we are looking to investigate the limits about how well maximum entropy predictions can measure a deviation from pure randomness.”

2) The sections titled "Multicellular motility is robust to cellular area heterogeneity" starting on p. 11 is slightly perplexing. It is certainly interested, and I see that it addresses a question that may arise from analysis of Volvox cell packing, but in its current form, I do not believe it contributes substantially to the key points of the paper. The introduction section seemed to imply that the results would demonstrate that fluctuations in cell packing may play a role in the evolution of multicellular systems, but as I understood them, the results suggest that fluctuations do not affect motility, at least implying that there should be little to no effect on any aspect of fitness related to motility. It is possible that there could be other aspects of organismal fitness related to cell packing, so while these results are consistent with cell packing fluctuations not necessarily impeding or constraining the evolution of multicellularity, they do not strongly support that conclusion. Perhaps contextualizing the results a bit more in terms of key points of the paper while reporting them and referring to them in the Discussion section might help the reader better appreciate their significance within the context of the paper overall.

We agree that this section could be made clearer to the reader. Broadly, random perturbations could be detrimental, beneficial, or neutral to an organism. Naively, we may expect that perturbations push an organism further from an ‘ideal’ morphology and thus are largely detrimental. With snowflake yeast, we show that randomness in cell positions can lead to highly repeatable group level traits, which allows emergent multicellular traits to become remarkably heritable (Zamani-Dahaj et al., 2021: doi.org/10.1101/2021.07.19.452990). In this way, entropic cell packing imbues nascent multicellular organisms like snowflake yeast with an evolutionary benefit, allowing natural selection to act upon multicellular traits that emerge from the mechanics of cellular packing. Our analysis of Volvox swimming speed shows that randomness can also be neutral. One may have expected that swimming speed is optimized via precise arrangement of flagella; instead, our results show that, to leading order, swimming speed is unaffected by cell packing fluctuations. Thus, while randomness does not produce a benefit to Volvox swimming speed, the fact that optimal swimming does not require an ordered array of flagella represents the absence of a barrier. Swimming speed, a complex group-level trait, can be optimized without first evolving a developmental mechanism that produces precisely arranged cells.

Text additions: We make the following additions to the text to help the reader contextualize these results:

“However, the effect of heterogeneity in cell area on motility is as yet unknown; naively, we might expect that heterogeneities in flagella packing may lower the motility of the organism in comparison to a highly regular arrangement. […] The absence of such a barrier may represent a scenario where trait optimization can be achieved without first evolving sophisticated developmental genes.”

3) I might suggest removing or otherwise modifying the phrase "highly-evolved" (p.14) as its meaning is unclear, has connotations of evolutionary teleology, and clashes with the fact that all extant organisms share an evolutionary history of equal length. Maybe something such as "organisms with highly-regulated development" may be more appropriate.

We agree and have made the suggested change.

4) Is anything known about the source of correlated subregions of cells in Volvox? Do the authors have any ideas about this? Either way, it would be interesting to know and may warrant at least a small comment in the text.

We agree that this question should be addressed in the manuscript. Volvox exhibit a known anterior/posterior polarity. The organism will orient its anterior pole towards a light source and swim in that direction. It is possible that this polarity also affects the somatic cell area distribution.

Typically, an anterior/posterior polarity is measured by observing a moving organism: however, in our snapshot data, the organism is stationary. In other cases, the distribution of the germ cells within the interior of the organism can be used, at some stages of the Volvox lifespan, to correlate with the anterior/posterior positions. However, we have considered here only the somatic cells, and although we have returned to the original images and labeled the germ cells, we have found that, at best, any labeling of anterior/posterior is messy and unreliable due to uncertainty both in the position of the germ cells and in extrapolating those positions to define a definitive pole. Therefore, we do not report these measurements here, but instead discuss the idea that anterior/posterior polarity may affect the correlated subregions of the somatic cell areas.

Text additions: We have revised the first paragraph of the subsection “The role of spatial correlations” to read:

“While we have shown that the distribution of cell neighborhood volumes closely follows the k-gamma distribution in two very different organisms, we have also seen that in some cases maximum entropy predictions are more accurate in sub-sections of an organism than across its entirety. […] This polarity may affect the somatic cell packing distribution in different subregions of the organism, leading to deviations from maximum entropy predictions.”

5) In the author list, SSH is missing an asterisk to denote corresponding authorship.

Corrected.

6) An "e" is missing in "surface" in the caption for Figure 2B.

Corrected.

7) I believe the dotted red line in Figure 4B should be a solid line to match those in panels A and C.

We appreciate your close scrutiny of our figures. Here, the dotted red line signifies that there is only a small, discrete number of observed cell volumes when there is no noise added to the system. We modified the caption to emphasize this point and ensure it is clear to the reader.

Text additions: We added the following sentences to the figure caption:

“Note: the discrete points in (B) arise because, absent randomness in the cell locations, the exact same cellular structure is achieved by every simulation. […] Upon adding randomness, the cellular structure is altered between successive simulations, and the distribution again achieves continuity.”

References:

1. Yinon M. Bar-On, Rob Phillips, and Ron Milo. “The biomass distribution on Earth”. In: Proceedings of the National Academy of Sciences 115.25 (2018), pp. 6506–6511. issn: 10916490. doi: 10.1073/pnas.1711842115.

2. Pierre A. Haas et al. “The noisy basis of morphogenesis: Mechanisms and mechanics of cell sheet folding inferred from developmental variability”. In: PLoS Biology 16.7 (2018), pp. 1–37. issn: 15457885. doi: http://dx.doi.org/10.1371/journal.pbio.2005536.

3. K. Harth, A. Mauney, and R. Stannarius. “Frustrated packing of spheres in a flat container under symmetry-breaking bias”. In: Physical Review E – Statistical, Nonlinear, and Soft Matter Physics 91.3 (2015), pp. 1–5. issn: 15502376. doi: 10.1103/PhysRevE.91.030201.

4. Lilan Hong et al. “Variable Cell Growth Yields Reproducible OrganDevelopment through Spatiotemporal Averaging”. In: Developmental Cell 38.1 (2016), pp. 15–32. issn: 18781551. doi: http://dx.doi.org/ 10.1016/j.devcel.2016.06.016.

5. S´ara L´evay et al. “Frustrated packing in a granular system under geometrical confinement”. In: Soft Matter 14.3 (2018), pp. 396–404. issn: 17446848. doi: 10.1039/c7sm01900a.

6. Maciej Lisicki et al. “Swimming eukaryotic microorganisms exhibit a universal speed distribution”. In: eLife 8 (2019), pp. 1–20. issn: 2050084X. doi: 10.7554/eLife.44907.

7. Joerg Reimann et al. “Pebble bed packing in prismatic containers”. In: Fusion Engineering and Design 88.9-10 (2013), pp. 2343–2347. issn: 09203796. doi: 10.1016/j.fusengdes.2013.05.100. url: http://dx.doi.org/10.1016/j.fusengdes.2013.05.100.

8. Mohammad Mahdi Roozbahani, Bujang B.K. Huat, and Afshin Asadi. “Effect of rectangular container’s sides on porosity for equal-sized sphere packing”. In: Powder Technology 224 (2012), pp. 46–50. issn: 00325910. doi: 10.1016/j.powtec.2012.02.018. url: http://dx.doi.org/10.1016/j.powtec.2012. 02.018.

9. Yu G. Stoyan and G. N. Yaskov. “Packing identical spheres into a cylinder”. In: International Transactions in Operational Research 17.1 (2010), pp. 51–70. issn: 14753995. doi: 10.1111/j.1475-3995.2009.00733.x.

https://doi.org/10.7554/eLife.72707.sa2

Article and author information

Author details

  1. Thomas C Day

    School of Physics, Georgia Institute of Technology, Atlanta, United States
    Contribution
    Conceptualization, Data curation, Formal analysis, Investigation, Software, Visualization, Writing – original draft, Writing – review and editing, Methodology
    For correspondence
    tday31@gatech.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-4681-9348
  2. Stephanie S Höhn

    Department of Applied Mathematics and Theoretical Physics, Centre for Mathematical Sciences, University of Cambridge, Cambridge, United Kingdom
    Contribution
    Conceptualization, Funding acquisition, Investigation, Methodology, Project administration, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review and editing
    For correspondence
    sh753@cam.ac.uk
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-1815-705X
  3. Seyed A Zamani-Dahaj

    1. School of Physics, Georgia Institute of Technology, Atlanta, United States
    2. School of Biological Sciences, Georgia Institute of Technology, Atlanta, United States
    3. Quantitative Biosciences Graduate Program, Georgia Institute of Technology, Atlanta, United States
    Contribution
    Investigation, Methodology
    Competing interests
    No competing interests declared
  4. David Yanni

    School of Physics, Georgia Institute of Technology, Atlanta, United States
    Contribution
    Formal analysis, Investigation, Validation, Writing – review and editing
    Competing interests
    No competing interests declared
  5. Anthony Burnetti

    School of Biological Sciences, Georgia Institute of Technology, Atlanta, United States
    Contribution
    Data curation, Investigation, Writing – review and editing
    Competing interests
    No competing interests declared
  6. Jennifer Pentz

    1. School of Biological Sciences, Georgia Institute of Technology, Atlanta, United States
    2. Department of Molecular Biology, Umeå University, Umeå, Sweden
    Contribution
    Data curation, Formal analysis, Investigation, Software, Writing – review and editing
    Competing interests
    No competing interests declared
  7. Aurelia R Honerkamp-Smith

    Department of Applied Mathematics and Theoretical Physics, Centre for Mathematical Sciences, University of Cambridge, Cambridge, United Kingdom
    Present address
    Department of Physics, Lehigh University, Bethlehem, United States
    Contribution
    Conceptualization, Formal analysis, Investigation, Methodology, Software
    Competing interests
    No competing interests declared
  8. Hugo Wioland

    Department of Applied Mathematics and Theoretical Physics, Centre for Mathematical Sciences, University of Cambridge, Cambridge, United Kingdom
    Present address
    Institut Jacques Monod de Paris Diderot/CNRS, Paris, France
    Contribution
    Data curation, Investigation, Methodology, Software, Visualization
    Competing interests
    No competing interests declared
  9. Hannah R Sleath

    Department of Applied Mathematics and Theoretical Physics, Centre for Mathematical Sciences, University of Cambridge, Cambridge, United Kingdom
    Present address
    Department of Chemistry, Imperial College London, London, United Kingdom
    Contribution
    Investigation, Software
    Competing interests
    No competing interests declared
  10. William C Ratcliff

    School of Biological Sciences, Georgia Institute of Technology, Atlanta, United States
    Contribution
    Conceptualization, Formal analysis, Funding acquisition, Resources, Supervision, Writing – original draft, Writing – review and editing
    For correspondence
    william.ratcliff@biology.gatech.edu
    Competing interests
    No competing interests declared
  11. Raymond E Goldstein

    Department of Applied Mathematics and Theoretical Physics, Centre for Mathematical Sciences, University of Cambridge, Cambridge, United Kingdom
    Contribution
    Conceptualization, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Validation, Writing – original draft, Writing – review and editing
    For correspondence
    R.E.Goldstein@damtp.cam.ac.uk
    Competing interests
    Reviewing editor, eLife
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-2645-0598
  12. Peter J Yunker

    School of Physics, Georgia Institute of Technology, Atlanta, United States
    Contribution
    Conceptualization, Formal analysis, Funding acquisition, Resources, Supervision, Writing – original draft, Writing – review and editing
    For correspondence
    peter.yunker@gatech.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-8471-4171

Funding

National Institutes of Health (1R35GM138030)

  • Will Ratcliff

Wellcome Trust (207510/Z/17/Z)

  • Stephanie S Hohn
  • Raymond E Goldstein

Engineering and Physical Sciences Research Council (EP/M017982/1)

  • Raymond E Goldstein

National Institutes of Health (1R35GM138354-01)

  • Peter J Yunker

Engineering and Physical Sciences Research Council (Vacation Bursary)

  • Hannah R Sleath

John Templeton Foundation (A009723003)

  • Raymond Goldstein
  • Stephanie S Höhn

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

Core Facilities at the Carl R Woese Institute for Genomic Biology. WCR was supported by NIH grant 1R35GM138030. This work was funded in whole, or in part, by the Wellcome Trust (Grant 207510/Z/17/Z; REG & SSH). For the purpose of open access, the authors have applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission. This work was also supported in part by Established Career Fellowship EP/M017982/1 from the Engineering and Physical Sciences Research Council (REG).

Senior Editor

  1. Aleksandra M Walczak, École Normale Supérieure, France

Reviewing Editor

  1. Irene Giardina, Università Sapienza, Italy

Publication history

  1. Received: August 2, 2021
  2. Accepted: January 4, 2022
  3. Version of Record published: February 21, 2022 (version 1)

Copyright

© 2022, Day et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,565
    Page views
  • 111
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Thomas C Day
  2. Stephanie S Höhn
  3. Seyed A Zamani-Dahaj
  4. David Yanni
  5. Anthony Burnetti
  6. Jennifer Pentz
  7. Aurelia R Honerkamp-Smith
  8. Hugo Wioland
  9. Hannah R Sleath
  10. William C Ratcliff
  11. Raymond E Goldstein
  12. Peter J Yunker
(2022)
Cellular organization in lab-evolved and extant multicellular species obeys a maximum entropy law
eLife 11:e72707.
https://doi.org/10.7554/eLife.72707

Further reading

    1. Physics of Living Systems
    Patrick Bakenecker et al.
    Research Article

    Little is known about how muscle length affects residual force enhancement (rFE) in humans. We therefore investigated rFE at short, long, and very long muscle lengths within the human quadriceps and patellar tendon (PT) using conventional dynamometry with motion capture (rFETQ) and a new, non-invasive shear-wave tensiometry technique (rFEWS). Eleven healthy male participants performed submaximal (50% max.) EMG-matched fixed-end reference and stretch-hold contractions across these muscle lengths while muscle fascicle length changes of the vastus lateralis (VL) were captured using B-mode ultrasound. We found significant rFETQ at long (7±5%) and very long (12±8%), but not short (2±5%) muscle lengths, whereas rFEWS was only significant at the very long (38±27%), but not short (8±12%) or long (6±10%) muscle lengths. We also found significant relationships between VL fascicle length and rFETQ (r=0.63, p=.001) and rFEWS (r=0.52, p=.017), but relationships were not significant between VL fascicle stretch amplitude and rFETQ (r=0.33, p=.126) or rFEWS (r=0.29, p=.201). PT shear-wave speed-angle relationships did not agree with estimated PT force-angle relationships, which indicates that estimating PT loads from shear-wave tensiometry might be inaccurate. We conclude that increasing muscle length rather than stretch amplitude contributes more to rFE during submaximal voluntary contractions of the human quadriceps.

    1. Cell Biology
    2. Physics of Living Systems
    Sohyeon Park et al.
    Research Article Updated

    In addition to diffusive signals, cells in tissue also communicate via long, thin cellular protrusions, such as airinemes in zebrafish. Before establishing communication, cellular protrusions must find their target cell. Here, we demonstrate that the shapes of airinemes in zebrafish are consistent with a finite persistent random walk model. The probability of contacting the target cell is maximized for a balance between ballistic search (straight) and diffusive search (highly curved, random). We find that the curvature of airinemes in zebrafish, extracted from live-cell microscopy, is approximately the same value as the optimum in the simple persistent random walk model. We also explore the ability of the target cell to infer direction of the airineme’s source, finding that there is a theoretical trade-off between search optimality and directional information. This provides a framework to characterize the shape, and performance objectives, of non-canonical cellular protrusions in general.