On the origin of universal cell shape variability in confluent epithelial monolayers
Abstract
Cell shape is fundamental in biology. The average cell shape can influence crucial biological functions, such as cell fate and division orientation. But celltocell shape variability is often regarded as noise. In contrast, recent works reveal that shape variability in diverse epithelial monolayers follows a nearly universal distribution. However, the origin and implications of this universality remain unclear. Here, assuming contractility and adhesion are crucial for cell shape, characterized via aspect ratio (r), we develop a meanfield analytical theory for shape variability. We find that all the systemspecific details combine into a single parameter α that governs the probability distribution function (PDF) of r; this leads to a universal relation between the standard deviation and the average of r. The PDF for the scaled r is not strictly but nearly universal. In addition, we obtain the scaled area distribution, described by the parameter μ. Information of α and μ together can distinguish the effects of changing physical conditions, such as maturation, on different system properties. We have verified the theory via simulations of two distinct models of epithelial monolayers and with existing experiments on diverse systems. We demonstrate that in a confluent monolayer, average shape determines both the shape variability and dynamics. Our results imply that cell shape distribution is inevitable, where a single parameter describes both statics and dynamics and provides a framework to analyze and compare diverse epithelial systems. In contrast to existing theories, our work shows that the universal properties are consequences of a mathematical property and should be valid in general, even in the fluid regime.
Editor's evaluation
In this important study, the authors unveil the reason for nearly universal shape fluctuations that have been reported earlier by theoretically analysing the energy of a confluent epithelial tissue. The comparison of their analytic results with simulations and experimental data is compelling, only the justification of the cell area distribution is somewhat incomplete. The manuscript is relevant for all people with an interest in tissue structure and dynamics.
https://doi.org/10.7554/eLife.76406.sa0Introduction
D’Arcy Thompson argued, in his book On Growth and Form, physical principles could explain tissue packing and cell shape (Thompson, 1917). Shape formation of tissues and organs during embryogenesis is a longstanding, fascinating problem of developmental biology. Since cells are the functional units of a tissue, shapes in the organs must originate at the cellular level (Paluch and Heisenberg, 2009; Wyatt et al., 2015; PérezGonzález et al., 2021; Hannezo et al., 2013). Cell shapes are vital in both health and disease. As cancer progresses (Sailem and Bakal, 2017; Park et al., 2016), as asthma advances (Park et al., 2016; Park et al., 2015; Veerati et al., 2020; Atia et al., 2018), as wounds heal (Nnetu et al., 2012; Poujade et al., 2007), as an embryo develops (Farhadifar et al., 2007; Atia et al., 2018), cells progressively change their shape. Besides, cell shape may influence crucial biological functions, such as cell growth or selective programmed cell death (apoptosis) (Chen et al., 1997), the orientation of the mitotic plane (Wyatt et al., 2015; Bosveld et al., 2016; Xiong et al., 2014), stem cell lineage (McBeath et al., 2004; Wang et al., 2011), terminal differentiation (Watt et al., 1988; Roskelley et al., 1994), and divisioncoupled interspersion in many mammalian epithelia (McKinley et al., 2018). Moreover, the nuclear positioning mechanism in neuroepithelia depends on cell shape variation (Yanakieva et al., 2019). Thompson regarded celltocell shape variability as a biologically unimportant noise Thompson, 1917; however, it is now known that shape variability is not an exception but a fundamental property of a confluent cellular monolayer (Graner and Riveline, 2017). In a seminal work, (Atia et al., 2018) showed that cell shape variability, quantified by the aspect ratio ($r$), follows virtually the same distribution across different epithelial systems. But, the origin of this nearuniversal behavior, and whether it is precisely universal, remains unclear.
Previous works have shown the similarity in dynamics between cellular monolayers and glassy systems (Angelini et al., 2011; Park et al., 2015; Malinverno et al., 2017; Garcia et al., 2015). Additionally, crucial insights into the dynamics of cellular systems are obtained via simplified model systems, treating cells as polygons (Park et al., 2015; Bi et al., 2015; Bi et al., 2016; Chiang and Marenduzzo, 2016; Sadhukhan and Nandi, 2021; Sussman et al., 2018). One of these models, the vertex model (VM), shows a rigidity transition akin to the jamming transition (Bi et al., 2015; Merkel et al., 2019). However, this transition does not exist in other confluent models, such as the Voronoi model (Sussman and Merkel, 2018) or the cellular Potts model (CPM) (Sadhukhan and Nandi, 2021). But, all three models are similar from the perspective of glass transition (Bi et al., 2016; Sussman et al., 2018; Chiang and Marenduzzo, 2016; Sadhukhan and Nandi, 2021). A point of clarification on the terminology is quintessential here. The jamming (rigidity) transition is a zerotemperature, zeroactivity phenomenon. It is a genuine phase transition characterized via an order parameter. The observed shape index, $q$, the ratio of perimeter to the square root of cell area, is an order parameter of the jamming transition (Park et al., 2015; Bi et al., 2015; Merkel et al., 2019). By contrast, the glass transition refers to the extreme dynamical slowing down when the relaxation time, $\tau $, becomes a certain value, usually taken as $10100s$ in experiments. It is not associated with any phase transition, and no static order parameter exists (Berthier and Biroli, 2011). Extensive research in the last couple of decades shows that although jamming and glass transitions can coexist, they are distinct phenomena controlled by entirely different physics (Mari et al., 2009; Biroli and Garrahan, 2013; Berthier and Witten, 2009; Ikeda et al., 2012). However, these terms are often used imprecisely while describing the dynamics of biological systems. The ‘jamming transition’ is sometimes used to describe the changing system behavior from fluidlike fast to solidlike slow dynamics (Berthier et al., 2019; Atia et al., 2021); this is actually the glass transition. Their synonymous use may lead to erroneous conclusions. Since both transitions exist in this field, distinguishing them is crucial to avoid confusion (Park et al., 2015; Chiang and Marenduzzo, 2016; Sussman et al., 2018; Sadhukhan and Nandi, 2021).
Atia et al., 2018 argued that the universal behavior of aspect ratio in an epithelial monolayer is related to the jamming transition. In a jammed system, one can carry out a voronoi tesselation centering each particle and obtain the tesselated volume, $x$, of the individual particles. $x$ is known to follow a $k$Gamma distribution, $P(x,k)=[{k}^{k}/\mathrm{\Gamma}(k)]{x}^{k1}\mathrm{exp}[kx]$, where $\mathrm{\Gamma}$ is the Gamma function and the distribution is characterised by a single parameter $k$ (Aste and Di Matteo, 2008). From this association, (Atia et al., 2018) conjectured the viability of describing the distribution of $r$ via $P(x,k)$. But, the analytical derivation of the $k$Gamma distribution in granular packing (Aste and Di Matteo, 2008) relies on the fact that $x$ is additive, whereas, as the authors of Atia et al., 2018 rightly point out, $r$, is not. Thus, there “exists no rigorous basis for the applicability of the kGamma distribution” (Atia et al., 2018). Yet, (Atia et al., 2018) and several subsequent works Lin et al., 2018; Li et al., 2021; Kim et al., 2020; Wenzel and Voigt, 2021 have shown that the probability distribution function (PDF) of scaled aspect ratio, r_{s} (defined in ‘Aspects of universality’), in diverse biological and model systems can be described by $P({r}_{s},k)$, and the value of $k$ is nearly the same, around 2.5, for these diverse systems. Furthermore, the standard deviation, $sd$, vs the mean aspect ratio, $\overline{r}$, follows a universal relation (Atia et al., 2018; Kim et al., 2020; Ilina et al., 2020). What is the origin of this universality? What determines the value of $k\sim 2.5$? How is the PDF of $r$ related to the microscopic properties of a system? Answers to questions like these are crucial for deeper insights into the cell shape variability and unveiling the implications of the universality. However, it requires an analytical theory that is rare in this field due to the inherent complexity of the problem and the presence of manybody interactions.
The existing theoretical works on cellular shapes assume solid like property of the system: specifically, the property that deformation of a solid is elastic in nature, that is the system resumes its shape when the external force is relaxed. For example, cell shape in the context of the jamming transition within the VM has been studied in Czajkowski et al., 2018. In a recent work, Li et al demonstrated the presence of a gamma distribution of $r$ in a wide variety of systems (Li et al., 2021). The analytical framework of Li et al., 2021 related this distribution to Boltzmannlike features and the elastic nature of the system. However, the origin of the universal properties and what dictates the value of $k\sim 2.5$ remained unclear. Furthermore, in the context of cellular systems, the solidlike nature is only applicable deep in the glassy regime or in a jammed system (Czajkowski et al., 2018). But, most biological systems are fluidlike due to activity. The universal behavior in diverse systems seems to suggest that a more general mechanism must exist. In this work, we take a different approach. We develop a meanfield analytical theory for cell shape variability without consideration of solidlike nature or rigidity, and thus, our results should be applicable even in the fluid regime. Crucially, our work reveals that the origin of the universal features is a mathematical property.
The main results of this work are as follows: (1) We find that the aspect ratio distribution is described by a single parameter, α, containing all the systemspecific parameters. Having a single parameter within the theory implies that $\overline{r}$ determines the distribution. We demonstrate this in Figure 4f, illustrating the predictive power of the theory. This also implies a universal relation between $sd$ and $\overline{r}$ (Figure 1e, Figure 3f, Figure 4d). (2) The PDF of r_{s} is not strictly, but nearly, universal; $k\sim 2.5$ is a direct consequence of a mathematical property. The kGamma distribution for r_{s} is obtained as a rough approximation of our analytical expression. Crucially, this nearly universal distribution of r_{s} exists for systems even in the fluid regime; we demonstrate this by comparing our analytical results with our simulations (Figure 3c) and existing experimental data (Figure 4c). (3) We also obtain the PDF for the scaled area, $a$, and show that it is not universal, in contrast to what has been proposed elsewhere (Wilk et al., 2014; Figure 2 and Figure 4e). (4) We demonstrate that simultaneous measurements of the PDFs for $a$ and $r$ can reveal the effects of changing physical conditions, such as maturation, on the individual model parameters. We have verified our theory via simulations of two distinct models of a confluent epithelial monolayer: the discrete latticebased CPM on square and hexagonal lattices and the continuous VM (see Appendix 2 for details). Moreover, comparisons with existing experimental data on a wide variety of epithelial systems show excellent agreements.
One remarkable aspect of our work is as follows. It is often hard to control a specific property in a biological system as a perturbation can significantly affect other proteins. However, α includes all such effects. Therefore, even in the absence of detailed knowledge of the individual changes, our theory allows the characterization of different effects in diverse systems by treating α as a control parameter. Such a characterization can illuminate the mechanistic notion if functions and shapes are related irrespective of the molecular details. We further demonstrate in our simulations that α can also be a parameter for the dynamics. Thus, the same parameter describes both statics and dynamics, governs the origin and aspects of universality, and provides a framework to analyze and compare diverse epithelial systems.
Results
Analytical theory for the shape variability
Simplified model systems, representing cells as polygons, have been remarkably successful in describing both the static and dynamic aspects of an epithelial monolayer (Honda, 1978; Honda and Eguchi, 1980; Farhadifar et al., 2007; Bi et al., 2014; Fletcher et al., 2014; Bi et al., 2015; Park et al., 2015). The energy function, $\mathscr{H}$, governing these models is
where $N$ is the number of cells, the first term constrains area, ${A}_{i}$, to a target area, A_{0}, determined by cell height and cell volume, with strength $\lambda}_{A$. Cell heights in experiments remain almost constant in an epithelial monolayer (Farhadifar et al., 2007). The second term describes cortical contractility and adhesion (Figure 1a; Bi et al., 2015; Farhadifar et al., 2007; Prost et al., 2015). It constrains the perimeter, ${P}_{i}$, to the target perimeter $P}_{0$ with strength ${\lambda}_{P}$. The energy function, Equation 1, can be numerically studied (Albert and Schwarz, 2016) via different confluent models, such as the VM (Farhadifar et al., 2007; Bi et al., 2015), the Voronoi model (Bi et al., 2016; Honda, 1978), or more microscopic models such as the CPM (Graner and Glazier, 1992; Hirashima et al., 2017; Hogeweg, 2000) and the phasefield model (Nonomura, 2012; Palmieri et al., 2015; Wenzel and Voigt, 2021). For concreteness, we mostly focus our discussions below within the CPM that has been demonstrated to be more appropriate for variable cellular shapes and sizes (Bosveld et al., 2016). However, our analytical results should generally apply for a confluent system independent of the microscopic details of the models, and we have verified them via numerical simulations in both the VM and the CPM on a square and hexagonal lattice. Furthermore, we have neglected cell division and apoptosis in our simulations for the results presented in the main text; their rates are usually low in epithelial monolayers (Poujade et al., 2007; Park et al., 2015). For example, they are of the order of ${10}^{2}$ per hour and per day, respectively, for an MDCK monolayer (Puliafito, 2017). Nevertheless, we show in Appendix 4, that the general conclusions of the theory remain unchanged when their rates are not very large.
Our starting point is the energy function, Equation 1, describing a confluent system of cells. We assume that the probability of a specific cellular configuration is given by a Boltzmann weight at an effective temperature, $T$ (see ‘Details of the analytical calculation’ for details). Note that $T$ in an active system includes contributions from all possible activities and the equilibrium temperature. An exact interpretation of $T$ depends on the system, and several definitions of $T$ exist, for example, the ratio of correlation to response function (Nandi and Gov, 2018; Petrelli et al., 2020; Nandi and Gov, 2017), from Einstein relation (Szamel, 2014), etc. The confluent models (such as the VM or the CPM) for epithelial systems, have two main variants: depending on the presence or absence of activity in the form of selfpropulsion. The second variant represents equilibrium systems; $T$ is treated at the same footing as an equilibrium temperature and provides good agreements with experiments (Glazier and Graner, 1993; Hirashima et al., 2017; Sussman et al., 2018; Fletcher et al., 2014). Thus, the Boltzmann distribution is justified, at least within our simulations (Appendix 1). Excellent agreements of our results with experiments and analyses of the experimental systems in terms of an effective temperature (Atia et al., 2018; Kim et al., 2020) also validate this description. (Li et al., 2021) also finds a Boltzmann distribution description at an effective temperature applies to a wide variety of systems. An exact analytical calculation for the distribution of $r$ is impractical; therefore, we have made several simplifying assumptions. They are either motivated by or justified in our simulations. Here, we briefly discuss the main aspects of the calculation and relegate the technical details to the Materials and Methods (‘Details of analytical calculation’) . A detailed comparison of the analytical theory with our simulations and justifications of the assumptions are shown in ‘Comparison with simulations’.
One crucial aspect of these model systems (such as the VM, the CPM, or the Voronoi model) of epithelial monolayer is the constraint of confluency, which is area fraction is unity at all times; it enters the problem via the area term in Equation 1. This constraint is an intricate mathematical problem rendering a direct analytical calculation impractical. However, we can bypass this difficulty and gain valuable insights into the distribution of $r$. First, a thin actomyosin layer, known as cortex, mainly governs the cellular mechanical properties (Prost et al., 2015). Therefore, the perimeter term should be dominant in determining shape. Second, shape fluctuation in these models can occur only via changes in the boundary. Third, $r$ being nondimensional can vary independently of the cell area. Thus, in the regime of our interest, when the cells are compact objects (in contrast to being fractallike in other regimes Sadhukhan and Nandi, 2021, see Appendix 2), we expect the area term in Equation 1 to be not crucial in determining $r$. We have tested this assumption in two different ways in our simulations. First, if the area term is not paramount, the distribution of $r$ should not depend on ${\lambda}_{A}$; as detailed later (Figure 2b and Figure 2—figure supplement 1), this is indeed true. Second, in the regime of our interest, the distribution of $r$ of a single cell (treating the rest of the system as medium Graner and Glazier, 1992; Glazier and Graner, 1993) is nearly the same as that of a confluent system (Figure 2a). Moreover, as discussed below, the energy from the area term varies only slightly as ${\lambda}_{A}$ changes. Therefore, we assume that the area constraint is satisfied and not crucial for the aspect ratio distribution.
Then the energy function, Equation 1, becomes a sum of energies coming from individual cells. Since the perimeter of a cell is independent of that of others, we can concentrate on a particular cell, $i$, with energy
where the first term represents contractility, and the second, effective adhesion. We have ignored the constant part as it does not affect any system properties. We first develop a coarsegrained description of cellperimeter designating it via a set of representative points, as described in ‘Details of the analytical calculation’. To calculate the aspect ratio, $r$, we first need to obtain the two radii of gyrations, s_{1} and s_{2}, around the two principal axes (see ‘Radius of gyration’ for the definitions). Then $r={s}_{1}/{s}_{2}$, considering $s}_{1}>{s}_{2$ without loss of generality. However, a direct calculation of s_{1} and s_{2} is intricate due to their anisotropic natures. A slightly simpler calculation is possible for $s$, the radius of gyration around the center of mass, and we have ${s}^{2}={s}_{1}^{2}+{s}_{2}^{2}$ (Davis and Denton, 2018). Therefore, we first obtain the distribution of ${s}^{2}$, $P({s}^{2})$, and then using this, obtain $P(r)$. As detailed in ‘Details of the analytical calculation’, using ${s}^{2}=A(r+1/r)$, with $A$ being the cell area, we obtain
where the normalization constant $\mathcal{N}$ is determined via the constraint that total probability must be unity and $\alpha \propto {\lambda}_{P}(1K{P}_{0})/T$ with $K$ being a constant.
Additionally, as detailed in ‘Distribution for area’, Equation 18 together with the constraint of confluency (Weaire, 1986; Gezer et al., 2021), give the distribution for the scaled area $a=A/\overline{A}$, where $\overline{A}$ is the average of area. It is a Gamma distribution, with a single parameter μ,
Since μ is related to the constraint of confluency, it should be independent of ${\lambda}_{P}$; our simulations show that this is indeed true (Figure 3e). Therefore, α and μ together can distinguish how the model parameters ${\lambda}_{P}$ and $T$ are affected by changing conditions such as maturation.
Note that the power 3/2 of the algebraic term in Equation 3 comes from the mathematical property of closedlooped objects. That is, for closedloop objects, the lowest nonzero eigenvalue will have degeneracy 2, leading to the exponent 3 in the algebraic part of Equation 18, as detailed in ‘Details of the analytical calculation’. As shown below (next section), Equation 3 can be roughly approximated as a $k$Gamma distribution for r_{s} that has been fitted with experimental and simulation data (Atia et al., 2018; Kim et al., 2020; Li et al., 2021; Wenzel and Voigt, 2021). The value of $k\sim 2.5$ found in these fits comes from this mathematical property. Since the perimeter of a cell must be closedlooped, this mathematical property is inevitable. On the other hand, all the system specific details are contained in the parameter $\alpha $. We treat it as a free parameter in the theory and obtain its value via fits with data. Thus, Equation 3 provides a remarkable description, where all the systemspecific details enter through a single parameter, α; it has profound implications leading to the universal behavior as we now illustrate.
Aspects of universality
We show the PDF of the aspect ratio, $P(r)$, at different values of α in Figure 1b, $P(r)$ decays faster as $\alpha $ increases, as expected from Equation 3. The plots look remarkably similar to the experimental results shown in Atia et al., 2018; we present detailed comparisons with experiments later. (Atia et al., 2018) has demonstrated that the PDFs of the scaled aspect ratio, ${r}_{s}=(r1)/(\overline{r}1)$, where $\overline{r}$ is the ensembleaveraged $r$, across different systems follow a nearuniversal behavior. We now plot the PDFs of r_{s} in Figure 1c. The PDFs almost overlap, but they are not identical. A closer look at Equation 3 shows that if $\overline{r}+1/\overline{r}$ goes as $1/\alpha $, we can scale $\alpha $ out of the equation and obtain a universal scaled distribution for r_{s}. However, as shown in Figure 1d, there is a slight deviation in $\overline{r}+1/\overline{r}$ with the functional form of $1/\alpha $. This tiny deviation implies that the PDF of r_{s} is not universal. If one ignores $1/r$ compared to $r$, Equation 3 becomes a $k$Gamma function for r_{s} with $k=2.5$. However, since $r\sim \mathcal{O}(1)$, this cannot be a good approximation, and the observed spread of $k$ around 2.5 is natural when fitted with this function (Atia et al., 2018; Li and Ciamarra, 2018; Kim et al., 2020; Wenzel and Voigt, 2021). On the other hand, since the deviation (Figure 1d) is minute, the PDFs of r_{s} for different systems look nearly universal (Figure 1c): the αdependence becomes so weak for the PDFs of r_{s} that they seem to be independent of α. This result is a strong prediction of the theory and, as we show below, is corroborated by available experimental data on diverse epithelial systems.
Although the PDFs of r_{s} are not strictly universal, there is another aspect, $sd$ vs $\overline{r}$, which is universal. We show the $sd=\sqrt{\overline{{r}^{2}}{\overline{r}}^{2}}$ as a function of $\overline{r}$ in Figure 1e. Since there is only one parameter, α, in Equation 3, it determines both $sd$ and $\overline{r}$. The monotonic dependence of $\overline{r}$ on α (Figure 1d) implies a unique relationship between them. Therefore, we can express $\alpha $ in terms of $\overline{r}$ and, in turn, $sd$ as a function of $\overline{r}$. Since there is no other systemdependent parameter in this relation, it must be universal. Note that $\alpha \propto {\lambda}_{P}/T$ at a constant $P}_{0$, thus, α increases as ${\lambda}_{P}$ increases or $T$ decreases. Both $sd$ and $\overline{r}$ become smaller as α increases, and the system on the $sd$ vs $\overline{r}$ plot moves towards the origin (Figure 1e). From the perspective of the dynamics, the relaxation time, $\tau $, of the system grows as $\alpha $ increases (Sadhukhan and Nandi, 2021). Thus, small $\overline{r}$ and large $\tau$, that is, less elongated cells and slow dynamics, follow each other, and the energy function, Equation 1, controls both behaviors. Finally, we show in Figure 1f some representative PDFs, Equation 4, at different values of μ for the scaled area $a=A/\overline{A}$. The PDF of $a$ has been argued to be universal Wilk et al., 2014; our theory shows that although the PDFs of $a$ for different systems follow the same functional form, they are not identical.
Comparison with simulations
We now compare our analytical theory with simulations of two distinct confluent models: the CPM and the VM. In our simulations, we use the original energy function, Equation 1, and other simulation details are presented in the Appendix 2. Unless otherwise stated, the CPM simulations, presented in the main text, are on the square lattice; the data for hexagonal lattice CPM simulations are shown in Figure 3—figure supplement 1. We first present tests of the crucial assumption that, in the regime of our interest here, we can ignore the constraint of confluency that enters via the area term in Equation 1. We simulate, within the CPM, single cells treating the rest of the system as medium and compare the distribution of $r$ with that in a confluent system. As shown in Figure 2a, the PDFs are nearly the same. Next, we have simulated the confluent systems with varying ${\lambda}_{A}$ and find that the distribution of $r$ remains almost independent of ${\lambda}_{A}$. We have obtained $\alpha $ that characterizes the aspect ratio distribution via Equation 3 at fixed ${\lambda}_{P}$, $P}_{0$, and $T$ but varying ${\lambda}_{A}$. As shown in Figure 2b, within both the CPM and the VM, α remains almost constant with varying ${\lambda}_{A}$ (the distributions are shown in Figure 2—figure supplement 2). Finally, the scaled area, $a$, given by Equation 4, is a sharply peaked function around $a=1$. We show the distribution of $a$ within the CPM in Figure 2c: the lines are fits with Equation 4 with μ as a fitting parameter. Thus, assuming that cells satisfy the target area, we can ignore the area term in Equation 1 to obtain the distribution of $r$.
Figure 2c shows that $P(a)$ becomes more sharply peaked as ${\lambda}_{A}$ increases; this makes sense as greater values of ${\lambda}_{A}$ ensure the area constraint is more effective and the distributions become sharply peaked around the average area. Thus, the standard deviation of $A$ decreases as ${\lambda}_{A}$ increases. This implies that the variation in energy from the area term in Equation 1 is even less. We have checked that it is less than 10% for about a 300% change in ${\lambda}_{A}$. Thus, treating the area part of Equation 1 as a constant is justified. Figure 2d shows that μ almost linearly increases with ${\lambda}_{A}$. We present similar results for the VM in Figure 2—figure supplement 3. Since the area term is related to the cell height that remains nearly constant and the geometric constraint of confluency, we do not expect a substantial variation in ${\lambda}_{A}$ in a particular system. However, μ also varies with $T$ (see Figure 2—figure supplement 4). Thus, in contrast to what has been proposed elsewhere (Wilk et al., 2014), as $T$ changes, though the PDF of $a$ is welldescribed by the same function, Equation 4, the values of μ can be different. Thus, the PDFs for different systems or the same system at different levels of activity and maturation need not be identical.
We now show that our analytical theory agrees well with simulation data. To highlight that the distribution of $r$ and its associated universal properties are also valid in the fluid regime, we have mostly simulated the systems in this regime. However, some of the simulations are also in the glassy regime (with relaxation time greater than 10^{4}). Since glass transition is not associated with any thermodynamic transition, we do not expect a drastic change in the static properties, such as the distribution of $r$. Figure 3a and b show representative plots for the comparison of the PDFs of $r$ within the CPM, and the VM, respectively, where the lines represent fits with Equation 3. Figure 3a shows data with varying ${\lambda}_{P}$, and Figure 3b shows data with changing $T$. As discussed above, our theory predicts nearly universal behavior for the PDFs of r_{s} (Figure 1c). We plot the simulation data for the PDFs of r_{s} for both the models at different parameters in Figure 3c; the PDFs almost overlap, consistent with the theory (see Figure 3—figure supplement 1 for more results). An important prediction of the theory is that the parameter $\alpha $, which governs the behavior of the cell shape variability, is linearly proportional to both ${\lambda}_{P}$ and $1/T$, hence with ${\lambda}_{P}/T$. This prediction also agrees with our simulations (Figure 3d and Figure 3—figure supplement 2). The slopes within the CPM and the VM are different; this possibly comes from the distinctive natures of the two models, but the qualitative behaviors are the same.
Figure 3e shows α vs μ within the CPM when we vary one of the parameters, ${\lambda}_{A}$, ${\lambda}_{P}$, and $T$, keeping the other two fixed. First, when ${\lambda}_{A}$ increases, the value of μ increases, but α remains almost constant (also see Figure 2b). Next, when ${\lambda}_{P}$ increases, although α linearly increases, μ remains nearly the same. Finally, both parameters linearly increase with $1/T$; since higher $T$ implies more fluctuations, decreasing $T$ helps both $r$ and $a$ to become sharply peaked (see Figure 2—figure supplement 4, Figure 3—figure supplement 2 for their specific behaviors, and results within the VM). These results show when ${\lambda}_{A}$ remains constant, varying ${\lambda}_{P}$ and $T$ have distinctive effects on μ and α. These results are significant from at least two aspects: First, μ comes from the constraint of confluency (see ‘Distribution for area’), which should depend only on the area and be independent of the perimeter. Thus, the ${\lambda}_{P}$independence of μ validates the phenomenological implementation Weaire, 1986; Gezer et al., 2021 of this constraint. Second, these results can provide crucial insights regarding the model parameters. The maturation of a monolayer can affect both ${\lambda}_{P}$ and $T$. Additional junctional proteins may be employed during maturation to increase ${\lambda}_{P}$. On the other hand, different forms of activity may reduce, decreasing $T$. Since $\alpha $ increases linearly with ${\lambda}_{P}/T$, $r$ alone is not enough to determine the dominant mechanism during the maturation process. However, assuming that ${\lambda}_{A}$ remains constant in a particular system, simultaneous measurements of μ and α allow distinguishing effects of changing physical conditions, such as maturation, on the individual parameters.
We next verify the universal result of the theory: $sd$ vs $\overline{r}$. Figure 3f shows $sd$ vs $\overline{r}$ within the CPM; we plot the theoretical prediction by the dotted line for comparison. The theory predicts that the state points move towards the origin as $\alpha $ increases (Figure 1e); this is consistent with our simulations. Since $\alpha \propto {\lambda}_{P}/T$, higher $\alpha $ should correspond to slower dynamics. To test this hypothesis, we have simulated the CPM at different ${\lambda}_{P}$, $P}_{0$, and $T$ to obtain the relaxation time, $\tau $ (see Appendix 2 for details). From these control parameters, we have calculated α and then $\overline{r}$, using our theory. We show $\tau $ as functions of $\alpha $ and $\overline{r}$ in Figure 3g; it is clear that indeed $\tau $ grows as $\alpha $ increases or $\overline{r}$ decreases. To show this behavior of $\tau $ on the same plot as $sd$ vs $\overline{r}$, we plot $2.5/\mathrm{ln}\tau 0.25$ in Figure 3f as a function of $\overline{r}$. A monolayer fluidizes under compressive or stretching experiments, where cell shape changes, but not cell area (Krishnan et al., 2012; Park et al., 2015; Atia et al., 2018). Such perturbations make the cells more elongated, increasing $\overline{r}$; thus, our theory rationalizes the decrease in $\tau $ associated with fluidization under such perturbations. Finally, we show that our meanfield result that $\alpha $ decreases linearly with $P}_{0$ agrees with simulations (Figure 3h). Further, to test our hypothesis that our main results remain unchanged in the presence of cell division and apoptosis, when their rates are small, we have simulated the CPM, including these processes. As shown in the Appendix 4, the simulation results justify our hypothesis.
Comparison with existing experiments
Having shown that our theory agrees well with both the CPM and the VM simulations, we next confront it with the existing experimental data. We first compare the theory with data taken from Atia et al., 2018 for three different confluent cell monolayers: the MDCK cells, the asthmatic HBEC, and the nonasthmatic HBEC. We chose the PDFs at three different times from Figure 3a and c in Atia et al., 2018. We fit Equation 3 with the data to obtain $\alpha $ and present their values in Table 1; the corresponding fits for the MDCK cells are shown in Figure 4a (see Figure 4—figure supplement 1 for the other fits). Table 1 shows that α increases with maturation. Thus, progressive maturation can be interpreted as an increase in either ${\lambda}_{P}$ or $1/T$ or both. The PDFs for r_{s} corresponding to the MDCK cells are shown in the inset of Figure 4a together with one set of experimental data Atia et al., 2018; note that this is not a fit, yet the theory agrees remarkably well with the data.
We next calculate $sd$ as a function of $\overline{r}$ using the values of α, noted in Table 1 for the three systems. They are shown in Figure 4b along with the theory prediction. With maturation, the state points move towards lower $\overline{r}$, represented by the arrow in Figure 4b. As shown in Figure 3g, larger α corresponds to a system with higher $\tau $. Thus, with maturation, as the PDFs become sharply peaked, as the cells look more roundish and $\overline{r}$ becomes smaller, the system becomes more sluggish. This maturation effect is the same in all the systems (Figure 4b) and agrees with the interpretation presented in Atia et al., 2018. We have also examined that the theoretical prediction of $sd$ vs $\overline{r}$ agrees well with the experimentally measured values shown in Atia et al., 2018.
Our theory predicts that the PDF for r_{s}, although not strictly universal, should be almost the same for different systems (Figure 1c). This prediction is a consequence of a crucial aspect of the theory: all the systemspecific details enter via a single parameter, α in Equation 3. As shown in Figure 1d, $\overline{r}+1/\overline{r}$ deviates slightly from the behavior $1/\alpha $. This slight deviation implies that the PDF for r_{s} can not be strictly universal and manifests as a variation in k when the PDF is fitted with the kGamma function in different experiments and simulations (Atia et al., 2018; Kim et al., 2020; Li and Ciamarra, 2018; Lin et al., 2020; Wenzel and Voigt, 2021). Nevertheless, since the deviation in Figure 1d is very weak, the values of $k$ are very close to each other. Therefore, the PDFs for r_{s} in diverse epithelial systems–in experiments, simulations, and theory–should be nearly universal. To test this prediction, we have collected existing experimental and simulation data on different systems and show the PDFs of r_{s} in Figure 4c. The variety in our chosen set is spectacular: it consists of various cancer cell lines (Kim et al., 2020), both asthmatic and nonasthmatic HBEC cells, MDCK cells (Atia et al., 2018), Drosophila wing disk (Lin et al., 2020), simulations data on both active (Li and Ciamarra, 2018) and equilibrium versions of the VM, the active Voronoi model (Atia et al., 2018), and the CPM. Yet, the PDFs shown in Figure 4c look nearly universal and in agreement with our analytical theory.
Additionally, our theory predicts a strictly universal behavior for $sd$ vs $\overline{r}$. Since this relation does not have any systemspecific details, data across diverse confluent monolayers must follow a universal relationship. We have collected existing experimental data for several systems: cancerous cell lines (Kim et al., 2020), human breast cancer cells (Ilina et al., 2020), and a jammed epithelial monolayer of MDCK cells (Fujii et al., 2019). Figure 4d shows the experimental data together with our theoretical prediction; the agreement with our theory, along with the aspect of universality, is truly remarkable. As α increases, dynamics slows down, and the points on this plot move towards lower $\overline{r}$. This result is consistent with the finding that cell shapes are more elongated and variable as the dynamics become faster in different epithelial systems (Atia et al., 2018; Park et al., 2015).
We have argued that simultaneous measurements of the PDFs of cell area and $r$ distinguish the effects of maturation on the two key parameters: ${\lambda}_{P}$ and $T$. The argument relies on the negligible effect of the perimeter constraint on μ (Equation 4). We now show a comparison of our theoretical result for the PDF of $a$ with existing experiments. Figure 4e shows experimental data for four different systems (Wilk et al., 2014; Puliafito, 2017) and the corresponding fits of Equation 4. Unlike what has been proposed elsewhere that epithelial monolayers have a universal area distribution (Wilk et al., 2014), we find, in agreement with experiments, that although the functional form remain the same, the distribution can vary.
What are the implications of these universal aspects of cell shape variability and our theory? Cell shape controls several crucial biological functions such as the mitoticorientation (Wyatt et al., 2015; Bosveld et al., 2016; Xiong et al., 2014) and cell fate (McBeath et al., 2004; Wang et al., 2011; Roskelley et al., 1994). Our theory shows that the microscopic system properties are encoded via a single parameter, $\alpha $. Consequently, knowledge of one of the observables, such as $\overline{r}$, contains the information of the entire statistical properties in a monolayer. We now illustrate this predictive aspect of the theory. Experimental measurement of an average property is usually less complex and more reliable. We have collected the data for $\overline{r}$ from the supplementary material of Kim et al., 2020 for three different systems: 10Ca1A, 10AT, and 10 A.ErbB2, shown by the hexagrams in Figure 4d. From these average values, we obtain α, which we use to theoretically calculate the PDFs for $r$, as shown in Figure 4f. The inset of Figure 4f shows our theoretical PDFs for r_{s}, together with the corresponding experimental data for comparison. The excellent agreement demonstrates that cell shape variability results from the geometric constraint imposed by the energy function, Equation 1, and is not a choice but inevitable for such systems. This result, we believe, will foster analysis of diverse epithelial systems to understand the interrelation between geometric properties and biological functions within a unified framework.
Discussion
We have obtained a meanfield theory for cell shape variability through the energy function $\mathcal{H}$, (Equation 1; Farhadifar et al., 2007; Honda, 1978; Bi et al., 2015; Bi et al., 2016; Sadhukhan and Nandi, 2021). We have used simplifying assumptions for analytical tractability and justified them in detailed simulations of the VM and the CPM on a square and a hexagonal lattice. The geometric restriction of confluency is a strong constraint on cell area. Considering that the area constraint is satisfied and that the cell cortex, described by the perimeter term, is crucial in determining the cell shape, allowed us to ignore the area term and obtain the distribution of $r$. We have justified this assumption in our simulations in the regime of our interest where cells are compact objects. A detailed comparison of our simplified analytical theory with simulations and experiments shows excellent agreements. Recent experiments and simulations have revealed that cell shape variability is nearly universal in confluent epithelial monolayers Atia et al., 2018; Wenzel and Voigt, 2021; Kim et al., 2020; Ilina et al., 2020; our work provides the theoretical basis for such behavior. We have shown that the universal properties are associated with a mathematical property and valid in general, even in the fluid regime; this is significant since most biological systems are in the fluid regime due to activity. Our analytical theory reveals that the microscopic system properties enter the distribution via a single parameter, α: this leads to the universal behavior for $sd$ vs $\overline{r}$ and a nearly universal distribution for the scaled aspect ratio r_{s}. Thus far, the PDF of r_{s} has been fitted with a kGamma function with $k$ being around 2.5. We show that a rough approximation of our expression leads to the kGamma function; the slight variation in $k$ comes from the fact that the PDF is not strictly but nearly universal. On the other hand, $k\simeq 2.5$ is a direct consequence of a mathematical property: the lowest degeneracy of the eigenvalues of the connectivity matrix being two for a closedlooped object, here the perimeter.
A better understanding of the connection between the theoretical parameters and different system properties is crucial to exploit the universal aspects for deeper insights. Since all of the parameters combine into α, the effects of changing physical conditions on the individual parameters are difficult to determine from the measurements of $r$ alone. ${\lambda}_{P}$ describes the cortical properties, and $T$ parameterizes different biological activities, including temperature. These are effective parameters, and their direct estimation in biological systems is impractical. Our theory provides an indirect way to estimate these parameters. The cell area in a confluent system is geometrically constrained. We have used the phenomenological implementation of the constraint of confluency and obtained the PDF for $a$ (Weaire, 1986). It is a Gamma function, described by a single parameter μ. (Wilk et al., 2014) has proposed that the PDF of $a$ is universal in various epithelial monolayers. However, we show that though the area distribution follows the same function, there is a variation in μ. Our work connects μ to the microscopic model parameters of Equation 1. In particular, μ should be independent of ${\lambda}_{P}$, whereas α varies linearly with both ${\lambda}_{P}$ and $1/T$. This distinction, assuming ${\lambda}_{A}$ remains constant, allows inferring the effects of maturation on the individual model parameters.
We have neglected cell division, growth, and apoptosis in our theory. These are nonequilibrium processes, and the assumption of effective equilibrium becomes questionable. It will be interesting to see if the nonequilibrium statistical mechanics approach of Grossman and Joanny, 2022 can be applied to obtain the distribution in the presence of these processes. However, more generally, since the rate of these processes are low, we expect the basic form of the distribution of $r$, Equation 3 to remain the same. The algebraic part of Equation 3 comes from the geometric property that remains the same. Therefore, we hypothesized that the effect of these additional processes should enter the distribution via $\alpha $. We have included these processes in our simulations in Appendix 4. The results suggest that the general form of the distribution, and the prediction that all systemspecific details enter via the single parameter α, remain valid when these processes are not dominant (Appendix 4).
Our work demonstrates that a single parameter, α, describes both the cell shape statistics and the dynamics. We have shown in our simulations that the relaxation time, $\tau $, grows as α increases or $\overline{r}$ decreases. Experiments on confluent cellular monolayer have also reported similar results (Atia et al., 2018; Park et al., 2015; Kim et al., 2020). Thus, as cells become more compact and their shape variability reduces, the monolayer becomes more sluggish. This result may have farreaching consequences. Most experiments usually measure $\overline{r}$ and analyze biological functions via $\overline{r}$ (Wyatt et al., 2015; Bosveld et al., 2016). Our theory implies that such knowledge contains a wealth of information. One can obtain α from $\overline{r}$, and all other properties, such as the distribution, the standard deviation, and the dynamics, can be analytically calculated. Pattern formation in a biological system is omnipresent during development. Pattern formation is generally controlled by gradients of different morphogens, such as the wingless or Dpp in the Drosophila wing (Jaiswal et al., 2006). Such gradients can be incorporated within our model. Since biological functions are related to cell shape, our results provide a statistical way to describe the system from the perspective of cellular functions, such as division or apoptosis. Such a description, in turn, will allow us to study the pattern formation in the presence of morphogen gradients. We are currently developing, in collaboration with colleagues, such a formalism to study the mechanochemical pattern formation in the Drosophila wing disc.
It is wellrecognized that different levels of organizations in biology are mechanistically related. One fundamental open question is how molecularlevel events are related to cellular machines that control the cell shape (Gilmour et al., 2017). It is a difficult question as varying a specific property in a biological system is nearly impossible. Any perturbation will have significant impacts on several other proteins. Our work shows that all these perturbations enter the cell shape variability via a single parameter. The striking predictability, demonstrated by our theory, where $\overline{r}$ determines the PDF, shows that the statistical distribution of cell shape is unavoidable. How do different cells respond to this inevitable distribution? Is cellular response similar across diverse systems? How is it related to organlevel morphogenesis? Having a single parameter that describes the static and dynamic aspects at the cellular level should help compare and analyze different systems and answer these questions.
We have emphasized that the jamming and the glass transitions are distinct phenomena controlled by different physics (Mari et al., 2009; Biroli and Garrahan, 2013; Berthier and Witten, 2009; Berthier et al., 2019; Atia et al., 2021). The glass transition is not associated with any thermodynamic transition; therefore, we expect our results to remain valid even in the glassy regime. Although most of our simulations are in the fluid regime, some are also in the glassy regime. Glassiness is relevant only in the study of dynamics, and no static order parameter exists to date (Berthier and Biroli, 2011). Since our focus in this work is the static properties, we have not discussed the glass transition. By contrast, the shape index, $q$, has been shown to be an order parameter of the jamming transition (Bi et al., 2015; Merkel et al., 2019). Our formalism can also provide the distribution of $q$. These results will be presented elsewhere. In addition, our expression for the distribution of area is the same as that obtained for the Voronoi tessellated volume of a jammed granular system (Aste and Di Matteo, 2008). Although the Boltzmann distribution aspect at an effective temperature is similar, the bases of the two theories are quite different. This connection provides an alternative way to think about the constraint of confluency.
In conclusion, we have developed a simple meanfield theory for the aspect ratio distribution in a confluent epithelial monolayer. We show that the universal properties of a biological system, whose physics is controlled by the energy function Equation 1, come from a mathematical property. We have analytically derived the cell shape variability, characterized via $r$. The PDF of $r$ is described by a single parameter, α. As a result, $sd$ vs $\overline{r}$ becomes universal, and the PDF for the scaled aspect ratio, r_{s}, is nearly universal. A rough approximation of our analytical form for the PDF of r_{s} leads to the kGamma distribution (Aste and Di Matteo, 2008) that has been fitted to date with the existing experimental data (Atia et al., 2018; Kim et al., 2020). The distribution is valid in general, even in the fluid regime. The nearuniversal value of $k\sim 2.5$ is the consequence of a mathematical property; the variation results from the fact that the PDFs are not strictly universal. α can also provide information on dynamics. Having a single parameter for the statistical and dynamical aspects of an epithelial monolayer should foster a detailed comparison of diverse epithelial systems, provide insights on the relation of biological functions to shapes, and elucidate the detailed cellular responses to the inevitable shape variability.
Materials and methods
Details of the analytical calculation
Request a detailed protocolTo set up the calculation, we represent the cell perimeter in a coarsegrained description via $n$ points, where l_{j} is the infinitesimal lineelement between jth and (j+1)th points (see Appendix 5). Note that our final results are independent of this discretization of the perimeter. To calculate the aspect ratio, $r$, we first obtain the gyration tensor related to the moment of inertia tensor, a $2\times 2$ tensor in spatial dimension two, in a coordinate system whose origin coincides with the center of mass (CoM) of the cell. Diagonalization of this tensor gives the two principal eigenvalues, ${s}_{1}^{2}$ and ${s}_{2}^{2}$, the squaredradii of gyrations around the respective principal axes, and $r={s}_{1}/{s}_{2}$. However, as discussed in the main text, a direct calculation of s_{1} and s_{2} is nontrivial due to their anisotropic natures. Therefore, we calculate the distribution of the radius of gyration, $s$, around the center of mass, and we have ${s}^{2}={s}_{1}^{2}+{s}_{2}^{2}$ (Davis and Denton, 2018). The distribution of ${s}^{2}$ is
where $Z$ is the partition function, $\dot{x}={\prod}_{\rho =1}^{2}{\prod}_{j=1}^{n}d{x}_{j}^{\rho}$, the volume element, ${k}_{B}$, the Boltzmann constant, and $T$, the temperature. The first $\delta $function in Equation 5 ensures that the origin coincides with the CoM of the cell; the second $\delta$function selects specific values of ${s}^{2}$, giving the distribution function.
A precise mathematical description of adhesion remains unclear (Graner and Riveline, 2017; Hilgenfeldt et al., 2008; Käfer et al., 2007). The VM represents it via a line tension: ${\sum}_{\u27e8ij\u27e9}{\mathrm{\Lambda}}_{ij}{\mathrm{\ell}}_{ij}$, where ${\mathrm{\ell}}_{ij}$ is the length between two consecutive vertices, $i$ and $j$ , and ${\mathrm{\Lambda}}_{ij}$ gives the line tension. Since the degrees of freedom in the VM are the vertices, considering ${\mathrm{\Lambda}}_{ij}$ constant, we obtain Equation 1. The constant ${\mathrm{\Lambda}}_{ij}$ implies a regular cell perimeter between vertices. However, the entire cell boundary is in contact with other cells (Figure 1a), and the perimeter is often irregular in experiments (Käfer et al., 2007; Atia et al., 2018). Therefore, we take a more general description, where the tension in a lineelement l_{j} is proportional to l_{j} with strength $P}_{0$. Thus, the adhesion part becomes ${\lambda}_{P}\stackrel{~}{K}{P}_{0}{\sum}_{j}{l}_{j}^{2}$, where $\stackrel{~}{K}$ is a constant. At the cellular level description, this microscopic difference can be accounted for by a renormalized $P}_{0$. Since it is unclear how to measure $P}_{0$ in experiments, we mainly restrict our discussions on ${\lambda}_{P}$ and $T$. Unless otherwise stated, we assume $P}_{0$ remains constant. However, the theory does capture the variation in $P}_{0$, as we show within both the VM and the CPM simulations (Figure 3h).
Within our coarsegrained description, the contractile term is ${P}_{i}^{2}={({\sum}_{j}{l}_{j})}^{2}$. Now we can evaluate the integral, Equation 5, via a Gaussian approximation. However, an exact calculation, even at this level, is complicated. For analytical tractability and to gain insights, we use the CauchySchwartz inequality (Arfken et al., 2018) and write ${({\sum}_{j}{l}_{j})}^{2}\le n{\sum}_{j}{l}_{j}^{2}=\nu {\sum}_{j}{l}_{j}^{2}$, where $\nu \sim \mathcal{O}(n)$ is a constant. Note that this inequality is exact for the CPM, where we can consider ${l}_{j}=1$ as the line element and $\nu =n$, the number of sides comprising the perimeter (see Appendix 5). In the experiments, the perimeter is usually obtained via a similar discretization (Atia et al., 2018). As shown in Appendix 5, this inequality, with the largest $n$, is a reasonable approximation for the perimeter term since the variation in perimeter is not too strong due to the perimeter constraint in Equation 1. Use of this inequality makes the evaluation of the integral slightly easier. In any case, it only affects the constant in the exponential that we treat as a fitting parameter within our theory. Since we are not investigating any transition here, the use of this relation is justified. Crucially, as we describe below, the parameter $k\simeq 2.5$ has a different origin that is not affected by this inequality. Then, the perimeter part of $\mathscr{H}$ becomes $\nu {\stackrel{~}{\lambda}}_{P}{\sum}_{j}{l}_{j}^{2}=\nu {\lambda}_{P}(1K{P}_{0}){\sum}_{j}{l}_{j}^{2}$, where $K=\stackrel{~}{K}/\nu $. The contractility and adhesion act as two competing effects (Figure 1a).
Thus, in the regime of our interest, the cell perimeter with the energy given by Equation 2 governs the distribution of $r$ that we calculate via that of $s$. In the field of polymer physics, the distribution of $s$ has been calculated (Fixman, 1962; Eichinger, 1977; Eichinger, 1980). The mathematical structure of the two problems at this stage becomes equivalent. However, note that their physics are quite different. Specifically, our assumptions will not hold in the regime of interest of the polymer physics problem. To take advantage of an established notation and provide a connection between two disparate fields, we present our calculation in the notation of Eichinger, 1977; Eichinger, 1980. As discussed earlier, we describe the cell perimeter by the vector $\mathbf{x}=\{{x}_{1}^{1},{x}_{1}^{2},{x}_{2}^{1},{x}_{2}^{2},\mathrm{\dots}{x}_{n}^{1},{x}_{n}^{2}\}$, representing a set of $n$ points on the perimeter. Then, we have ${\mathscr{H}}_{P}=\nu {\stackrel{~}{\lambda}}_{P}\mathbf{x}(\mathbf{K}\otimes {\mathbf{I}}_{2}){\mathbf{x}}^{\prime}$, where ${\mathbf{x}}^{\prime}$ is the column vector, the transpose of $\mathbf{x}$, $\mathbf{K}$, the Kirchhoff’s matrix (Eichinger, 1977; Eichinger, 1980) with ${\mathbf{K}}_{ii}=2$ and ${\mathbf{K}}_{(i1)i}={\mathbf{K}}_{i(i1)}=1$, and ${\mathbf{I}}_{2}$, the twodimensional identity tensor. Thus, we have
where $\gamma =\nu {\lambda}_{P}(1K{P}_{0})/{k}_{B}T$. The distribution of ${s}^{2}$ is
where the squared radius of gyration is ${s}^{2}={n}^{1}{\mathbf{xx}}^{\prime}$. Since the radius of gyration does not depend on the coordinate system, we are allowed to chose one that diagonalizes $\mathbf{K}$. Say the diagonal matrix is $\mathbf{\Lambda}$, and $\mathbf{q}$ represents the normal coordinates in this system.
The radius of gyration can be defined as the rootmeansquare distance of different parts of a system either from its center of mass or around a given axis. We have designated the former as $s$, defined as
where $N$ is the total volume element in the system with coordinates ${\mathbf{x}}_{i}$, and ${\mathbf{x}}_{CM}$ is the center of mass (CoM) of the system. The other two radii of gyration can be defined around the two principal axes (since we are in spatial dimension two) passing through the CoM. We calculate these two radii of gyration by writing the inertia tensor in a coordinate system whose origin coincides with the CoM and diagonalizing the tensor. The eigenvalues ${s}_{1}^{2}$ and ${s}_{2}^{2}$ are the squaredradii of gyrations around the respective principal axes. Thus, the aspect ratio, $r$, is obtained as $r={s}_{1}/{s}_{2}$, assuming ${s}_{1}\ge {s}_{2}$. As discussed in the main text, due to the anisotropic nature of s_{1} and s_{2}, a direct calculation for their distributions is more complex than that of $s$. So, we first calculate the distribution for ${s}^{2}$ and then, using this result, we obtain the distribution of $r$.
Equation 5 can be written in the normal coordinate system as.
where we have used ${q}_{n}^{\rho}\propto \sum {x}_{j}^{\rho}$ that corresponds to the zeroeigenvalue mode of the matrix. Integrating over ${q}_{n}^{\rho}$, we get rid of this zeroeigenvalue that gives translation. Thus,
where we have defined ${\mathbf{q}}_{0}$ as the $2(n1)$ dimensional vector excluding the coordinates corresponding to the zeroeigenvalue. The normalization factor, $Z$, can be calculated exactly through the integration as
Note that the integration in the calculation of $P({s}^{2})$ is around the boundary of the cell; to separate out the radial part, we now write the volume element in polar coordinate $\mathbf{u}$. Then ${\dot{q}}_{0}={n}^{(n1)}{s}^{2(n1)1}ds\dot{u}$ and ${\mathbf{q}}_{0}{\mathbf{q}}_{0}^{\prime}/n{s}^{2}={\mathbf{uu}}^{\prime}=1$. Thus, we obtain from Equation 9
with $\mathcal{A}={\left(\frac{\gamma}{\pi}\right)}^{(n1)}{\mathbf{\Lambda}}_{0}\frac{1}{2\pi}{n}^{(n1)}{s}^{2(n1)1}{I}_{n1}$ is the identity matrix of rank $n1$. Carrying out the integration over $\mathbf{u}$, we obtain
Using the value of $\mathcal{A}$, we obtain
where $\lambda}_{j$’s are the eigenvalues of $\mathbf{K}$. The integral in Equation 13 can be performed via the contour integral and the resultant solution can be written as.
where ${\lambda}_{k}$ are the distinct eigenvalues of $\mathbf{K}$ and $Res({\lambda}_{k})$ gives the residue at the pole ${\lambda}_{k}$. As we show below, the residues will have a term $\mathrm{exp}[n\gamma {s}^{2}{\lambda}_{k}]$ and in the limit ${s}^{2}\to \mathrm{\infty}$, only the smallest ${\lambda}_{k}$ will contribute.
Since the cell perimeter must be closedlooped, $\mathbf{K}$ is a tridiagonal matrix with periodicity. Therefore, the number of zeroeigenvalues must be one, and the lowest degeneracy of the nonzero eigenvalues must be two (Kulkarni et al., 1999; Witt et al., 2009; Eichinger, 1977; Eichinger, 1980). We have already integrated out the coordinate corresponding to the zeroeigenvalue. Let us designate the lowest nonzero eigenvalue as λ. The pole corresponding to λ is located at $\beta =i\gamma n{s}^{2}\lambda $, and of order 2. Thus, we obtain the residue as
Let’s first take the derivative, with respect to β, of the numerator and write part of the residue as
Next, differentiating the denominator, we obtain the other part of the residue as
A comparison of term1 and term2, given by Equation 16 and Equation 17, respectively, shows that there is an extra factor of ${s}^{2}$ in the denominator of term2. Thus, in the limit $s}^{2}\to \mathrm{\infty$ term2 can be ignored compared to term1. Therefore, we obtain the distribution function for ${s}^{2}$ as.
where $C$ is the normalization constant that we will fix later, and $\stackrel{~}{\alpha}=\gamma n\lambda $. Note that the lowest eigenvalue for the $n$dimensional Kirchoff’s matrix is proportional to $1/n$, thus $n\lambda \sim \mathcal{O}(1)$.
Now, ${s}^{2}={s}_{1}^{2}+{s}_{2}^{2}$ and the aspect ratio $r={s}_{1}/{s}_{2}$. Moreover, we have ${s}_{1}{s}_{2}=A$, where $A$ is the area of the cell. Since the distribution of cell area is sharply peaked (Figure 2c in the main text), as the cell division and apoptosis are slow processes, $A$ can be taken as a constant. Therefore, using the last two relations in the first, we obtain ${s}^{2}=A(r+1/r)$. Thus, we obtain from Equation 18, the distribution of $r$ as.
where $\mathcal{N}$ is the normalization constant: $\mathcal{N}=\mathrm{\Gamma}(5/2)/{\alpha}^{5/2}W(5/2{)}_{1}{F}_{1}(5/2,7/2,2\alpha )$, where $W(x)={2}^{x}/x$, $\mathrm{\Gamma}(x)$ is the Gamma function, and ${}_{1}F_{1}(a,b,c)$ is the Kummer’s confluent Hypergeometric function (Erdelyi et al., 1953), $\alpha \propto {\lambda}_{P}(1K{P}_{0})/T$.
Note that the exponent 3 in the algebraic part of Equation 18 comes from the mathematical property, therefore, must be true for any system. This is the source of the value $k\simeq 2.5$ when Equation 19 is approximated as a $k$Gamma distribution, as discussed in the main text. For the analysis of the theory, a symbolic software, such as ‘Mathematica’ (Wolfram Research Inc, 2019) is helpful. We note here that the relation between $sd$ and $\overline{r}$ is quite complex, though for most practical purposes it can be taken as a straight line: $sd\simeq 0.71\overline{r}0.75$.
Radius of gyration
Request a detailed protocolThere are different ways to define the radius of gyration depending on the application. Mathematically, it is the root mean square distance from the centre of mass or a given axis. We have designated the first as $s$ (Equation 7). For a twodimensional object, there are two independent axes, therefore, there will be two radii of gyration, s_{1} and s_{2}, in the second method. As detailed in the Discussion, s_{1} and s_{2} are obtained as the two eigenvalues of the gyration tensor. The significance of these two eigenvalues are as follows: the geometric properties of the system are equivalent to those of an ellipse with s_{1} and s_{2} as the major and minor axes. Considering that the mass is uniformly distributed throughout the system, the area is proportional to ${s}_{1}{s}_{2}$ or ${s}^{2}$. We have used this property to obtain the distribution of $r$ in the analytical derivation of Equation 19.
Distribution for area
Request a detailed protocolEquation 18 gives the distribution of the squared radius of gyration of different cells in a monolayer. Using this equation, we now derive the distribution of area, $A$. Note that Equation 18 is valid irrespective of whether the system is confluent or not. We have argued in the main text that the constraint of confluency is not crucial to obtain the distribution of the aspect ratio. But, this argument is not valid when we are interested in the distribution of cellular area, $A$. Since confluency is a strong geometric constraint on $A$, we must include this constraint in the derivation of $P(A)$. Now, as detailed in ‘Radius of gyration’, ${s}^{2}\propto A$. Therefore, Equation 18 gives.
where β is a constant related to $\stackrel{~}{\alpha}$. We now consider the constraint of confluency. It is actually a longstanding hard mathematical problem. Even for random patterns, no exact result exists yet. (Weaire, 1986) proposed a phenomenological implementation of this constraint as a polynomial function of area; this remains one of the simplest possible ways to date to deal with this constraint (Gezer et al., 2021). Keeping only one term of this polynomial for simplicity, we can write this constraint as $f(A)\sim {A}^{\nu}$ (Weaire, 1986; Gezer et al., 2021). Using this in Equation 20, we obtain
where we have defined $\mu =\nu +5/2$. From Equation 21, we obtain the average area, $\overline{A}=\mu /\beta $. Thus, we obtain the normalized distribution for the scaled area, $a=A/\overline{A}$, as
This is the wellknown kGamma function, defined in Aste and Di Matteo, 2008, and usually denoted with the variable $k$. This same function has been used in fitting the scaled aspect ratio, r_{s}, data in different existing experiments and simulations. Therefore, to avoid confusion with $k$, which is obtained fitting the r_{s} data, we have used μ to define the distribution function for $a$. Since μ comes from the constraint of confluency, it should be independent of ${\lambda}_{P}$. Our simulation results within both the CPM and the VM (Figure 3e, in the main text) support this hypothesis.
Appendix 1
Boltzmann Distribution
As we discussed in the main text, there are two types of models of confluent systems, depending on the presence or absence of selfpropulsion. In the absence of selfpropulsion, the models are in equilibrium. It has been shown in the literature, that the equilibrium variants of the models do capture the static and dynamical properties of a confluent epithelial system (Farhadifar et al., 2007; Atia et al., 2018; Park et al., 2015). For concreteness, we have considered the equilibrium variants of the models in this work. Therefore, we expect that the Boltzmann distribution should hold, that is, a configuration with energy $E$, given by Equation 1, should have the weightage of $\sim \mathrm{exp}[E/{k}_{B}T]$. However, as discussed below in Appendix 2, we restrict ourselves in the low$P}_{0$ regime, where the cells, on average, cannot satisfy the perimeter constraint. This implies a nontrivial dependence of the distribution around $E\to 0$ coming from the constraint of confluency that in turn restricts the minimum value of the perimeter; the energy cannot go below a certain value. However, we show in Appendix 1—figure 1 that we can still approximate the distribution via a Boltzmann distribution.
If the system follows the Boltzmann distribution, that is, $P(E)\sim \mathrm{exp}(E/T)$, plots of $\mathrm{log}[P(E)]$ as a function of $E/T$ for different $T$ should follow a master curve that is a linearly decreasing function. As shown in Appendix 1—figure 1, except at very close to $E\to 0$, $P(E)$ as a function of $E/T$ at different $T$ indeed follows a master curve, implying the existence of the Boltzmann distribution.
Appendix 2
Simulation details
We have verified our theory via simulations of two distinct models: the CPM and the VM. In both the simulations, we have used the energy function $\mathscr{H}$, given in Equation 1 in the main text. The CPM is a latticebased model. The underlying lattice has some effects on the quantitative aspects of the model; for example, on the square lattice, the object with the minimum possible perimeter for a given area is a square. However, the qualitative behaviors are independent of the lattice. To ascertain that the simulation results are independent of the lattice, we have simulated the CPM on two different lattices: the square lattice and the hexagonal lattice.
For the glassy dynamics, the geometric restriction leads to two different regimes, the low$P}_{0$, and the large$P}_{0$ regime. However, the large$P}_{0$ regime characterizes quite large adhesion compared to the cortical contractility; this regime, we feel, is not relevant for the experiments. Therefore, we present most results when $P}_{0$ is not too large. Here, we briefly discuss the different models.
Cellular Potts Model (CPM)
In CPM, dynamics proceeds by stochastically updating one boundary lattice site at a time. In a MonteCarlo simulation (MC), we accept a move with a probability, $P(\sigma \to {\sigma}^{\prime})={e}^{\frac{\mathrm{\Delta}\mathscr{H}}{T}}$, where we have set Boltzmann constant ${k}_{B}$ to unity, $\mathrm{\Delta}\mathscr{H}$ is the change in energy going from one configuration to the other. $\sigma $ and ${\sigma}^{\prime}$ are the cell indices of the current cell and the target cell indices, respectively. Unit of time (1 MC time) refers to $M$ such elementary moves, where $M$ is the total number of lattice sites in the simulation.
With square lattice
We have implemented CPM with square lattice in Fortran 90 and followed a Connectivity Algorithm developed in Durand, 2016 to prevent fragmentation of cells (Sadhukhan and Nandi, 2021). We have chosen a simulation box of size $120\times 120$ with 360 total cells in the system. The average area of cells in the system is 40, and the minimum possible perimeter on a square lattice with this area is 26. Unless otherwise specified, we always start with an initial configuration where each cell is rectangular with a size $5\times 8$. We first equilibrate the system for $8\times {10}^{5}$ MC time before collecting the data of aspect ratio. The variables that characterize the system are ${\lambda}_{A}$, ${\lambda}_{P}$, $P}_{0$, and $T$. We have ignored cell division and apoptosis, as discussed in the main text. However, to test the effects of these processes, we have included them within this model and present a specific set of results in Appendix 4.
To obtain the relaxation time, $\tau $, we have calculated the overlap function, $Q(t)$, defined as
where ${X}_{cm}^{\sigma}(t)$ is the center of mass at time $t$ of a cell with index $\sigma $, $W(x)$ is a Heaviside step function
${\u27e8\mathrm{\dots}\u27e9}_{{t}_{0}}$ denotes averaging over initial times and the overline implies ensemble average. Relaxation time is defined as $Q(t=\tau )=0.3$. The parameter $a$ is related to the vibrational motion of the particles (here cells) inside the cage formed by its neighbors. We have set $a=1.12$ (Sadhukhan and Nandi, 2021). We have taken 50 t_{0} averaging and 20 configurations for ensemble averaging.
With Hexagonal lattice
To simulate CPM on the hexagonal lattice, we have used an opensource application CompuCell3D (Swat, 2012; Zajac et al., 2003). CPM is the core of CompuCell3D. We have simulated the 2D confluent cell monolayer with periodic boundary conditions using the same energy function, Equation 1, in the main text. In this simulation, we have taken $128\times 128$ simulation box size with 361 cells. Initially, Cells are of three different types of areas and perimeters. We have started with a rectangular slab of the cell monolayer. We first equilibrate the system for 10^{5} MC time steps before collecting the data. Fragmentation is allowed in these simulations; however, we have restricted ourselves in the low $T$ regime, where fewer cells are fragmented. While calculating the PDFs, we have eliminated the fragmented cells.
Vertex Model (VM)
Vertex models (VM) can be viewed as the continuum version of the CPM. Within VM, vertices of the polygons are the degrees of freedom, and cell edges are defined as straight lines (or lines with a constant curvature) connecting between vertices (Farhadifar et al., 2007; Fletcher et al., 2014). Within the MonteCarlo simulatoin (Wolff et al., 2019), dynamics proceeds by stochastically updating each vertex position of all the cells by a small amount $\delta r$. We accept the move with a probability $P(\mathcal{C}\to {\mathcal{C}}^{\mathrm{\prime}})={e}^{\frac{\mathrm{\Delta}\mathcal{H}}{T}}$ where we have set Boltzmann constant ${k}_{B}$ to unity, $\mathrm{\Delta}\mathscr{H}$ is the change in energy going from $\mathcal{C}\to {\mathcal{C}}^{\mathrm{\prime}}$. Initially, we start will 1024 cells having equal area $(=1.0)$ and perimeter $(=3.72)$. We have equilibrated the system for $2\times {10}^{6}$ MC time before the start of collecting data.
Two different regimes
As we discussed in the main text, theoretical studies have shown the existence of two distinct regimes as a function of $P}_{0$ within these models of confluent epithelial cells (Sadhukhan and Nandi, 2021; Bi et al., 2015). The existence of these two regimes is controlled by a geometric constraint. Given a certain average cell area, the perimeter can have a minimum value. If $P}_{0$ in Equation 1 is less than this value, cells cannot achieve this target perimeter: the cells in this regime are compact. It seems that this is the regime of experimental interest. In contrast, when $P}_{0$ is large compared to this minimum value, cells can achieve this value to minimize energy $\mathscr{H}$. Within the CPM, the cells look like fractal objects in this regime (Sadhukhan and Nandi, 2021). This behavior is expected to be similar within both the VM and the Voronoi model. But there are technical difficulties to simulate these latter models in this regime due to numerical instabilities.
Appendix 3
Procedure to calculate aspect ratio ($r$) in our simulation
As stated in the main text, we characterize the cell shape via aspect ratio, $r$. To calculate $r$, we follow (Atia et al., 2018) and briefly discuss the method here. Consider one particular cell, represented by the set of points $\{{x}_{i}^{1},{x}_{i}^{2}\}$, as schematically shown in Appendix 3—figure 1. We first calculate the gyration tensor, $\mathbf{I}$, (proportional to the moment of inertia tensor) in a frame of reference whose origin coincides with the center of mass $({x}_{c}^{1},{x}_{c}^{2})$ of the cell. $\mathbf{I}$ is a $2\times 2$ tensor in spatial dimension two. We next diagonalize $\mathbf{I}$ and obtain the two eigenvalues, ${s}_{1}^{2}$ and ${s}_{2}^{2}$. Without loss of generality, we assume that ${s}_{1}\ge {s}_{2}$, and obtain $r={s}_{1}/{s}_{2}$. This process can be viewed as approximating the cell with an ellipse with the major and minor axes given by s_{1} and s_{2}, respectively. For the CPM on the square lattice and the VM, we have written our own codes; for the CompuCell3D (Swat, 2012; Zajac et al., 2003), we have used <Plugin Name="MomentOfInertia"/>in the.XML file and using the example of Demos/MomentOfInertia, we have calculated the lengths of semiaxes in the Steppables.py file. We have obtained the aspect ratio of a cell by taking the ratio of lengths of semimajor and semiminor axes.
Appendix 4
Including cell division and apoptosis
We have ignored cell division and apoptosis in our theory since the rates of these processes are extremely low. However, even when these processes are present, as long as their rates are not very high, we expect the form of the distribution, Equation 3 in the main text, should remain valid. There are two distinct parts in $P(r)$. The algebraic part is determined by the geometric property, which is a cell perimeter is a closedloop object. This property cannot change. Therefore, the effects of these processes can enter the distribution via $\alpha $ alone. Thus, the main conclusions of the theory should remain valid even when division and apoptosis are present.
We have included cell division and apoptosis in our simulations to test this hypothesis. We keep the rate of division, ${k}_{d}^{1}$, and apoptosis, ${k}_{a}^{1}$, the same such that the average number of cells remains the same. Every k_{d} time step, we randomly select a cell and divide it into two, with a randomly chosen division plane. To decide the division plane, we first calculate the center of mass (CoM) and then chose a point on the perimeter in a random direction. The line connecting the CoM and this point gives the division plane (a line for the twodimensional system). The area of these two daughter cells then grows till it becomes of the order of A_{0} (Equation 1 in the main text). To avoid dividing a cell that has just undergone division, we impose a cutoff on the area ($\ge 32$) for selecting a cell for division. For apoptosis, every k_{a} time step, we randomly choose a cell and assign it a target area ${A}_{0}=0$. We also relax the constraint that does not allow fragmentation in our simulation for the cells undergoing apoptosis.
We show the PDF of $r$ at three representative values of k_{a} above. As shown in Appendix 4—figure 1a PDFs of r at different values of ${k}_{d}={k}_{a}$ can be fitted with Equation 3, the values of α for the best fits are quoted in the figure. Crucially, an important prediction of the theory that the PDF for r_{s} becomes nearly universal remains also valid, as shown in Appendix 4—figure 1b. We find that α as a function of k_{d} or k_{a} quickly reaches a constant with increasing k_{a} (that is, with decreasing rate) (Appendix 4—figure 1c).
Appendix 5
Validity of the approximation of perimeter
We have used the CauchySchwartz inequality, ${({\sum}_{j}{l}_{j})}^{2}\le n{\sum}_{j}{l}_{j}^{2}=\nu {\sum}_{j}{l}_{j}^{2}$, where $\nu \sim \mathcal{O}(n)$ is a constant. Within the CPM, the perimeter consists of $n$ elementary sides, as schematically shown in Appendix 5—figure 1a above, then ${P}_{i}^{2}={[{\sum}_{j=1}^{n}{\mathrm{\ell}}_{j}]}^{2}$. Since ${\mathrm{\ell}}_{j}=1$, ${P}_{i}^{2}={[{\sum}_{j=1}^{n}{\mathrm{\ell}}_{j}]}^{2}=n{\sum}_{j=1}^{n}{\mathrm{\ell}}_{j}^{2}$. The inequality is exact in that case. The data from the simulation confirms this (Appendix 5—figure 1). In the case of the continuum models, such as the vertex model, we can use a similar discretization as schematically shown by the red line in the inset of Appendix 5—figure 1. In fact, the estimate of cell perimeter in the experiments is obtained via a similar discretization method. With such a discretization representing the continuous cell perimeter via a discrete line, we again find that the inequality is very close to equality (Appendix 5—figure 1b).
The approximation of replacing the inequality by equality now depends on the fluctuation of the total perimeter or $n$. This fluctuation depends on the values of the model parameters, such as $T$, ${\lambda}_{P}$, $P}_{0$, etc. We find that for the parameter values used in this work to simulate the systems, the highest fluctuation of the perimeter, or in the value of $n$, is less than 5%. Thus, we are justified to approximate the inequality by equality.
Data availability
We have uploaded the source files of the simulation data and the Mathematica analysis files in Dryad.

Dryad Digital RepositoryData from: On the origin of universal cell shape variability in a confluent epithelial monolayer.https://doi.org/10.5061/dryad.xsj3tx9h0
References

Modeling cell shape and dynamics on micropatternsCell Adhesion & Migration 10:516–528.https://doi.org/10.1080/19336918.2016.1148864

Emergence of gamma distributions in granular materials and packing modelsPhysical Review. E, Statistical, Nonlinear, and Soft Matter Physics 77:021309.https://doi.org/10.1103/PhysRevE.77.021309

Geometric constraints during epithelial jammingNature Physics 14:613–620.https://doi.org/10.1038/s4156701800899

Are cell jamming and unjamming essential in tissue development?Cells & Development 168:203727.https://doi.org/10.1016/j.cdev.2021.203727

Glass transition of dense fluids of hard and compressible spheresPhysical Review. E, Statistical, Nonlinear, and Soft Matter Physics 80:021502.https://doi.org/10.1103/PhysRevE.80.021502

Theoretical perspective on the glass transition and amorphous materialsReviews of Modern Physics 83:587–645.https://doi.org/10.1103/RevModPhys.83.587

Glassy dynamics in dense systems of active particlesJ Chem Phys 150:200901.https://doi.org/10.1063/1.5093240

Energy barriers and cell migration in densely packed tissuesSoft Matter 10:1885–1890.https://doi.org/10.1039/c3sm52893f

A densityindependent rigidity transition in biological tissuesNature Physics 11:1074–1079.https://doi.org/10.1038/nphys3471

Motilitydriven glass and jamming transitions in biological tissuesPhysical Review. X 6:021011.https://doi.org/10.1103/PhysRevX.6.021011

An efficient cellular potts model algorithm that forbids cell fragmentationComputer Physics Communications 208:54–63.https://doi.org/10.1016/j.cpc.2016.07.030

An approach to distribution functions for Gaussian moleculesMacromolecules 10:671–675.https://doi.org/10.1021/ma60057a035

Configuration statistics of Gaussian moleculesMacromolecules 13:1–11.https://doi.org/10.1021/ma60073a001

Vertex models of epithelial morphogenesisBiophysical Journal 106:2291–2304.https://doi.org/10.1016/j.bpj.2013.11.4498

Statistical properties of poissonvoronoi tessellation cells in bounded regionsJournal of Statistical Computation and Simulation 91:915–933.https://doi.org/10.1080/00949655.2020.1836184

Simulation of the differential adhesion driven rearrangement of biological cellsPhysical Review. E, Statistical Physics, Plasmas, Fluids, and Related Interdisciplinary Topics 47:2128–2154.https://doi.org/10.1103/physreve.47.2128

Simulation of biological cell sorting using a twodimensional extended Potts modelPhysical Review Letters 69:2013–2016.https://doi.org/10.1103/PhysRevLett.69.2013

Instabilities and geometry of growing tissuesPhysical Review Letters 129:048102.https://doi.org/10.1103/PhysRevLett.129.048102

Cellular potts modeling of complex multicellular behaviors in tissue morphogenesisDevelopment, Growth and Differentiation 59:12358.https://doi.org/10.1111/dgd.12358

Evolving mechanisms of morphogenesis: on the interplay between differential adhesion and cell differentiationJournal of Theoretical Biology 203:317–333.https://doi.org/10.1006/jtbi.2000.1087

Description of cellular patterns by Dirichlet domains: the twodimensional caseJournal of Theoretical Biology 72:523–543.https://doi.org/10.1016/00225193(78)903156

How much does the cell boundary contract in a monolayered cell sheet?Journal of Theoretical Biology 84:575–588.https://doi.org/10.1016/s00225193(80)80021x

Unified study of glass and jamming rheology in soft particle systemsPhysical Review Letters 109:018301.https://doi.org/10.1103/PhysRevLett.109.018301

Unjamming and collective migration in MCF10A breast cancer cell linesBiochemical and Biophysical Research Communications 521:706–715.https://doi.org/10.1016/j.bbrc.2019.10.188

Fluidization, resolidification, and reorientation of the endothelial cell in response to slow tidal stretchesAmerican Journal of Physiology. Cell Physiology 303:C368–C375.https://doi.org/10.1152/ajpcell.00074.2012

Eigenvalues of tridiagonal pseudotoeplitz matricesLinear Algebra and Its Applications 297:63–80.https://doi.org/10.1016/S00243795(99)001147

Role of cell deformability in the twodimensional melting of biological tissuesPhysical Review Materials 2:045602.https://doi.org/10.1103/PhysRevMaterials.2.045602

E2 and gamma distributions in polygonal networksPhysical Review Research 3:L042001.https://doi.org/10.1103/physrevresearch.3.l042001

Dynamic migration modes of collective cellsBiophysical Journal 115:1826–1835.https://doi.org/10.1016/j.bpj.2018.09.010

Universal statistical laws for the velocities of collective migrating cellsAdvanced Biosystems 4:e2000065.https://doi.org/10.1002/adbi.202000065

Endocytic reawakening of motility in jammed epitheliaNature Materials 16:587–596.https://doi.org/10.1038/nmat4848

Jamming versus glass transitionsPhysical Review Letters 103:025701.https://doi.org/10.1103/PhysRevLett.103.025701

Effective temperature of active fluids and sheared soft glassy materialsThe European Physical Journal. E, Soft Matter 41:117.https://doi.org/10.1140/epje/i2018117317

Biology and physics of cell shape changes in developmentCurrent Biology 19:R790–R799.https://doi.org/10.1016/j.cub.2009.07.029

Unjamming and cell shape in the asthmatic airway epitheliumNature Materials 14:1040–1048.https://doi.org/10.1038/nmat4357

Collective migration and cell jamming in asthma, cancer and developmentJournal of Cell Science 129:3375–3383.https://doi.org/10.1242/jcs.187922

Cellsize distribution in epithelial tissue formation and homeostasisJournal of the Royal Society Interface 14:20170032.https://doi.org/10.1098/rsif.2017.0032

BookGlazierIn: Swat MH, editors. Computational Methods in Cell Biology. Academic Press. pp. 325–366.

Airway mechanical compression: its role in asthma pathogenesis and progressionEuropean Respiratory Review 29:190123.https://doi.org/10.1183/16000617.01232019

On the distribution of cell areas in a voronoi networkLetters Section 53:L101–L105.https://doi.org/10.1080/13642818608240647

Universal area distributions in the monolayers of confluent mammalian cellsPhysical Review Letters 112:138104.https://doi.org/10.1103/PhysRevLett.112.138104

SoftwareMathematica, version version 12.0Wolfram research inc.

Cell and tissue morphology determine actindependent nuclear migration mechanisms in neuroepitheliaThe Journal of Cell Biology 218:3272–3289.https://doi.org/10.1083/jcb.201901077

Simulating convergent extension by way of anisotropic differential adhesionJournal of Theoretical Biology 222:247–259.https://doi.org/10.1016/s00225193(03)00033x
Decision letter

Karsten KruseReviewing Editor; University of Geneva, Switzerland

Aleksandra M WalczakSenior Editor; CNRS LPENS, France

JeanFrançois JoannyReviewer; Institut Curie, CNRS UMR168, France
Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.
Decision letter after peer review:
Thank you for submitting your article "On the origin of universal cell shape variability in a confluent epithelial monolayer" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by a Reviewing Editor and Aleksandra Walczak as the Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: JeanFrançois Joanny (Reviewer #2).
The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.
Essential revisions:
1) The authors consider a regime where the area is equal to the target or preferred area of the cells and the area part of the free energy can be neglected. Quantitatively check the validity of this approximation.
2) Justify, why effectively equilibrium statistics can be used in the analysis.
3) In Appendix A you replace the square of the perimeter by the sum of the squares of the lengths of the edges of the cell. Check the accuracy of this approximation quantitatively.
4) Make the text more accessible to biologists.
Reviewer #1 (Recommendations for the authors):
I do not have any recommendations for further analysis or computation. However, the way the manuscript is written is rather poor. The authors often use jargon and, at places, the text is repetitive and confusing. Below I give some examples that might help the authors to make their work more accessible to the broad readership of eLife.
The authors very often switch between AR and r for denoting the aspect ratio. The figures sometimes, but not always contain both notions. I would strongly suggest introducing one symbol only and using it throughout the manuscript.
2nd paragraph is very long and partly repetitive.
p.2
"In a jammed granular system, the statistics of tessellated volume, x, follows a kΓ distribution," This sentence is not very biologist friendly.
"Almost universal" Other expressions used are "nearly universal", "virtually universal". Neither of them is welldefined and it is not clear if they all refer to the same feature. If yes, use only one notion. Also, try to make it scientific and not just intuitive.
"The existing theoretical works on cellular shapes are based on the elasticity framework." Again, hard to understand for a biologist.
p.3
On this page there a several jumps in the presentation. Try to streamline.
"One crucial aspect of these systems" Which ones?
"the cortex, a thin layer of cytoplasm" Better: the cortex is a thin actomyosin layer
"we mostly focus our discussions below within the CPM " but then "cells are compact objects (in contrast to being fractallike in the CPM simulation [30])" Can you clarify the compact vs fractal nature of the cells in the CPM you use?
"we first need to obtain the two radii of gyrations, s1 and s2, around the two principal axes." You should probably explain 'radius of gyration' for biologists as well as their relation to cell area.
"as detailed in the Appendix B, Equation (??), presented in the Appendix," Number is missing.
"Since μ is related to the constraint of confluency" Here Ii was confused, because you consider an isolated cell, right?
p.4
"Note that the power 3/2 of the algebraic term in Equation (3) comes from the mathematical property of closedlooped objects." Again, this requires more explanation for biologists
Figure 1d you could maybe show a loglog plot.
p.5
"Figure 3g" It seems to be premature to refer to this figure here.
"Finally, we show" Where?
"although they follow the same distribution, they need not be identical" What are 'they'? Also, I find it hard to understand this sentence.
"we have simulated the confluent systems with varying λ_A and find that the AR distribution remains almost independent of λ_A" Can you show the distribution?
"as T changes, the PDF of a, though welldescribed by a single parameter μ via Equation (4), can still be different" Yo mean the values of \mu can be different?
Figure 4ce Why do you not show all curves for all cell types?
Figure 4d: Hexagons? Maybe stars?
p.10
last paragraph of the main text is just a repetition of the previous paragraphs
Reviewer #2 (Recommendations for the authors):
– The authors consider a regime where the area is equal to the target or preferred area of the cells and the area part of the free energy can be neglected. Can the validity of this approximation be made more quantitative? A naive guess would be that the modulus \λ_a is very large but in the end the authors calculate fluctuations of the area due to shape fluctuations and they would cost a large energy which is not taken into account. In the vertex model, as explained by the authors there is a transition between a soft solid and a hard solid behavior and the transition only depends on the ratio between the perimeter and the square root of the area. Which regime do the authors consider?
– The authors define the tissue by an energy which is used in all the models that they quote. In my mind this energy is an efficient way to write that the forces in the epithelial layer are balanced. However, they then use a Boltzmann Boltzmann statistics with a well defined temperature for the configuration of the tissue. This temperature seems to play an important role as it enters the expression for the parameter \α. Is this equilibrium statistics also used in the simulations? In my mind, a tissue is an intrinsically nonequilibrium system and I do not see how the equilibrium distribution can be justified. The authors also claim that the results are not changed if cell division is considered, which is difficult for me to understand. The manuscript would be much stronger if the authors were discussing this effective equilibrium assumption in details.
– Another approximation is made in appendix A which replaces the square of the perimeter by the sum of the squares of the lengths of the edges of the cell. I understand that this strongly simplifies the calculations. Could the authors make more quantitative this approximation which is only justified by an inequality. How accurate is it?
– On more general grounds, the results obtained by the authors are interesting but do they have any biological relevance? Could the authors point out a biological phenomenon for which cell shape fluctuations would play an important role?
https://doi.org/10.7554/eLife.76406.sa1Author response
Essential revisions:
Reviewer #1 (Recommendations for the authors):
I do not have any recommendations for further analysis or computation. However, the way the manuscript is written is rather poor. The authors often use jargon and, at places, the text is repetitive and confusing. Below I give some examples that might help the authors to make their work more accessible to the broad readership of eLife.
We sincerely thank the reviewer for the insightful comments, appreciation of the importance of the work, and suggestions to make the manuscript accessible to a broader audience. We have revised the manuscript accordingly and removed the repetitions. We hope the reviewer and the editor finds the revised manuscript accessible to a broad audience. Here we briefly respond to the comments of the reviewer.
The authors very often switch between AR and r for denoting the aspect ratio. The figures sometimes, but not always contain both notions. I would strongly suggest introducing one symbol only and using it throughout the manuscript.
Thanks, we have now used the notation r to denote the aspect ratio.
2nd paragraph is very long and partly repetitive.
We have now shortened this paragraph and removed the repetitions.
p.2
"In a jammed granular system, the statistics of tessellated volume, x, follows a kΓ distribution," This sentence is not very biologist friendly.
We have expanded this sentence, explaining what are these volumes: “In a jammed system, one can carry out …. by a single parameter k [41].”
"Almost universal" Other expressions used are "nearly universal", "virtually universal". Neither of them is welldefined and it is not clear if they all refer to the same feature. If yes, use only one notion. Also, try to make it scientific and not just intuitive.
We thank the reviewer for this suggestion. They do refer to the same feature and we now use the term “nearly universal” alone. We have also explained what it means (Page 5, left column): “the αdependence becomes so weak … independent of α.”
"The existing theoretical works on cellular shapes are based on the elasticity framework." Again, hard to understand for a biologist.
We have now rephrased this sentence explaining the elasticity framework: “The existing theoretical works … when external force is relaxed.” (Page 2, first column).
p.3
On this page there a several jumps in the presentation. Try to streamline.
Thanks for pointing this out. We have now made it more contiguous by providing the details, either in the Appendix or within the main text itself. We have also streamlined the presentation by providing more details in the Materials and methods Section and including additional details in the Appendix.
"One crucial aspect of these systems" Which ones?
We have explicitly mentioned the models now: “One crucial aspect of these model systems (such as the VM, the CPM, or the Voronoi model) of epithelial monolayer is the constraint of confluency”.
"the cortex, a thin layer of cytoplasm" Better: the cortex is a thin actomyosin layer
We have rephrased it as “First, a thin actomyosin layer, known as cortex, mainly governs the cellular mechanical properties”.
"we mostly focus our discussions below within the CPM " but then "cells are compact objects (in contrast to being fractallike in the CPM simulation [30])" Can you clarify the compact vs fractal nature of the cells in the CPM you use?
There are two regimes of these models, the system behaves like soft and hard solids in these two regimes. We mainly focussed on the CPM, because from the theoretical perspective, this model has a lot of advantage and easier to understand many of the complex behaviours. Within the CPM, the cells look like fractal objects and compact objects, respectively in these two regimes. We have added a subsection in Appendix C, titled “Two different regimes”, explaining this behaviour (page 15, last subsection of Appendix 2).
"we first need to obtain the two radii of gyrations, s1 and s2, around the two principal axes." You should probably explain 'radius of gyration' for biologists as well as their relation to cell area.
We have added a section, “Radius of gyration”, in the Materials and methods Section with the various definitions of radii of gyration and their relation with the area (Page 13, Section IV B).
"as detailed in the Appendix B, Equation (??), presented in the Appendix," Number is missing.
Thanks for pointing this out, we have fixed it in the revised manuscript.
"Since μ is related to the constraint of confluency" Here Ii was confused, because you consider an isolated cell, right?
We apologise for this confusion. Actually, there are two different calculations presented in this work: the distribution of aspect ratio and the distribution of the cell area. The first one is the main result of the work, but the second result is also important as it connects the derivation to several other problems.
Our interest is the cellular aspect ratio in a confluent monolayer. But the constraint of confluency is an extremely hard mathematical problem and no exact results exist till date to treat it analytically. Motivated by our simulation results, we have argued that the constraint of confluency is not crucial for the aspect ratio distribution. We have also verified this in our simulations. This slightly simplifies the derivation of P(r).
But, for the distribution of cellular area, the constraint of confluency is crucial. Equation
(18) is valid for a general system. We then included this constraint in our analytical derivation following Weaire et al. (Ref. 67) and obtained the distribution of cell area.
We have clarified this in the text in section Materials and methods, IV C.
p.4
"Note that the power 3/2 of the algebraic term in Equation (3) comes from the mathematical property of closedlooped objects." Again, this requires more explanation for biologists.
We have detailed this point in the Appendix A. In the main text (page 4, second paragraph), we have now stated the property that leads to this value and has referred the reader to the appendix: “That is, for closedloop objects, …. As detailed in Sec. IV A”.
Figure 1d you could maybe show a loglog plot.
We have now shown the plot in loglog scale in the inset of Figure 1(d). p.5
p.5
"Figure 3g" It seems to be premature to refer to this figure here.
We have removed the reference to Figure 3(g) here.
"Finally, we show" Where?
Thanks, we have now added the reference [Figure 1(f)].
"although they follow the same distribution, they need not be identical" What are 'they'? Also, I find it hard to understand this sentence.
We have now clarified this sentence: “The PDFs of a has been argued… they are not identical”.
"we have simulated the confluent systems with varying λ_A and find that the AR distribution remains almost independent of λ_A" Can you show the distribution?
We have now included these distributions in the supplementary figure, Figure 2 —figure supplement 2 and added in the main text: “(the distributions are shown in Figure 2 —figure supplement 2).”
"as T changes, the PDF of a, though welldescribed by a single parameter μ via Equation (4), can still be different" Yo mean the values of \mu can be different?
Yes. We have now rephrased this sentence for better clarity: “… by the same function, Equation (4), …. need not be identical”.
Figure 4ce Why do you not show all curves for all cell types?
We are a bit confused with this question. We do have the data corresponding to all the cell types mentioned in the figure.
On the other hand, if the reviewer means plotting the lines for all the cell types: we represented most data with symbols as plotting the lines for all the cell types makes the figure extremely hard to understand.
Figure 4d: Hexagons? Maybe stars?
Thanks for pointing this out. Actually these symbols are called Hexagrams that we incorrectly wrote Hexagons, we have corrected this in the revised manuscript.
p.10
last paragraph of the main text is just a repetition of the previous paragraphs
We have now removed the repetitions. The last paragraph now summarises the achievements of the work.
Reviewer #2 (Recommendations for the authors):
– The authors consider a regime where the area is equal to the target or preferred area of the cells and the area part of the free energy can be neglected. Can the validity of this approximation be made more quantitative? A naive guess would be that the modulus \λ_a is very large but in the end the authors calculate fluctuations of the area due to shape fluctuations and they would cost a large energy which is not taken into account. In the vertex model, as explained by the authors there is a transition between a soft solid and a hard solid behavior and the transition only depends on the ratio between the perimeter and the square root of the area. Which regime do the authors consider?
We thank the reviewer, Prof. Joanny, for the insightful comments, questions, and suggestions. We have included all the suggestions in the revised manuscript. Here, we briefly respond to the comments and questions.
There are two distinct analytical results in the paper: the distributions of aspect ratio and cell area. We have argued that the area constraint (or the constraint of confluency) is not crucial for the aspect ratio distribution, but it is extremely essential for the distribution of the area. As detailed below, we have made this argument more quantitative with additional results in the revise manuscript.
As we discussed in the main text, this approximation is motivated by our simulation results:
– First, as shown in Figure 2(c), the area distribution is sharply peaked around the average value. As the reviewer correctly points out, the sharpness of this distribution grows as λ_{A} increases.
– Second, we find that even at moderate values of λ_{A} (=0.1 — 0.8), where the area distribution is not a δfunction, the value of α, which determines the aspect ratio distribution, does not change (Figures 2b). We now show these distributions as well in the new supplementary Figure 2 —figure supplement 2.
– These results are consistent with the expectation that aspect ratio, being a dimensionless parameter, should be independent of the actual value of cellular area. Our simulation results, presented in Figure 2(b), and the additional results presented in the supplementary figures, Figure 2 —figure supplement 1 and Figure 2 —figure supplement 2, show that the area fluctuation does not significantly affect the shape fluctuation. Therefore, we are justified to neglect the area term in the energy, Equation (1), to obtain the aspect ratio distribution.
– On the other hand, as you also pointed out, for the other result, that is the area fluctuation due to shape fluctuation, there is indeed a large cost. We have taken this into account in an indirect manner (the discussion between Eqs. (20) and (21) in Sec. IV C), considering the very strong geometric constraint, that is the constraint of confluency. As you know, this is a hard mathematical problem and no exact results exist. We have used the results of Weaire et al. [Ref. 67] in the form of a constraint. This is a phenomenological approach, our analysis for the area distribution can be viewed as a support of this approach.
– We fixed the target perimeter, P_{0}, in our simulations, except for the data in Figure 2(h), such that the system is in lowP_{0} regime. We believe that the lowP_{0} (hardsolidlike) regime is more relevant experimentally. But, as shown in Figure 2(h), the theory seems to capture the variation with P_{0} as well. We have now highlighted these points in the revised manuscript, and have specifically added a subsection (Two different regimes) in Appendix 2 discussing this.
– The authors define the tissue by an energy which is used in all the models that they quote. In my mind this energy is an efficient way to write that the forces in the epithelial layer are balanced. However, they then use a Boltzmann Boltzmann statistics with a well defined temperature for the configuration of the tissue. This temperature seems to play an important role as it enters the expression for the parameter \α. Is this equilibrium statistics also used in the simulations? In my mind, a tissue is an intrinsically nonequilibrium system and I do not see how the equilibrium distribution can be justified. The authors also claim that the results are not changed if cell division is considered, which is difficult for me to understand. The manuscript would be much stronger if the authors were discussing this effective equilibrium assumption in details.
The confluent models used in this work are wellestablished models for epithelial monolayers. These models have two distinct variants: depending on the presence or absence of activity. When there is no activity, the models are in equilibrium. The nonequilibrium models can have several interesting complex behaviours, as shown by you in the recent PRL paper (Ref. 73). But, the equilibrium models are also significant and capture the key static and dynamic properties of such systems, as shown by several groups including Jülicher, Fredberg, etc (Refs. 31, 52, 56, 65, etc.). How far can these equilibrium models represent a cellular system is an important question that we did not address in this work. This is also an important question in general. The role of activity is also significant, but for a cleaner understanding, we have not considered the effects of activity in the current paper. One crucial question we are currently exploring is whether an effective equilibrium behaviour is possible even under activity, similar to what we found in some other contexts of glassy dynamics. In fact, a single particle effective temperature description seems to capture many aspects in a dense glassy system [Nandi and Gov, Soft Matter (2017)]. We intend to explore this question in the future.
In the current work, in the absence of cell division and apoptosis, we are justified to use the Boltzmann distribution, as the temperature in these models has the same status as an equilibrium temperature. We did not consider any Boltzmann distribution in our simulations, but we now show in the additional data ( Appendix 1) that this is a reasonable approximation.
For the second point, we are sorry for the confusion. It is not yet clear to us how to develop an analytical description in the presence of celldivision and apoptosis, one possibility is to include these rates following this work: https://www.pnas.org/doi/ 10.1073/pnas.1011086107. We are also trying to apply the non equilibrium approach that you developed in the recent PRL (Ref. 73). Our point in the current manuscript is simply the following: when the rates of these processes are small, the basic characteristic of the system should remain similar. The analytical derivation suggests that the aspect ratio distribution will have two different parts: the algebraic part is determined by the geometric property that cell perimeter is closedlooped.
This cannot change. So, all system specific details should enter via α alone. What we showed in the manuscript is that this expectation seems to be reasonable when we include these processes in our simulations, with their rates being not very large. We have now clarified this in the manuscript (Page 9, Column 2, second paragraph): “We have neglected …. processes are not dominant (Appendix 4).”
– Another approximation is made in appendix A which replaces the square of the perimeter by the sum of the squares of the lengths of the edges of the cell. I understand that this strongly simplifies the calculations. Could the authors make more quantitative this approximation which is only justified by an inequality. How accurate is it?
Thanks for raising this question. This is indeed an important trick in our analytical derivation. We have now discussed this approximation in detail in Appendix 5 justifying this approximation. We have also discussed this point in Sec. IV A, Page 11: “Note that this inequality … integral slightly easier.”
– On more general grounds, the results obtained by the authors are interesting but do they have any biological relevance? Could the authors point out a biological phenomenon for which cell shape fluctuations would play an important role?
We believe these results can pave the way to understand pattern formation in a new light. Apart from providing explanations for several existing experimental results, the analytical result is important as we can now formulate a theoretical framework to understand the mechanochemical pattern formation in confluent epithelial tissues in terms of this analytical result. Many past works have shown that cellular aspect ratio is directly correlated with biological functions, such as differentiation and division. But all these studies are either at the single cell level or in terms of the averages in a monolayer. Having the analytical distribution of the aspect ratio, we can now develop a statistical description of the monolayer. For example, we are now studying how to couple the model with different gradients of morphogens (such as the wingless and Dpp in the Drosophila wing disc). Having such a description will then allow us to study the pattern formation in such systems.
We have included a brief discussion of this aspect in the conclusion (Pages 910): “Pattern formation in a biological system … in the Drosophila wing disc.”
https://doi.org/10.7554/eLife.76406.sa2Article and author information
Author details
Funding
Department of Atomic Energy, Government of India (RTI 4007)
 Souvik Sadhukhan
 Saroj Kumar Nandi
Science and Engineering Research Board (SRG/2021/002014)
 Saroj Kumar Nandi
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
We thank Satyabrata Bandyopadhyay, Mustansir Barma, Tamal Das, Manish Jaiswal, Sritama Datta, Basil Thurakkal. and Sumedha for many enlightening discussions and Sam A Safran for comments. We acknowledge the support of the Department of Atomic Energy, Government of India, under Project Identification No. RTI 4007 and the computational facility of TIFR Hyderabad. We also thank SERB for grant via SRG/2021/002014.
Senior Editor
 Aleksandra M Walczak, CNRS LPENS, France
Reviewing Editor
 Karsten Kruse, University of Geneva, Switzerland
Reviewer
 JeanFrançois Joanny, Institut Curie, CNRS UMR168, France
Version history
 Preprint posted: August 21, 2021 (view preprint)
 Received: December 15, 2021
 Accepted: December 22, 2022
 Accepted Manuscript published: December 23, 2022 (version 1)
 Version of Record published: January 11, 2023 (version 2)
Copyright
© 2022, Sadhukhan and Nandi
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 1,762
 Page views

 239
 Downloads

 1
 Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading

 Developmental Biology
 Neuroscience
Development of the nervous system depends on signaling centers – specialized cellular populations that produce secreted molecules to regulate neurogenesis in the neighboring neuroepithelium. In some cases, signaling center cells also differentiate to produce key types of neurons. The formation of a signaling center involves its induction, the maintenance of expression of its secreted molecules, and cell differentiation and migration events. How these distinct processes are coordinated during signaling center development remains unknown. By performing studies in mice, we show that Lmx1a acts as a master regulator to orchestrate the formation and function of the cortical hem (CH), a critical signaling center that controls hippocampus development. Lmx1a coregulates CH induction, its Wnt signaling, and the differentiation and migration of CHderived Cajal–Retzius neurons. Combining RNAseq, genetic, and rescue experiments, we identified major downstream genes that mediate distinct Lmx1adependent processes. Our work revealed that signaling centers in the mammalian brain employ master regulatory genes and established a framework for analyzing signaling center development.

 Developmental Biology
 Evolutionary Biology
Cephalochordates and tunicates represent the only two groups of invertebrate chordates, and extant cephalochordates – commonly known as amphioxus or lancelets – are considered the best proxy for the chordate ancestor, from which they split around 520 million years ago. Amphioxus has been an important organism in the fields of zoology and embryology since the 18^{th} century, and the morphological and genomic simplicity of cephalochordates (compared to vertebrates) makes amphioxus an attractive model for studying chordate biology at the cellular and molecular levels. Here we describe the life cycle of amphioxus, and discuss the natural histories and habitats of the different species of amphioxus. We also describe their use as laboratory animal models, and discuss the techniques that have been developed to study different aspects of amphioxus.