I. Introduction

Due to their structural complexity, proteins can inter-act in different ways, leading to coexisting phases or assemblies such as fibers and aggregates. Long-lived assemblies are often kept together by strong adhesive forces, with binding free energies ranging from 9 kBT in the case of insulin dimers [1], over 2.5 kBT per beta sheet in amyloid fibers, to the 0.9 kBT per beta-sheet in the formation of assemblies of specific FUS segments called low-complexity aromatic-rich kinked segments [2]. Weak interactions are often responsible for the separation into liquid phases, each of distinct molecular compositions. The interaction free energies associated with the formation of P granules via phase separation in living cells are about 0.5 kBT per molecule [3]. The biological function of both assemblies and phase-separated compartments relies on the recruitment of specific biomolecules such as proteins, RNA or DNA [47]. Since assemblies and condensed phases can adhere to membrane surfaces, both not only mediate mechanisms for sorting and transport of molecules [8] but also affect the composition, shape and properties of intra-cellular surfaces [912].

Despite these similarities, molecular assemblies and coexisting phases also exhibit crucial differences. While the size of a condensed phase at equilibrium increases with the size of the system [13], this is not necessarily the case for molecular assemblies [1416]. Moreover, the assembly kinetics tends to an equilibrium characterised by assemblies of different sizes [1416], while condensed phases equilibrate the physico-chemical properties such as temperature, pressure and chemical potential between the spatially separated phases [13]. These differences suggest a rich interplay in a system where the molecular constituents can both oligomerise forming assemblies and give rise to coexisting phases [1722].

In the last years, the interplay between phase separation and assembly formation has been the focus of many experimental efforts. Different proteins capable of forming condensed phases were shown to form oligomers below the saturation concentration [23, 24]. The authors proposed that such oligomers affect the phase separation propensity, however, the detailed mechanism remains elusive. Moreover, several experimental studies of protein phase separation indicate that proteins in the dense phase are linked, reminiscent of a physical gel [2527]. Molecular simulations were performed that aimed at the sequence-specific origin of such phenomena [2831]. However, even in elegantly coarse-grained simulation approaches, the large number of parameters makes it difficult to extract the general, mechanisms across different proteins. To develop an understanding of such general mechanisms that underlie the interplay between phase separation and molecular assembly, a theoretical frame-work that relies on thermodynamic principles is lacking.

While the theory of phase separation of a low number of different components [13, 32], as well as the formation of molecular assemblies in dilute environments [14, 33, 34], are well developed, only a few works addressed assembly formation beyond the dilute limit, where assemblies can form and also phase separate. For example, it has been shown that, in the presence of coexisting phases, the assembly size distributions at equilibrium can vary in the two phases and that the dense phase can gelate [3538]. These studies account for the scaling of the internal free energies of assemblies with their size but neglect the size dependence of the interaction propensities. Moreover, a discussion of the coupled phase separation and assembly kinetics is lacking.

Other authors focused on systems composed of a scaffold component, that drives phase separation, and study the dilute assembly kinetics of a second component that can interact with the scaffold [3942]. In these works, the assemblies are considered to be dilute and the feedback of the assembly kinetics on the phase-separated compartment is neglected.

In this work, we introduce a framework that unifies the thermodynamic theories for phase separation with the theories developed for the formation of micelles and molecular assemblies at dilute conditions. We present two classes of size-dependent interactions that are inspired by biologically relevant proteins. Our theory is able to reproduce results observed in recent experimental studies, such as the emergence of anomalous size distribution below saturation and the gelation of condensed phases above saturation, and characterise for which class and parameter values these phenomena manifest. Furthermore, we propose a non-equilibrium thermodynamic theory for the kinetics of molecular assembly at non-dilute conditions which can lead to macroscopic, condensed phases above the saturation concentration. The complexity of our theory is reflected in a high dimensional phase space that is set by the number of differently sized assemblies. We developed efficient numerical schemes to investigate the kinetics of such systems for the case where diffusion is fast compared to assembly kinetics. In particular, we study how condensates, initially formed via the phase separation of monomers from the solvent change in response to the formation of assemblies. Our unified theory could be key to interpreting and understanding recent observations of protein condensation in vitro [43], in the cell cytoplasm [24, 27, 44, 45].

II. Assembly and phase equilibria

We begin by reviewing the equilibrium theory of multi-component mixtures composed of solvent (s) and monomers (i = 1) that can form assemblies composed of i monomers, see Fig. 1a. In the case when monomers and assemblies are dissolved in the solvent, the free energy density of the solution can be written as [35, 37, 46, 47]:

where ρi = vi/vs are the relative molecular volumes, vi is the molecular volume of assembly of size i, and vs is the solvent molecular volume. The solvent volume fraction can be expressed as a function of the assembly volumefractions via . The first and fourth terms in Eq. (1) are the mixing entropies. The second and fifth terms of fsol characterize the internal free energies. Here,

Illustration of assembly reaction scheme and classification.

a Illustration of the chemical reaction network associated with the formation of assemblies Ai with size i. b Identification of three classes based on assembly dimension: d=1,2,3. c Classification of assemblies based on the scaling of their Flory-Huggins interaction propensity.

ωs denotes the internal free energy of the solvent, and ωi are the internal free energies per monomer of an assembly of size i. Note that we chose to keep ϕii in the logarithm argument instead of reabsorbing the linear term −ϕi ln(ρi) i in the internal free energies ωi. With this choice, ωi, depends only on bond free energies, see Appendix B, Ref. [46], and the recent overview in the SI of Ref. [41]. The third and last terms in Eq. (1) capture the interactions of monomers belonging to different assemblies and with the solvent, where χij is the corresponding interaction parameter. The exchange chemical potentials of monomers belonging to an assembly of size i reads

Assembly equilibrium

Assemblies can grow and shrink via association and dissociation. Such transitions among assemblies of different sizes are reminiscent of chemical transitions, see Fig. 1a. The condition of chemical equilibrium reads [16]:

where μi is the exchange chemical potential of monomers belonging to an assembly of size i; see Eq. (2). Using the free energy Eq. (1) and the equilibrium conditions Eq. (3), we can express the volume fraction of the assembly of size i as a function of the monomer volume fraction ϕ1 in the following form:

The equation above together with the conservation of monomers

allows us to rewrite the volume fraction ϕi of each assembly of size i, as a function of the conserved quantity ϕtot. This relation ϕi = ϕi(ϕtot) has an analytical expression in the case d = 1, see Eq. (C1) and Eq. (C3) in Appendix C.

Phase equilibrium

Two phases in an incompressible, multi-component system are at phase equilibrium when the chemical potentials μi and the osmotic pressure balance in each phase [13, 48]:

where the superscripts I and II indicate the ϕtot-rich and II the ϕtot-poor phase, respectively. Thermodynamic equilibrium. Our system is at thermodynamic equilibrium when assembly and phase equilibrium hold simultaneously. The conditions above for phase equilibrium can thus be rewritten using ϕi (ϕtot) (Eq. (4)). In particular, the free energy density Eq. (1) can be recasted in terms of the conserved variable, ϕtot [49, 50]. The phase diagram of the system can be then obtained via the common tangent construction (i.e., Maxwell construction). This construction corresponds to the balance between the exchange chemical potentials and the osmotic pressure in both phases, see Chapter 2 in Ref. [49, 50]:

III. Scaling of molecular volumes, internal free energies and interaction energies with assembly size

The composition of the phase-separated compartments and the size distributions of the assemblies in each phase will depend on the scaling form of the key parameters of the model: the relative molecular volumes (ri), the internal free energy of assemblies (ωi), and the interaction energies of assemblies among themselves (χij), and with the solvent (χis). For simplicity, we choose ρi = i for the results shown in this work. In Appendix B, we derive the scaling relationships for the internal free energies of linear (d = 1), planar (d = 2) and three-dimensional (d = 3) assemblies:

Here, ω = limi→∞ ωi is a constant that does not affect chemical nor phase equilibrium, except in the limit M → ∞, which will be discussed later. Moreover, eintsintT, is the free energy of an internal bond that keeps each assembly together, which can be separated into an enthalpic and an entropic contribution, eint and sint, respectively. How bond energy affects phase separation is discussed in Appendix E.

For the scaling of interaction energies χij and χis, we introduce two classes inspired by biologically relevant classes of proteins that can form assemblies and phase separate:

  1. Class 1: Constant assembly-solvent interactions.

    This class corresponds to the case where each monomer, independently of the assembly it is part of, interacts equally with the solvent χis = χ, see Appendix B. Moreover, monomers in assemblies of different sizes interact equally with each other, implying that the corresponding Flory-Huggins parameter χij vanishes:

    This class is inspired by biologically relevant proteins for which the oligomerization domains are well separated along the protein from hydrophobic phase separation domains. In this case, when monomers form an assembly, their phase separation domains remain exposed, leading to a monomer-solvent interaction that does not depend on assembly size. Examples belonging to this class include synthetic constructs like the so-called ‘Corelets’ [51], realised tethering intrinsically disordered protein fragments to oligomerizing domains [51], and proteins like NPM1, whose N-terminal oligomerization domain (that allows for the formation of pentamers) is considered to be separated from the disordered region (responsible for phase separation) and the RNA binding domain [52, 53].

  2. Class 2: Size-dependent assembly-solvent interactions

    This class describes the case where monomers in the assembly bulk and monomers at the assembly boundary have different interaction propensities with the solvent (χ and χ respectively, see Appendix B for details). Similar to class 1, monomers in assemblies of different sizes interact equally with each other, leading to

    This class corresponds to the general case in which the oligomerization domains of protein overlap with the phase separation domains. This case applies to segments of the intrinsically disordered region of the protein FUS, for example. In fact, recent experiments have shown the formation of assemblies in solutions containing specific FUS domains, called low-complexity aromatic-rich kinked segments (LARKS) [2, 54]. Strikingly, it was shown that hydrophobic domains along LARKS were buried in the formation of these assemblies and the author could quantify the hydrophobic area buried upon assembly formation. Another example could be Whi3, since it has been recently found that mutation that enhances oligomerization strength, lowers the density of Whi3 in the RNP condensates [45], suggesting that the formation of assemblies could screen Whi3 phase separation propensity.

IV. Assembly size distributions below and above saturation

We first consider systems that are spatially homogeneous and composed of linear assemblies (d = 1). Homogeneity can be realized in dilute solutions if the total protein volume fraction ϕtot is below the saturation con-centration of phase separation (for a definition see Sec. II). Homogeneous systems governed by Eq. (4) at equilibrium, obeying the conservation Eq. (5), exhibit two limiting behaviours depending on the value of the conserved variable ϕtot. We define the assembly threshold ϕ(T), that separates these two behaviours, as the value of ϕtot for which the maximum of ϕi corresponds to monomers

Indeed, for ϕtotϕ the size distribution of linear assemblies (d = 1) is dominated by monomers (ϕ1ϕtot) while larger assemblies have vanishing volume fraction. For higher volume total volume fractions (ϕtotϕ), the monomer concentration saturates at ϕ1ϕ and bigger assemblies start to populate the mixture. Above ϕ, the distribution becomes peaked at a value imax > 1 and then exponentially decays for larger i; see Fig. 6 in Appendix B. Both the maximum and the average of the distribution ϕi scale with indicating that as ϕtot is increased larger and larger assembly populate the system; see Appendix C for a detailed discussion for Class 1.

Now we consider systems that can phase separate. As outlined in Sec. II, at assembly equilibrium, we can recast the free energy as a function of the conserved variable ϕtot by using Eq. (4). For sufficiently large assembly-solvent interaction parameters χ and χ, the system can demix into two phases with different total volume fractions and , which are the solutions of Eq. (7). By means of , we can calculate the whole assembly size distribution in the two phases, i.e., ϕI/II, via Eq. (4) and Eq. (5).

We first discuss linear assemblies belonging to class 1, in the regime of high assembly strength −eint≫ 1; see Fig. 2a-c. In Fig. 2a, we show the corresponding phase diagram as a function of ϕtot and the rescaled temperature T/T0 with T0 = χ/kB. The domain enclosed by the binodal corresponds to phase separation. As indicated by the colour code (depicting the monomer fraction ϕ1tot) each point in the diagram can have different assembly composition. In green we plot the assembly threshold ϕ(T), at which intermediate-sized assemblies start to appear. Note that, with this choice of parameters, the assembly threshold precedes in ϕtot the dilute branch of the binodal. We stress that, for d = 1, crossing the assembly threshold does not lead to a phase transition since, in contrast to crossing the binodal, it is not accompanied by a jump in the free energy or its derivatives. We can now define regions corresponding to qualitatively different phase and assembly behaviour. In particular, starting from a homogeneous system composed of monomers only (region “i”), increasing ϕtot leads to the emergence of intermediate-sized assemblies (region “ii”). Increasing ϕtot further, the system demixes into two phases both of which are rich in intermediate assemblies (region “iii”). Representative size distributions and illustrations of the state of the systems in the different regions are shown in Fig. 2b and Fig. 2c, respectively. For parameter values see Table I in Appendix A. This analysis showcases the potential of this framework to describe the appearance of mesoscopic clusters below the saturation concentration, as recently observed experimentally in Ref. [24].

Parameters corresponding to the figures in the main text. We made use of the temperature scale T0 = χ/kB.

Phase diagram and assembly size distributions for different classes and assembly strengths.

a Phase diagram as a function of ϕtot and rescaled temperature T/T0 (with T0 = χ/kB) in the regime of high assembly strength, i.e. − eint ≫1. The green line is the volume fraction threshold ϕ(T) at which intermediate-sized assemblies start to appear, which in this regime precedes the binodal (coloured curve). As indicated by the colour code, the monomer fraction ϕ1tot mildly varies in the two phases. b Size distributions and c pictorial representations corresponding to different regions of the phase diagram, defined by the relative position of the binodal and the assembly threshold. In region “i”, the system is homogeneous and composed of monomers only. Increasing the total volume fraction of assemblies ϕtot beyond the assembly threshold ϕ, the system enters region “ii” where intermediate assemblies appear. Here, the sizes corresponding to the maximum and the average of the distribution ϕi scale with , see Appendix C. Finally, once ϕtot exceeds the binodal, the system enters region “v” and demixes in two phases, both rich in intermediate assemblies. In d-f we focus on the low assembly strength regime, i.e. − eint/χ∼ 1. In phase diagram d, the binodal now precedes in ϕtot the assembly threshold. e In region “iv”, the system phase separates but in both phases monomers dominate the size distribution, while in region “v” the dense phase becomes populated by intermediate-sized assemblies. Progressively lowering the temperature allows switching between these regions, as depicted in f. g,h Behaviour of dilute mixtures as a function of assembly strength, for the two different classes. Notably, assembly below saturation becomes much more accessible for class 2, as can be seen by comparing the green regions “ii” in g and h.

Remaining within class 1, we now discuss the case of low assembly strength − eint ∼1; see Fig. 2d-f. The interception between the binodal and the assembly threshold ϕ∗; defines two new regions, “iv” and “v”, see Fig 2 d. In particular, in region “iv” both binodal branches lie below the assembly threshold, resulting in monomers dominating both coexisting phases, see Fig 2e, centre. On the other hand, in region “v” the dense phase exceeds the assembly threshold, resulting in phases with dramatically different compositions: the dilute phase is populated only by monomers while intermediate-sized assemblies develop in the dense phase, see Fig 2e right. In Fig 2f, we illustrate states corresponding to fixed ϕtot and decreasing temperature T . Starting from a homogeneous monomeric state, region “i”, the system transitions into a demixed state with monomers dominating both phases, region “iv”, and finally to a demixed state with larger assemblies abundant in the dense phase, region “v”.

We now highlight the differences between the two classes defined in Sec. III. In particular, we characterise how mixtures behave with increasing ϕtot, varying the assembly strength eint but keeping the temperature T fixed. In particular, for class 1, the emergence of assemblies before saturation typically occurs for a very narrow interval of volume fractions, see the green region labelled with “ii” in Fig 2g. Strikingly, for class 2, assembly below saturation are more favoured; see again region “ii” in Fig 2h. This difference arises because, within class 2, monomers in the bulk of an assembly have reduced interaction propensity with respect to the boundary ones. As a consequence, the formation of large clusters shifts the onset of phase separation to higher ϕtot values.

V. Gelation of the dense phase

In this section, we discuss the case of planar (d = 2) and three-dimensional assemblies, (d = 3), referring for simplicity to systems belonging to Class 1. In this case, as shown in Appendix D, even when neglecting protein solvent interactions (χ = 0), the system can undergo a phase transition in the thermodynamic limit M→ ∞ . In fact, above the volume fraction ϕsg (definition see Eq. (D2)), we observe the emergence of a macroscopic assembly occupying a finite fraction of the system volume that contains a macroscopic fraction of all monomers in the system; a behaviour reminiscent of Bose-Einstein condensation, see for example Chapter 7.3 of Ref. [37]. Since we do not explicitly include the solvent in assembly formation (see reaction scheme in Fig. 1a, we will consider the gel as a phase without solvent and thus ϕtot = 1.

We now focus on systems that phase separate as the result of interactions with the solvent (χ ≠= 0 in Eq. (9)) and discuss the interplay between phase separation and gelation. Volume fractions in the coexisting phases are determined by Eq. (7) and assembly equilibrium requires that Eq. (3) is satisfied. As pointed out in Sec. II, we aim to find an expression for ϕi(ϕtot) via Eq. (3) and Eq. (5), and then substitute it into the free energy Eq. (1). However, for planar (d = 2) and three-dimensional assemblies, (d = 3), performing the thermo-dynamic limit M→ ∞ leads to a free energy composed of series that cannot be analytically calculated. We know that this is a consequence of the gelation transition, and this limitation can be dealt with by introducing explicitly the infinite-sized gel in the free energy. For this reason, we write the system free energy as a composition of the solvent free energy fsol and the gel free energy fgel:

where fsol is defined in Eq. (1). The gel free energy reads

with δ( ) denoting the delta distribution. The gel free energy fgel is the free energy of a state with no solvent, where all monomers belong to an assembly of size i→ ∞ . In fact, in the limit ϕi = 0 for all finite i and ϕtot = 1, the free energy in Eq. (1) simplifies to ω/v1. For a detailed discussion of the role of ω, see Appendix D

We can now perform a Maxwell construction by using Eq. (12) in Eq. (7). The resulting phase diagram is displayed in Fig. 3a, where the binodal is coloured by the monomer fraction ϕ1tot in the coexisting phases. In phase-separated systems, gelation can be considered as a special case of phase coexistence between a dilute phase (“sol”), in which ϕsol < 1, and the gel phase, corresponding to ϕgel = 1. The domain in the phase diagram where a gel phase coexists with a soluble phase is shaded in blue and labelled as “sol-gel” in Fig. 3a. In the same panel, we show that lowering the temperature for large ϕtot leads to a transition from the homogeneous state to the sol-gel coexistence. By contrast, for intermediate volume fractions, the system transits first through a domain corresponding to two-phase coexistence; see light blue domain labelled as “sol-sol” in Fig. 3a, where ϕtot < 1 in both phases. In Fig. 3b, we show assembly size distributions representative of the “sol-sol” and “sol-gel” regions. The transition from the “sol-sol” to the “sol-gel” region is accompanied by a jump in the dense phase total volume fraction , see Fig. 3c for an illustration.

Gelation transition in phase-separating systems.

a Phase diagram for planar (d=2) and three-dimensional (d=3) assemblies in the limit M → ∞, as a function of ϕtot and the rescaled temperature T/T0 (with T0 = χ/kB). The coloured curve represents the binodal associated with the free energy f, which accounts for the emergence of an infinite assembly. The colour code of the binodal line depicts the monomer fraction ϕ1tot in the phases. In the region labelled as “sol-sol”, the system demixes into two phases both populated mainly by monomers, see panel b, with . In the region labelled as “sol-gel”, on the other hand, a phase (the “sol”), obeying , coexists with a phase (the “gel”) that is a macroscopic assembly, containing no solvent ). The latter scenario is represented in panel b, right side. c Lowering the temperature allows transitions from the “sol-sol” to the “sol-gel” region, which manifest with a jump in the total volume fraction of the dense phase.

VI. Kinetic theory of assembly at phase equilibrium

Building upon the thermodynamic framework discussed in the previous sections, we devise a non-equilibrium kinetic theory for molecular assembly at non-dilute conditions, where the interactions can give rise to coexisting phases. Here, we restrict ourselves to the case where each phase is homogeneous and at phase equilibrium but not at assembly equilibrium [56], i.e., Eq. (6) is fulfilled during the kinetics while Eq. (3) is not satisfied in general. This partial equilibrium holds when the molecular transitions among assemblies are slow compared to phase separation. This case is often referred to reaction-limited [57, 58] and applies particularly well to molecular assemblies involving biological enzymes [59]. For simplicity, we present the kinetic theory and discuss the results for two coexisting phases.

We tailor the concepts developed in Ref. [56] to the case of incompressible systems, dvi/dt = 0 and dvs/dt = 0, and volume conserving assembly kinetics, where denotes the assembly rate of assembly i in each phase. In this case, the total volume V is constant, dV/dt = 0, with V = V I + V II, and the volume fractions of the assembly of size i, , is governed by

and the solvent volume fraction in each phase given as with .Moreover, denote the diffusive exchange rates between the phases. The last term in Eq. (14) accounts for variations in volume fractions due to the changes of the respective phase volumes V I/II. The kinetics of phase volumes follows

Moreover, mass is conserved at the interface implies the diffusive exchange rates of assemblies

and solvent . Thus, the assembly kinetics conserves the otal volume fraction defined as . The exchange rates are determined by the conditions that maintain phase equilibrium, and , where are the exchange chemical potentials of the monomers in an assembly of size i (Eq. (2)), and Π the osmotic pressure; for more information, see Appendix F. Using our kinetic theory, we can study the relaxation toward thermodynamic equilibrium which corresponds to simultaneous phase and assembly equilibrium. Here, we focus on assembly growth and shrinkage occurring via association and dissociation, see reaction scheme in Fig. 1. However, note that our framework can be easily generalised to include other assembly mechanisms, including primary and secondary nucleation [60]. To account for association and dissociation processes, the phase-dependent net reaction rate for the formation of a (i + j)-mer starting from a i-mer and a j-mer and vice versa are set by the exchange chemical potentials via

where kij is a size-dependent kinetic rate coefficient. The assembly rates entering Eq. (14) can finally be expressed as a function of the monomer exchange rate , by

VII. Assembly kinetics in coexisting phases

By integrating Eq. (14) numerically, we obtain the time evolution of and V I(t), provided their initial values at t = 0, V I(t = 0), and , at phase equilibrium. Specifically, we consider an initial state solely composed of solvent and monomers demixed into a monomer-rich and a monomer-poor phase (labeled with I and II respectively, see the illustration in Fig. 4a). We focus on linear assemblies (d = 1) and highlight differences between Class 1 and 2; for parameters see caption of Fig. 4.

Assembly kinetics at phase equilibrium.

Assuming that the relaxation to phase equilibrium is fast compared to assembly kinetics, we study the slow relaxation to assembly equilibrium in a compartmentalized system. a In the sketch, starting from an initial state composed of monomers and solvent only, assemblies selectively appear in phase I, increasing its volume VI and total volume fraction . b, c For Class 1, as time proceeds, the total macromolecule volume fraction in the two phases, , changes inducing the growth of phase I. In d and e we show the time evolution of the full size distribution in phase II and I, respectively. f, g For Class 2, as time proceeds, changes in total macromolecule volume fraction in the two phases cause a shrinkage of phase I. This is reminiscent of recent experimental findings that quantify droplet volume changes along with droplet ageing [55]. h, i time evolution of assembly volume fractions ϕ;i(t) in phase II and I, respectively. Time is measured in units of the discretization time step, where the rate is introduced in Eq. (F4)

For Class 1, as monomers start forming assemblies, the mixing entropy decreases. As a result, the total amount of protein in the monomer-rich phase, , increases while decreases (Fig. 4b). Such changes in total protein volume fractions induce phase volume variations (Fig. 4c). In particular, remaining within Class 1, since the monomer enrichment of phase I is less pronounced than the monomer depletion of phase II, the volume of the dense phase V I increases. An important finding of our work is that the distribution of assembly size evolves differently in each phase (Fig. 4d,e; and SI Movie 1). In phase II, which is initially poor in monomers, assemblies grow slowly toward an equilibrium distribution which monotonously decreases with assembly size, following an exponential decay. The kinetics in the initially monomer-rich phase I is fundamentally different. First, a very pronounced peak of intermediate-sized assemblies develops quickly. The faster kinetics compared to phase II is caused by monomer diffusion from II to I, which leads to negative feedback for assembly in II and positive feedback in I. This observation is reminiscent of studies on dilute, irreversible aggregation in coexisting phases [39]. The most abundant populations of intermediate-sized assemblies shrink slowly in time feeding the growth of larger assemblies. The resulting equilibrium distribution shows a notable peak of intermediate-sized assemblies followed by an exponential decay. Thus, the difference in the kinetics between the phases is dominantly a consequence of the fact that each phase strives towards a significantly different equilibrium distribution.

Assemblies belonging to Class 2, exhibit a different behaviour. Indeed, in this class as monomers assemble, their interaction propensity decreases. As a result, depending on the values of χ and χ, the total amount of protein in both phases, and , can decrease, as in the case of Fig. 4f. In this case, furthermore, the changes in total protein volume fractions induce a non-monotonic phase volume variation (Fig. 4g), that ultimately lead to the shrinkage of phase I. In Fig. 4h,i, and SI Movie 2, we show how the volume fractions ϕ;i(t) corresponding to all the assembly sizes i evolve in both phases.

VIII. Assembly formation can increase or decrease condensate volume

Here, we discuss changes in phase volumes caused by the assembly kinetics introduced in Sec. VI. In particular, we focus on mixtures initially demixed in two phases, both composed of monomers only, and let the system relax to thermodynamic equilibrium. We then assess for which values of the control parameters ϕ;tot and T, the formation of assemblies in both phases leads to a growth of the ϕ;tot-rich phase (phase I) and vice versa. Moreover, we distinguish the two protein classes introduced in Sec. III.

To this end, we compare the phase diagram corresponding to the initial system, composed of monomers only, with the equilibrium phase diagram in which large assemblies populate the mixture. In figure Fig. 5a, we show the initial and final equilibrium binodals (black and coloured curve, respectively), for the case of linear assemblies (d = 1) belonging to class 1. In this case, the domain corresponding to demixing enlarges once the system reaches its equilibrium state, i.e., assembly facilitates phase separation. We focus on the ϕ;tot-T domain enclosed by the black curve, where the system is phase separated at all times, and compute the initial and final dense phase volumes via the total volume fraction conservation . As displayed in Fig. 5a, this allows us to identify two parameter regimes: at low ϕ;tot (orange area), the dense phase grows as assemblies form, while above the dashed grey line (light blue area), it shrinks. Remarkably, linear assemblies (d = 1) belonging to class 2 exhibit a completely different behaviour, see Fig. 5b. In this case, assembly formation shrinks the domain corresponding to demixing, thereby suppressing phase separation. In the domain enclosing the coloured curve, we can compute the initial and final dense phase volume for each value of ϕ;tot and T . In contrast to the previous case, we find that at low ϕ;tot (light blue area), the dense phase shrinks as assemblies are formed, while for higher ϕ;tot values (orange area) condensate volume grows, as illustrated in Fig. 5b.

Identification of shrinkage and growth regions for different classes.

Here, we study phase-separating systems initially composed of monomers only and we monitor phase volume changes as they relax to thermodynamic equilibrium. a For linear assemblies (d=1) belonging to class 1 the final binodal line (coloured curve) is wider than the initial one (black curve), corresponding to monomers and solvent only (black curve). Areas in orange and light blue correspond to growth and shrinkage of the ϕ;tot-dense phase (phase I), respectively. b The hehaviour of linear assemblies (d=1) belonging belonging to class 2 is remarkably different. Since, in this class, the interaction with the solvent is screened, the final binodal is shrunk compared to the initial one. As a consequence of the shrinkage, the domain corresponding to phase I growth (light blue area) precedes in ϕtot the shrinkage domain (orange area), for class 2.

A volume fraction threshold separates two assembly regimes in homogeneous systems.

a Illustration of assemblies belonging to Class 1 with different spatial dimension. b Assembly size distribution at low total macromolecular volume fraction: ϕtot = 0.2ϕ*. Disregarding assembly dimension, d, the macromolecules are mainly in the monomer state, i.e., ϕ1≃ϕtot. c For ϕtot = 10ϕ*, the monomer concentration saturates at ϕ1≃ϕ* and big assemblys begin to populate the system. For linear assem-blies (corresponding to d = 1 in Eq. (8)), the distribution becomes peaked at an intermediate value imax > 1 and then exponentially cut off. For planar and three-dimensional assemblies, d = 2, 3, the distribution becomes bimodal, with peaks at i = 1 and i = M, the maximum assembly size (M = 50). This bimodal behaviour hints at the emergence of a gelation transition in the limit M→ ∞ . In the insets, we show the scaling of concentrations ci with assembly size. For d = 2, 3 and above the ϕ* threshold, deviations from the classical exponential decay are present. Here eint = 1, sint/kB = 1, M = 50, T/T0 = 0.25

IX. Conclusion

We discuss an extension of the classical theory of molecular assembly [1416] to non-dilute conditions and study it for case where assemblies can phase-separate from the solvent and gelate. This extension relies on a thermodynamic free energy governing the interactions among all assemblies of different sizes and the solvent. We propose two classes of molecular interactions to account for protein interactions relevant for biological systems that can phase separate and form assemblies. Classes differ in the way how energetic parameters for interactions and internal free energies depend on assembly size.

Using our theory, we report several key findings that arise from non-dilute conditions and the ability of assemblies to form a condensed phase. First, size distributions, in general, differ between the phases. In particular, monomers are not necessarily the most abundant species, and distribution tails can significantly deviate from the exponential decay known for classical assembly at dilute conditions [15]. Interestingly, this statement also applies to conditions below the saturation concentration beyond which phase separation can occur. Second, we showed that by lowering the temperature, the dense phase can gelate, i.e., it consists of a single connected assembly of volume equal to the dense phase (a gel). Upon gelation, the composition of the dilute phase changes continuously, while dense liquid phase discontinuously transits to the gel phase. Third, when monomers start assembling in the respective phases, the volume of the protein-dense phase can grow or shrink depending on the molecular interactions among the constituents.

Our key findings are consistent with recent experimental observations in living cells and in vitro assays using purified proteins. A decrease in droplet volume has been observed in phase-separated condensates composed of purified FUS proteins [55]. Up to now, it has remained unclear whether this kinetics relies on a glass transition as suggested in the discussion of Ref. [55] or on the formation of FUS oligomers in the dense phase. However, a potential hint comes from independent studies, which indicate that FUS can form amyloid-like assemblies, that are associated with neurodegenerative disorders [6], at similar conditions [61, 62]. Moreover, the gelation of dense protein condensates upon temperature and heat stress was suggested in several in vivo studies in living cells [26]. The transition to a gelated condensate is believed to provide a protection mechanism for the protein expression machinery in the case of intracellular stress. Recently, in vitro experiments using purified proteins indicate anomalous size distributions of phase-separating proteins below saturation [24]. Our theoretically predicted size distributions could be compared to systematic experimental studies using single molecule techniques such as FRET. From this comparison, protein interactions of assembly-prone and phase separating proteins can be characterized using our proposed classes.

Though many biologically-relevant assembly processes are reversible and governed by thermodynamic principles, there are also a large number of assemblies that are persistently maintained away from equilibrium. For example, the formation or disassembly of assemblies can depends on the hydrolysis of ATP [63] while it can also act cosolute [64, 65]. Since fuel levels are approximately kept constant in living cells, fuel-driven assembly processes are maintained away from equilibrium and thus cannot relax to thermodynamic equilibrium. It is an exciting extension of our work to consider fuel and waste components and how distributions of assembly sizes and the gelation of condensates are affected when maintained away from equilibrium.

Acknowledgements

We thank J. Bauermann, K. Alameh, P. McCall, T. Harmon, L. Hubatsch, L. Jawerth and F. Jülicher for fruitful discussions about the topic. We thank C. Seidel and T. Franzmann for pointing out the relevance of your theory for protein aggregation in biomolecular condensates. We acknowledge J.-F. Joanny for pointing out the references [35, 37]. We thank J. Bauermann, S. Horvát and C. Duclut for help improving the Mathematica code. G. Bartolucci and C. Weber acknowledge the SPP 2191 “Molecular Mechanisms of Functional Phase Separation” of the German Science Foundation for financial support.C. Weber acknowledges the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Fuelled Life, Grant Number 949021) for financial support. Figures created with BioRender.com.

Appendix A Suplementary information and parameters used

Parameters used in each figure is shown in table I.

Movie 1 shows the time evolution of the assembly

volume fractions in both phases, for Class 1. Parameters are the same of 4b-e.

Movie 2 shows the time evolution of the assembly volume fractions in both phases, for Class 2. Parameters are the same of 4f-i.

Appendix B Scaling laws for internal and interaction energies

Here we provide a physical interpretation of the internal free energy ωi. For simplicity, we consider a homogeneous system solely composed of assemblies of size i, characterized by the volume fraction vector ϕ(i), with and for ij. Making use of Eq. (1), the internal free energy of such systems can be written as

with f (ϕ(i))v1 being the free energy associated with each monomer belonging to the i-th assembly. The second term in the equation above is the conformational entropy that stems from having more accessible states with increasing assembly size. Thus, Eq. (B1) allows interpreting ωi as the free energy of monomers inside an assembly of size i, coming only from bonds between monomers. To quantify it, we introduce the number of binding sites for each monomer, z. Following Ref. [16], we distinguish between nb monomers at the boundaries of the assembly, and (inb) in the assembly bulk. Monomers in the bulk can saturate all their z binding sites while, in general, monomers at the boundaries are able to saturate only zb < z. Thus, we get

where Δω is the free energy associated with the formation of a single bond, that is composed of an energetic and an entropic part. The factor two avoids double counting.

We describe three species of assemblies: linear, disc-like and three-dimensional. These can be realised by varying the number of binding sites and their orientation. Linear assemblies (d = 1) are defined to have only two binding sites. They can be pictured as one-dimensional semi-flexible assemblies with no loops, leading to nb = 2, z = 2 and zb = 1. Planar assemblies (d = 2) are defined to have z > 2 co-planar binding sites, for which . Three-dimensional assemblies (d = 3) are characterized by z > 2 binding sites with no precise orientation leading to . Summing up, we get

that inserted in Eq. (B2), decomposing Δω in its energetic and entropic contribution, gives

where α is a constant that depends on number and geometry of the binding sites. Identifying ω = α (eintTsint), Eq. (B4) leads to Eq. (8), in the main text. The constant terms ω, does not affect chemical nor phase equilibrium. However, in the case of d = 2, 3 and M→ ∞, ω it becomes important to study the gelation of the dense phase, see Appendix D. In Eq. (8), the second term represents a boundary interaction penalty, accounting for the fact that monomers at the assembly boundary can realise fewer internal bonds than monomers at the assembly bulk, in analogy with the physical origin of surface tension.

We now discuss the size dependence of the interaction parameters χij. Starting from a lattice model, these parameters can be expressed in terms of the energetic parameters eij corresponding to having two neigh-bouring monomers belonging to i and j. In particular, χij = 2eijeiiejj. Assuming that the energies associated with monomer-monomer interactions do not vary within assemblies, i.e., eij = e11 is constant, we get χij = 0. Moreover, we now discuss the scaling of χis = 2eiseiiess. If the monomer-solvent interactions are also chosen to be size-independent, i.e., eis = e1s, we get χis = 2e1se11ess = χ. This explains the scaling in Class 1 (see Eq. (9)).

However, many proteins of interest screen their hydrophobic interaction when forming assemblies [2, 45, 54] implying that the interactions between monomers in assembly i with solvent (s) eis varies with assembly size i. In each assembly, this energy per monomer comes from two contributions. The first corresponds to monomers in the bulk which are (nnb) and have interaction with solvent . The second one corresponds to the nb monomers at the assembly boundary, characterised by interaction with solvent e1s. We get

Using the scaling of nb/i already introduced above in the discussion of the internal free energy scaling, see B3, we obtain Eq. (10). By abbreviating this case corresponds to Class 2.

Appendix C Linear assemblies belonging to class 1

For class 1 and d = 1, Eq. (4) reads

where we have introduced the characteristic volume fraction

It is straightforward to verify that the latter volume fraction is proportional to the assembly threshold defined in Eq. (11), i.e. . In Fig. 6 in Appendix B, we show the assembly size distribution in homogeneous mixtures obtained by numerically solving Eq. (4) together with Eq. (5), with a cut-off M = 50. We characterise the behaviour of assemblies with different spatial dimensions d = 1, 2, 3, see Fig. 6a. For dilute solutions, corresponding to ϕtot ϕ*, the size distribution is dominated by monomers while larger assem-blies have vanishing volume fraction, i.e., ϕ1ϕtot, see Fig. 6b. For ϕtotϕ*, the monomer concentration saturates at ϕ1ϕ* and assemblies begin to populate the system. As depicted in Fig. 6b, above this threshold the size distribution depends crucially on assembly dimension d. For linear assemblies (d = 1 in Eq. (8)), the distribution becomes peaked at a value M > 1 and then exponentially decays. For planar and three-dimensional assemblies, d = 2, 3 in Eq. (8), the distribution becomes bimodal peaked at i = 1 and i = M, the maximum assembly size (M = 50 in Fig. 6c). The behaviour of the system at high density can be quantitatively studied by performing the thermodynamic limit, i.e., M→ ∞ . Within this limit, the series defined in the conservation law, Eq. (5) can be explicitly solved, leading to

Recalling that , this leads to ϕ1ϕtot, in the regime ϕtotϕ*, while for ϕtotϕ*, we get ϕ1ϕ*.

The maximum of the volume fraction distribution in Eq. (C1) can be obtained imposing iϕi = 0, leading to

The approximate expression on the right hand is obtained using Eq. (C3) and expanding for .

The average ⟨i⟩ =∑i/∑ϕi is given by

where we expanded for to obtain the approximate expression.

We can also derive an expression for the free energy as a function of the conserved quantity alone ϕtot, making use of Eq. (C1) together with Eq. (C3):

Appendix D Gelation transition for two and three-dimensional assemblies

As outlined in Fig. 6 in Appendix B for d = 2, 3, at high ϕtot for M finite, the size distribution shows a bimodal behaviour. This suggests for the limit M → ∞ that the system undergoes a gelation transition, which is defined as the emergence of an assembly that is comparable with the system size [16, 37, 38]. To precisely locate the ϕtot value at which the transition occurs, we recall Eq. (4) and consider the series

We note that when N → ∞, this series converges only if ⟩. Thus, we get an upper bound for the series, namely

Approximating the series with the integral, we get an estimation for ϕsg:

By the Maxwell construction, Eq. (7) with the free energy Eq. (12), we can study the interplay between the gelation transition and phase separation. Here, the parameter ω plays a crucial role. As discussed in Appendix B, ω contains an energetic and an entropic part, and is proportional to eintTsint, the coefficient depending on assembly dimension, and number and geometry of the binding sites. Here, for simplicity, we set

In Fig. 7a we display the result of the construction, coloured curve, colour code depicts the monomer fraction ϕ1tot in the coexisting phases. Note that, with our choice of ω, the boundary between homogeneous mixtures and the gel state, at high ϕtot, agrees very well with the estimate ϕsg introduced in Eq. (D3) (black curve). In Fig. 7, we display the free energy for three temperature values corresponding to sol-gel coexistence (Fig. 7b), sol-sol and sol-gel coexistence (Fig. 7c), and sol-sol coexistence only (Fig. 7d). The dashed lines represent values where f is not convex. Notice that, for consistency, we use values of fsol only up to ϕsg (denoted by a vertical black line). This is because, as depicted in Fig 6 in Appendix B, after this value a peek at M, the finite cut of used for the numerics, will appear.

Gel-sol free energies.

a The coloured curved indicates the binodal obtained with the Maxwell construction for f = fsol + fgel, together with the estimate ϕsg(T) (black line, defined in Eq. (D2)) for the transition between homogeneous and gel states. c-e Maxwell construction for three different temperature values, the coloured and black, dashed curves represent convex and concave branches of f, respectively. Parameters are the same of Fig. 3, see Table I

Appendix E Mutual feedback between phase separation and assembly equilibria

We first discuss how assemblies can shape the phase diagram. For linear assemblies (d = 1) belonging to Class 1, assemblies facilitate phase separation. Indeed, as illustrated in Fig. 8a-b, increasing the relative assembly strength, i.e., decreasing eint, leads to an upshift in critical temperature and a downshift in critical volume fraction. This trend can be explained by considering that assembly formation, even if energetically disfavoured, reduces the mixing entropy (see the first term in Eq. (1)). In Fig. 8a, we show the binodal lines corresponding to three representative values of the assembly strength: eint = 0,⟩−1−2. We compare them to the black curve, which corresponds to a binary mixture made of monomers and solvent only (black curve). This reference case can be thought of as the limiting case in which assemblies have an infinite energy penalty, i.e, eint =→ ∞. In Fig. 8b, we quantify the changes in critical temperature and critical volume fraction as a function of the relative assembly strength eint. In Fig. 8c-d, we illustrate the behaviour of linear assemblies (d = 1) belonging to Class 2. In contrast to Class 2, assemblies can suppress phase separation. Indeed, making assemblies more favourable by decreasing eint, the critical temperature decreases, and even if the critical density decreases and the binodal shrinks, see Fig. 8c. In Fig. 8d, we display critical temperatures and critical volume fraction variations as a function of the relative assembly strength eint.

The influence of assemblies on the system phase behaviour.

a Focusing on systems with d = 1 belonging to class 1, we compare three binodals corresponding to assembly strength eint = 0.5,⟩−1,⟩−2 (coloured curves) and the reference binary mixture composed of monomers and solvent only (black curve). The latter can be associated with the limit eint/χ→ ∞ . The region enclosed by the binodal, corresponding to phase separation, expands even for assemblies with no assembly energy eint = 0. This can be explained by the entropic advantage caused by size polydispersity. b Dependence of the critical volume fraction and critical temperature on the assembly strength eint. The presence of assemblies causes T c and ϕc to deviate from the reference values (black dashed lines) corresponding to a binary mixture with monomers and solvent only (eint/χ→ ∞). In particular, for Class 1, making assemblies more energetically favourable, i.e. decreasing eint, induces an increase in T c and a decrease in ϕc, in turn making phase separation more accessible. Here sint/kB =2, M = . c Comparison between three binodal lines corresponding to systems belonging to class 2 and d = 1, with assembly energies eint = 0,⟩−0.5,⟩−1 (coloured curves) and the reference binary mixture composed of monomers and solvent only (black curve). d For Class 2, decreasing eint, causes T c and ϕc to decrease, overall hindering phase separation. This is caused by the interaction propensity screening in monomers at the bulk of assemblies belonging to class 2, see Eq. (10). Here sint/kB = 2, M = ∞. χ= 0.2χ.

Fig. 8 clearly shows that the presence of assemblies affects the phase equilibrium of a mixture. We now prove that, in turn, the total number of assemblies can dif-fer between phase-separating and homogeneous systems with the same total protein volume fraction. To show this, we fix the interaction propensity χ, the temperature T/T0, and the total macromolecule volume fraction ϕtot to values corresponding to two-phase coexistence at thermodynamic equilibrium. We then compare the assembly size distribution (after averaging over both phases), with the distribution in the corresponding homogeneous state, with the same values of T and ϕtot. Recalling that due to our choice of interaction propensity scaling in Eq. (9) that the size distribution in the homogeneous system, Eq. (4), does not depend on χ. For this reason, the homogeneous state can be thought of as an unstable state corresponding to the same χ as the phase separating one, which has not reached phase equilibrium yet, but also as the equilibrium state of a system with the same parameters as the phase separating one, but formed by assemblies that do not interact with the solvent (χ = 0).

In Fig. 9a, we display results for linear assemblies (d = 1) with T/T0 = 0.2, and ϕtot = 0.016. We compare the size distribution in the homogeneous system , with the weighted average over compartments, defined as

The influence of phase separation on assembly size.

a Comparison between the size distribution in a homogeneous system, and in the corresponding phase-separated system (averaged in both compartments). Here, we consider linear assemblies (d = 1), M→ ∞, ϕtot = 0.016 and T/T0 = 0.2. We note that the presence of compartments can favour assembly formation, even when the corresponding homogeneous mixture is populated mainly by monomers. The difference in distributions can be quantified utilizing the functional distance, defined in Eq. (E2). b The magnitude of this distance depends on the droplet size and the temperature chosen. The volume corresponding to the maximum distribution distance shifts towards lower values with decreasing temperature T/T0. The distributions separated by the maximum distance, for T/T0 = 0.2, are the ones displayed in a. eint =0.5, sint/kB =2, T/T0 = 0.25

in the corresponding phase-separated system. Clearly, the two distributions differ, showing that the presence of

compartments can lead to larger assemblies. The difference in size distributions can be quantified utilizing the so-called total variation distance, defined as

This quantity characterizes the distance between two nor-malised functions as the largest possible distance among values that they assign to the same argument. The distance between the homogeneous size distribution and the distribution defined in Eq. (E1) depends on the temperature T and the total volume fraction ϕtot, which in turn determines the droplet size. In Fig. 9b, we display distribution distances corresponding to different temperatures and droplet volumes. In the limits V I/V → 0 and V I/V → 1, the system becomes homogeneous. As a result, the distribution distance vanishes. Note that the volume corresponding to the maximum distribution distance shifts towards lower values.

Appendix F Assembly kinetics

1 assembly kinetics in homogeneous mixtures

In this section, we give the details on the kinetic theory for assembly in non-dilute homogeneous systems that can relax toward chemical equilibrium. Each component i follows

dt The assembly rates ri read

These rates conserve the total volume fraction ϕtot, i.e., tϕtot =∑i ri = 0. The assembly flux between two assemblies of size i and j, and the combined (i + j)-mer reads

and is determined by differences in chemical potential per monomer.

We recast the assembly flux in Eq. (F2) as

to recover a finite flux in the limit ϕi≪1. In the literature, Fij is known as fragmentation kernel. For linear assemblies (d = 1) belongin to Class 1, we find a constant kernel in agree-ment with standard polymerization models [15]. For linear assemblies (d = 1) belonging to Class 2, we find , i.e. the fragmentation kernel is still size independent but now depends on the total monomer volume fraction ϕtot. Following again Ref [15], we can express the time evolution of ϕi(t) via

where γ is the following function of the fragmentation kernel F

Appendix G Assembly kinetics in phase-separated systems

Here, we generalise the assembly kinetics described in the previous section to the case of phase coexistence. To this end, we focus on passive systems that can relax toward thermodynamic equilibrium. Moreover, we restrict ourselves to systems that are at phase equilibrium at any time during the relaxation kinetics toward thermos-dynamic equilibrium and following the theory originally developed in Ref. [56]. Chemical kinetics constrained to phase equilibrium is valid if the chemical reaction rates are small compared to diffusion rates. By choosing initial average volume fractions corresponding to two-phase coexistence, we can consider the system volume to be divided into two homogeneous compartments as a result of phase separation. We then study the time evolution of compartment sizes and volume fractions due to chemical reactions, enforcing instantaneous phase equilibrium at all times. To this aim, we start with the variation of particle numbers in compartments I and II:

where are the variations due to chemical reactions and describes the exchange of assemblies between the two phases. Particle conservation during crossing implies . Due to volume conservation in the two-phase, we have

Furthermore, V = V I + V II. We now introduce volume fractions and the rescaled rates and , leading to

which correspond to Eq. (F1) generalised to two-phase coexistence. The rates in both phases are given in Eq. (18). Eq. (G1) and Eq. (G2) can be combined to get . Using the volume conserving properties of the rates,

finally get

Assembly mass conservation at the interface implies

with the volume dynamics obeying dt(V I + V II) = 0.

The currents enforce that phase equilibrium is satisfied at all times, which can be expressed by taking a time derivative of Eq. (7):

provided that the initial phase volume and volume fractions V I(t = 0), and are a solution of Eq. (7). Once an expression for ∂μi/∂ϕj and Π/∂ϕj is calcu-lated, we can derive an a set of M + 1 equations for inserting Eq. (G3), Eq. (G4), and Eq. (G5) in Eq. (G6). These equations are linear and enable us to find an ex-pression for as a function of and V I/V . We have finally all the ingredients to characterize the dynamics of the phase volume and volume fractions and V I(t), integrating Eq. (G3) and Eq. (G4) and provided we can solve the initial phase equilibrium problem to find V I(t = 0)/V, and . This scheme can be used to study the kinetics of a system initially composed of two phases filled by monomers only that relax to its thermodynamic equilibrium. An example of such relaxation kinetics is depicted in Fig 10. Note that the currents restrict the trajectories to lie in the binodal manifold at all times.

Kinetic trajectory in the multicomponent phase diagram Illustration of the assembly kinetics at phase equilibrium, for systems corresponding to M = 3 and initially composed of monomers only.