Stochastic yield catastrophes and robustness in self-assembly
Abstract
A guiding principle in self-assembly is that, for high production yield, nucleation of structures must be significantly slower than their growth. However, details of the mechanism that impedes nucleation are broadly considered irrelevant. Here, we analyze self-assembly into finite-sized target structures employing mathematical modeling. We investigate two key scenarios to delay nucleation: (i) by introducing a slow activation step for the assembling constituents and, (ii) by decreasing the dimerization rate. These scenarios have widely different characteristics. While the dimerization scenario exhibits robust behavior, the activation scenario is highly sensitive to demographic fluctuations. These demographic fluctuations ultimately disfavor growth compared to nucleation and can suppress yield completely. The occurrence of this stochastic yield catastrophe does not depend on model details but is generic as soon as number fluctuations between constituents are taken into account. On a broader perspective, our results reveal that stochasticity is an important limiting factor for self-assembly and that the specific implementation of the nucleation process plays a significant role in determining the yield.
eLife digest
The self-assembly of a large biological molecule from small building blocks is like finishing a puzzle of magnetic pieces by shaking the box. Even though each piece of the puzzle is attracted to its correct neighbours, the limited control makes it very hard to finish the puzzle in a short amount of time.
The problem becomes even more difficult if several copies of the same puzzle are assembled in one box. If several puzzles start at the same time, the different parts might steal pieces from each other, making it impossible to successfully complete any of the puzzles. This is called a depletion trap. If the box is only shaken and there is no real control over individual pieces, these traps occur at random.
Overcoming these random depletion traps is an important challenge when assembling nanostructures and other artificial molecules designed by humans without wasting many, potentially expensive, components. Previous studies have shown that when multiple copies of the same structure are assembled simultaneously, slowing the rate of initiation increases the yield of correctly-made structures. This prevents new structures from stealing pieces from existing structures before they are fully completed.
Now, Gartner, Graf, Wilke et al. have used a mathematical model to show that changing the way initiation is delayed leads to different yields. This was especially true for small systems where fluctuations in the availability of the different pieces strongly enhanced the initiation of new structures. In these cases, the self-assembly process terminated undesirably with many incomplete structures.
Nanostructures have various applications ranging from drug delivery to robotics. These findings suggest that in order to efficiently assemble biological molecules, the concentrations of the different building blocks need to be tightly controlled. A question for further research is to investigate strategies that reduce fluctuations in the availability of the building blocks to develop more efficient assembly protocols.
Introduction
Efficient and accurate assembly of macromolecular structures is vital for living organisms. Not only must resource use be carefully controlled, but malfunctioning aggregates can also pose a substantial threat to the organism itself (Jucker and Walker, 2013; Drummond and Wilke, 2009). Furthermore, artificial self-assembly processes have important applications in a variety of research areas like nanotechnology, biology, and medicine (Zhang, 2003; Whitesides and Grzybowski, 2002; Whitesides et al., 1991). In these areas, we find a broad range of assembly schemes. For example, while a large number of viruses assemble capsids from identical protein subunits, some others, like the Escherichia virus T4, form highly complex and heterogeneous virions encompassing many different types of constituents (Zlotnick et al., 1999; Zlotnick, 2003; Hagan, 2014; Leiman et al., 2010). Furthermore, artificially built DNA structures can reach up to Gigadalton sizes and can, in principle, comprise an unlimited number of different subunits (Ke et al., 2012; Reinhardt and Frenkel, 2014; Gerling et al., 2015; Wagenbauer et al., 2017). Notwithstanding these differences, a generic self-assembly process always includes three key steps: First, subunits must be made available, for example by gene expression, or rendered competent for binding, for example by nucleotide exchange (Alberts and Johnson, 2015; Chen et al., 2008; Whitelam, 2015) (‘activation’). Second, the formation of a structure must be initiated by a nucleation event (‘nucleation’). Due to cooperative or allosteric effects in binding, there might be a significant nucleation barrier (Chen et al., 2008; Jacobs and Frenkel, 2015; Sear, 2007; Lazaro and Hagan, 2016; Hagan and Elrad, 2010). Third, following nucleation, structures grow via aggregation of substructures (‘growth’). To avoid kinetic traps that may occur due to irreversibility or very slow disassembly of substructures (Hagan et al., 2011; Grant et al., 2011), structure nucleation must be significantly slower than growth (Zlotnick et al., 1999; Ke et al., 2012; Reinhardt and Frenkel, 2014; Wei et al., 2012; Jacobs et al., 2015; Hagan and Elrad, 2010). Physically speaking, there are no irreversible reactions. However, in the biological context, self-assembly describes the (relatively fast) formation of long-lasting, stable structures. Therefore, at least part of the assembly reactions are often considered to be irreversible on the time scale of the assembly process. In this manuscript we investigate, for a given target structure, whether the nature of the specific mechanism employed in order to slow down nucleation influences the yield of assembled product. To address this question, we examine a generic model that incorporates the key elements of self-assembly outlined above.
Model definition
We model the assembly of a fixed number of well-defined target structures from limited resources. Specifically, we consider a set of different species of constituents denoted by which assemble into rings of size . The cases and () are denoted as homogeneous and partially (fully) heterogeneous, respectively. The homogeneous model builds on previous work on virus capsid (Chen et al., 2008; Hagan et al., 2011), linear protein filament assembly (Michaels et al., 2016; Michaels et al., 2017; D'Orsogna et al., 2012) and aggregation and polymerization models (Krapivsky et al., 2010). The heterogeneous model in turn links to previous model systems used to study, for example, DNA-brick-based assembly of heterogeneous structures (Murugan et al., 2015; Hedges et al., 2014; D'Orsogna et al., 2013). We emphasize that, even though strikingly similar experimental realizations of our model exist (Gerling et al., 2015; Wagenbauer et al., 2017; Praetorius and Dietz, 2017), it is not intended to describe any particular system. The ring structure represents a general linear assembly process involving building blocks with equivalent binding properties and resulting in a target of finite size. The main assumption in the ring model is that the different constituents assemble linearly in a sequential order. In many biological self-assembling systems like bacterial flagellum assembly or biogenesis of the ribosome subunits the assumption of a linear binding sequence appears to be justified (Peña et al., 2017; Chevance and Hughes, 2008). In order to test the validity of our results beyond these constraints we also perform stochastic simulations of generalized self-assembling systems that do not obey a sequential binding order: i) by explicitly allowing for polymer-polymer bindings and ii) by considering the assembly of finite sized squares that grow independently in two dimensions (see Figures 6 and 7).
The assembly process starts with inactive monomers of each species. We use to denote the initial concentration of each monomer species, where is the reaction volume. Monomers are activated independently at the same per capita rate , and, once active, are available for binding. Binding takes place only between constituents of species with periodically consecutive indices, for example 1 and 2 or and 1 (leading to structures such as for ); see Figure 1. To avoid ambiguity, we restrict ring sizes to integer multiples of the number of species . Furthermore, we neglect the possibility of incorrect binding, for example species 1 binding to 3 or . Polymers, that is incomplete ring structures, grow via consecutive attachment of monomers. For simplicity, polymer-polymer binding is disregarded at first, as it is typically assumed to be of minor importance (Zlotnick et al., 1999; Chen et al., 2008; Murugan et al., 2015; Haxton and Whitelam, 2013). To probe the robustness of the model, later we consider an extended model including polymer-polymer binding for which the results are qualitatively the same (see Figure 6 and the discussion). Furthermore, it has been observed that nucleation phenomena play a critical role for self-assembly processes (Ke et al., 2012; Wei et al., 2012; Reinhardt and Frenkel, 2014; Chen et al., 2008). So it is in general necessary to take into account a critical nucleation size, which marks the transition between slow particle nucleation and the faster subsequent structure growth (Michaels et al., 2016; Lazaro and Hagan, 2016; Morozov et al., 2009; Murugan et al., 2015). We denote this critical nucleation size by , which in terms of classical nucleation theory corresponds to the structure size at which the free energy barrier has its maximum. For attachment of monomers to existing structures and decay of structures (reversible binding) into monomers take place at size-dependent reaction rates and , respectively (Figure 1). Here, we focus on identical rates and . A discussion of the general case is given in Appendix 4. Above the nucleation size, polymers grow by attachment of monomers with reaction rate per binding site. As we consider successfully nucleated structures to be stable on the observational time scales, monomer detachment from structures above the critical nucelation size is neglected (irreversible binding) (Murugan et al., 2015; Chen et al., 2008). Complete rings neither grow nor decay (absorbing state).
We investigate two scenarios for the control of nucleation speed, first separately and then in combination. For the ‘activation scenario’ we set (all binding rates are equal) and control the assembly process by varying the activation rate . For the ‘dimerization scenario’ all particles are inherently active () and we control the assembly process by varying the dimerization rate (we focus on ). It has been demonstrated previously in Chen et al. (2008) and (Endres and Zlotnick, 2002; Hagan and Elrad, 2010; Morozov et al., 2009) that either a slow activation or a slow dimerization step are suitable in principle to retard nucleation and favour growth of the structures over the initiation of new ones. We quantify the quality of the assembly process in terms of the assembly yield, defined as the number of successfully assembled ring structures relative to the maximal possible number . Yield is measured when all resources have been used up and the system has reached its final state. We do not discuss the assembly time in this manuscript, however, in Appendix 5 we show typical trajectories for the time evolution of the yield in the activation and dimerization scenario. If the assembly product is stable (absorbing state), the yield can only increase with time. Consequently, the final yield constitutes the upper limit for the yield irrespective of additional time constraints. Therefore, the final yield is an informative and unambiguous observable to describe the efficiency of the assembly reaction.
We simulated our system both stochastically via Gillespie’s algorithm (Gillespie, 2007) and deterministically as a set of ordinary differential equations corresponding to chemical rate equations (see Appendix 1).
Results
Deterministic behavior in the macroscopic limit
First, we consider the macroscopic limit, , and investigate how assembly yield depends on the activation rate (activation scenario) and the dimerization rate (dimerization scenario) for . Here, the deterministic description coincides with the stochastic simulations (Figure 2a and b). For both high activation and high dimerization rates, yield is very poor. Upon decreasing either the activation rate (Figure 2a) or the dimerization rate (Figure 2b), however, we find a threshold value, or , below which a rapid transition to the perfect yield of 1 is observed both in the deterministic and stochastic simulation. By exploiting the symmetries of the system with respect to relabeling of species, one can show that, in the deterministic limit, the behavior is independent of the number of species (for fixed and , see Appendix 1). Consequently, all systems behave equivalently to the homogeneous system and yield becomes independent of in this limit. Note, however, that equivalent systems with differing have different total numbers of particles and hence assemble different total numbers of rings.
Decreasing the activation rate reduces the concentration of active monomers in the system. Hence growth of the polymers is favored over nucleation, because growth depends linearly on the concentration of active monomers while nucleation shows a quadratic dependence. Likewise, lower dimerization rates slow down nucleation relative to growth. Both mechanisms therefore restrict the number of nucleation events, and ensure that initiated structures can be completed before resources become depleted (see Figure 2c and d).
Mathematically, the deterministic time evolution of the polymer size distribution is described by an advection-diffusion equation (Endres and Zlotnick, 2002; Yvinec et al., 2012) with advection and diffusion coefficients depending on the instantaneous concentration of active monomers (see Appendix 2). Solving this equation results in the wavefront of the size distribution advancing from small to large polymer sizes (Figure 2e). Yield production sets in as soon as the distance travelled by this wavefront reaches the maximal ring size . Exploiting this condition, we find that in the deterministic system for , a non-zero yield is obtained if either the activation rate or the dimerization rate remains below a corresponding threshold value, that is if or , where
(see Appendix 3) with proportionality constants and . These relations generalize previous results (Morozov et al., 2009) to finite activation rates and for heterogeneous systems. A comparison between the threshold values given by Equation 1 and the simulated yield curves is shown in Figure 2a,b. The relations highlight important differences between the two scenarios (where and , respectively): While decreases cubically with the ring size , does so only quadratically. Furthermore, the threshold activation rate increases with the initial monomer concentration . Consequently, for fixed activation rate, the yield can be optimized by increasing . In contrast, the threshold dimerization rate is independent of and the yield curves coincide for . Finally, if is finite and , the interplay between the two slow-nucleation scenarios may lead to enhanced yield. This is reflected by the factor in , and we will come back to this point later when we discuss the stochastic effects.
In summary, for large particle numbers (), perfect yield can be achieved in two different ways, independently of the heterogeneity of the system - by decreasing either the activation rate (activation scenario) or the dimerization rate (dimerization scenario) below its respective threshold value.
Stochastic effects in the case of reduced resources
Next, we consider the limit where the particle number becomes relevant for the physics of the system. In the activation scenario, we find a markedly different phenomenology if resources are sparse. Figure 3a shows the dependence of the average yield on the activation rate for different, low particle numbers in the completely heterogeneous case (). Here, we restrict our discussion to the average yield. The error of the mean is negligible due to the large number of simulations used to calculate the average yield. Still, due to the randomness in binding and activation, the yield can differ between simulations. A figure with the average yield and its standard deviation is shown in Appendix 6. For very low and very high average yield, the standard deviation has to be small due to the boundedness of the yield. For intermediate values of the average, the standard deviation is highest but still small compared to the average yield. Thus, the average yield is meaningful for the essential understanding of the assembly process. Whereas the deterministic theory predicts perfect yield for small activation rates, in the stochastic simulation yield saturates at an imperfect value . Reducing the particle number decreases this saturation value until no finished structures are produced (). The magnitude of this effect strongly depends on the size of the target structure if the system is heterogeneous. Figure 3c shows a diagram characterizing different regimes for the saturation value of the yield, , in dependence of the particle number and the size of the target structure for fully heterogeneous systems . We find that the threshold particle number necessary to obtain a fixed yield increases nonlinearly with the target size . For the depicted range of , the dependence of the threshold for nonzero yield, , on can approximately be described by a power-law: , with exponent for . Consequently, for already more than 105 rings must be assembled in order to obtain a yield larger than zero. In Appendix 8 we included two additional plots that show the dependence of on for fixed and the dependence on for fixed , respectively. The suppression of the yield is caused by fluctuations (see explanation below) and is not captured by a deterministic description. Because these stochastic effects can decrease the yield from a perfect value in a deterministic description to zero (see Figure 3a), we term this effect ‘stochastic yield catastrophe’. For fixed target size and fixed maximum number of target structures , increases with decreasing number of species, see Figure 3d. In the fully homogeneous case, , a perfect yield of 1 is always achieved for . The decrease of the maximal yield with the number of species thus suggests that, in order to obtain high yield, it is beneficial to design structures with as few different species as possible. In large part this effect is due to the constraint , whereby the more homogeneous systems (small ) require larger numbers of particles per species and, correspondingly, exhibit less stochasticity. If is fixed instead of , the yield still initially decreases with increasing number of species but then quickly reaches a stationary plateau and gets independent of for , see Appendix 7. Moreover, increasing the nucleation size , and with it the reversibility of binding, also increases , see Figure 3(d). This indicates that, beside heterogeneity of the target structure, irreversibility of binding on the relevant time scale makes the system susceptible to stochastic effects.
The stochastic yield catastrophe is mainly attributable to fluctuations in the number of active monomers. In the deterministic (mean-field) equation the different particle species evolve in balanced stoichiometric concentrations. However, if activation is much slower than binding, the number of active monomers present at any given time is small, and the mean-field assumption of equal concentrations is violated due to fluctuations (for ). Activated monomers then might not fit any of the existing larger structures and would instead initiate new structures. Figure 4a illustrates this effect and shows how fluctuations in the availability of active particles lead to an enhanced nucleation and, correspondingly, to a decrease in yield. Due to the effective enhancement of the nucleation rate, the resulting polymer size distribution has a higher amplitude than that predicted deterministically (Figure 4b) and the system is prone to depletion traps. A similar broadening of the size distribution has been reported in the context of stochastic coagulation-fragmentation of identical particles (D'Orsogna et al., 2015).
In the dimerization scenario, in contrast, there is no stochastic activation step. All particles are available for binding from the outset. Consequently, stochastic effects do not play an essential role in the dimerization scenario and perfect yield can be reached robustly for all system sizes, regardless of the number of species (Figure 3(b)).
Non-monotonic yield curves for a combination of slow dimerization and activation
So far, the two implementations of the ‘slow nucleation principle’ have been investigated separately. Surprisingly, we observe counter-intuitive behavior in a mixed scenario in which both dimerization and activation occur slowly (i.e., , ). Figure 5 shows that, depending on the ratio , the yield can become a non-monotonic function of . In the regime where is large, nucleation is dimerization-limited; therefore activation is irrelevant and the system behaves as in the dimerization scenario for . Upon decreasing we then encounter a second regime, where activation and dimerization jointly limit nucleation. The yield increases due to synergism between slow dimerization and activation (see dependence of , Equation 1), whilst the average number of active monomers is still high and fluctuations are negligible. Finally, a stochastic yield catastrophe occurs if is further reduced and activation becomes the limiting step. This decline is caused by an increase in nucleation events due to relative fluctuations in the availability of the different species (‘fluctuations between species’). This contrasts the deterministic description where nucleation is always slower for smaller activation rate. Depending on the ratio , the ring size and the particle number , maximal yield is obtained either in the dimerization-limited (red curves, Figure 5), activation-limited (blue curve, Figure 5b) or intermediate regime (green and orange curves, Figure 5).
Robustness of the results to model modifications
In our model, the reason for the stochastic yield catastrophe is that - due to fluctuations between species - the effective nucleation rate is strongly enhanced. Hence, if binding to a larger structure is temporarily impossible, activated monomers tend to initiate new structures, causing an excess of structures that ultimately cannot be completed. Natural questions that arise are whether (i) relaxing the constraint that polymers cannot bind other polymers or (ii) abandoning the assumption of a linear assembly path, will resolve the stochastic yield catastrophe. To answer these questions, we performed stochastic simulations for extensions of our model system showing that the stochastic yield catastrophe indeed persists. We start by considering the ring model from the previous section but take polymer-polymer binding into account in addition to growth via monomer attachment (Figure 6). In detail, we assume that two structures of arbitrary size (and with combined length ) bind at rate if they fit together, that is if the left (right) end of the first structure is periodically continued by the right (left) end of the second one. Realistically, the rate of binding between two structures is expected to decrease with the motility and thus the sizes of the structures. In order to assess the effect of polymer-polymer binding, we focus on the worst case where the rate for binding is independent of the size of both structures. If a stochastic yield catastrophe occurs for this choice of parameters, we expect it to be even more pronounced in all the ‘intermediate cases’. Figure 6 shows the dependence of the yield on the activation rate in the polymer-polymer model. As before, yield increases below a critical activation rate and then saturates at an imperfect value for small activation rates. Decreasing the number of particles per species, decreases this saturation value. Compared to the original model, the stochastic yield catastrophe is mitigated but still significant: For structures of size , yield saturates at around 0.87 for particles per species and at around 0.33 for particles per species. We thus conclude that polymer-polymer binding indeed alleviates the stochastic yield catastrophe but does not resolve it. Since binding only happens between consecutive species, structures with overlapping parts intrinsically can not bind together and depletion traps continue to occur. Taken together, also in the extended model, fluctuations in the availability of the different species lead to an excess of intermediate-sized structures that get kinetically trapped due to structural mismatches. Note that in the extreme case of , incomplete polymers can always combine into one final ring structure so that in this case the yield is always 1. Analogously, for high activation rates yield is improved for compared to (Figure 6b).
Kinetic trapping due to structural mismatches can occur in every (partially) irreversible heterogeneous assembly process with finite-sized target structure and limited resources. From our results, we thus expect a stochastic yield catastrophe to be common to such systems. In order to further test this hypothesis, we simulated another variant of our model where finite sized squares assemble via monomer attachment from a pool of initially inactive particles, see Figure 7. In contrast to the original model, the assembled structures are non-periodic and exhibit a non-linear assembly path where structures can grow independently in two dimensions. While the ring model assumes a sequential order of binding of the monomers, the square allows for a variety of distinct assembly paths that all lead to the same final structure. Note that, because of the absence of periodicity, the square model is only well defined for the completely heterogeneous case. Figure 7 depicts the dependence of the yield on the activation rate for a square of size . Also in this case, we find that the yield saturates at an imperfect value for small activation rates. Hence, we showed that the stochastic yield catastrophe is not resolved neither by accounting for polymer-polymer combination nor by considering more general assembly processes with multiple parallel assembly paths. This observation supports the general validity of our findings and indicates that stochastic yield catastrophes are a general phenomenon of (partially) irreversible and heterogeneous self-assembling systems that occur if particle number fluctuations are non-negligible.
Discussion
Our results show that different ways to slow down nucleation are indeed not equivalent, and that the explicit implementation is crucial for assembly efficiency. Susceptibility to stochastic effects is highly dependent on the specific scenario. Whereas systems for which dimerization limits nucleation are robust against stochastic effects, stochastic yield catastrophes can occur in heterogeneous systems when resource supply limits nucleation. The occurrence of stochastic yield catastrophes is not captured by the deterministic rate equations, for which the qualitative behavior of both scenarios is the same. Therefore, a stochastic description of the self-assembly process, which includes fluctuations in the availability of the different species, is required. The interplay between stochastic and deterministic dynamics can lead to a plethora of interesting behaviors. For example, the combination of slow activation and slow nucleation may result in a non-monotonic dependence of the yield on the activation rate. While deterministically, yield is always improved by decreasing the activation rate, stochastic fluctuations between species strongly suppress the yield for small activation rate by effectively enhancing the nucleation speed. This observation clearly demonstrates that a deterministically slow nucleation speed is not sufficient in order to obtain good yield in heterogeneous self-assembly. For example, a slow activation step does not necessarily result in few nucleation events although deterministically this behavior is expected. Thus, our results indicate that the slow nucleation principle has to be interpreted in terms of the stochastic framework and have important implications for yield optimization.
We showed that demographic noise can cause stochastic yield catastrophes in heterogeneous self-assembly. However, other types of noise, such as spatiotemporal fluctuations induced by diffusion, are also expected to trigger stochastic yield catastrophes. Hence, our results have broad implications for complex biological and artificial systems, which typically exhibit various sources of noise. We characterize conditions under which stochastic yield catastrophes occur, and demonstrate how they can be mitigated. These insights could usefully inform the design of experiments to circumvent yield catastrophes: In particular, while slow provision of constituents is a feasible strategy for experiments, it is highly susceptible to stochastic effects. On the other hand, irrespective of its robustness to stochastic effects, the experimental realization of the dimerization scenario relies on cooperative or allosteric effects in binding, and may therefore require more sophisticated design of the constituents (Sacanna et al., 2010; Zeravcic et al., 2017). Our theoretical analysis shows that stochasticity can be alleviated either by decreasing heterogeneity (presumably lowering realizable complexity) or by increasing reversibility (potentially requiring fine-tuning of bond strengths and reducing the stability of the assembly product). Alternative approaches to control stochasticity include the promotion of specific assembly paths (Murugan et al., 2015; Gartner, Graf and Frey, in preparation) and the control of fluctuations (Graf, Gartner and Frey, in preparation). One possibility to test these ideas and the ensuing control strategies could be via experiments based on DNA origami. Instead of building homogeneous ring structures as in Wagenbauer et al. (2017), one would have to design heterogeneous ring structures made from several different types of constituents with specified binding properties. By varying the opening angle of the ‘wedges’ (and thus the preferred number of building blocks in the ring) and/or the number of constituents, both the target structure size as well as the heterogeneity of the target structure could be controlled.
Moreover, the ideas presented in this manuscript are relevant for the understanding of intracellular self-assembly. In cells, provision of building blocks is typically a gradual process, as synthesis is either inherently slow or an explicit activation step, such as phosphorylation, is required. In addition, the constituents of the complex structures assembled in cells are usually present in small numbers and subject to diffusion. Hence, stochastic yield catastrophes would be expected to have devastating consequences for self-assembly, unless the relevant cellular processes use elaborate control mechanisms to circumvent stochastic effects. Further exploration of these control mechanisms should enhance the understanding of self-assembly processes in cells and help improve synthesis of complex nanostructures.
Materials and methods
All our simulation data was generated with either C++ or MATLAB. The source code is available at the eLife website.
Here we show the derivation of Equation 1 in the main text, giving the threshold values for the rate constants below which finite yield is obtained. The details can be found in Appendices 1–3.
Master equation and chemical rate equations
Request a detailed protocolWe start with the general Master equation and derive the chemical rate equations (deterministic/mean-field equations) for the heterogeneous self-assembly process. We renounce to show the full Master equation here but instead state the system that describes the evolution of the first moments. To this end, we denote the random variable that describes the number of polymers of size and species in the system at time by with and . The species of a polymer is defined by the species of the respective monomer at its left end. Furthermore, and denote the number of inactive and active monomers of species , respectively, and the number of complete rings. We signify the reaction rate for binding of a monomer to a polymer of size by . denotes the activation rate and the decay rate of a polymer of size . By we indicate (ensemble) averages. The system governing the evolution of the first moments (the averages) of the is then given by:
The different terms of this equation are illustrated graphically in Figure 8. The first equation describes loss of inactive particles due to activation at rate . Equation 2b gives the temporal change of the number of active monomers that is governed by the following processes: activation of inactive monomers at rate , binding of active monomers to the left or to the right end of an existing structure of size at rate , and decay of below-critical polymers of size into monomers at rate (disassembly). Equations 2c and 2d describe the dynamics of dimers and larger polymers of size , respectively. The terms account for reactions of polymers with active monomers (polymerization) as well as decay in the case of below-critical polymers (disassembly). The indicator function equals 1 if the condition is satisfied and 0 otherwise. Note that a polymer of size can grow by attaching a monomer to its left or to its right end whereas the formation of a dimer of a specific species is only possible via one reaction pathway (dimerization reaction). Finally, polymers of length – the complete ring structures – form an absorbing state and, therefore, include only the respective gain terms (cf Equation 2e).
We simulated the Master equation underlying Equation 2 stochastically using Gillespie’s algorithm. For the following deterministic analysis, we neglect correlations between particle numbers , which is valid assumption for large particle numbers. Then the two-point correlator can be approximated as the product of the corresponding mean values (mean-field approximation)
Furthermore, for the expectation values it must hold
because all species have equivalent properties (there is no distinct species) and hence the system is invariant under relabelling of the upper index. By
we denote the concentration of any monomer or polymer species of size , where is the reaction volume. Due to the symmetry formulated in Equation 4, the heterogeneous assembly process decouples into a set of identical and independent homogeneous assembly processes in the deterministic limit. The corresponding homogeneous system then is described by the following set of equations that is obtained by applying (Equation 3, Equation 4) and (Equation 5) to (Equation 2)
The rate constants in Equations 6 and 2 differ by a factor of . For convenience, we use however the same symbol in both cases. The rate constants in Equation 6 can be interpreted in the usual units . Due to the symmetry, the yield, which is given by the quotient of the number of completely assembled rings and the maximum number of complete rings, becomes independent of the number of species
Hence, it is enough to study the dynamics of the homogeneous system, Equation 6, to identify the condition under which non zero yield is obtained.
Effective description by an advection-diffusion equation
Request a detailed protocolThe dynamical properties of the evolution of the polymer-size distribution become evident if the set of ODEs, Equation 6, is rewritten as a partial differential equation. This approach was previously described in the context of virus capsid assembly (Zlotnick et al., 1999; Morozov et al., 2009). For simplicity, we restrict ourselves to the case and let and . Then, for the polymers with we have
As a next step, we approximate the index indicating the length of the polymer as a continuous variable and define . By we denote the concentration of active monomers in the following to emphasize their special role. Formally expanding the right-hand side of Equation 8 in a Taylor series up to second order
one arrives at the advection-diffusion equation with both advection and diffusion coefficients depending on the concentration of active monomers
Equation 10 can be written in the form of a continuity equation with flux . The flux at the left boundary equals the influx of polymers due to dimerization of free monomers . This enforces a Robin boundary condition at
At we set an absorbing boundary so that completed structures are removed from the system. The time evolution of the concentration of active monomers is given by
The terms on the right-hand side account for activation of inactive particles, dimerization, and binding of active particles to polymers (polymerization).
Qualitatively, Equation 10 describes a profile that emerges at from the boundary condition Equation 11, moves to the right with time-dependent velocity due to the advection term, and broadens with a time-dependent diffusion coefficient . In Appendices 2–3 we show how the full solution of Equations 10 and 11 can be found assuming knowledge of . Here, we focus only on the derivation of the threshold activation rate and threshold dimerization rate that mark the onset of non-zero yield. Yield production starts as soon as the density wave reaches the absorbing boundary at . Therefore, finite yield is obtained if the sum of the advectively travelled distance and the diffusively travelled distance exceeds the system size
According to Equation 10, and , giving as condition for the onset of finite yield
where the last approximation is valid for large .
In order to obtain we derive an effective two-component system that governs the evolution of . To this end, we denote the total number of polymers in Equation 12 by (as long as yield is zero the upper boundary is irrelevant and we can consider ). Equation 12 then reads
and the dynamics of is determined from the boundary condition, Equation 11
Measuring and in units of the initial monomer concentration and time in units of the equations are rewritten in dimensionless units as
where and . Equation 17 describes a closed two-component system for the concentration of active monomers and the total concentration of polymers . It describes the dynamics exactly as long as yield is zero. In order to evaluate the condition (14) we need to determine the integral over as a function of and
To that end, we proceed by looking at both scenarios separately. The numerical analysis, confirming our analytic results, is given in Appendix 3.
Dimerization scenario
Request a detailed protocolThe activation rate in the dimerization scenario is , and instead of the term in , we set the initial condition (and ). Furthermore, and we can neglect the term proportional to in . As a result,
Solving this equation for as a function of using the initial condition , the totally travelled distance of the wave is determined to be
where for the evaluation of the integral we used the substitution .
Activation scenario
Request a detailed protocolIn the activation scenario, yield sets in only if the activation rate and thus the effective nucleation rate is slow. As a result, in addition to , we can again neglect the term proportional to in . This time, however, we have to keep the term . As a next step, we assume that is much smaller than the remaining terms on the right-hand side, and . This assumption might seem crude at first sight but is justified a posteriori by the solution of the equation (see Appendix 3). Hence, we get the algebraic equation . Using it to solve for , and then to determine , the totally travelled distance of the wave is deduced as
Taken together, we therefore obtain two conditions out of which one must be fulfilled in order to obtain finite yield
where and are numerical factors, and and . This verifies Equation 1 in the main text.
Appendix 1
Chemical reaction equations and the equivalence of models with different numbers of species
In this section we derive the chemical rate equations (deterministic equations) for the self-assembly process as described in the main text. Furthermore, we show that for general in the deterministic limit the model is equivalent to a set of independent assembly processes with only one species.
Homogeneous structures
First, we consider the homogeneous model (). By we denote the concentration of complexes of length () at time , is the concentration of active monomers and the concentration of inactive monomers at time . In the following we will usually skip the time argument for better readability. We denote the reaction rate for binding of a monomer to a polymer of size by . The model from the main text is recovered by setting if , and otherwise. The ensuing set of ordinary differential equations then reads:
The indicator function equals 1 if the condition is satisfied and 0 otherwise. The first equation describes loss of inactive particles due to activation at rate . It is uncoupled from the remainder of the equations and is solved by , with denoting the initial concentration of inactive monomers. The temporal change of the active monomers is governed by the following processes (Equation A1b): activation of inactive monomers at rate , binding of active monomers to existing structures at rate (polymerization), and decay of below-critical polymers into monomers at rate (disassembly). All binding rates appear with a factor of 2 because a monomer can attach to a polymer on its left or on its right end.
Note that there is a subtlety with the dimerization term in Equation A1b: the dimerization term as well bears a factor of 2 because two identical monomers and can form a dimer in two possible ways, either as or . Additionally, there is a stoichiometric factor of 2 for the monomers in this reaction. However, one factor of 2 is cancelled again because, assuming there are monomers, the number of ordered pairs of monomers that describe possible reaction partners is (if is large) rather than (the number of reaction partners when two different species react). This leaves us with a single factor of 2 like for all the other binding reactions.
Equations A1c and A1d describe the dynamics of dimers and larger polymers of size , respectively. The terms account for reactions of polymers with active monomers (polymerization) as well as decay in the case of below-critical polymers (disassembly). The dimerization term in the equation for lacks the factor of 2 because the stoichiometric factor is missing for the dimers as compared with the dimerization term for the monomers in the line above. Finally, polymers of length – the complete ring structures – form an absorbing state and therefore only include a reactive gain term (Equation A1e).
Heterogeneous structures
Next we consider systems with more than one particle species (). The heterogeneous system can be described by dynamical equations equivalent to the homogeneous system. We show this starting from a full description that distinguishes both monomers and polymers into a set of different species . The species of a polymer is defined by the species of the respective monomer at its left end. As polymers assemble in consecutive order of species, a polymer is uniquely determined by its length and species (i.e. species of leftmost monomer). In that sense, with and denotes the concentration of a polymer of length and species ( and again denote inactive and active monomers of species , respectively). For example, denotes the concentration of polymers [5678] if , or of polymers [5612] if . Upper indices are always assumed to be taken modulo whenever they lie outside the range . Therefore, the dynamics of the concentrations with is given by
The terms on the right-hand side account for the influx due to binding of the respective polymers of length with a monomer either on the right or on the left (first and second term), and for the outflux due to reactions of a polymer of length and species with a monomer on the right or on the left (third and fourth term), as well as for decay into monomers for (last term). For the dynamics of the dimers, however, there is only one gain term arising from dimerization:
Equivalently, for the active monomers we find:
Now we exploit the symmetry of the system with respect to the species index, that is, the upper index in : Since all species in the system are equivalent, the dynamic equations are invariant under relabelling of the upper indices. Consequently, it must hold that:
In other words, the upper index is irrelevant and can also be discarded. The variable then denotes the concentration of any one polymer species of length . Taking advantage of this symmetry for the equations of the heterogeneous system, (Equation A2, Equation A3 and Equation A4), and collecting equal terms leads to a set of equations fully identical to those for the homogeneous system (Equation A1). We show the equivalence to the homogeneous model exemplarily for the dynamics of the polymers with size in Equation A2. Applying to Equation A2 yields for the dynamics of the concentration of an arbitrary polymer species of size :
which is identical to the respective dynamic Equation A1d for the homogeneous model. The other equations for the heterogeneous system reduce to those for the homogeneous system in an analogous manner.
Summarizing, we have shown that the (deterministic) heterogeneous assembly process decouples into a set of identical and independent homogeneous processes. In particular, yield, which is given by the quotient of the number of completely assembled rings and the maximal possible number of complete rings, becomes independent of :
Appendix 2
Effective description of the evolution of the polymer size distribution as an advection-diffusion equation
The dynamical properties of the evolution of the polymer size distribution become evident if the set of ODEs, Equation 1, is rewritten as a partial differential equation. This approach was previously described in the context of virus capsid assembly (Morozov et al., 2009; Zlotnick et al., 1999; Endres and Zlotnick, 2002) but we will restate the essential steps here for the convenience of the reader. To this end we interpret the length index of the polymer as a continuous variable that we rename . With such a continuous description in view we write to denote the concentration of polymers of size .
Since the active monomers play a special role, we denote their concentration in the following by . For simplicity we restrict our discussion to the case and let and . Generalizations to can be done in a similar way. Then, for the polymers with we have:
Formally, expanding the right-hand side in a Taylor series up to second order
we arrive at an advection-diffusion equation with both advection and diffusion coefficients depending on the concentration of active monomers ,
Equation A9 can be written in the form of a continuity equation with flux . The flux at the left boundary, , equals the influx of polymers due to dimerization of free monomers, . This enforces a Robin boundary condition at ,
At , we have an absorbing boundary so that completed structures are removed from the system. Furthermore, the time evolution of the concentration of active particles is given by
The terms on the right-hand side account for activation of inactive particles, dimerization, and binding of active particles to polymers (polymerization).
Qualitatively, Equation A9 describes a profile that emerges at from the boundary condition, Equation A10, moves to the right with time dependent velocity due to the advection term, and broadens with a time-dependent diffusion coefficient . The concentration of active particles determines both the influx of dimers at , as well as the speed and diffusion of the wave profile.
Next, we derive an expression that solves Equation A9, assuming that we know . We start by solving Equation A9 at the left boundary , and then translate the resulting expression to obtain a solution for . To obtain in dependence of we can solve (see Equation A1c) by ’variation of the constants’ as
With help of this expression we find : Given , the advective part of Equation A9,
is solved by
Here, denotes the time when a particle now at position and time was at . In other words, a particle at time and position has entered the system at at time . This ansatz solves the PDE (Equation A13) if and only if satisfies
with being an arbitrary integral of such that and denoting its inverse. More easily, we find this form of by requiring that the integral over the velocity from time to equals the travelled distance :
To include the diffusive contribution in Equation A13, we use the diffusion kernel,
with the time dependent diffusion constant . The kernel accounts for the mass that has been diffusively transported from over a distance of . Because the mass has entered the system at at time , it diffused for the time . The complete expression for is then obtained as the convolution of (Equation A14), that is obtained from Equation A12 and Equation A15, and the diffusion kernel (Equation A17):
Interpreting the terms in the equations and the general form of the solution, we are able to understand the qualitative behavior of the system. If both the activation and the dimerization rate are large, the system produces zero yield: both advection and diffusion are driven by the concentration of active monomers . If activation is fast, the concentration of active monomers will become large initially since activation is faster than the reaction dynamics. Consequently, provided , dimerization dominates over binding because it depends quadratically on , see Equation A11. The reservoir of free particles then depletes quickly and cannot sustain the motion of the wave for long enough to reach the absorbing boundary, resulting in a very low yield. Only if either the activation rate is low enough or if , the motion of the wave can be sustained until it reaches the absorbing boundary.
Appendix 3
Threshold values for the activation and dimerization rate
Based on the analysis from the previous section, we will now determine the threshold activation rate and threshold dimerization rate which mark the onset of non-zero yield. Yield production starts as soon as the density wave reaches the absorbing boundary at . Therefore, finite yield is obtained if and only if the sum of the advectively travelled distance and the diffusively travelled distance exceeds the system size :
The condition for the onset of non-zero yield is obtained by assuming equality in this relation. The advectively travelled distance is obtained from Equation A16 by setting the borders of the integral over the velocity to and :
The diffusively travelled distance is approximately given by the standard deviation of the Gaussian diffusion kernel, Equation A17, again with and ,
Taken together, we obtain a condition for the onset of finite yield:
Substituting and requiring that is positive, we solve the quadratic equation and find that Equation A22 is equivalent to
where the last approximation is valid for large .
We determine the threshold values for the activation rate and the dimerization rate by finding solutions of the dynamical equation for the active particles , Equation A11, such that the condition, Equation A23, is fulfilled. Thus, we start by deriving the dependence of on and .
The concentration appears in Equation A11 only in terms of an integral , counting the total number of polymers in the system. As long as yield is zero there is no outflux of polymers at the absorbing boundary and the total number of polymers in the system only increases due to the influx at the left boundary . As long as yield is zero we can therefore equivalently consider the limit . We denote the total number of polymers in Equation A11 by for which the dynamics is determined from the boundary condition, Equation A10:
Hence, as long as yield is zero, the total number of polymers increases with the rate of the dimerization events. The system then simplifies to a set of two coupled ordinary differential equations for and :
The dynamics of and is equivalent to a two-state activator-inhibitor system, where dimerizes into at rate , and degrades (inhibits) at rate . Note that Equation A25 describes the exact dynamics of the active monomers and total number of polymers in the deterministic system as long as yield is zero. The system has therefore been greatly reduced from originally coupled ODEs to now only two coupled ODEs.
For the further analysis it is useful to non-dimensionalize Equation A25 by measuring and in units of the initial concentration of inactive monomers and time in units of :
with the remaining dimensionless parameters and . We are interested in the integral over as a function of and ,
which relates to the totally travelled distance of the wave. Note that, in case of zero yield, is the total advectively travelled distance of the wave (cf. Equation A20) and the square of the diffusively travelled distance (cf. Equation A21).
Analysis of the dimerization scenario
The dimerization scenario is characterized by fast activation and slow dimerization . For the dimensionless parameters these assumptions translate to and . Because for small nucleation is much slower than growth we neglect the dimerization term in Equation A26a against the growth term. Furthermore, because activation happens on a fast time scale compared with nucleation and we may therefore integrate out the fast time scale assuming that all particles are activated instantaneously at the beginning. The system Equation A26 then reduces to
with the initial condition and . We divide the first equation by the second one (formally applying the chain rule and the inverse function theorem) to obtain a single equation for the dynamics of :
where . This first order ODE can be solved by separation of variables and subsequent integration, yielding
Because the number of active monomers must vanish for , the final value of is
Thereby, we calculate the function via variable substitution :
So, the dependence of the travelled distance of the wave on obeys a power law with exponent , confirming the previous result (Morozov et al., 2009). For the coefficient we find .
Additionally, we can determine the time dependent solutions and . Using the solution for from Equation A30 in Equation A28b we obtain as
We use this expression for in Equation A28a to obtain . The resulting ODEs can again be solved by separation of variables as
Analysis of the activation scenario
In the activation scenario, , such that and . As we know already that decreasing will slow down nucleation relative to growth we can again neglect the dimerization term in Equation A26a. In contrast to the dimerization scenario, however, we have to keep the activation term. Transforming time via such that and writing and the system in Equation A26 becomes:
with the initial condition . The function transforms as
In the following we derive the asymptotic solution for in the limit of small in order to evaluate the integral in Equation A36. In the limit () both and will become small whereas increases monotonically. The reaction term in Equation A35a is furthermore weighted by a factor which will become large if . We therefore postulate that for sufficiently large the derivative is much smaller than the two terms on the right-hand side of Equation A35a and hence negligible. This assumption has to be justified a posteriori with the obtained solution. Neglecting the derivative term in (Equation A35a) reduces the equation to an algebraic equation and we find
Using this result in Equation A35b we can solve for by separation of variables and subsequent integration:
From Equation A37 we immediately obtain :
where by we denote the part of the solution that depends only on . Hence, we find that and hence also scale like , and will thus become small if and is large enough. Therefore the solution is consistent and justifies the approximation in which we neglected the derivative term in the limit of small and sufficiently large .
Note that consistency of the solution with the approximation is a sufficient criterion for the validity of the approximation: We can solve the system for and in Equation A35 iteratively by defining
Assuming that for , and converge to the correct solutions and when starting with , we obtain and as given by Equation A39 and Equation A38 and can iteratively refine the approximation. The next iteration step then reads: . As we know that the left-hand side will be small and and solve the system if the left-hand side equals 0. Writing and this gives:
From dimensional analysis it follows that the correction terms and must scale like and and are hence much smaller than the first order approximations and . Higher order corrections will give even smaller contributions showing that if , is indeed a very good approximation.
In the limit , however, the expression for in Equation A39 diverges and consistency is violated. Hence, the obtained solution is valid only for sufficiently large .
We fix some small such that the approximation can be assumed to be sufficiently good if . Furthermore, we define such that for all . Using Equation A39 we can write this as for all , where the left-hand side, , depends only on . Hence, by decreasing we can make arbitrarily small: . In order to calculate the integral in Equation A36 can be separated in a domain where the approximation is accurate and a domain where the correct solution deviates strongly from :
We see from Equation A35a that describes an upper bound to showing that . Therefore we can bound the contribution of the first integral as . Because this upper bound for the integral goes to 0 if and hence become small the first integral will become negligible against the second one. Asymptotically, we therefore only need to consider the second integral with the solution for as given by Equation A39:
where we used the substitution and is the (Euler) Gamma function. So, in the limit of small , scales with and with identical exponent . This contrasts the dimerization scenario where as well as and depend only on and are independent of (cf. Equation A32, A33 and A34).
Numerical analysis and the threshold values for the rate constants
In order to confirm the results of the last two paragraphs and to see how behaves in the intermediate regime where and are of the same order of magnitude we also investigate the function numerically. For that purpose we numerically integrate the ODE-system for and in Equation A26 for different values of and with a semi-implicit method. Subsequently, we integrate the solution using an adaptive recursive Simpson’s rule. Plotting in dependence of for fixed on a double-logarithmic scale reveals a rather simple bipartite form of , see Appendix 3—figure 1a:
The transition between these two regimes is rather sharp so that is best described in a piecewise fashion
Next, we plot the coefficients and against . Here we find that with and is again bipartite with a sharp kink in between (Appendix 3—figure 1b):
where and . The transition between both regimes is at . The second regime is not relevant for self-assembly since it refers to both large and large , hence the travelled distance is too small to give finite yield in this regime. Therefore, we discard the second regime and obtain as final result
with and . This confirms perfectly the exponents as well as the coefficients found in the last two paragraphs. It is, however, surprising that there is such a sharp transition between both regimes, which allows to define in a piecewise fashion. This behavior must be the result of a series of lower oder terms in which are unimportant in the limits and but cause the sharp transition when and are of the same order of magnitude.
Finally, we return to our original task of finding the threshold values of the activation and dimerization rate for the onset of yield. Using our result for in Equation A23 we find as necessary and sufficient condition to obtain finite yield in the deterministic system:
Alternatively, we can state this result as two separate conditions out of which at least one must be fulfilled to obtain finite yield:
where and . This verifies Equation 1 in the main text.
Appendix 4
Impact of the implementation of sub-nucleation reactions
In the main text we focused our discussion on irreversible binding . In this section we investigate the effect of different implementations of the sub-nucleation reactions.
In general, perfect yield is trivially achieved if the complete ring is the only stable structure. However, yield can be maximal already for smaller nucleation sizes depending on the explicit decay rate . In the deterministic limit without the dimerization and activation mechanisms (, ) a rapid transition from zero yield to perfect yield occurs in dependence of the critical nucleation size (see Appendix 4—figure 1). The threshold value in this case is approximately half the ring size and is weakly affected by the decay rate . In order to obtain finite yield for small nucleation sizes, an extremely high decay rate would be necessary. Hence, maximizing the yield solely by increasing the nucleation size is not very feasible.
In our model, the subcritical reaction rates may take different values. Here, we want to restrict our discussion to two scenarios. First, all rates have an identical value and second, the rates increase linearly up to the super-nucleation reaction rate: .
In the deterministic limit, both implementations show the same qualitative behavior as the dimerization mechanism with in the main text (see Appendix 4—figure 2). The only relevant aspect for the final yield is the extend to which nucleation is slowed down in total. In the constant scenario all reaction steps contribute equally. As a results there is a strong dependence on the number of such reaction steps, that is on the critical nucleation size. If however, the reaction rates increase linearly with the size of the polymers, the dimerzation rate dominates. Only in the case finite yield is observed at all. In this limit the dimerization rate is much smaller than the subsequent growth rates. The explicit form of the different is not of major importance for the yield. The total slowdown of nucleation is the central feature. Structure decay does not play any role for intermediate nucleation sizes.
The last question we want to address is how the combination of activation and dimerization mechanism and the corresponding non-monotonic behavior is affected by the nucleation size. Again, we compare constant sub-nucleation growth with a linearly increasing growth rate (see Appendix 4—figure 3). In the deterministic regime both implementations behave qualitatively similar as the dimerization mechanism discussed in the main text. However, in both cases the stochastic yield catastrophe is less pronounced. For the constant growth rates a saturation of the maximal yield is observed for sufficiently low . If the profile is linear this effect is weaker as compared to the constant case and a dependency on the explicit value of is still observed. The saturation value is not reached for these reactions rates.
Taking all our results for the sub-nucleation behavior together we draw the following conclusions: First, structure decay by itself it not very efficient in order to maximize yield. Second, the explicit choice of the sub-nucleation rates is of minor importance for the qualitative behavior. The system behaves similarly to the case . Third, larger nucleation sizes mitigate the stochastic yield catastrophe in general.
Appendix 5
Time evolution of the yield in the activation and dimerization scenario
In the main text we focus on the final yield, which represents the maximal yield that can be obtained in the assembly reaction for . Here, we briefly discuss the temporal evolution of the yield in the two scenarios. Appendix 5—figure 1 shows the yield as a function of time for the dimerization scenario (blue) and the activation scenario (red) for the corresponding parameters indicated in the plot. Drawn lines show the evolution of the yield in the stochastic simulation whereas dashed lines represent its deterministic evolution obtained by integrating the corresponding mean-field rate equations (only shown for the activation scenario). In both scenarios, yield production sets in after a short lag time (Hagan and Elrad, 2010). The emergence of a lag time can be understood in terms of the interpretation of the assembly process as the progression of a travelling wave (see Sec. B). The travelling wave thereby describes the polymer size distribution and the time that is needed for the wave to reach the absorbing boundary equals the lag time for yield production observed in Appendix 5—figure 1. After the lag time, the yield increases very abruptly in the dimerization scenario and a bit more continually in the activation scenario. Since monomers are provided gradually in the activation scenario, the emerging wave is flatter and extends over a larger range (in polymer size space) as compared to the dimerization scenario. Consequently, yield production is more gradual in the activation scenario than in the dimerization scenario. For the same reason, the dimerization scenario is generally ‘faster’ or more time efficient than the activation scenario. For a detailed analysis of the time efficiency of these and other self-assembly scenarios we refer the reader to our manuscript in preparation (Gartner, Graf and Frey, in preparation).
In all depicted situations, the yield increases monotonically with time. This is, of course, generally true since the completed ring structures define an absorbing state in our system. The final yield, which is indicated in the right bar, therefore represents the upper limit for the yield that can be achieved in the assembly reaction. Appendix 5—figure 1 shows that the temporal yield curves initially are rather steep and quickly reach a value that lies within 10% of the final yield (‘quickly’ thereby refers to the respective time scale), before the curves flatten and increase more slowly. This underlines that the final yield is a meaningful observable that not only describes the upper limit for the yield but also approximates the typical yield of the assembly reaction under appropriate time constraints that are not too restrictive (on the time scale set by the respective lag time).
Appendix 6
Standard deviation of the yield
In the main text, the analysis focuses on the average yield. A priori it is, however, not apparent that this average quantity is informative, in particular due to the strong effect of stochasticity in the system. Here, we thus take a step forward to complement this picture by additionally considering a simple measure for the fluctuations of the yield, its standard deviation. Appendix 6—figure 1 is an extension of Figure 3a in the main text, showing the dependence of the average yield and its sample standard deviation on the activation rate. Since yield is always positive, the standard deviation of the yield has to be small if the average yield is close to 0 ( in Appendix 6—figure 1). The same holds true for average yield close to 1 as the yield is bounded by one from above ( in Appendix 6—figure 1). For intermediate values of the average yield, the standard deviation is highest but still small compared to the average yield ( in Appendix 6—figure 1). The average yield is, thus, meaningful. Naturally the ratio of the standard deviation compared to the average yield also depends on the number of particles per species and on the number of species . Generally speaking, for higher and , this ratio decreases (see Appendix 7—figure 1 for the dependency on ).
Appendix 7
Influence of the heterogeneity of the target structure for fixed number of particles per species
Figure 3d in the main text shows how the maximal yield depends on the number of species if the ring size and the number of possible ring structures is fixed. This comparison for fixed is motivated by the question which role the heterogeneity of a structure plays for assembly efficiency if a certain number of structures should be realized. Figure 3d illustrates that a higher number of species (more heterogeneous structures) leads to a lower maximally possible yield, suggesting that it is beneficial to build structures with as few different species as possible. However, this situation does not correspond to the deterministically equivalent case of fixed number of particles per species (note, though, that in the deterministic case the maximally possible yield is always 1, namely for ). Instead, for higher number of species , the number of particles per species decreases. How does the heterogeneity of the structures alter the maximally possible yield if and (instead of and ) are fixed? Appendix 7—figure 1 shows how the maximal yield and its standard deviation (obtained as average yield and sample standard deviation for when the yield has well saturated and the dynamics (except for the timescale) get independent of the exact value of the rate-limiting activation rate) depend on the number of species . For homogeneous structures yield is always perfect since in this case there can be no fluctuations between species. As a result, the average yield is 1 and the standard deviation is 0. For increasing , the average yield decreases until it levels off for . This behavior indicates that indeed the decreasing number of particles per species for larger is essential for the decrease of the maximal yield with in Figure 3d. As mentioned above, the standard deviation is largest for small and decreases with .
Appendix 8
Dependence of the maximal yield in the activation scenario on and
Figure 3c in the main text characterizes the dependence of the maximal yield in the activation scenario as a ‘phase diagram’ distinguishing different regimes of in dependence of the particle number and target size . Supplementing this figure in the main text, Appendix 8—figure 1 shows the maximum yield that is obtained in the activation scenario in the limit for fixed in dependence of (Appendix 8—figure 1a) as well as for fixed in dependence of (Appendix 8—figure 1b). For larger particle number , the maximal yield exhibits a transition from 0 to 1 over roughly three orders of magnitude. Increasing shifts the transition to larger . The threshold particle number where the transition starts is characterised by (see main text). Approximately, for , we find (cf. main text, Figure 3c). Similarly, decreasing the target size for fixed , the maximal yield exhibits a transition from 0 to 1 over roughly one order of magnitude in . The corresponding threshold value as a function of is obtained as the inverse function of . Hence, at least for , approximately it holds . Since is largely independent of the number of species for fixed and (see Appendix 7), the maximal yield in the activation scenario (for ) can be fully characterized as a function of and . Hence, can roughly be expressed in terms of the threshold particle number as
As can be seen from Figure 3c in the main text, the transition line between zero and nonzero yield slightly flattens with increasing . Hence, the power law (and similarly for ) only holds approximately and for a restricted range in and . The asymptotic behavior of in the limit remains elusive.
Data availability
All data was generated from stochastic simulations in C++ and deterministic simulations in Matlab. The source code files are included with the article.
References
-
Self-assembly of brome mosaic virus capsids: insights from shorter time-scale experimentsThe Journal of Physical Chemistry A 112:9405–9412.https://doi.org/10.1021/jp802498z
-
Coordinating assembly of a bacterial macromolecular machineNature Reviews Microbiology 6:455–465.https://doi.org/10.1038/nrmicro1887
-
Stochastic self-assembly of incommensurate clustersThe Journal of Chemical Physics 136:084110.https://doi.org/10.1063/1.3688231
-
Combinatoric analysis of heterogeneous stochastic self-assemblyThe Journal of Chemical Physics 139:121918.https://doi.org/10.1063/1.4817202
-
First assembly times and equilibration in stochastic coagulation-fragmentationThe Journal of Chemical Physics 143:014112.https://doi.org/10.1063/1.4923002
-
The evolutionary consequences of erroneous protein synthesisNature Reviews Genetics 10:715–724.https://doi.org/10.1038/nrg2662
-
Stochastic simulation of chemical kineticsAnnual Review of Physical Chemistry 58:35–55.https://doi.org/10.1146/annurev.physchem.58.032806.104637
-
Analyzing mechanisms and microscopic reversibility of self-assemblyThe Journal of Chemical Physics 135:214505.https://doi.org/10.1063/1.3662140
-
Mechanisms of kinetic trapping in self-assembly and phase transformationThe Journal of Chemical Physics 135:104115.https://doi.org/10.1063/1.3635775
-
Modeling viral capsid assemblyAdvances in Chemical Physics 155:1.https://doi.org/10.1002/9781118755815.ch01
-
Allosteric control of icosahedral capsid assemblyThe Journal of Physical Chemistry B 120:6306–6318.https://doi.org/10.1021/acs.jpcb.6b02768
-
Morphogenesis of the T4 tail and tail fibersVirology Journal 7:355.https://doi.org/10.1186/1743-422X-7-355
-
Fluctuations in the kinetics of linear protein Self-AssemblyPhysical Review Letters 116:258103.https://doi.org/10.1103/PhysRevLett.116.258103
-
Assembly of viruses and the pseudo-law of mass actionThe Journal of Chemical Physics 131:155101.https://doi.org/10.1063/1.3212694
-
Undesired usage and the robust self-assembly of heterogeneous structuresNature Communications 6:6203.https://doi.org/10.1038/ncomms7203
-
Eukaryotic ribosome assembly, transport and quality controlNature Structural & Molecular Biology 24:689–699.https://doi.org/10.1038/nsmb.3454
-
Numerical evidence for nucleated self-assembly of DNA brick structuresPhysical Review Letters 112:238103.https://doi.org/10.1103/PhysRevLett.112.238103
-
Nucleation: theory and applications to protein solutions and colloidal suspensionsJournal of Physics: Condensed Matter 19:033101.https://doi.org/10.1088/0953-8984/19/3/033101
-
First passage times in homogeneous nucleation and self-assemblyThe Journal of Chemical Physics 137:244107.https://doi.org/10.1063/1.4772598
-
Colloquium : Toward living matter with colloidal particlesReviews of Modern Physics 89:031001.https://doi.org/10.1103/RevModPhys.89.031001
-
Fabrication of novel biomaterials through molecular self-assemblyNature Biotechnology 21:1171–1178.https://doi.org/10.1038/nbt874
Article and author information
Author details
Funding
Deutsche Forschungsgemeinschaft (GRK2062)
- Patrick Wilke
Deutsche Forschungsgemeinschaft (QBM)
- Florian M Gartner
- Isabella R Graf
Aspen Center for Physics (PHY-160761)
- Erwin Frey
Deutsche Forschungsgemeinschaft (EXC-2094 - 390783311)
- Erwin Frey
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
We thank Nigel Goldenfeld for a stimulating discussion, and Raphaela Geßele and Laeschkir Hassan for helpful feedback on the manuscript. This research was supported by the German Excellence Initiative via the program ‘NanoSystems Initiative Munich’(NIM) and was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy – EXC-2094–390783311. FMG and IRG are supported by a DFG fellowship through the Graduate School of Quantitative Biosciences Munich (QBM). We also gratefully acknowledge financial support by the DFG Research Training Group GRK2062 (Molecular Principles of Synthetic Biology). Finally, EF thanks the Aspen Center for Physics, which is supported by National Science Foundation grant PHY-1607611, for their hospitality and inspiring discussions with colleagues.
Copyright
© 2020, Gartner et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 1,995
- views
-
- 326
- downloads
-
- 8
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Computational and Systems Biology
- Physics of Living Systems
Explaining biodiversity is a fundamental issue in ecology. A long-standing puzzle lies in the paradox of the plankton: many species of plankton feeding on a limited variety of resources coexist, apparently flouting the competitive exclusion principle (CEP), which holds that the number of predator (consumer) species cannot exceed that of the resources at a steady state. Here, we present a mechanistic model and demonstrate that intraspecific interference among the consumers enables a plethora of consumer species to coexist at constant population densities with only one or a handful of resource species. This facilitated biodiversity is resistant to stochasticity, either with the stochastic simulation algorithm or individual-based modeling. Our model naturally explains the classical experiments that invalidate the CEP, quantitatively illustrates the universal S-shaped pattern of the rank-abundance curves across a wide range of ecological communities, and can be broadly used to resolve the mystery of biodiversity in many natural ecosystems.
-
- Computational and Systems Biology
- Physics of Living Systems
Planar cell polarity (PCP) – tissue-scale alignment of the direction of asymmetric localization of proteins at the cell-cell interface – is essential for embryonic development and physiological functions. Abnormalities in PCP can result in developmental imperfections, including neural tube closure defects and misaligned hair follicles. Decoding the mechanisms responsible for PCP establishment and maintenance remains a fundamental open question. While the roles of various molecules – broadly classified into “global” and “local” modules – have been well-studied, their necessity and sufficiency in explaining PCP and connecting their perturbations to experimentally observed patterns have not been examined. Here, we develop a minimal model that captures the proposed features of PCP establishment – a global tissue-level gradient and local asymmetric distribution of protein complexes. The proposed model suggests that while polarity can emerge without a gradient, the gradient not only acts as a global cue but also increases the robustness of PCP against stochastic perturbations. We also recapitulated and quantified the experimentally observed features of swirling patterns and domineering non-autonomy, using only three free model parameters - the rate of protein binding to membrane, the concentration of PCP proteins, and the gradient steepness. We explain how self-stabilizing asymmetric protein localizations in the presence of tissue-level gradient can lead to robust PCP patterns and reveal minimal design principles for a polarized system.