Proofreading through spatial gradients

  1. Vahe Galstyan
  2. Kabir Husain
  3. Fangzhou Xiao
  4. Arvind Murugan  Is a corresponding author
  5. Rob Phillips  Is a corresponding author
  1. Biochemistry and Molecular Biophysics Option, California Institute of Technology, United States
  2. Department of Physics and the James Franck Institute, University of Chicago, United States
  3. Division of Biology and Biological Engineering, California Institute of Technology, United States
  4. Department of Physics, California Institute of Technology, United States


Key enzymatic processes use the nonequilibrium error correction mechanism called kinetic proofreading to enhance their specificity. The applicability of traditional proofreading schemes, however, is limited because they typically require dedicated structural features in the enzyme, such as a nucleotide hydrolysis site or multiple intermediate conformations. Here, we explore an alternative conceptual mechanism that achieves error correction by having substrate binding and subsequent product formation occur at distinct physical locations. The time taken by the enzyme–substrate complex to diffuse from one location to another is leveraged to discard wrong substrates. This mechanism does not have the typical structural requirements, making it easier to overlook in experiments. We discuss how the length scales of molecular gradients dictate proofreading performance, and quantify the limitations imposed by realistic diffusion and reaction rates. Our work broadens the applicability of kinetic proofreading and sets the stage for studying spatial gradients as a possible route to specificity.


The nonequilibrium mechanism called kinetic proofreading (Hopfield, 1974; Ninio, 1975) is used for reducing the error rates of many biochemical processes important for cell function (e.g. DNA replication [Kunkel, 2004], transcription [Sydow and Cramer, 2009], translation [Rodnina and Wintermeyer, 2001; Ieong et al., 2016], signal transduction [Swain and Siggia, 2002], or pathogen recognition [McKeithan, 1995; Goldstein et al., 2004; Cui and Mehta, 2018]). Proofreading mechanisms operate by inducing a delay between substrate binding and product formation via intermediate states for the enzyme–substrate complex. Such a delay gives the enzyme multiple chances to release the wrong substrate after initial binding, allowing far lower error rates than what one would expect solely from the binding energy difference between right and wrong substrates.

Traditional proofreading schemes require dedicated molecular features such as an exonuclease pocket in DNA polymerases (Kunkel, 2004) or multiple phosphorylation sites on T-cell receptors (McKeithan, 1995; Goldstein et al., 2004); such features create intermediate states that delay product formation (Figure 1a) and thus allow proofreading. Additionally, since proofreading is an active nonequilibrium process often involving near–irreversible reactions, the enzyme typically needs to have an ATP or GTP hydrolysis site to enable the use of energy supplies of the cell (Yamane and Hopfield, 1977; Rodnina and Wintermeyer, 2001). Due to such stringent structural requirements, the number of confirmed proofreading enzymes is relatively small. Furthermore, generic enzymes without such dedicated features are assumed to not have active error correction available to them.

Error correction schemes that operate by delaying product formation.

(a) The traditional proofreading scheme with multiple biochemically distinct intermediates, transitions between which are typically accompanied by energy–consuming reactions. The T-cell activation mechanism with successive phosphorylation events is used for demonstration (McKeithan, 1995; Cui and Mehta, 2018). (b) The spatial proofreading scheme where the delay between binding and catalysis is created by constraining these events to distinct physical locations. The wavy arrows stand for the diffusive motion of the complex. Binding events primarily take place on the length scale λS of substrate localization.

In this work, we propose an alternative scheme where the delay between initial substrate binding and product formation steps is achieved by separating these events in space. If substrates are spatially localized and product formation is favorable only in a region of low substrate concentration where an activating effector is present then the time taken by the enzyme–substrate complex to travel from one location to the other can be used to discard the wrong substrates, which are assumed to unbind from the enzyme more readily than the right substrates (Figure 1b). When this delay is longer than substrate unbinding time scales, very low error rates of product formation can be achieved, allowing this spatial proofreading scheme to outperform biochemical mechanisms with a finite number of proofreading steps.

In contrast to traditional proofreading, the nonequilibrium mechanism here does not require any direct energy consumption by the enzyme or substrate itself (e.g. through ATP hydrolysis). This liberates the enzyme from any proofreading-specific molecular features; indeed, any ‘equilibrium’ enzyme with a localized effector can proofread using our scheme if appropriate concentration gradients of the substrates or enzymes are set up. In this way, the energetic and structural requirements of proofreading can be outsourced from the enzyme and substrate to the gradient maintaining mechanism. It also means that spatial proofreading is easy to overlook in experiments, and that the fidelity of reconstituted reactions in vitro could be lower than the fidelity in vivo.

The lack of reliance on structure makes spatial proofreading more adaptable. We study how tuning the length scale of concentration gradients can trade off error rate against speed and energy consumption on the fly. In contrast, traditional proofreading schemes rely on nucleotide chemical potentials, for example, the out of equilibrium [ATP]/[ADP] ratio in the cell, and cannot modulate their operation without broader physiological disruptions.

Our proposed scheme can be leveraged for specificity if appropriate concentration gradients are set. Such gradients arise in multiple cellular contexts (e.g. near the nucleus, the plasma membrane, the Golgi apparatus, the endoplasmic reticulum [ER], kinetochores, microtubules [Bivona et al., 2003; Caudron et al., 2005; Kholodenko, 2006]) and several gradient-forming mechanisms have been discussed in the literature (Wu et al., 2018; Kholodenko, 2006; Kholodenko, 2003). We conclude our analysis of spatial proofreading by quantifying its limitations as set by realistic reaction rates and gradient formation mechanisms, and discuss examples from the literature, including the localization of mRNAs in polarised cells, and the non-vesicular transport of lipids in eukaryotic cells, in which this mechanism might be in play. Our work motivates a detailed investigation of spatial structures and compartmentalization in living cells as possible delay mechanisms for proofreading enzymatic reactions.


Slow transport of enzymatic complex enables proofreading

Our proposed scheme is based on spatially separating substrate binding and product formation events for the enzyme (Figure 1b). Such a setting arises naturally if substrates are spatially localized by having concentration gradients in a cellular compartment. Similarly, an effector needed for product formation (e.g. through allosteric activation) may have a spatial concentration gradient localized elsewhere in that compartment. To keep our model simple, we assume that the right (R) and wrong (W) substrates have identical concentration gradients of length scale λS but that the effector is entirely localized to one end of the compartment, for example via membrane tethering. In Appendix 4, we extend our study of model performance to the scenario where the two substrates have different localization length scales.

We model our system using coupled reaction–diffusion equations for the substrate-bound (‘ES’ with S=R,W) and free (‘E’) enzyme densities, namely,

(1) ρERt=D2ρERx2koffRρER+konρRρE,
(2) ρEWt=D2ρEWx2koffWρEW+konρWρE,
(3) ρEt=D2ρEx2+S=R,WkoffSρESS=R,WkonρSρE.

Here, D is the enzyme diffusion constant, kon and koffS (with koffW>koffR) are the substrate binding and unbinding rates, respectively, and ρS(x)e-x/λS is the spatially localized substrate concentration profile which we take to be exponentially decaying, which is often the case for profiles created by cellular gradient formation mechanisms (Driever and Nüsslein-Volhard, 1988; Brown and Kholodenko, 1999). We limit our discussion to this one-dimensional setting of the system, though our treatment can be generalized to two and three dimensions in a straightforward way.

The above model does not explicitly account for several effects relevant to living cells, such as depletion of substrates or distinct diffusion rates for the free and substrate-bound enzymes. More importantly, it does not account for the mechanism of substrate gradient formation. We analyze a biochemically detailed model with this latter feature and experimentally constrained parameters later in the paper. Here, we proceed with the minimal model above for explanatory purposes. To identify the key determinants of the model’s performance, we assume throughout our analysis that the amount of substrates is sufficiently low that the enzymes are mostly free with a roughly uniform profile (i.e. ρEconstant). This assumption makes Equations (1-3) linear and allows us to solve them analytically at steady state. We demonstrate in Appendix 5 that proofreading is, in fact, most effective under this assumption and discuss the consequences of having high substrate amounts on the performance of the scheme.

In our simplified picture, enzyme activation and catalysis take place upon reaching the right boundary at a rate r that is identical for both substrates. Therefore, the density of substrate–bound enzymes at the right boundary can be taken as a proxy for the rate of product formation vS, since 

(4) vS=rρES(L),

where L is the size of the compartment. In order to keep the analytical results concise and intuitive, we perform our main analyses under the assumption that catalysis is slow, mirroring the study of traditional proofreading schemes (Hopfield, 1974). In Appendix 3, we derive the precise conditions under which this treatment is valid, and generalize our analysis to arbitrary catalysis rates.

To demonstrate the proofreading capacity of the model, we first analyze the limiting case where substrates are localized to the left end of the compartment (λS0). In this limit, the fidelity η, defined as the number of right products formed per single wrong product, becomes

(5) η=vRvW=ηeqsinh(τDkoffW)sinh(τDkoffR),

where ηeq=koffW/koffR is the equilibrium fidelity, and τD=L2/D is the characteristic time scale of diffusion across the compartment (see Appendix 1 for the derivation).

Equation 5 is plotted in Figure 2 for a family of different parameter values. As can be seen, when diffusion is fast (small τD), fidelity converges to its equilibrium value and proofreading is lost (ηηeq×τDkoffW/τDkoffR=ηeq). Conversely, when diffusion is slow (large τD), the enzyme undergoes multiple rounds of binding a substrate at the left end and unbinding midway until it manages to diffuse across the whole compartment as a complex and form a product. These rounds serve as ‘futile cycles’ that endow the system with proofreading. In this regime, fidelity scales as

(6) ηe(koffW-koffR)τD.
Dependence of fidelity on the diffusion time scale in the limit of very high substrate localization.

Individual curves were made for different choices of koffW (varied in the [10-100]koffR range). τoffR=1/koffR is the unbinding time scale of right substrates, kept fixed in the study. Fidelity values corresponding to integer degrees of proofreading in a traditional sense (η/ηeq=ηeqn, n=1,2,3,) are marked as circles. Dominant processes in the two limiting regimes are highlighted in red in the schematics shown as insets.

To get further insights, we introduce an effective number of extra biochemical intermediates (n) that a traditional proofreading scheme would need to have in order to yield the same fidelity, that is η/ηeq=ηeqn. We calculate this number as (see Appendix 1)

(7) nτDkoffWlnηeq.

Notably, since τDL2, the result above suggests a linear relationship between the effective number of proofreading realizations and the compartment size (nL). In addition, because the right-hand side of Equation 7 is an increasing function of koffW, the proofreading efficiency of the scheme rises with larger differences in substrate off-rates (Figure 2) – a feature that ‘hard–wired’ traditional proofreading schemes with a fixed number of proofreading steps lack.

Navigating the speed–fidelity trade-off

As is inherent to all proofreading schemes, the fidelity enhancement described earlier comes at a cost of reduced product formation speed. This reduction, in our case, happens because of increased delays in diffusive transport. Here, we explore the resulting speed–fidelity trade-off and its different regimes by varying two of the model parameters: diffusion time scale τD and the substrate localization length scale λS.

Speed and fidelity for different sampled values of τD and λS are depicted in Figure 3a. As can be seen, for a fixed τD, the reduction of λS can trade off fidelity against speed. This trade-off is intuitive; with tighter substrate localization, the complexes are formed closer to the left boundary. Hence, a smaller fraction of complexes reach the activation region, reducing reaction speed. The Pareto-optimal front of the trade-off over the whole parameter space, shown as a red curve on the plot, is reached in the limit of ideal substrate localization (λS0). Varying the diffusion time scale allows one to navigate this optimal trade-off curve and access different performance regimes.

Speed–fidelity trade-off and consequences of having weak substrate gradients.

(a) Speed and fidelity evaluated for sampled values of the diffusion time scale (τD) and substrate localization length scale (λS). Here, veq1/koffR is the speed in the equilibrium limit of a uniform substrate profile (λS). The red line corresponds to the Pareto-optimal front and is reached in the high substrate localization limit. The example speed–fidelity trade-off illustrated through the black dotted curve is obtained for τD20τoffR. (b) Density profiles of wrong (EW) and right (ER) complexes in three qualitatively different performance regimes. The normalization factor ρESeq corresponds to the equilibrium complex densities. (c) Fidelity as a function of diffusion time scale for different choices of λS (varied in the [0.04,0.4]L range). The dashed line corresponds to the ideal substrate localization limit (λS0). Inset: Fidelity as a function of L/λS for a fixed τD. Shaded area indicates the range where the bulk of fidelity enhancement takes place. Equilibrium fidelity ηeq=10 was used in generating all the panels.

Specifically, if the diffusion time scale is fast compared with the time scales of substrate unbinding (i.e. τD1/koffR,1/koffW), then both right and wrong complexes that form near the left boundary arrive at the activation region with high probability, resulting in high speeds, although at the expense of error–prone product formation (Figure 3b, top). In the opposite limit of slow diffusion, both types of complexes have exponentially low densities at the activation region, but due to the difference in substrate off-rates, production is highly accurate (Figure 3b, bottom). There also exists an intermediate regime where a significant fraction of right complexes reach the activation region while the vast majority of wrong complexes do not (Figure 3b, middle). As a result, an advantageous trade-off is achieved where a moderate decrease in the production rate yields high fidelity enhancement – a feature that was also identified in multi-step traditional proofreading models (Murugan et al., 2012).

In Appendix 3, we also study this trade-off caused by varying the catalysis rate r. Briefly, we find that when all other parameters are fixed, increasing r trades off fidelity against speed in a linear fashion, with the ratio of highest and lowest fidelity values falling in the [ηeq,ηeq] range. The Pareto–optimal front of the trade-off, however, monotonically shifts toward the higher speed region, suggesting that faster catalysis is, in fact, more favorable if the diffusion time scale τD can be adjusted accordingly (see Appendix 3 for details).

We saw in Figure 3a that in the case of ideal substrate localization, the slowdown of diffusive transport necessarily reduced the production rate and increased the fidelity. The latter part of this statement, however, breaks down when substrate gradients are weak. Indeed, fidelity exhibits a non-monotonic response to tuning τD when the substrate gradient length scale λS is non-zero (Figure 3c). The reason for the eventual decay in fidelity is the fact that with slower diffusion (larger τD), substrate binding and unbinding events take place more locally and therefore, the right and wrong complex profiles start to resemble the substrate profile itself, which does not discriminate between the two substrate kinds. We show in Appendix 1 that the optimal diffusion time scale can be roughly approximated as τD*/τoffRηeq-1(L/λS)2, which increases monotonically with L/λS, consistent with the shifting peaks in Figure 3c.

Not surprisingly, the error–correcting capacity of the scheme improves with better substrate localization (lower λS). For a fixed τD, the bulk of this improvement takes place when L/λS is tuned in a range set by the two key dimensionless numbers of the model, namely, τDkoffR and τDkoffW (Figure 3c, inset). In Appendix 1, we provide an analytical justification for this result. Taken together, these parametric studies uncover the operational principles of the spatial proofreading scheme and demonstrate how the speed–fidelity trade-off could be dynamically navigated as needed by tuning the key time and length scales of the model.

Energy dissipation and limits of proofreading performance

A hallmark signature of proofreading is that it is a nonequilibrium mechanism with an associated free energy cost. In our scheme, the enzyme itself is not directly involved in any energy-consuming reactions, such as hydrolysis. Instead, the free energy cost comes from maintaining the spatial gradient of substrates, which the enzymatic reaction tends to homogenize by releasing bound substrates in regions of low substrate concentration. As the activating effectors are assumed to be tethered at x=L, they do not require dissipation to remain localized.

While mechanisms of substrate gradient maintenance may differ in their energetic efficiency, there exists a thermodynamically dictated minimum energy that any such mechanism must dissipate per unit time. We calculate this minimum power P as

(8) P=S = {R,W}0LjS(x)μ(x)dx.

Here jS(x)=konρS(x)ρE-koffSρES(x) is the net local binding flux of substrate ‘S’, and μ(x) is the local chemical potential (see Appendix 2.1 for details). For substrates with an exponentially decaying profile considered here, the chemical potential is given by

(9) μ(x)=μ(0)+kBTlnρS(x)ρS(0)=μ(0)-kBTxλS,

where kBT is the thermal energy scale. Notably, the chemical potential difference across the compartment, which serves as an effective driving force for the scheme, is set by the inverse of the nondimensionalized substrate localization length scale, namely,

(10) βΔμ=LλS,

where β-1=kBT. This driving force is zero for a uniform substrate profile (λS) and increases with tighter localization (lower λS), as intuitively expected.

We used Equation 8 to study the relationship between dissipation and fidelity enhancement as we tuned Δμ for different choices of the diffusion time scale τD. As can be seen in Figure 4, power rises with increasing fidelity, diverging when fidelity reaches its asymptotic maximum given by Equation 5 in the large Δμ limit. For the bulk of each curve, power scales as the logarithm of fidelity, suggesting that a linear increase in dissipation can yield an exponential reduction in error. Notably, such a scaling relationship has also been calculated in the context of E. coli chemoreceptor adaptation (Lan et al., 2012). In particular, it was shown that the adaptation error decreases exponentially with energy dissipated through multiple methylation–demethylation cycles which are used to stabilize the activity state of the receptor. Analogies in the cost-performance trade-off across these functionally distinct mechanisms contribute to the search for overarching thermodynamic themes underlying cellular information processing (Lan et al., 2012; Lan and Tu, 2013; Horowitz et al., 2017; Sartori and Pigolotti, 2015).

Power–fidelity relationship when tuning the effective driving force Δμ for different choices of the diffusion time scale τD.

Jbind=konρEρS(x)dx is the integrated rate of substrate binding. The red line indicates the large dissipation limit of fidelity given by Equation 5. The circles indicate the Δμ range specified in Equation 11 for different τD choices. For sufficiently large τD values, the cost per binding event approaches βηeq at the end of this range (see Appendix 2.1 for details). In making this plot, ηeq=10 was used.

The logarithmic scaling is achieved in our model when the driving force is in a range where most of the fidelity enhancement takes place, namely,

(11) βΔμ[τDkoffR,τDkoffW].

At the end of this range, the cost per substrate binding event approaches ηeq in kBT units (see Appendix 2.1 for details). And beyond the range, additional error correction is attained at an increasingly higher cost.

Note that the power computed here does not include the baseline cost of creating the substrate gradient, which, for instance, would depend on the substrate diffusion constant. We only account for the additional cost to be paid due to the operation of the proofreading scheme which works to homogenize this substrate gradient. The baseline cost in our case is analogous to the work that ATP synthase needs to perform to maintain a nonequilibrium [ATP]/[ADP] ratio in the cell, whereas our calculated power is analogous to the rate of ATP hydrolysis by a traditional proofreading enzyme. We discuss these two classes of dissipation in greater detail in Appendix 2.3.

Just as the cellular chemical potential of ATP or GTP imposes a thermodynamic upper bound on the fidelity enhancement by any proofreading mechanism (Qian, 2006), the effective driving force Δμ imposes a similar constraint for the spatial proofreading model. This thermodynamic limit depends only on the available chemical potential and is equal to eβΔμ. This limit can be approached very closely by our model, which for Δμ1 achieves the exponential enhancement with an additional linear prefactor, namely, (η/ηeq)maxeβΔμ/βΔμ (see Appendix 2.2). Such scaling behavior was theoretically accessible only to infinite-state traditional proofreading schemes (Qian, 2006; Ehrenberg and Blomberg, 1980). This offers a view of spatial proofreading as a procession of the enzyme through an infinite series of spatial filters and suggests that, from the perspective of peak error reduction capacity, our model outperforms the finite-state schemes.

Proofreading by biochemically plausible intracellular gradients

Our discussion of the minimal model thus far was not aimed at a particular biochemical system and thus did not involve the use of realistic reaction rates and diffusion constants typically seen in living cells. Furthermore, we did not account for the possibility of substrate diffusion, as well as for the homogenization of substrate concentration gradients due to enzymatic reactions, and have thereby abstracted away the gradient maintaining mechanism. The quantitative inspection of such mechanisms is important for understanding the constraints on spatial proofreading in realistic settings.

Here, we investigate proofreading based on a widely applicable mechanism for creating gradients by the spatial separation of two opposing enzymes (Stelling and Kholodenko, 2009; Bivona et al., 2003; Brown and Kholodenko, 1999). Consider a protein S that in its free state is phosphorylated by a membrane-bound kinase and dephosphorylated by a delocalized cytoplasmic phosphatase, as shown in Figure 5a. This setup will naturally create a gradient of the active form of protein (S*), with the gradient length scale controlled by the rate of phosphatase activity kp (S*kpS). Such mechanisms are known to create gradients of the active forms of MEK and ERK (Kholodenko, 2006), of GTPases such as Ran (with GEF and GAP [Kalab et al., 2002] playing the role of kinase and phosphatase, respectively), of cAMP (Kholodenko, 2006) and of stathmin oncoprotein 18 (Op18) (Bastiaens et al., 2006; Niethammer et al., 2004) near the plasma membrane, the Golgi apparatus, the ER, kinetochores and other places.

Proofreading based on substrate gradients formed by spatially separated kinases and phosphatases.

(a) The active form S* of many proteins exhibits gradients because kinases that phosphorylate S are anchored to a membrane while phosphatases can diffuse in the cytoplasm (Kholodenko, 2006). An enzyme can exploit the resulting spatial gradient for proofreading. (b) At low enzyme activity (i.e. low konρE), the gradient of S* is successfully maintained, allowing for proofreading. The upper dashed line corresponds to the peak fidelity when the substrate profile is exponential. At high enzyme activity (large konρE), the dephosphorylation with rate kp=5 s-1 is no longer sufficient to maintain the gradient and proofreading is lost. (c) Profiles of right substrates for different choices of enzyme activity. Numbers indicate konρE in s-1 units. The black line shows an exponential substrate profile with a length scale λS=D/kp0.5 μm.

We test the proofreading power of such gradients, assuming experimentally constrained biophysical parameters for the gradient forming mechanism. Specifically, we consider an enzyme E that acts on the active forms of cognate (R*) and non-cognate (W*) substrates which have off-rates 0.1 s-1 and 1 s-1, respectively (hence, ηeq=10). These off-rates are consistent with typical values for substrates proofread by cellular signaling systems (Cui and Mehta, 2018; Gascoigne et al., 2001). The kinases and phosphatases in our setup act identically on right and wrong substrates. We consider a dephosphorylation rate constant kp=5 s-1 that falls in the range 0.1−100 s-1 reported for different phosphatases (Brown and Kholodenko, 1999; Kholodenko et al., 2000; Todd et al., 1999), and a cytosolic diffusion constant D=1 μm2/s for all proteins in this model. With this setup, exponential gradients of length scale ∼0.5 μm are formed for R* and W*. We evaluate the proofreading and energetic performance of the model in a compartment of size L=10 μm – a typical length scale in eukaryotic cells (see Appendix 6 for details).

Although not cost-efficient, this setup achieves proofreading in a wide range of regimes. Specifically, it is most effective when the enzyme–substrate binding is slow, in which case the exponential substrate profile is maintained and the system attains the fidelity predicted by our earlier explanatory model (Figure 5b). The system’s proofreading capacity is retained if the first–order on-rate is raised up to konρE10 s-1, where around 10-fold increase in fidelity is still possible. If the binding rate constant (kon) or the enzyme’s expression level (ρE) is any higher, then enzymatic reactions overwhelm the ability of the kinase/phosphatase system to keep the active forms of substrates sufficiently localized (Figure 5c) and proofreading is lost. Overall, this model suggests that enzymes can work at reasonable binding rates and still proofread, when accounting for an experimentally characterized gradient maintaining mechanism.


We have outlined a way for enzymatic reactions to proofread and improve specificity by exploiting spatial concentration gradients of substrates. Like the classic model, our proposed spatial proofreading scheme is based on a time delay; but unlike the classic model, here the delay is due to spatial transport rather than transitions through biochemical intermediates. Consequently, the enzyme is liberated from the stringent structural requirements imposed by traditional proofreading, such as multiple intermediate conformations and hydrolysis sites for energy coupling. Instead, our scheme exploits the free energy supplied by active mechanisms that maintain spatial structures.

The decoupling of the two crucial features of proofreading – time delay and free energy dissipation – allows the cell to tune proofreading on the fly. For instance, all proofreading schemes offer fidelity at the expense of reaction speed and energy. For traditional schemes, navigating this trade-off is not always feasible, as it needs to involve structural changes via mutations or modulation of the [ATP]/[ADP] ratio which can cause collateral effects on the rest of the cell. In contrast, the spatial proofreading scheme is more adaptable to the changing conditions and needs of the cell. The scheme can prioritize speed in one context, and fidelity in another, simply by tuning the length scale of intracellular gradients (e.g. through the regulation of the phosphotase or free enzyme concentration in the scheme discussed earlier).

On the other hand, this modular decoupling can complicate the experimental identification of proofreading enzymes and the interpretation of their fidelity. Here, the enzymes need not be endowed with the structural and biochemical properties typically sought for in a proofreading enzyme. At the same time, any attempt to reconstitute enzymatic activity in a well-mixed, in vitro assay, will show poor fidelity compared to in vivo measurements, even when all necessary molecular players are present in vitro. Therefore, more care is required in studies of cellular information processing mechanisms that hijack a distant source of free energy compared to the case where the relevant energy consumption is local and easier to link causally to function.

While we focused on spatially localized substrates and delocalized enzymes, our framework would apply equally well to other scenarios, like one with a spatially localized enzyme (or its active form [Kalab et al., 2002; Nalbant et al., 2004]) and effector with delocalized substrates, an example of which would be an alternative version of the scheme in Figure 5a where the target of the kinase/phosphatase activity is changed from substrates to enzymes. Our framework can also be extended to signaling cascades, where slightly different phosphatase activities can result in magnified concentration ratios of two competing signaling molecules at the spatial location of the next cascade step (Roy and Cyert, 2009; Bauman and Scott, 2002; Kholodenko, 2006).

The spatial gradients needed for the operation of our model can be created and maintained through multiple mechanisms in the cell, ranging from the kinase/phosphatase system modeled here, to the passive diffusion of substrates/ligands combined with active degradation (e.g. Bicoid and other developmental morphogens), to active transport processes combined with diffusion. A particularly simple implementation of our scheme is via compartmentalization – substrates and effectors are localized in two spatially separated compartments with the enzyme–substrate complex having to travel from one to another to complete the reaction.

Many molecular localization pathways involving the naturally compartmentalized parts of the cell require high substrate selectivity and are therefore potential candidates for the implementation of spatial proofreading. For example, in polarized, asymmetric cells (e.g. budding yeast or neuronal cells) gene expression often needs to be spatially regulated (Parton et al., 2014; Martin and Ephrussi, 2009). Such regulation is achieved with designated ribonucleoproteins that bind specific mRNAs near the cell nucleus, perform a biased random walk to the mRNA localization site and deliver them for translation. During transport, mRNAs are protected from ribosome binding and when they unbind, they are subject to degradation which would prevent rebinding events at intermediate locations. Another example process is the non-vesicular transport of lipids between the membrane–bound domains of the cells (e.g. the ER, mitochondria, the Golgi apparatus, or the plasma membrane). This transport mechanism is mediated by lipid-transfer proteins that bind lipids on the donor membrane, diffuse to the acceptor membrane and upon interacting with it, undergo a conformational change, delivering the ‘cargo’ (Lev, 2010). Although the higher proximity of the two membranes is thought to enhance the transport efficiency, it would be interesting to study the optimality of the inter-membrane distance in the context of fidelity–transport efficiency trade-off, given the fact that some of the lipid-transfer proteins are known to exhibit specificity for their cognate substrates.

Our scheme may also be applicable as a quality control mechanism in protein secretion pathways (Ellgaard and Helenius, 2003; Arvan et al., 2002), in high-fidelity targeting of membrane proteins mediated by signal recognition particles (Rao et al., 2016; Chio et al., 2017), as well as in selective glycosylation reactions in the Golgi apparatus (Jaiman and Thattai, 2020). Lastly, considering the recent advances in generating synthetic morphogen patterns in multicellular organisms (Toda et al., 2020; Stapornwongkul et al., 2020), spatial proofreading could also be employed in pathways acting on engineered protein gradients. Experimental investigations of these processes in light of our work will reveal the extent to which spatial transport promotes specificity.

In conclusion, we have analyzed the role played by spatial structures in endowing enzymatic reactions with kinetic proofreading. Simply by spatially segregating substrate binding from catalysis, enzymes can enhance their specificity. This suggests that enzymatic reactions may acquire de novo proofreading capabilities by coupling to pre-existing spatial gradients in the cell.

Materials and methods

Detailed derivations of the analytical results presented in the main text along with additional studies on our model are included in the Appendices. In addition, Python scripts and Jupyter notebooks used to generate all the plots in the main text and Appendices are included as Supplementary files.

Appendix 1

Analytical calculations of the complex density profile and fidelity

We begin this section by deriving an analytical expression for the density profile of substrate-bound enzymes (ρES(x)) in the case where the ρ(x)constant assumption holds. Based on this result, we then obtain expressions for fidelity in low, high, and intermediate substrate localization regimes. We reserve the studies of speed and fidelity in the general case of a nonuniform free enzyme profile to Appendix 5.

1. Derivation of the complex density profile ρES(x)

The ordinary differential equation (ODE) that defines the steady state profile of substrate-bound enzymes is

(S1) Dd2ρESdx2diffusion-koffSρES(x)unbinding+konρS(0)e-x/λSρE(x)binding=0.

Here, ρS(0) is the substrate density at the leftmost boundary, whose value can be calculated from the condition that the total number of free substrates is Stotal, namely,

(S2) Stotal=x=0LρS(0)ex/λSdx=ρS(0)λS(1eL/λS)
(S3) ρS(0)=StotalλS(1eL/λS).

In the limit of low substrate amounts where the approximation ρE(x)constant is valid, Equation S1 represents a linear nonhomogeneous ODE. Hence, its solution can be written as

(S4) ρES(x)=ρES(h)(x)+ρES(p)(x),

where ρES(h)(x) is the general solution to the corresponding homogeneous equation, while ρES(p)(x) is a particular solution.

Looking for solutions of the form Ce-x/λ for the homogeneous part, we find

(S5) C(Dλ2-koffS)e-x/λ=0.

The two possible roots for λ are ±D/koffS. Calling the positive root λES, which represents the mean distance traveled by the substrate–bound enzyme before releasing the substrate, we can write the general solution to the homogeneous part of Equation S1 as

(S6) ρES(h)(x)=C1e-x/λES+C2ex/λES,

where C1 and C2 are constants which will be determined from the boundary conditions.

Since the nonhomogeneous part of Equation S1 is a scaled exponential, we look for a particular solution of the same functional form, namely, ρES(p)(x)=Cpe-x/λS. Substituting this form into the ODE, we obtain

(S7) Cp(DλS2-koffS)e-x/λS=-konρS(0)e-x/λSρE.

The constant coefficient Cp can then be found as

(S8) Cp=konρS(0)ρEkoffSDλS2=konρS(0)ρEkoffS(1D/koffSλS2)=konρS(0)ρEkoffS(1λES2λS2),

where we have used the equality λES=D/koffS.

Now, to find the unknown coefficients C1 and C2, we impose the no-flux boundary conditions for the density ρES(x) at the left and right boundaries of the compartment, namely,

(S9) dρESdx|x=0=C1λES+C2λESCpλS=0,
(S10) dρESdx|x=L=C1λESeLλES+C2λESeLλESCpλSeLλS=0.

Note that we did not take into account the product formation flux at the rightmost boundary when writing Equation S10 in order to simplify our calculations. This is justified in the limit of slow catalysis – an assumption that we make in our treatment. The above system of two equations can then be solved for C1 and C2, yielding

(S11) C1=λES2λSeL/λESeL/λSsinh(L/λES)Cp,
(S12) C2=λES2λSeL/λSeL/λESsinh(L/λES)Cp.

With the constant coefficients known, we obtain the general solution for the complex profile as

(S13) ρES(x)=C1ex/λES+C2ex/λES+Cpex/λS=Cp(λESλSsinh(L/λES)[e(Lx)/λES+e(xL)/λES2+ex/λES+ex/λES2eL/λS]+ex/λS)=konρS(0)ρEkoffS(1λES2/λS2)(λESλSsinh(L/λES)[cosh(LxλES)+cosh(xλES)eL/λS]+ex/λS)=konρS(0)ρEkoffS(1λES2/λS2)(λESλSsinh(L/λES)[cosh(LxλES)+cosh(xλES)eL/λS]+ex/λS).

2. Density profile in low and high substrate localization regimes

If substrate localization is very poor (λSL), the substrate distribution will be uniform (ρS(x)=ρ¯S=Stotal/L), resulting in a similarly flat profile of enzyme–substrate complexes with their density ρES given by

(S14) ρES=konρS(0)ρEkoffS=konρ¯SρEkoffS.

This is the expected equilibrium result where the complex concentration is inversely proportional to the dissociation constant (koffS/kon).

In the opposite limit where the substrates are highly localized (λSλES,L and ρS(0)Stotal/λS from Equation S3), the complex density profile simplifies into

(S15) ρES(x)konStotalρEkoffSλS(λES2/λS2)(λESλSsinh(L/λES)cosh(LxλES))=konStotalρEkoffSLL/λESsinh(L/λES)cosh(LxλES)=ρES×L/λESsinh(L/λES)cosh(LxλES).

The x-dependence through the cosh() function suggests that the complex density is the highest at the leftmost boundary and lowest at the rightmost boundary, with the degree of complex localization dictated by the length scale parameter λES. Notably, this localization of complexes does not alter their total number, since the average complex density is conserved, that is,

(S16) ρES=0LρES(x)dx=ρES×L/λESsinh(L/λES)×1L0Lcosh(LxλES)dx=ρES×L/λESsinh(L/λES)×λESLsinh(L/λES)=ρES.

Equation S15 for the complex profile can be alternatively written in terms of the diffusion time scale τD=L2/D and the substrate off-rate koffS. Noting that L/λES=L2koffS/D=τDkoffS and introducing a dimensionless coordinate x~=x/L, we find

(S17) ρES(x)=ρES×τDkoffSsinh(τDkoffS)cosh(τDkoffS(1-x~)).

The above equation is what was used for generating the plots in Figure 3b of the main text for different choices of the diffusion time scale.

3. Fidelity in low and high substrate localization regimes

Let us now evaluate the fidelity of the model in the two limiting regimes discussed earlier. In the poor substrate localization case, which corresponds to an equilibrium setting, the fidelity can be found from Equation S14 as

(S18) ηeq=rρERrρEW=koffWkoffR,

where we have employed the assumption about the right and wrong substrates having identical density profiles. This is the expected result for equilibrium discrimination where no advantage is taken of the system’s spatial structure.

In the regime with high substrate localization, the enzyme–substrate complexes have a nonuniform spatial distribution. What matters for product formation is the complex density at the rightmost boundary (x~=1), which we obtain from Equation S17 as

(S19) ρES(L)=ρES×τDkoffSsinh(τDkoffS).

Substituting the above expression written for right and wrong complexes into the definition of fidelity, we find

(S20) η=rρER(L)rρEW(L)=ηeq×koffRkoffWsinh(τDkoffW)sinh(τDkoffR)=ηeqsinh(τDkoffW)sinh(τDkoffR).

This is the result reported in Equation 5 of the main text. To gain more intuition about it and draw parallels with traditional kinetic proofreading, let us consider the limit of long diffusion time scales where proofreading is the most effective. In this limit, the hyperbolic sine functions above can be approximated as sinh(τDkoffS)0.5eτDkoffS, simplifying the fidelity expression into

(S21) η=ηeqeτDkoffWeτDkoffR=ηeqeτDkoffWτDkoffR=ηeqeτDkoffR(ηeq1),

where we have used the definition of equilibrium fidelity (Equation S18). In traditional proofreading, a scheme with n proofreading realizations can yield a maximum fidelity of η/ηeq=ηeqn. The value of n for the original Hopfield model, for instance, is 1. It would be informative to also know the effective parameter n for the spatial proofreading model. Dividing Equation S21 by ηeq, we find

(S22) ηηeq=1ηeqeτDkoffR(ηeq1)=ηeqn,eτDkoffR(ηeq1)=ηeqn+12,τDkoffR(ηeq1)=(n+12)lnηeqn+12=ηeq1lnηeqτDkoffR.

This exact result can be simplified into an approximate form when diffusion is slow and ηeq1, yielding the expression reported in Equation 7 of the main text, namely,

(S23) nηeqτDkoffRlnηeq=τDkoffWlnηeq.

4. Fidelity in an intermediate substrate localization regime

The generic expression for complex density at the rightmost boundary (x=L) can be written using Equation S13 as

(S24) ρES(L)=konρS(0)ρEkoffS(1λES2/λS2)(λESλSsinh(L/λES)[cosh(LλES)eL/λS1]+eL/λS).

For the system to proofread, substrates need to be sufficiently localized (λS<L) and diffusion needs to be sufficiently slow (τDkoffS>1 or, λES<L). Under these conditions, the substrate profile can be approximated using Equation S3 as ρS(x)λS-1Stotale-x/λS, while the hyperbolic sine and cosine functions used above can be approximated as sinh(L/λES)cosh(L/λES)0.5eL/λES. With these approximations, the complex density expression simplifies into

(S25) ρES(L)=konStotalρEkoffSλS(1λES2/λS2)(λESλS[eL/λS2eL/λES]+eL/λS)=konStotalρEkoffS(λS2λES2)((λS+λES)eL/λS2λESeL/λES).

Now, depending on how λS compares with λES, there can be two qualitatively different regimes for the complex density, namely,

(S26) ρES(L)=ρES×{2LλESeL/λES,ifλSλES(L/λSτDkoffS)LλSeL/λS,ifλESλS(τDkoffSL/λS)

where we used the equilibrium complex density ρES defined in Equation S14.

Notably, the first regime effectively corresponds to the case of ideal substrate localization where complex density is independent of the precise value of λS. The dimensionless number τDkoffS sets the scale for the minimum L/λS value beyond which ideal localization can be assumed. Conversely, the second regime corresponds to the case where the distance traveled by a complex before dissociating is so short that the complex profile is dictated by the substrate profile itself. Because of that, the complex density reduction from its equilibrium limit is independent of the precise values of τD and koffS, as long as the condition λESλS is met.

The scheme yields its highest fidelity when both right and wrong complex densities are in the first regime (ideal localization). When both densities are in the second regime, fidelity is reduced down to its equilibrium value ηeq (Appendix 1—table 1). The transition between these two extremes happens when the density profiles of right and wrong complexes fall under different regimes. Fidelity can be navigated in the transition zone by tuning the substrate gradient length scale λS. This is demonstrated in Appendix 1—figure 1 for three different choices of ηeq. In all three cases, the dimensionless numbers τDkoffR and τDkoffW set the approximate range in which the bulk of fidelity enhancement occurs, as stated in the main text.

Appendix 1—table 1
Fidelity of the scheme in different regimes of right and wrong complex densities.

The upper-right cell is empty because the two conditions on λS cannot be simultaneously met, since λER>λEW by construction (follows from koffR<koffW).

Appendix 1—figure 1
The effective number of proofreading realizations (neff) as a function of L/λS.

The shaded region represents the range of L/λS values set by the key dimensionless numbers τDkoffR and τDkoffW. τD values chosen for the demonstration were 60, 40, and 20 (in 1/koffR units) for the three different choices of ηeq, respectively.

5. Optimal diffusion time scale for maximum fidelity

Figure 3c of the main text illustrated the non-monotonic dependence of fidelity on the diffusion time scale τD for different fixed values of λS. Here, we further explore this feature by asking what sets the optimal τD. To gain analytical insights, we focus on the case where the system can proofread, which, as we argued in the previous section, happens when λS,λES<L. Under this condition, we identified two qualitatively different regimes of complex density reduction (Equation S26). Namely, we found that for sufficiently fast diffusion the system acted as if the substrates were localized ideally, whereas for sufficiently slow diffusion the complex density reduction was dictated solely by λS and did not discriminate between the two substrate kinds. These two limiting behaviors are indeed reflected in Figure 3c where in the low τD limit (fast diffusion) the family of curves matches the dotted ideal localization curve, while in the high τD limit (slow diffusion) all curves decay to 1, corresponding to the loss of error correction.

An intuitive approach for identifying the optimal τD is to slow down diffusion up to the point where the density of wrong complexes at x=L approaches a plateau and effectively stops decreasing. Going past this threshold would only reduce the density of right complexes at x=L and thereby, reduce the fidelity. We know from Equation S26 that plateauing for wrong complexes happens when λEWλS (equivalently, τDkoffWL/λS). Hence, our first guess for the optimal diffusion time scale τD* is


To test the soundness of this expression, we compared its predictions to the optimal τD values in Figure 3 that were identified numerically for different choices of λS. The results of the comparison are shown in Appendix 1—figure 2. As can be seen, for sufficiently high degrees of substrate localization (L/λS), the prediction of Equation S29 provides a good approximation of the true optimum. However, it is apparent that the prediction consistently underestimates the true τD*, which was expected since plateauing of ρEW(L) happens not under equality but a strict inequality condition (i.e. τD*koffWL/λS). Because an exact analytical expression for τD* is not available, we performed different approximations to the fidelity formula and found an empirical correction term for our earlier estimate given by 2(L/λS)/ηeq. The prediction for τD* with the correction term is now accurate starting a much lower value of L/λS, corresponding to a regime where the system proofreads once (neff1). Overall, these analytical results provide good initial guesses for τD* which should be refined using a numerical approach for a higher accuracy.

Appendix 1—figure 2
Optimal diffusion time scale for different choices of λS.

Blue dots represent the exact values obtained numerically for the data in Figure 3c. Dashed and solid lines represent the analytical estimates with and without the correction term. Vertical lines correspond to those values of L/λS that yield an integer number of effective proofreading realizations.

Appendix 2

Energetics of the scheme

We start this section by deriving an analytical expression for the minimum dissipated power, which was used in making Figure 4 of the main text. Then, we calculate the upper limit on fidelity enhancement available to our model for a finite substrate gradient length scale and compare this limit with the fundamental thermodynamic bound. We end the section by providing an estimate for the baseline cost of setting up gradients and compare this cost with the maintenance cost reported in the main text. Similar to our treatment of Appendix 1, here too our calculations are based on the ρEconstant assumption to allow for intuitive analytical results.

1. Derivation of the minimum dissipated power

As stated in the main text, we calculate the minimum rate of energy dissipation necessary for maintaining the substrate profiles as

(S30) P=S=R,W0LjS(x)μ(x)dx,

where jS(x)=konρS(x)ρE-koffSρES(x) is the net local substrate binding flux and μ(x)=μ(0)+kBTlnρS(x)/ρS(0)=μ(0)-kBTx/λS is the local chemical potential.

Our choice for the expression of power at steady state is motivated by that fact that the enzyme transport is passive and therefore, energy needs to be spent only on counteracting the local binding/unbinding events that tend to homogenize the substrate profile. To demonstrate the validity of our proposed expression more formally, we invoke the standard approaches for calculating power (Hill, 1977; Zhang et al., 2012). In particular, for a system that is described through discrete states with transition rates kij between them, the rate of energy dissipation at steady state is given by

(S31) P=kBTi>j(Jij-Jji)lnkijkji,

where Jij is the flux from state i into state j. We note here that a similar expression for the rate of total entropy production involves a ln(Jij/Jji) term (statistical forces) instead of the ln(kij/kji) term (deterministic driving forces). At steady state, however, these two expressions are mathematically equivalent (Zhang et al., 2012). Our choice for Equation S31 stems from the better physical intuition that it provides in our context.

So far, the description of our system has been in terms of continuous density functions. To apply Equation S31 for calculating power, we consider the discrete-state representation of enzyme dynamics shown in Appendix 2—figure 1. There, space is discretized into intervals of size δx and diffusion is represented through jumps between neighboring sites with a rate D/δx2. What keeps the system out of equilibrium is the spatially varying substrate profile ρS(x).

Appendix 2—figure 1
Discrete-state representation of diffusive transport and substrate binding/unbinding events.

Transparent clusters of different numbers of substrates illustrate the spatial variation of substrate concentration.

Because forward and backward diffusive transitions have identical rates, according to Equation S31 they will not contribute to energy dissipation (since ln(1)=0). The contribution from the remaining substrate binding/unbinding events can then be written as

(S32) P=kBTS=R,Wi(konρS(xi)×δniE-koffS×δniES)lnkonρS(xi)koffS,

where δniE=ρEδx and δniES=ρES(xi)δx are the numbers of free and substrate–bound enzymes, respectively, in the [xi,xi+δx] interval. In the limit of a large number of discrete spatial intervals, the sum over i in Equation S32 can be rewritten as an integral over the coordinate x, namely,

(S33) P=kBTS=R,Wx=0(konρS(x)ρE-koffSρES(x))jS(x)lnkonρS(x)koffSdx.

Comparing the form of Equation S33 to that of Equation S30 (with μ(x) substituted), one can notice a difference in the terms that multiply jS(x). Specifically, in Equation S30 we have μ(x)=μ(0)-kBTlnρS(0)+kBTlnρS(x) while the corresponding term in Equation S33 is kBTln(kon/koffS)+kBTlnρS(x). The difference between them, however, is in the parts that do not depend on x, while the spatially varying parts (namely, the kBTlnρS(x) contributions) are identical. Now, since the number of bound complexes is constant at steady state, we have 0jS(x)dx=0. This means that the x-independent parts discussed earlier all integrate to zero, making the power estimates by Equation S30 and Equation S33 identical, thereby justifying our proposed expression.

To estimate power, we substitute the analytical expression for ρES(x) found earlier (Equation S13) into jS(x) and performing a somewhat cumbersome integral, obtain

(S34) βP=JbindS=R,W11-λS2/λES2(λESλStanh(L/2λES)tanh(L/2λS)-1),

where β-1=kBT, and Jbind=konStotalρE is the net binding rate of each substrate. Figure 4 in the main text was made using this expression for power.

To get additional insights about this result, let us consider the case where substrates are highly localized (λSL) and diffusion is slow (λESL) – conditions needed for effective proofreading. Under these conditions, the hyperbolic tangent terms become 1 and the expression for the power expenditure simplifies into

(S35) βP=JbindS=R,WλES2λS(λES+λS).

The monotonic increase of power with λES suggests that energy is primarily spent on maintaining the concentration gradient of right substrates. This is not surprising, since typically right complexes travel a much greater distance into the low concentration region of the compartment before releasing the bound substrate (i.e. λERλEW). Therefore, neglecting the contribution from wrong substrates and considering the range of λS values where the bulk of power–fidelity trade-off takes place (λER>λS>λEW), we further simplify the power expression into

(S36) βPJbindλERλS=JbindβΔμτDkoffR,

where we used the identities βΔμ=L/λS and λER=L/τDkoffR. This simple linear relation suggests that in order to maintain the exponential substrate profile, the minimum energy spent per substrate binding event should be at least P/JbindkBTλER/λS>1kBT (since λER>λS).

We can also use Equation S36 to estimate the minimum dissipation per substrate binding event at λSλEW where the logarithmic power–fidelity scaling regime ends (see Figure 4 of the main text). Substituting the value of λS, we obtain βP/Jbind(λER/λEW)=ηeq, which is the result illustrated in Figure 4.

2. Limits on fidelity enhancement

The error reduction capacity of the spatial proofreading scheme improves with a greater difference in substrate off-rates, as was demonstrated in Figure 2 of the main text. At the same time, Figure 3c showed that the finite length scale of substrate localization (or, finite driving force) sets an upper limit on fidelity enhancement for substrates with fixed off-rates. It is therefore of interest to consider these two features together to find the absolute limit on fidelity enhancement available to our model and then compare it with the fundamental bound set by thermodynamics.

Intuitively, fidelity will be enhanced the most if the density of right complexes does not decay across the compartment, while that of wrong complexes decays maximally. The first condition can be met if diffusion is fast or if the unbinding rate of right substrates is low, in which case we have

(S37) ρER(L)ρER,

where ρER is the equilibrium density of right complexes. Conversely, when the unbinding rate of wrong substrates is very large, the density of wrong complexes is maximally reduced at the rightmost boundary and can be obtained from Equation S24 by taking the λES0 limit, namely,

(S38) ρEW(L)konρEρS(0)eL/λSkoffW=konρEStotaleL/λSλS(1eL/λS)koffW=konρEStotalkoffWL×LeL/λSλS(1eL/λS)=ρEW×βΔμeβΔμ1eβΔμ.

Here, ρEW is the equilibrium density of wrong complexes, and βΔμ=L/λS is the effective driving force of the scheme. Taking the ratio of Equations S37 and S38. Limits on fidelity enhancement, we obtain the largest fidelity enhancement of the scheme for the given driving force, namely,

(S39) ηmax=ρER(L)ρEW(L)=ρERρEWηeq×eβΔμ1βΔμ
(S40) (η/ηeq)max=(eβΔμ1)/βΔμ.

When βΔμ1 (or, λSL), the limit above gets further simplified into

(S41) (η/ηeq)maxeβΔμ/βΔμ.
Appendix 2—figure 2
Fidelity enhancement as a function of the effective driving force for varying choices of koffW.

The red dashed line indicates the thermodynamic bound given by eβΔμ. The black dashed line corresponds to the model’s upper limit on fidelity enhancement given by Equation S40.

Now, thermodynamics imposes an upper bound on fidelity enhancement by any proofreading scheme operating with a finite chemical potential Δμ. This bound is equal to eβΔμ and is reached when the entire chemical potential is used to increase the free energy difference between right and wrong substrates (Qian, 2006). Comparing it with the result in Equation S41, we can see that fidelity enhancement in the spatial proofreading model has the same exponential scaling term, but with an additional linear factor. Since the dominant contribution comes from the exponential term (as captured also in Appendix 2—figure 2), we can claim that our proposed model can operate very close to the fundamental thermodynamic limit.

3. Energetic cost to setup a concentration gradient

Earlier in the section, we calculated the rate at which energy needs to be dissipated to counteract the homogenizing effect that enzyme activity has on the substrate gradient. In addition to this cost, however, there is also a baseline cost for setting up a gradient in the absence of any enzyme. Here, we calculate this cost in the case where the gradient formation mechanism needs to work against diffusion that tends to flatten the substrate profile.

As before, we consider an exponentially decaying substrate gradient with a decay length scale λS and a total number of substrates Stotal. We write the minimum power PD required for counteracting the diffusion of substrates as

(S42) PD=-0LJD(x)μ(x)dx,

where JD=-DSρS(x) is the diffusive flux, with DS being the substrate diffusion constant. The rationale for writing this form is that diffusion moves substrates from a higher chemical potential region into a neighboring lower chemical potential region. The gradient maintaining mechanism would need to spend at least this chemical potential difference (δμ=-μ(x)δx) per each substrate diffusing a distance δx down the chemical potential gradient. Adding up the contribution from all local neighborhoods with a local diffusive flux JD(x) results in Equation S42.

Now, substituting ρS(x)e-x/λS for the substrate profile and μ(x)=μ(0)+kBTln(ρS(x)/ρS(0)) for the chemical potential, we obtain

(S43) βPD=0LDSρS(x)(lnρS(x))dx=DS0L(ρS(x))2ρS(x)dx=DS0LρS(x)λS2dx=DSStotalλS2,

where in the third step we used the relation ρS(x)=-ρS(x)/λS. This suggests that the minimum dissipated power required for setting up an exponential gradient increases quadratically with decreasing localization length scale λS.

It is informative to also make a comparison between this result and the earlier calculated minimum dissipation needed to counteract the enzyme’s homogenizing activity. Recall that when substrates were sufficiently localized and when diffusion was sufficiently slow, proofreading power could be approximated as (Equation S35)

(S44) βPJbindλES2λS(λES+λS),

where Jbind=konStotalρE is the total substrate binding flux. Using the identities λES=D/koffS and KdS=koffS/kon, we can calculate the ratio of the proofreading power to baseline power as

(S45) PPD=konStotalρEλES2DSStotal×λS2λS(λES+λS)=DDS×ρEKdS×λS/λES1+λS/λES.

Presuming for simplicity that the enzyme and substrate diffusion constants are the same, we see that two factors determine the power ratio: (1) the amount of free enzyme in the system (ρE/KdS) and (2) the substrate localization length scale relative to the characteristic length scale of complex diffusion (λS/λES). Now, recall that the proofreading cost is spent largely on counteracting the homogenizing activity of the enzyme on right substrates (Appendix 2.1) and that the bulk of fidelity enhancement takes place when λSλER (Appendix 1.4). Therefore, when tuning λS down, initially the power ratio would only depend on the amount of free enzyme in the system (ρE/KdS) and then, with tighter substrate localization, the relative contribution of the proofreading power would start to decrease.

In the end, we would like to note that spatial gradients can also be set up using an external potential without a continuous dissipation of energy. In an in vivo setting, gravity can give rise to spatial structures in oocytes (Feric and Brangwynne, 2013), while in an in vitro setting, electric fields can create gradients and power the transport of the complex (Hansen et al., 2017). We leave the investigations of such alternative strategies to future work.

Appendix 3

Studies on the effect of catalysis on the model performance

In Appendix 1, we considered the rate of catalysis at the right boundary to be very small for the analytical simplicity of our derivations. This resulted in expressions for fidelity that were independent of the rate of catalysis r and allowed us to use the complex density at the right boundary as a proxy for speed. In this section, we relax this assumption and explore the consequences of having non-negligible catalysis rates on the model’s fidelity and on the speed–fidelity trade-off.

1. Derivation of the complex density profile ρES(x)

Accounting for catalysis in our model should be done through a boundary condition for the complex density equation (Equation S1). Earlier, we imposed a no-flux boundary condition at x=L under the slow catalysis assumption. With non-negligible catalysis, this assumption is no longer valid, and the boundary condition is modified into

(S46) DdρESdx|x=L=rρES(L)catalysis  flux.

Recall from Equations S4, S6 and S8 that the general solution for the complex profile had the form


Imposing the no-flux boundary condition at x=0 allows us to eliminate one of the integration constants, namely,


Next, we impose the new boundary condition at x=L (Equation S46), which yields

(S52) D(2C1λESsinh(L/λES)+CpλS(eL/λESeL/λS))=r(2C1cosh(L/λES)+CpλS(λESeL/λES+λSeL/λS))2C1sinh(L/λES)+CpλSλES(eL/λESeL/λS)=λESrDε(2C1cosh(L/λES)+CpλS(λESeL/λES+λSeL/λS)).

Note that we have introduced the dimensionless variable ε, which, as will see later, will define the extent to which the presence of catalysis affects the fidelity. For convenience, here we write different equivalent forms for ε as

(S53) ε=λESrD=rDkoffS=rLkoffSτDkoffS.

Solving for the remaining unknown coefficient C1 in Equation S52, we find

(S54) C1=-Cp2λSλES(eL/λES-e-L/λS)+ε(λESeL/λES+λSe-L/λS)sinh(L/λES)+εcosh(L/λES).

Lastly, we substitute this result for C1 into Equation S51 and obtain a general expression for the complex density profile as

(S55) ρES(x)=CpλSλES(eL/λESeL/λS)+ε(λESeL/λES+λSeL/λS)sinh(L/λES)+εcosh(L/λES)cosh(x/λES)+CpλS(λESex/λES+λSex/λS).

One can show in a straightforward way that this result reduces to Equation S13 in the ε0 limit.

2. Effects on fidelity in low and high substrate localization regimes

Accounting for the catalysis flux has made the general expression for the complex density profile even more incomprehensible. In order to gain insights about the qualitative as well as quantitative changes introduced by catalysis, we will focus on two characteristic limits of substrate localization – uniform substrate profile (λS) and ideal substrate localization (λS0).

2.1. Uniform substrate profile

In this case, no mechanism for localizing substrates is in play. Let us start off by evaluating the coefficient Cp (Equation S48) in the λS limit. Recalling from Equation S3 that ρS(0)=Stotal/(λS(1-e-L/λS)), we find


where Jbind=konStotalρE is the total substrate binding flux.

Substituting the expression for Cp into Equation S55 and eliminating all the terms that vanish upon taking the λS limit, we obtain

(S58) ρES(x)Cp(1εcosh(x/λES)sinh(L/λES)+εcosh(L/λES))=JbindLkoffS×sinh(L/λES)+ε(cosh(L/λES)cosh(x/λES))sinh(L/λES)+εcosh(L/λES).

Ultimately, we are interested in knowing the rate of product formation defined via vS=rρES(L). We therefore evaluate the complex density at x=L and multiply it by r, which yields

(S59) vS=rρES(L)=Jbind×(rLkoffS)×sinh(L/λES)sinh(L/λES)+εcosh(L/λES)=Jbind×(rLkoffS)×tanh(L/λES)tanh(L/λES)+εJbind×(rLkoffS)×tanh(τDkoffS)tanh(τDkoffS)+ε,

where in the last step we wrote an equivalent expression using the L/λES=τDkoffS identity. To analyze this result further, we will consider two limiting cases.

Case 1: Fast diffusion (τDk𝐨𝐟𝐟𝐒1). If diffusion is fast, we can approximate the hyperbolic tangent functions as the arguments themselves (i.e. tanh(z)z for z1). Then, using the last form of ε in Equation S53, we simplify the expression for speed as


This is an intuitive result, suggesting that an enzyme that diffuses fast acts like a standard Michaelis–Menten enzyme with an effective catalysis rate r~. For such an enzyme, the probability of catalysis for a bound substrate is r~/(koffS+r~). Multiplying this probability by the net substrate binding flux yields the expression for speed in Equation S61.

Fidelity of the model in this fast diffusion setting can be written as

(S63) η=vRvW=koffW+r~koffR+r~.

In the limit where catalysis is very slow (r~koffR), the equilibrium fidelity given by the ratio of off-rates is recovered. And in the opposite limit of very fast catalysis (r~koffW), the discriminatory capacity of the enzyme disappears altogether (Appendix 3—figure 1a).

Appendix 3—figure 1
Dependence of fidelity on the catalysis rate in the case where the substrate profile is uniform.

(a) Fast diffusion setting (τDkoffR1). The highest fidelity reduction is a factor of ηeq. (b) Slow diffusion setting (τDkoffR1). The highest fidelity reduction is a factor of ηeq. In both cases, ηeq=10 was used.

Case 2: Slow diffusion (τDk𝐨𝐟𝐟𝐒1). A more interesting case is when diffusion is slow. Now, the hyperbolic tangent functions in Equation S59 are approximately 1, allowing us to simplify the expression for speed into

(S64) vS=Jbind×(rLkoffS)×11+rLkoffSτDkoffS=Jbind×r~koffS+r~τDkoffS.

Drawing an analogy between the above result and Equation S61, one can notice the presence of an extra τDkoffS factor for r~ in the denominator.

Evaluating the speeds of right and wrong product formation, we can write fidelity in this slow diffusion setting as

(S65) η=vRvW=koffW+r~τDkoffWkoffR+r~τDkoffR.

Like the fast diffusion case, when catalysis is very slow (r~koffR/τD or, equivalently, rDkoffR), the equilibrium fidelity is recovered. Unlike the fast diffusion case, however, if catalysis is very fast (rDkoffW), the enzyme partly preserves its discriminatory capacity (Appendix 3—figure 1b). In this limit, a fidelity equal to the square root of the equilibrium fidelity is still attainable, namely,

(S66) ηkoffWkoffR=ηeq.

This unexpected result suggests a potential advantage of localizing fast catalytic reactions instead of having them occur in a well–mixed solution.

2.2. Ideal substrate localization

We next consider the effect of catalysis on model fidelity in the ideal substrate localization limit (λS0). We begin by evaluating the Cp/λS ratio that appears in the density profile expression (Equation S55). Using Equations S48 and Equations S3, we find


where in the last step we invoked the identities λES2=D/koffS and Jbind=konStotalρE. We then substitute our result for Cp/λS into Equation S55 and simplify the complex density expression into

(S69) ρES(x)=JbindD×λES(eL/λES+εeL/λESsinh(L/λES)+εcosh(L/λES)cosh(x/λES)ex/λES)=Jbind×λESDcosh((Lx)/λES)+εsinh((Lx)/λES)sinh(L/λES)+εcosh(L/λES).

To obtain the speed, we evaluate ρES(x) at the right boundary (x=L) and multiply it by r, namely,

(S70) vS=rρES(L)=JbindλESrDε1sinh(L/λES)+εcosh(L/λES)=Jbind×εsinh(τDkoffS)+εcosh(τDkoffS).

To evaluate the effect of catalysis further, we again consider two special limits – those of fast and slow diffusion.

Case 1: Fast diffusion (τDk𝐨𝐟𝐟𝐒1). In this limit, the hyperbolic sine function can be approximated by its argument (i.e. sinh(z)z for z1), while the hyperbolic cosine function is approximately 1. Making these approximations and substituting the expression for ε, we obtain

(S71) vSJbind×rLkoffSτDkoffSτDkoffS+rLkoffSτDkoffS=Jbind×rLkoffS1+rLkoffS=Jbind×r~koffS+r~.

This result is identical to what we found in the fast diffusion limit for the λS setting (Equation S61), which is reasonable, since the location of substrate binding is irrelevant if diffusion is very fast (Appendix 3—figure 2a).

Appendix 3—figure 2
Fidelity as a function of the catalysis rate in an ideal substrate localization setting.

(a) Fast diffusion case, where the behavior of the system is identical to that in Appendix 3—figure 1a. (b) Slow diffusion case where efficient proofreading is achieved. Catalysis can reduce the fidelity by up to a factor of ηeq. In both cases, ηeq=10 was used.

Case 2: Slow diffusion (τDk𝐨𝐟𝐟𝐒1). In this limit, the hyperbolic sine and cosine functions can be approximated as exponentials with a 1/2 prefactor, simplifying the expression of speed into

(S72) vSJbind×2ε1+εe-τDkoffS.

Recalling the identity ε=r/DkoffS (note that ε depends on the substrate kind), we evaluate the speed for right and wrong product formation and, dividing them, obtain the fidelity as

(S73) η=vRvW=1+r/DkoffW1+r/DkoffR×koffWkoffReτDkoffW-τDkoffR=1+r/DkoffW1+r/DkoffR×ηeqeτDkoffR(ηeq-1).

In the case where catalysis is slow (rDkoffR), the first term in the fidelity expression becomes approximately 1, and the our earlier result obtained with no account of catalysis is recovered (Equation S21). In the opposite limit of fast catalysis (rDkoffW), the first term is no longer 1, and we find

(S74) ηkoffRkoffWηeq1eτDkoffR(ηeq-1)=eτDkoffR(ηeq-1).

As we can see, fast catalysis in the slow diffusion regime reduces the fidelity by ηeq or, equivalently, reduces the effective number of proofreading realizations by one half, without affecting the exponential amplification term (Appendix 3—figure 2b).

To conclude, our study demonstrated the expected reduction of fidelity with increasing catalysis rate. In the case of fast diffusion, up to a factor of ηeq reduction is possible, as is the case for the original (Hopfield, 1974; Wong et al., 2018). In the case of slow diffusion, however, the cap on the amount of reduction is decreased down to ηeq. The advantage of this feature is most notable in the limit of a non-localized (i.e. uniform) substrate profile and fast catalysis where a diffusing enzyme is still capable of discriminating between substrates. This behavior would not be possible for a Michaelis–Menten enzyme in a well-mixed solution.

3. Effects on the speed–fidelity trade-off

In Figure 3a of the main text we explored the speed–fidelity trade-off in the slow catalysis limit. This trade-off arose in response to tuning the substrate localization length scale (λS) and the diffusion time scale (τD). Here, we explore the changes to this trade-off behavior in the case where the effects of catalysis are not negligible. For concreteness, we focus on alterations to the Pareto front of the trade-off achieved in the λS0 limit.

Appendix 3—figure 3a compares the Pareto fronts in the cases of slow and fast catalysis limits. In each case, speed is normalized by the corresponding effective Michaelis–Menten speed that is reached in the fast diffusion limit and is given by vMM=Jbind×r~/(koffR+r~), where r~=r/L. One can notice a shift of the fast catalysis front toward the low-fidelity region, which was expected since earlier we observed the complete loss of substrate discrimination when diffusion and catalysis were both fast (Appendix 3—figure 2a).

Appendix 3—figure 3
Pareto front of the speed–fidelity trade-off at different levels of catalytic activity.

(a) Cases of slow and fast catalysis limits, with the y-axis for speed normalized to the [0,1] interval. (b) Family of Pareto fronts for different choices of the catalysis rate. Speed on the y-axis is reported relative to the substrate binding flux Jbind.

Appendix 3—figure 3a may leave an impression that faster catalysis leads to a less favorable speed–fidelity trade-off. Note, however, that the speed vMM(r~) used to normalize the y-axis is itself a function of the catalysis rate and penalizes the fast catalysis case more than its slow counterpart. To eliminate this ambiguity, we plotted a family of Pareto fronts for increasing values of the catalysis rate but this time normalizing the y-axis by the r-independent quantity Jbind (Appendix 3—figure 3b). As can be seen, faster catalysis in fact improves the speed–fidelity trade-off, meaning that in order to maximize fidelity at a given speed level, the best strategy would be to increase the catalysis rate and correspondingly slow down the diffusion.

A trade-off between speed and fidelity also arises in response to the sole alteration of the catalysis rate, while keeping the rest of the model parameters fixed. To explore this trade-off for an arbitrary fixed choice of λS and τD, we begin by evaluating speed from Equation S55, namely,

(S75) vS=rρES(L)=r×(CpλSλES(eL/λESeL/λS)+ε(λESeL/λES+λSeL/λS)sinh(L/λES)+εcosh(L/λES)cosh(L/λES)+CpλS(λESeL/λES+λSeL/λS))=r×CpλS(λES(eL/λESeL/λS)×cosh(L/λES)+(λESeL/λES+λSeL/λS)×sinh(L/λES)sinh(L/λES)+εcosh(L/λES)+ε×(λESeL/λES+λSeL/λS)×cosh(L/λES)+(λESeL/λES+λSeL/λS)×cosh(L/λES)sinh(L/λES)+εcosh(L/λES)=0)=r×CpλS(λSsinh(L/λES)+λEScosh(L/λES))eL/λSλESsinh(L/λES)+εcosh(L/λES)=aSr1+bSr.

In the last step, we introduced coefficients aS and bS that are independent from r, and used the fact that εr.

Now, using the definition of fidelity and the result obtained above, we can write

(S76) η=vRvW=aRaW1+bWr1+bRr.

Notice that the ratio aR/aWη0 is the fidelity in the limit of very slow catalysis (r0). Substituting it, we write


where Δb=bR-bW. Recalling that ε=λESr/D and noting the function form of the denominator in Equation S75, one can show that bS=D-1λES/tanh(L/λES). This is an increasing function of λES and hence, a decreasing function of koffS, implying that Δb>0.

With this condition in mind, we can see from Equation S78 that speed and fidelity are anticorrelated with a linear slope when tuning the catalysis rate, unlike the more sophisticated trade-off relations when tuning the other model parameters. The peak fidelity η0 is attained in the limit of vanishing speed. And conversely, speed is the highest when fidelity is the lowest for the given fixed values of λS and τD (Appendix 3—figure 4).

Appendix 3—figure 4
Linear trade-off between speed and fidelity when tuning the rate of catalysis.

ηmin is the fidelity in the fast catalysis limit and is up to ηeq lower than η0 (based on the results of the previous section). Linear scale is used for both axes.

Overall, our result illustrates the simple speed–fidelity trade-off that can be navigated by tuning the catalysis rate. This, for instance, can be achieved by changing the concentration of effectors that activate the enzyme for catalysis.

Appendix 4

Proofreading for substrates with different localization conditions

Following the original treatment by Hopfield, 1974, we have performed the studies of our model under the assumption that discrimination between right and wrong substrates is solely based on their off–rates (koffW>koffR). Although this is often the signature difference between substrates, in a cellular setting substrate discrimination may occur through other factors also. For example, substrates may be present at different amounts or they may have non-identical on–rates. These differences, however, have a multiplicative effect on the fidelity (i.e. η(konR[R])/(konW[W])) and do not highlight the proofreading capacity of a particular model.

Unlike these two features, differences in the degree to which right and wrong substrates are localized can have a non-trivial effect on the proofreading performance. In this Appendix, we generalize our study of the model fidelity to cases where right and wrong substrates have unequal localization length scales λR and λW, respectively.

1. Limiting cases

We start off by exploring the limiting cases first. From the earlier derived Equation S14 and Equation S15, we know that the complex density at x=L in very low (λSL) and very high (λSL) substrate localization regimes is given by


respectively. Note that the complex density in the ideal localization case is necessarily lower than that in the case of a uniform profile, since the inequality sinh(L/λES)>L/λES holds for all choices of λES. If λR and λW are not constrained to be equal, then the highest fidelity for a given τD will be attained when the right substrates are distributed uniformly while the wrong substrates are highly localized (λRL and λWL, respectively). We obtain the fidelity in this case as


Notably, this result for maximum fidelity enhancement is independent of koffR. Furthermore, it exceeds the ideal localization fidelity reported in the main text (Equation 5, derived in the λS0 limit), which was expected since now the right complexes on average travel a shorter distance to reach the activation site than the wrong complexes.

In the opposite scenario where the wrong substrates are uniformly distributed and the right ones are highly localized (λRL and λWL, respectively), the system attains its lowest fidelity for a given τD, namely,


Since L/λER<sinh(L/λER), the lowest fidelity is less than the equilibrium fidelity itself (ηmin<ηeq), suggesting that the enzyme may in fact do anti-proofreading (Murugan et al., 2014) if the wrong substrates are generally closer to the catalytic site.

2. Intermediate levels of substrate localization

In Figure 3 inset as well as in Appendix 1.4, we explored the dependence of fidelity on the substrate localization length scale λS when it was the same for the two substrate kinds. Here, we expand this study to the case where this constraint is relaxed.

In particular, using Equation S24, we calculate complex densities and corresponding fidelity values as a function of λR for different fixed choices of the length scale ratio λR/λW. The results of the study are captured in Appendix 4—figure 1. In the special case where the two length scales are equal (λR = λW, solid black line), fidelity exhibits a monotonic depends on L/λR, and in the limit of ideal localization (very large L/λR) the result in Equation 5 of the main text is recovered.

When λRλW, the dependence of fidelity on L/λR is no longer monotonic. If right substrates are more localized than the wrong ones (red curves), then the fidelity curves have a minimum where the enzyme does anti-proofreading (i.e. η<ηeq). The proofreading portion of the curves (when η>ηeq) is shifted to the right, suggesting that much higher substrate localization is needed for the enzyme to proofread.

The opposite case is when the right substrates have a shallower gradient than the wrong ones (blue curves). The fidelity curves are now shifted to the left and have a peak that is greater than the large L/λR limit of fidelity. This means that there is an optimal degree of substrate localization, going beyond which makes the model performance worse in terms of both error correction and energy consumption.

Appendix 4—figure 1
Fidelity as a function of L/λR for different choices of the ratio λR/λW.

The solid black line corresponds to the earlier studied regime where substrates had identical localization length scales. The blue curves represent the cases where λR>λW, while the red curves represent the cases where λR<λW. Numbers next to the curves correspond to the λR/λW ratios used for generating them. Expressions for the highest and lowest fidelity values, as well as the fidelity expression in the limit where both substrates are highly localized are shown on the right side of the figure. τD=40τoffR and ηeq=10 were used for demonstration.

Over the course of its diffusive transport, a bound enzyme is more likely to deposit a right substrate in a substrate-depleted region than a wrong one, because right substrates stay attached to the enzyme for a longer time. Therefore, if the gradient-maintaining mechanism does not discriminate between substrates (which we assume is the case for the kinase/phosphatase-based one), then it will be easier for it to maintain the wrong ones localized since they tend to get deposited closer to the localization site (see Appendix 6—figure 1c as an example). This means that in a realistic setting the spatial organization of substrates is more likely to be in the advantageous blue region of Appendix 4—figure 1 where λR>λW, facilitating the realization of spatial proofreading.

Appendix 5

Studies on the validity of the uniform free enzyme profile assumption

In our treatment of the model so far, we have assumed for mathematical convenience that free enzymes are in excess, which suggested the approximation ρE(x)constant. Example enzyme density profiles shown in Appendix 5—figure 1, however, demonstrate that this assumption does not hold in general. Specifically, there is a depletion of free enzymes near the substrate localization site and abundance near the catalysis site. Because of this depletion at the leftmost edge, we expect a reduction in speed in comparison with our earlier treatment where a flat profile was assumed. In addition, if substrates have a weak gradient, we expect the fidelity to also be reduced, since more enzymes will bind substrates at intermediate positions, reducing the average travel distance to the catalytic site. In what follows, we discuss in greater detail the consequences of having a nonuniform free enzyme distribution on the model performance.

Appendix 5—figure 1
Example profiles of free and substrate-bound enzymes.

Enzyme profiles are normalized so that the sum of areas under the curves is unity. The substrate profile (rescaled on the y-axis) is shown in transparent gray.

1. Effects that relaxing the ρE(x)constant assumption has on the Pareto front

We begin by studying the effects of relaxing the uniform free enzyme profile assumption on the Pareto front of the speed–fidelity trade-off (Figure 3a of the main text). This front is reached in the ideal substrate localization limit (λS0). Though in general enzyme profiles need to be obtained using numerical methods due to the nonlinearity of reaction–diffusion equations, in this particular limit (λS0) an analytical solution is available. To obtain it, we write the reaction–diffusion equations in the bulk region of space as


Substrate binding reactions did not enter the above equations, as they occur at the leftmost boundary only. They are instead accounted for via boundary conditions, which read


where Stotal is the total amount of free substrate of each kind concentrated at x=0.

Relating local enzyme concentrations

Considering the system at steady state, we add Equations S85-S87 and obtain

(S91) 0=Dd2ρERdx2+Dd2ρEWdx2+Dd2ρEdx2,

where we replaced the partial derivatives with total derivative since the profiles are time-independent. Dividing Equation S91 by D and integrating once, we find

(S92) dρERdx+dρEWdx+dρEdx=C1.

The above relation must hold for arbitrary position x. Choosing x=0 and noting that from Equations S88-S90 the sum of fluxes should be zero, we can claim that C1=0. Integrating for the second time, we obtain

(S93) ρER(x)+ρEW(x)+ρE(x)=C2,

where C2 is now a different constant. To find it, we perform an integral for the last time across the entire compartment, namely,

(S94) 0L(ρER(x)+ρEW(x)+ρE(x))dx=Etotal=C2L.

Here, we introduced the parameter Etotal as the total number of enzymes in the system (in free or bound forms). The constant C2, which we will rename into ρ0, is then the average enzyme density, that is,

(S95) ρ0=Etotal/L.

Substituting this result into Equation S93, we find an insightful relation between free and bound enzyme densities at an arbitrary position, namely,

(S96) ρE(x)=ρ0ρER(x)ρEW(x).

This relation suggests that whenever the local concentration of bound enzymes is high, the local concentration of free enzymes should be correspondingly low, as we see reflected in the profiles of Appendix 5—figure 1.

Deriving the fidelity expression

Next, we consider Equations S85 and S86 separately at steady state, written in the form

(S97) Dd2ρESdx2koffSρES=0.

The general solution to this ODE reads

(S98) ρES(x)=C1Sex/λES+C2 Sex/λES,

where λES=D/koffS, and C1S and C2S (S = R,W) are constants which are different for right and wrong complexes. The no-flux boundary condition at x=L can be used to relate these constants and simplify the complex profile expression, namely,


where C~1S=2C1Se-L/λES is a new constant coefficient introduced for convenience.

Now, the fidelity of the scheme is the ratio of right and wrong complex densities at x=L. Using the result above, the fidelity can be written as

(S102) η=ρER(L)ρEW(L)=C~1RC~1W.

The ratio of these constant coefficients can be obtained by noting that the diffusive fluxes of right and wrong complexes at x=0 are identical (from Equations S38 and S38), that is,


Substituting this result into Equation S102, and recalling the equality L/λES=τDkoffS, we obtain

(S106) η=τDkoffWτDkoffRsinh(τDkoffW)sinh(τDkoffR)=ηeqsinh(τDkoffW)sinh(τDkoffR).

This expression is identical to that in Equation S20 which was derived under the ρE(x)constant assumption, suggesting that when substrates are highly localized, the shape of the free enzyme profile does not dictate the fidelity.

Deriving the speed expression

To keep the expression of speed compact while still illustrating the key consequences of relaxing the ρ(x)constant assumption, we will assume moving forward that the density of wrong complexes is much lower than that of the right complexes, that is, ρEW(x)ρER(x). This assumption holds as long as the right and wrong complexes have sufficiently different off-rates. To see why it is the case, note that the ratio ρEW(x)/ρER(x) is the highest at x=0. We therefore calculate an upper bound for the ratio using Equation S101 and Equation S105 as

(S107) ρEW(x)ρER(x)<ρEW(0)ρER(0)=λEWλERtanh(L/λER)tanh(L/λEW)<λEWλER=koffRkoffW=1ηeq.

As long as ηeq10, it is fair to assume that the right complexes greatly outnumber the wrong ones, which allows us to approximate the free enzyme density from Equation S96 as ρE(x)ρ0-ρER(x).

The specification of the right complex density profile requires the knowledge of the unknown coefficient C~1R. To find this coefficient, we use the boundary condition in Equation S88 and the approximation ρE(x)ρ0-ρER(x) to write

(S108)DC~1RλERsinh(L/λER) = konStotal(ρ0C~1Rcosh(L/λER))C~1R=konStotalρ0DλERsinh(L/λER)+konStotalcosh(L/λER)=konStotalρ0λERkoffRsinh(L/λER)+konStotalcosh(L/λER)(S109)=ρ0×konStotalkoffRL1+LλERcosh(L/λER)sinh(L/λER)konStotalkoffRL×L/λERsinh(L/λER).

With the constant coefficient known, the right complex density then becomes

(S110) ρER(x)=ρ0×ρ¯SKdR1+LλERcosh(L/λER)sinh(L/λER)ρ¯SKdR×L/λERsinh(L/λER)cosh(LxλER),

where we used the definitions of the mean substrate density ρ¯S=Stotal/L and the dissociation constant KdR=koffR/kon.

To enable a direct parallel between this general treatment and the earlier one with the ρE(x)constant approximation, let us introduce ρER as the uniform right complex density when diffusion is very fast (λERL) and calculate it from Equation S110 as

(S111) ρER=ρ0×ρ¯SKdR1+ρ¯SKdR.

Now, using the ρER expression, we rewrite Equation S110 as

(S112) ρER(x)=1+ρ¯SKdR1+LλERcosh(L/λER)sinh(L/λER)ρ¯SKdR×ρER×L/λERsinh(L/λER)cosh(LxλER)=1+ρ¯SKdR1+LλERcosh(L/λER)sinh(L/λER)γρ¯SKdR×ρERconst(x),

where ρERconst(x) is the complex density obtained under the ρE(x)constant assumption (Equation S15). The extra factor that appears on front does not exceed 1 since γ1, indicating a reduction in speed, as we anticipated in our more qualitative discussion at the beginning of the section. The presence of the extra factor suggests two possibilities for the approximation to hold true; first, γ1 which happens when λERL or when the right complex does not decay noticeably across the compartment, and second, when γ>1 and ρ¯Sγ-1KdR, which is when right complexes do decay but their fraction is low compared with free enzymes because of low substrate concentration.

Let us demonstrate the last statement more explicitly. Specifically, let us show that the validity of the approximation ρE(x)constant is indeed linked directly to the fraction of bound enzymes. To that end, we evaluate ρE(0)/ρE(L) as a metric that quantifies the degree to which ρE(x)constant holds. If there is a large depletion of free enzymes near the substrate-binding site, then the metric will be significantly less than 1; conversely, if the free enzyme profile is practically flat, then the metric will be close to 1. Invoking the relation ρE(x)ρ0-ρER(x) and using our result for the complex density (Equation S110) as well as the definition of γ in Equation S112, we evaluate this metric as

(S113) ρE(0)ρE(L)ρ0ρER(0)ρ0ρER(L)=1γρ¯S/KdR1+γρ¯S/KdR1γρ¯S/KdRcosh(L/λER)(1+γρ¯S/KdR)=11+(11cosh(L/λER))γρ¯S/KdR.

Next, we calculate the fraction of bound enzymes pbound from Equation S110 as

(S114) pboundEtotal10LρER(x)dx=ρ0LEtotalρ¯S/KdR1+γρ¯S/KdR=ρ¯S/KdR1+γρ¯S/KdR.

Note that γ-1 emerges as the highest fraction of bound enzymes (pboundmax) reached in the large substrate concentration limit.

To link the metric ρE(0)/ρE(L) to the fraction of bound enzymes, we express ρ¯S/KdR in terms of pbound and substitute it into Equation S113, namely,


Now, when the complexes do not decay appreciably across the compartment (λERL and thus, cosh(L/λER)1), the metric becomes roughly equal to 1, suggesting that the free enzyme profile is practically flat. A more interesting case is when the complexes do decay (λER<L), as in Appendix 5—figure 1. In this case, applying the condition cosh(L/λER)1, we find

(S117) ρE(0)ρE(L)1pboundpboundmax.

The anti-correlation between the ρE(0)/ρE(L) and pbound in the above result demonstrates that the degree to which the approximation ρE(x)constant is violated is indeed dictated by the fraction of bound enzymes.

Pareto front shift

The previous calculations showed that in the ideal substrate localization limit relaxing the ρ(x)constant assumption keeps the fidelity the same while the speed gets reduced. And this reduction is greater for higher substrate concentrations. We therefore expect a shift in the Pareto front when going to the high substrate concentration limit, as is illustrated in Appendix 5—figure 2a. To get more intuition about the effect of this shift caused by tuning the amount of substrates, we consider the effective number of proofreading realizations at half-maximum speed (n50) and study how this number changes as a function of the fraction of enzymes bound (pbound), which increases monotonically with Stotal as suggested by Equation S114. Appendix 5—figure 2b shows this dependence. As can be seen, n50 reduces roughly linearly with pbound; for example, if 10% of the enzymes are bound, then a 10% reduction in n50 is expected. This suggests that as long as the fraction of bound enzymes is low, our findings related to the Pareto front made under the ρEconstant assumption will generally hold true.

Appendix 5—figure 2
Consequences of relaxing the ρE(x)constant assumption on the Pareto front.

(a) Pareto fronts in the low and high substrate concentration limits. (b) Reduction in the effective number of proofreading realizations at half-maximum speed as a function of the fraction of enzymes bound. ηeq=10 was used in making the plots.

2. Effects that relaxing the ρE(x)constant assumption has on fidelity in a weak substrate gradient setting

In this section, we study how accounting for the spatial distribution of free enzymes affects our results on the model’s fidelity in the setting where substrates have a finite localization length scale λS. In this setting, Equations (1–3) (in the main text) describing the system’s dynamics become a system of nonlinear equations, which we solve at steady state using numerical methods.

An example curve of how fidelity changes with tuning diffusion time scale in a finite λS setting is shown in Appendix 5—figure 3. As expected, the nonuniform free enzyme profile leads to a reduction in fidelity. This reduction is not significant when diffusion is relatively fast as in that case the free enzyme profile manages to flatten out rapidly. The reduction is not significant also in the very slow diffusion limit where binding events that lead to production primarily take place in the proximity of the activation region and hence, the nonuniform profile of free enzymes across the compartment has little impact on fidelity. The greatest reduction happens at intermediate diffusion time scales; in particular, when the system achieves its peak fidelity.

Appendix 5—figure 3
Fidelity as a function of diffusion time scale calculated with and without making the ρE(x)constant approximation.

The total number of free substrates is chosen so that ρ¯S/KdR=3. The substrate localization length scale used for generating the solid curves is λS/L=0.04.

To quantify the extent of this highest reduction, we calculated the peak value of the effective number of proofreading realizations (nmax) for different free substrate amounts which regulate the fraction of bound enzymes (pbound). The results obtained for different choices of λS are summarized in Appendix 5—figure 4. As can be seen, for the high substrate localization case (λS/L=0.04), there is a roughly linear dependence between nmax and pbound. The initial decrease in nmax with growing pbound is even slower when substrates are less tightly localized (λS/L=0.10,0.30).

Appendix 5—figure 4
Reduction in the peak effective number of proofreading realizations as a function of pbound.

nmaxlow represents the peak value of neff in the limit of low substrate concentration (the maximum of the solid blue curve in Appendix 5—figure 3).

Taken together, these results suggest that if the substrate concentration is low enough to leave most of the enzymes unbound, then our proposed scheme will proofread efficiently. And this requirement on substrate amount will be further relaxed if diffusion is fast, or if substrates are not very tightly localized.

Appendix 6

Proofreading on a kinase/phosphatase-induced gradient

In this section, we introduce the mathematical modeling setup for the kinase/phosphatase-based gradient formation scheme and describe how its fidelity is calculated numerically. In the end, we discuss the energetics of setting up the substrate concentration gradient and link our calculations to the lower bounds on energy cost obtained earlier in Appendix 2.

1. Setup and estimation of fidelity

In the analysis thus far, we have imposed a gradient of free substrates and analyzed the proofreading capability of an enzyme acting on this gradient. In a living cell, gradients themselves are maintained by active cellular processes. However, the action of the enzyme – that is, binding a substrate in one spatial location, diffusing away, and releasing the substrate elsewhere – can destroy the gradient, and thereby lead to a loss of proofreading. Here, we analyze the consequences of free substrate depletion and gradient flattening caused by the enzyme.

We model the formation of a substrate gradient by a combination of localized activation and delocalized deactivation. We suppose that substrates can exist in phosphorylated or dephosphorylated forms, and that only the phosphorylated form is capable of binding to the enzyme. The substrates are phosphorylated by a kinase with rate kkin=0.2 s−1, and dephosphorylated by a phosphatase with rate kp=5 s−1. Crucially, we assume that phosphatases are found everywhere in the domain of size L10 μm (a typical length scale in a eukaryotic cell), while kinases are localized to one end of the domain (at x=0), as may occur naturally if kinases are bound to one of the membranes enclosing the domain.

The minimal dynamics of phosphorylated substrates and enzyme–substrate complexes is then given by

(S118) ρSt=D2ρSkbρS+koffSρESkpρS,ρESt=D2ρES+kbρSkoffSρES,

augmented by the boundary conditions

(S119) Substrate phosphorylation: DρS|x=0=kkin,No-flux: DρS|x=L=DρES|x=L=DρES|x=0=0.

Here, we have supposed that the densities of free enzymes, dephosphorylated substrates, and phosphatases are fixed and uniform, and have absorbed them into the relevant rate constants (kb=konρE, kkin, and kp, respectively). For simplicity, we have also assumed that the free substrates and enzyme–substrate complexes have the same diffusion coefficient D=1 μm2/s. We note that accounting for distinct diffusivities of phosphorylated and unphosphorylated substrate forms (Kholodenko, 2009) would affect the speed, while accounting for the slower diffusion of the enzyme–substrate complex would alter the estimates of both speed and fidelity of the model. One or several of these effects can be considered when studying a specific biological system where these microscopic details are known.

We numerically solve Equations S118 and S119 at steady state to obtain the concentration profiles. First, the equations of dynamics are made dimensionless by settings units of length and time by L (x¯=x/L) and τDL2/D (t¯=t/τD), respectively. At steady state, the dimensionless equations read

(S120) ¯2ρ¯S=(k¯b+k¯p)ρ¯Sk¯offSρ¯ES,¯2ρ¯ES=k¯bρ¯S+koffSρ¯ES,

with boundary conditions

(S121) ¯ρ¯S|x¯=0=k¯kin,¯ρ¯S|x¯=1=¯ρ¯ES|x¯=1=¯ρ¯ES|x¯=0=0,

where concentrations have been rescaled as ρ¯=ρL, and kinetic rates as k¯=kτD.

We discretize the steady state equations on a grid with spacing Δx¯=0.01, approximating the second derivative as

(S122) ¯2ρ¯1Δx¯2(ρ¯(x¯+Δx¯)+ρ¯(x¯-Δx¯)-2ρ¯(x¯)).

This is ill-defined at the boundaries x¯=0 and x¯=1, which is addressed by incorporating the boundary conditions. For illustration, consider the left boundary, x¯=0, and suppose that our domain included also a point at x¯=-Δx¯. Then, we could approximate the boundary condition ¯ρ¯S|x¯=0=-k¯kin by a centred difference scheme, and solve out for the fictional point at x¯=-Δx¯, namely,


which, when inserted into Equation S122, specifies ¯2ρ¯S at x¯=0, that is,

(S123) ¯2ρ¯S|x¯=0=1Δx¯2(2ρ¯S(Δx¯)2ρ¯S(0))+2Δx¯k¯kin.

For the boundary at the right (x¯=1) as well as for the boundary conditions for ρ¯ES, we similarly implement no-flux boundary conditions. After discretizing, Equation S120 can then be written in a matrix form as

(S124) (1Δx¯2(2200121001210011)(k¯b+k¯p)I)MS ρS=k¯offSρES+(2Δx¯k¯kin000)b,(1Δx¯2(1100121001210011)k¯offSI)MES ρES=k¯bρS,

where ρS, ρES are column vectors of the nondimensionalized concentration profiles evaluated at the spatial grid points, that is, [ρ¯(0),ρ¯(Δx¯),]T. Solving these matrix equations yields

(S125) ρS=(MSk¯offSk¯bMES1)1b,ρES=k¯b(MSMESk¯offSk¯bI)1b.

We compute Equation S125 numerically for two substrates: a cognate (‘R’) and a non-cognate (‘W’), which differ in their off-rates (koffR=0.1s-1 and koffW=1s-1, respectively). Having the density profiles, the fidelity of the model becomes ηρ¯ER(x¯=1)/ρ¯EW(x¯=1). We calculate the fidelity for different choices of the first–order rate of enzyme–substrate binding (kb=konρE); this may be thought of as varying the concentration of free enzyme in the cell. The results are shown in Figure 5 of the main text.

2. Energy dissipation

In Appendices 2.1 and 2.3, we estimated lower bounds on the minimum power that needs to be dissipated in order to counter the homogenizing effect that enzyme activity and substrate diffusion respectively have on localized substrate profiles. Here, we calculate the energy dissipation required to run the kinase/phosphatase-based mechanism and compare it with these lower bounds estimated earlier.

Let us assume that phosphorylation and dephosphorylation reactions by kinases and phosphatases are nearly irreversible with associated free energy costs of Δεkin and Δεphosph per reaction, respectively. The net rate at which active substrates get dephosphorylated is kpSphosphorylated and it needs to be identical to the net phosphorylation rate of inactive substrates in order for Sphosphorylated to remain constant. With the costs of each reaction known, we can write the rate of energy dissipation Pk/p as

(S126) Pk/p=kpSphosphorylated(Δεkin+Δεphosph).

To gain analytical intuition, we first consider the case where the enzyme activity is very low, so that the kinase/phosphatase–based mechanism maintains an exponential profile of active substrates with a decay length scale λS=DS/kp. Expressing the rate of phosphorylation in terms of λS and DS (i.e., kp=DS/λS2), and substituting it into Equation S126, we obtain

(S127) Pk/p=DSSphosphorylatedλS2(Δεkin+Δεphosph).

Comparing this result with the lower dissipation bound found earlier (Equation S43), we can note the presence of an extra factor β(Δεkin+Δεphosph). Since the free energy consumption during ATP hydrolysis is 10kBT, we can say that the power dissipated by the kinase/phosphatase system for setting up an exponential gradient surpasses the lower limit necessary for counteracting diffusion roughly by an order of magnitude.

Next, we explore the energetics of the kinase/phosphatase-based mechanism in the context of the power–fidelity trade-off. Our study of the trade-off in Figure 4 of the main text was performed under the assumption that substrate profiles were exponentially decaying in the entire spatial domain. In Appendix 6—figure 1a, we show the trade-off curves obtained under this assumption and compare them with the trade-off curve for the kinase/phosphatase-based mechanism that arises in response to changing the substrate localization by tuning kp. As can be seen, the predicted lower bound (sum of the minimum powers needed to counteract the enzyme action and substrate diffusion) is roughly an order of magnitude lower than the total dissipation of the mechanism, and this difference increases with higher fidelity.

Note, however, that the assumption about an exponential substrate localization is not generally valid for the kinase/phosphatase-based mechanism because substrates can be deposited in low–concentration regions and not get immediately dephosphorylated (Appendix 6—figure 1c). We therefore refine our lower bounds on the dissipated power by estimating them numerically using their generic definitions, namely, Equation S30 for counteracting enzymatic action, and Equation S42 for counteracting substrate diffusion. These refined estimates suggest a factor of ∼10 difference between the total cost and its lower bound consistently across a wide region of the trade-off curve. This means that substrate gradient maintenance through practically irreversible phosphorylation and dephosphorylation reactions has low energetic efficiency for doing spatial proofreading, which, however, may be sustainable depending on the energy budget of the cell.

Appendix 6—figure 1
Energetic performance of the kinase/phosphatase–based mechanism.

(a) Total dissipation and calculated lower bounds under the assumption of exponential substrate localization. (b) Total dissipation and lower bounds estimated without assuming exponential substrate profiles. In both (a) and (b) kb=1s-1 and Δεkin=Δεphosph=10kBT were used. (c) Example profiles of right and wrong substrates for the physiologically relevant dephosphorylation rate kp=5s-1. Exponential decay of the substrate profile with the predicted length scale λS=DS/kp holds in the first 2 μm of the compartment.

Data availability

All scripts used to generate the data for making the plots are provided in supporting files.


    1. Kunkel TA
    (2004) DNA replication fidelity
    Journal of Biological Chemistry 279:16895–16898.

Article and author information

Author details

  1. Vahe Galstyan

    Biochemistry and Molecular Biophysics Option, California Institute of Technology, Pasadena, United States
    Conceptualization, Formal analysis, Investigation, Methodology, Writing - original draft, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7073-9175
  2. Kabir Husain

    Department of Physics and the James Franck Institute, University of Chicago, Chicago, United States
    Conceptualization, Formal analysis, Investigation, Methodology, Writing - original draft, Writing - review and editing
    Competing interests
    No competing interests declared
  3. Fangzhou Xiao

    Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, United States
    Conceptualization, Formal analysis, Investigation, Methodology, Writing - review and editing
    Competing interests
    No competing interests declared
  4. Arvind Murugan

    Department of Physics and the James Franck Institute, University of Chicago, Chicago, United States
    Conceptualization, Supervision, Funding acquisition, Investigation, Methodology, Writing - original draft, Project administration, Writing - review and editing
    For correspondence
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-5464-917X
  5. Rob Phillips

    1. Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, United States
    2. Department of Physics, California Institute of Technology, Pasadena, United States
    Conceptualization, Supervision, Funding acquisition, Investigation, Methodology, Writing - original draft, Project administration, Writing - review and editing
    For correspondence
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-3082-2809


James S. McDonnell Foundation

  • Kabir Husain

Simons Foundation

  • Arvind Murugan

John Templeton Foundation

  • Rob Phillips

National Institute of General Medical Sciences

  • Rob Phillips

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.


We thank Anatoly Kolomeisky, Shu-ou Shan and Erik Winfree for insightful discussions, Soichi Hirokawa and Avi Flamholz for providing useful feedback on the manuscript. We also thank Alexander Grosberg whose idea of a compartmentalized ‘rotary demon’ motivated the development of our model. This work was supported by the NIH Grant 1R35 GM118043-01, the John Templeton Foundation Grants 51250 and 60973 (to RP), a James S. McDonnell Foundation postdoctoral fellowship (to KH), and the Simons Foundation (AM).

Version history

  1. Received: June 26, 2020
  2. Accepted: December 24, 2020
  3. Accepted Manuscript published: December 24, 2020 (version 1)
  4. Version of Record published: January 18, 2021 (version 2)


© 2020, Galstyan et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.


  • 5,523
    Page views
  • 472
  • 10

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Vahe Galstyan
  2. Kabir Husain
  3. Fangzhou Xiao
  4. Arvind Murugan
  5. Rob Phillips
Proofreading through spatial gradients
eLife 9:e60415.

Share this article

Further reading

    1. Microbiology and Infectious Disease
    2. Physics of Living Systems
    Fabien Duveau, Céline Cordier ... Pascal Hersen
    Research Article

    Natural environments of living organisms are often dynamic and multifactorial, with multiple parameters fluctuating over time. To better understand how cells respond to dynamically interacting factors, we quantified the effects of dual fluctuations of osmotic stress and glucose deprivation on yeast cells using microfluidics and time-lapse microscopy. Strikingly, we observed that cell proliferation, survival, and signaling depend on the phasing of the two periodic stresses. Cells divided faster, survived longer, and showed decreased transcriptional response when fluctuations of hyperosmotic stress and glucose deprivation occurred in phase than when the two stresses occurred alternatively. Therefore, glucose availability regulates yeast responses to dynamic osmotic stress, showcasing the key role of metabolic fluctuations in cellular responses to dynamic stress. We also found that mutants with impaired osmotic stress response were better adapted to alternating stresses than wild-type cells, showing that genetic mechanisms of adaptation to a persistent stress factor can be detrimental under dynamically interacting conditions.

    1. Physics of Living Systems
    Josep-Maria Armengol-Collado, Livio Nicola Carenza, Luca Giomi
    Research Article Updated

    We formulate a hydrodynamic theory of confluent epithelia: i.e. monolayers of epithelial cells adhering to each other without gaps. Taking advantage of recent progresses toward establishing a general hydrodynamic theory of p-atic liquid crystals, we demonstrate that collectively migrating epithelia feature both nematic (i.e. p = 2) and hexatic (i.e. p = 6) orders, with the former being dominant at large and the latter at small length scales. Such a remarkable multiscale liquid crystal order leaves a distinct signature in the system’s structure factor, which exhibits two different power-law scaling regimes, reflecting both the hexagonal geometry of small cells clusters and the uniaxial structure of the global cellular flow. We support these analytical predictions with two different cell-resolved models of epithelia – i.e. the self-propelled Voronoi model and the multiphase field model – and highlight how momentum dissipation and noise influence the range of fluctuations at small length scales, thereby affecting the degree of cooperativity between cells. Our construction provides a theoretical framework to conceptualize the recent observation of multiscale order in layers of Madin–Darby canine kidney cells and pave the way for further theoretical developments.