A simple regulatory architecture allows learning the statistical structure of a changing environment
Abstract
Bacteria live in environments that are continuously fluctuating and changing. Exploiting any predictability of such fluctuations can lead to an increased fitness. On longer timescales, bacteria can ‘learn’ the structure of these fluctuations through evolution. However, on shorter timescales, inferring the statistics of the environment and acting upon this information would need to be accomplished by physiological mechanisms. Here, we use a model of metabolism to show that a simple generalization of a common regulatory motif (endproduct inhibition) is sufficient both for learning continuousvalued features of the statistical structure of the environment and for translating this information into predictive behavior; moreover, it accomplishes these tasks nearoptimally. We discuss plausible genetic circuits that could instantiate the mechanism we describe, including one similar to the architecture of twocomponent signaling, and argue that the key ingredients required for such predictive behavior are readily accessible to bacteria.
eLife digest
Associations inferred from previous experience can help an organism predict what might happen the next time it faces a similar situation. For example, it could anticipate the presence of certain resources based on a correlated environmental cue.
The complex neural circuitry of the brain allows such associations to be learned and unlearned quickly, certainly within the lifetime of an animal. In contrast, the subcellular regulatory circuits of bacteria are only capable of very simple information processing. Thus, in bacteria, the ‘learning’ of environmental patterns is believed to mostly occur by evolutionary mechanisms, over many generations.
Landmann et al. used computer simulations and a simple theoretical model to show that bacteria need not be limited by the slow speed of evolutionary trial and error. A basic regulatory circuit could, theoretically, allow a bacterium to learn subtle relationships between environmental factors within its lifetime. The essential components for this simulation can all be found in bacteria – including a large number of ‘regulators’, the molecules that control the rate of biochemical processes. And indeed, some organisms often have more of these biological actors than appears to be necessary. The results of Landmann et al. provide new hypothesis for how such seemingly ‘superfluous’ elements might actually be useful.
Knowing that a learning process is theoretically possible, experimental biologists could now test if it appears in nature. Placing bacteria in more realistic, fluctuating conditions instead of a typical stable laboratory environment could demonstrate the role of the extra regulators in helping the microorganisms to adapt by ‘learning’.
Introduction
Organisms that live in changing environments evolve strategies to respond to the fluctuations. Many such adaptations are reactive, for example sensory systems that allow detecting changes when they occur and responding to them. However, adaptations can be not only reactive, but also predictive. For example, circadian clocks allow photosynthetic algae to reorganize their metabolism in preparation for the rising sun (BellPedersen et al., 2005; Husain et al., 2019). Another example is the anticipatory behavior in E. coli, which allows it to prepare for the next environment under its normal cycling through the mammalian digestive tract (Savageau, 1983); similar behaviors have been observed in many species (Tagkopoulos et al., 2008; Mitchell et al., 2009).
All these behaviors effectively constitute predictions about a future environment: the organism improves its fitness by exploiting the regularities it ‘learns’ over the course of its evolution (Mitchell and Lim, 2016). Learning such regularities can be beneficial even if they are merely statistical in nature. A prime example is bet hedging: even if the environment changes stochastically and without warning, a population that learns the statistics of switching can improve its longterm fitness, for example, by adopting persistor phenotypes with appropriate probability (Kussell and Leibler, 2005; Veening et al., 2008). The seemingly limitless ingenuity of evolutionary trialanderror makes it plausible that virtually any statistical structure of the environment that remains constant over an evolutionary timescale could, in principle, be learnt by an evolving system, and harnessed to improve its fitness (Watson and Szathmáry, 2016).
However, the statistical structure of the environment can itself change, and this change can be too quick to be learned by evolution (Figure 1A). For example, an organism might experience a period of stability followed by a period of large fluctuations, or an environment where two resources are correlated, and then another where they are not. Note that there are two key timescales here – that of the fluctuations themselves (which we assume to be fast), and the slower timescale on which the structure of those fluctuations changes. One expects such scenarios to be common in an ecoevolutionary context. As an example, consider a bacterium in a small pool of water. Its immediate environment, shaped by local interactions, is fluctuating on the timescale at which the bacterium changes neighbors. The statistical properties of these fluctuations depend on the species composition of the pool. As such, the fast fluctuations are partially predictable, and learning their structure could help inform the fitnessoptimizing strategy: a neighbor encountered in a recent past is likely to be seen again in the near future. However, these statistics change on an ecological timescale, and such learning would therefore need to be accomplished by physiological, rather than evolutionary, mechanisms.
On a physiological timescale, this problem is highly nontrivial: the organism would have to perform inference from prior observations, encode them in memory, and act upon this knowledge (Figure 1B). It is clear that solutions to this problem do exist: such behaviors, common in neural systems, can be implemented by neuralnetworklike architectures; and these known architectures can be translated into biochemical networks (Hjelmfelt et al., 1991; Kobayashi, 2010; Fages et al., 2017; Katz and Springer, 2016; Katz et al., 2018). But singlecelled organisms operate in a severely hardwarelimited regime rarely probed by neuroscience. Streamlined by evolution, bacterial genomes quickly shed any unused complexity. Whether we could expect learninglike behaviors from bacteria depends on whether useful networks could be simple enough to plausibly be beneficial.
Known examples of phenotypic memory, for example, when the response is mediated by a longlived protein, can be interpreted as a simple form of learning (Lambert et al., 2014; Hoffer et al., 2001); circuits capable of adapting to the current mean of a fluctuating signal, as in bacterial chemotaxis (Barkai and Leibler, 1997), also belong in this category. Prior theory work has also proposed that simple genetic circuits could learn more subtle binary features, such as a (transient) presence or absence of a correlation between two signals (Sorek et al., 2013).
Here, we show that a simple generalization of a ubiquitous regulatory motif, the endproduct inhibition, can learn, store, and ‘act upon’ the information on continuousvalued features such as timescales and correlations of environmental fluctuations, and moreover, can do so nearoptimally. We identify the key ingredients giving rise to this behavior, and argue that their applicability is likely more general than the simple metabolically inspired example used here.
Results
The setup
For a simple model capturing some of the challenges of surviving in a fluctuating environment, consider a situation where some internal physiological quantities $\overrightarrow{P}=({P}_{1},\mathrm{\dots},{P}_{N})$ must track fluctuating external variables $\overrightarrow{D}=({D}_{1},\mathrm{\dots},{D}_{N})$. For example, the expression of a costly metabolic pathway would ideally track the availability of the relevant nutrient, or the solute concentration in the cytoplasm might track the osmolarity of the environment. In abstract terms, we describe these environmental pressures by the timedependent $\overrightarrow{D}(t)$, and postulate that the organism fitness is determined by the average mismatch $\sqrt{\u27e8{\sum}_{i=1}^{N}{({P}_{i}{D}_{i})}^{2}\u27e9}$, a quantity we will henceforth call ‘performance’. Here and below, angular brackets denote averaging over time.
In this simple model, a given static $\overrightarrow{D}$ clearly defines a unique optimal state $\overrightarrow{P}$; the regulatory challenge is entirely due to $\overrightarrow{D}$ being a fluctuating quantity. The challenges faced by real organisms are certainly vastly more rich: even in the static case, the optimal behavior may not be unique, or even welldefined (optimal under what constraints?); and in the dynamic case, the future state of the environment could be affected by past actions of the organism. These considerations can add layers of complexity to the problem, but our minimal model is sufficient to focus on the basic issues of sensing, learning and responding to changing fluctuation statistics of external factors.
If $\overrightarrow{D}$ changes sufficiently slowly, the organism can sense it and adapt $\overrightarrow{P}$ accordingly. We, instead, are interested in the regime of rapid fluctuations. When changes in $\overrightarrow{D}$ are too rapid for the organism to match $\overrightarrow{P}$ to $\overrightarrow{D}$ exactly, it can rely on statistical structure. At the simplest level, the organism could match the mean, setting $\overrightarrow{P}\equiv \u27e8\overrightarrow{D}\u27e9$. However, information on higherorder statistics, for example correlations between D_{1} and D_{2}, can further inform the behavior and improve fitness.
To see this, in what follows, we will consider the minimal case of such structured fluctuations, namely a $N$dimensional vector $\overrightarrow{D}=({D}_{1},\mathrm{\dots},{D}_{N})$ undergoing a random walk in a quadratic potential (the Ornstein—Uhlenbeck process):
with mean $\overrightarrow{\overline{D}}$, fluctuation strength Γ, independent Gaussian random variables $\overrightarrow{\eta}$ with zero mean and unit variance, and the matrix $M$ defining the potential.
In this system, the relevant ‘fluctuation structure’ is determined by $M$ and Γ. In one dimension, Equation (1) gives $D$ a variance of $\mathrm{\Gamma}/M$. In two dimensions, denoting the eigenvalues of $M$ as ${\lambda}_{1,2}$, the stationary distribution of the fluctuating $\overrightarrow{D}$ is a Gaussian distribution with principal axes oriented along the eigenvectors of $M$, and standard deviations along these directions given by $\sqrt{\mathrm{\Gamma}/{\lambda}_{1}}$ and $\sqrt{\mathrm{\Gamma}/{\lambda}_{2}}$. Intuitively, we can think of the fluctuating $\overrightarrow{D}$ as filling out an ellipse (Figure 1C). Going forward, when we refer to learning fluctuation structure, we mean learning properties of $M$ and Γ.
If $M$ and Γ are known, the optimal strategy minimizing $\u27e8{(\overrightarrow{P}\overrightarrow{D})}^{2}\u27e9$, where $\overrightarrow{D}(t)$ is set by Equation (1), can be computed exactly, as a function of the maximum allowed rate of change ${\parallel \dot{P}\parallel}^{2}$ (Liberzon, 2011). (If we do not constrain ${\parallel \dot{P}\parallel}^{2}$, the optimal behavior is of course $\overrightarrow{P}=\overrightarrow{D}$.) Briefly, the optimal behavior is to steer $\overrightarrow{P}$ toward the best guess of the expected future $\overrightarrow{D}$ (see Appendix 1, section 'Control theory calculation'). This best guess depends on the fluctuation structure, as illustrated by the comparison between Figure 1D and E for an isotropic and an anisotropic $M$.
However, in our problem, we will assume that $M$ and Γ do not stay constant long enough to be learned by evolution, and thus are unknown to the system. In this regime, it is not clear that the behavior of an $M$ and Γoptimized system is relevant. Nevertheless, we will describe a regulatory architecture consisting of common regulatory elements that will adapt its responsiveness to the fluctuation structure of its input (‘learn’); for example, in the twodimensional case, it will indeed develop the anisotropic response shown in Figure 1E. Moreover, we will find the steadystate performance of our architecture to be nearoptimal, when compared to the theoretical ceiling of a system that knows $M$ and Γ perfectly.
Proposed architecture: endproduct inhibition with an excess of regulators
The section above was intentionally very general. To discuss solutions available to cells, it is convenient to restrict the scope from this general formulation to a more specific metabolicallyinspired case. From here onwards, let ${D}_{i}$ be the instantaneous demand in metabolite x_{i} (determined by external factors), and ${P}_{i}$ be the rate at which the metabolite is produced, with both defined in units of metabolite concentration per unit time. The number of components of the vector $\overrightarrow{D}$ now has the meaning of the number of metabolites, and we will denote it as ${N}_{x}$. The cell needs to match $\overrightarrow{P}$ to $\overrightarrow{D}$ (or, equivalently, maintain the homeostasis of the internal metabolite concentrations x_{i}).
The simplest way to solve this problem is via feedback inhibition. Consider first the case of a single metabolite $x$. If an accumulation of $x$ inhibits its own synthesis, a decreased demand will automatically translate into a decreased production. For our purposes, we will model this scenario by placing the synthesis of metabolite $x$ under the control of a regulator $a$ (e.g. a transcription factor), which is, in turn, inhibited by $x$ (Figure 2A). For simplicity, we will measure regulator activity $a$ directly in units of equivalent production of $x$. The dynamics of this system, linearized for small fluctuations of metabolite concentration $x$, can be written in the following form (see Appendix 1, section 'Simple endproduct inhibition):
Here, we introduced P_{0} with dimension of production (concentration per time) to render $a$ dimensionless. In Equation 2c, λ has the units of concentration × time, and setting $\lambda \equiv {x}_{0}{\tau}_{a}$ defines a time scale for changes in regulator activity. Assuming the dynamics of metabolite concentrations $x$ are faster than regulatory processes, and choosing the units so that ${x}_{0}=1$ and ${P}_{0}=1$, we simplify the equations to:
We will refer to this architecture as simple endproduct inhibition (SEPI). For two metabolites $\overrightarrow{x}=({x}_{1},{x}_{2})$, the straightforward generalization is to have two independent copies of this circuit, with two regulators a_{1}, a_{2} (Figure 2B). Denoting the number of regulators as ${N}_{a}$, we note that in the SEPI architecture, there are as many regulators as there are metabolites: ${N}_{a}={N}_{x}$.
The architecture we will describe builds on this widely used regulatory motif, and relies on three added ingredients:
An excess of regulators: ${N}_{a}>{N}_{x}$;
Selfactivation of regulators;
Nonlinear activation/repression of the regulators ${a}_{\mu}$ by the metabolite concentrations x_{i}.
Here and below, we use index μ for regulators ($\mu =1\mathrm{\dots}{N}_{a}$) and index $i$ for metabolites ($i=1\mathrm{\dots}{N}_{x}$).
These three ingredients, we claim, will be sufficient for the circuit to both learn higher order statistics and to use this information appropriately when matching the production to demand. It is important to emphasize that all three are readily accessible to cells. In fact, there are multiple ways to build regulatory circuits exhibiting the proposed behavior using common regulatory elements. To focus on the general mechanism rather than any one particular implementation, we will defer describing these example circuits until later in the text (Figure 6); here, we will consider a minimal modification of Equation (3) that contains the required ingredients:
This architecture bears a similarity to neural networks, and, as we will see, the familiar intuition about the value of extra ‘hidden nodes’ indeed holds. However, we caution the reader not to rely too heavily on this analogy. For example, here ${\sigma}_{\mu i}$ is a constant matrix describing how the activities of regulators ${a}_{\mu}$ control the synthesis of metabolites x_{i}.
For two metabolites (${N}_{x}=2$) as in Figure 2C, each regulator is summarized by a 2component vector ${\overrightarrow{\sigma}}_{\mu}=({\sigma}_{\mu 1},{\sigma}_{\mu 2})$; its components can be of either sign (or zero) and specify how strongly the regulator ${a}_{\mu}$ is activating or repressing the synthesis of metabolite x_{i}. For simplicity, below, we will choose these vectors to be of unit length. Then, each regulator ${\overrightarrow{\sigma}}_{\mu}$ is fully characterized by an angle in the $({x}_{1},{x}_{2})$ plane, which allows for a convenient visualization of the regulatory systems (Figure 2D). The ${\sigma}_{\mu i}$ defines the regulatory logic of our system and does not change with time. The parameter $d\le 0$ allows us to tune the strength of the simple nonlinearity (Figure 2E); below we set $d=0$ (strong nonlinearity) unless explicitly stated otherwise. As we will show later, the learning behavior is also observed for more realistic functions such as the Hill function, but the simple piecewise linear form of Equation (4) will help us relate the observed behavior to specifically nonlinearity as opposed to, for example, cooperativity (the Hill parameter tunes both simultaneously). Finally, the parameter κ reflects degradation and is assumed to be small: $\kappa \ll {x}_{0}$. Previously, for SEPI, it could be neglected, but here, it will matter due to the nonlinearity; for more details, see Appendix 1, section 'Simple endproduct inhibition'. The parameters used in simulations are all listed in Appendix 1, section 'Parameters used in figures'.
Just like simple endproduct inhibition in Equation (3), the modified system Equation (4) will correctly adapt production to any static demand (see Appendix 1, section 'Adaptation to static demand'). In the following, we will show that the added ingredients also enable learning the structure of fluctuating environments. For this purpose, we expose our system to demands $D(t)$ with fixed means (${\overline{D}}_{i}=1$) but a changing fluctuation structure.
The regulatory architecture described above outperforms simple endproduct inhibition by learning environment statistics
To show that our system is able to adapt to different fluctuation structures, we probe it with changing environmental statistics, and show that it, first, learns these statistics, and, second, is able to make use of this information in its behavior.
For simplicity, we start with the 1dimensional case (Figure 3A–F). In dimension ${N}_{x}=1$, an excess of regulators means we have both an activator ${a}_{+}$ and a repressor ${a}_{}$ for the production of $x$ (Figure 3A). This is reminiscent of paradoxical regulation (Hart et al., 2012). We probe our system with changing environmental statistics by exposing it to a demand $D(t)$ with an increasing variance (Figure 3B,C). As a reminder, here and below, the mean demand is fixed at 1.
Faced with a faster fluctuating input, our system upregulates both ${a}_{+}$ and ${a}_{}$ while keeping ${a}_{+}{a}_{}$ constant (${a}_{+}{a}_{}\approx \overline{D}=1$; Figure 3D). In this way, the two activity levels ${a}_{+}$ and ${a}_{}$ encode both the mean and the variance of fluctuations. Crucially, the system makes use of the information it stores: The increased regulator activities allow future changes in $P$ to be faster. The system’s responsiveness, which we can define as $\mathcal{R}\equiv \frac{d\dot{P}}{dD}$, increases as ${a}_{+}+{a}_{}$ (Figure 3E; see also Appendix 1, section 'Defining the system's responsiveness'). As a result, as shown in Figure 3F, our system is able to perform approximately equally well (after adaptation time) in each environment, unlike a system like simple endproduct inhibition, which is unable to adapt its sensitivity. In summary, Figure 3D–F show that the simple architecture of Figure 3A can not only learn the statistics of environment fluctuations, but also ‘act upon this knowledge,’ effectively performing both computations of Figure 1B.
The idea of learning the fluctuation structure is perhaps clearer in dimension ${N}_{x}=2$, since the two demands can now be correlated with each other, and it seems intuitive that a system able to learn the typical direction of fluctuations (the angle α in Figure 3H) should be able to track the input better. Indeed, as we saw in Figure 1D–E, when environment fluctuations are anisotropic, the responsiveness of a welladapted strategy must be anisotropic as well: the preferred direction must elicit a stronger response. Mathematically, the responsiveness $\mathcal{R}$ is now a matrix ${\mathcal{R}}_{ij}=\frac{d{\dot{P}}_{i}}{d{D}_{j}}$, and for a welladapted system we expect its eigenvectors to align with the principal directions of $M$. In Figure 3G–L, Figure 4 and Figure 5, our discussion will focus on this twodimensional case.
Figure 3G–L show the behavior of our system (Equation 4) with ${N}_{a}=5$ regulators (Figure 3G), exposed to an input structured as shown in Figure 3H, where we vary α. As represented pictorially in Figure 3I, we rotate the fluctuation structure matrix $M$ in Equation (1), keeping its eigenvalues ${\lambda}_{1,2}$ fixed with $\sqrt{{\lambda}_{1}/{\lambda}_{2}}=10$ (this fixes the ratio of major to minor semiaxis lengths).
With ${N}_{a}=5$ regulators, matching the mean value of $\overrightarrow{D}$ would leave ${N}_{a}2=3$ degrees of freedom that can be influenced by other parameters (such as variance in each dimension and correlation between different demands). And indeed, changing environment statistics induces strong changes in the regulator state adopted by the system, with regulators better aligned to the input fluctuations reaching higher expression (Figure 3J; note the diagrams at the top, where dot size reflects the activity reached by the corresponding regulator at the end of each epoch; compare to the diagrams in Figure 3I). This activity pattern shapes the responsiveness matrix $\mathcal{R}$. Figure 3K plots the ‘learned angle’, defined as the direction of the dominant eigenvector of $\mathcal{R}$; we find that it tracks the stimulus angle. Finally, Figure 3L demonstrates that our architecture is able to make use of this learning, outperforming the SEPI system, whose responsiveness is isotropic and fixed.
The performance is nearoptimal
In the previous section, we have shown by example (Figure 3) that the proposed regulatory architecture can learn the statistics of the environment. We now characterize systematically the conditions under which learning improves performance and compare our system to the theoretical performance ceiling. Note that unlike the general statement that learning correlations improves performance, the optimal performance ceiling is necessarily specific to a given model of the environmental fluctuations. Nevertheless, this comparison is informative.
The fluctuation structure in our model is defined by Γ and $M$. We first investigate the dependence of performance on Γ (Figure 4A), exposing our system to a twodimensional input structured as in Figure 3H with $\sqrt{{\lambda}_{1}/{\lambda}_{2}}=10$ as before, $\alpha =\pi /4$, and a changing Γ.
Although the input is twodimensional, changing Γ scales the overall magnitude of fluctuations, and the behavior is analogous to the simpler onedimensional example shown in the first column of Figure 3. At $\mathrm{\Gamma}=0$ (static input), and by extension, for Γ finite but small, examining the steady state of Equation (4) shows that only ${N}_{x}=2$ out of ${N}_{a}$ regulators can be active. In this regime, our system is essentially identical to SEPI—the extra regulators, though available, are inactive—and in fact performs slightly worse. This is because at nonzero κ, the steady state of Equation (4) is slightly offset from the ideal state $\u27e8{x}_{i}\u27e9=1$. (While this effect can be corrected, it is only relevant in the parameter regime where no learning occurs, so we chose to keep Equation (4) as simple as possible; for additional discussion, see Appendix 1, section 'Performance penalty from the degradation term').
When Γ becomes sufficiently large, the first term in Equation (4) (proportional to the fluctuation size $\sqrt{\mathrm{\Gamma}}$) for one of the inactive regulators finally exceeds, on average, the degradation term. At this point, the system enters the regime where the number of active regulators exceeds ${N}_{x}$, and its performance deviates from the SEPI curve. Beyond this point, further changes to the stimulus no longer affect performance, as our system is able to adapt its responsiveness to the changing fluctuation magnitude (compare to Figure 3F). The threshold value of Γ satisfies $\sqrt{\mathrm{\Gamma}}\propto \kappa $; the proportionality coefficient of order 1 depends on the specific arrangement of regulators but can be estimated analytically (see Appendix 1, section 'The minimal Γ needed to initiate adaptation'). The theoretically predicted deviation points are indicated with arrows, and are in agreement with the simulation results. When a regulator in the system is particularly wellaligned with the dominant direction of fluctuations, the deviation occurs sooner, explaining the better performance of our system when the regulators are more numerous.
To better assess the performance of our system, we compare it to the theoretical optimum derived from control theory, which we represent with dotted lines in Figure 4A. For given $M$ and Γ, the family of optimal behaviors is parameterized by Control Input Power (CIP), defined as $\int {\parallel \dot{P}\parallel}^{2}\mathit{d}t$. If $\overrightarrow{P}$ could react infinitely fast, it would track $\overrightarrow{D}$ perfectly, but increasing response speed necessarily comes at a cost (of making more sensors, or more enzymes for faster synthesis / degradation of x_{i}); constraining the CIP is thus a proxy for specifying the maximum tolerable cost. In order to compare our system with the optimal family of solutions, we compute $\frac{1}{T}{\int}_{0}^{T}{\parallel \dot{P}\parallel}^{2}\mathit{d}t$ of our system at each Γ ($T$ is the simulation time), and compare to the performance of the optimally steered solution with a matched CIP; details of the calculation can be found in Appendix 1, section 'Control theory calculation'. Figure 4A demonstrates that the simple architecture we described not only benefits from matching its responsiveness to its input, but is in fact nearoptimal when compared to any system of equivalent responsiveness.
It is important to note that for low Γ, the performance of the SEPI architecture also tracks the optimal curve. Throughout this work, our intention is not to demonstrate that SEPI is a ‘poor’ architecture. To the contrary, the surprising efficiency of SEPI has been noted before (Goyal et al., 2010; Pavlov and Ehrenberg, 2013), and Figure 4 similarly shows that at its own CIP, its performance is virtually optimal. The advantage of our learningcapable architecture derives from its ability to increase responsiveness when necessary, in the correct direction. Our simplified treatment of the SEPI architecture is not a strawman we seek to dismiss, but an example of a system that exhibits no learning.
Having investigated the effect of fluctuation variance (changing Γ), we turn to the effect of their correlation. Up to now, we subjected our system to a strongly correlated twodimensional input with anisotropy $\sqrt{{\lambda}_{1}/{\lambda}_{2}}=10$ (illustrated, to scale, in Figure 1E). We will now consider a range of anisotropy values, down to anisotropy 1 (uncorrelated fluctuations, Figure 1D), keeping the variances of D_{1} and D_{2} constant, $\alpha =\pi /4$ as before, and $\mathrm{\Gamma}=0.05$.
The result is presented in Figure 4B. With ${N}_{a}=5$ or larger, our system is able to take advantage of the correlation, assuming it is strong enough to activate the learning mechanism. (In fact, its performance can reach values that exceed the theoretical ceiling achievable by any system that assumes the two dimensions of $\overrightarrow{D}$ to be independent, and thus must be exploiting the correlation in its inputs; see Appendix 1, section 'The system makes use of correlations in the input' and Appendix 1—figure 1). For ${N}_{a}=4$, the performance curve remains flat. This is because the four regulators are arranged as two independent copies of the system shown in Figure 3A (one $\{{a}_{+},{a}_{}\}$ pair for each of the two inputs); this architecture can take advantage of the learned variance, but not the correlation. Finally, the SEPI architecture can adapt to neither variance nor correlation; its performance curve is also flat, but is lower. As expected, the advantage of our architecture manifests itself in environments with periods of large and/or strongly correlated fluctuations.
The behavior is generalizable
The model described above was a proof of principle, showing that simple regulatory circuits could learn the fluctuation structure of their inputs. Given the simplicity of our model, it is not to be expected that the exact dynamics of Equation (4) are replicated in real cells. However, the benefit of this simplicity is that we can now trace this behavior to its key ingredients, which we expect to be more general than the model itself: an excess of regulators, nonlinearity, and selfactivation. In this section, we examine their role: first in our model (Figure 5), and then in more realistic circuits, relaxing our simplifying assumptions (Figure 6).
In Figure 5, the parameter $d$ on the horizontal axis is the strength of nonlinearity (see Figure 2E), from perfectly linear at $d=\mathrm{\infty}$, to strongly nonlinear at $d=0$. The vertical axis corresponds to an increasing number of regulators ${N}_{a}$, which we label as in Figure 2D; for completeness, we also include the simplest system with a single regulator coactivating both x_{1} and x_{2} (bottom row). Panel A examines the performance of our system as defined in Equation (4), that is, with selfactivation included. In panel B, we remove selfactivation by omitting the prefactor ${a}_{\mu}$ in front of the max function in Equation (4). The color scheme is chosen so that red indicates an improvement, and blue a deterioration, of the circuit performance relative to the SEPI architecture, which approximately corresponds to the point highlighted in Figure 5B. The difference between the labeled point and the SEPI architecture is that all models represented in Figure 5 include a small degradation term, which becomes important in the nonlinear regime. For the SEPIlike case, its effect on performance is negligible (see Appendix 1, section 'Performance penalty from the degradation term') . Performance is averaged over five angles α; see Appendix 1, section 'Parameters used in figures'.
Unsurprisingly, the performance of the simple SEPIlike architecture can be improved by adding extra regulators (pink region in Figure 5B): each new regulator allows the system to respond more quickly in a yet another direction of perturbation, with which it is ‘aligned’. However, such a strategy would have limited utility in a biological setting, since the marginal improvement per regulator must offset the cost of added complexity. The mechanism described here corresponds to the red area in Figure 5A. Importantly, in the presence of both nonlinearity and selfactivation, even a single extra regulator (${N}_{a}=3$) can already provide a significant benefit.
Figure 5A shows that in the context of our model, the reported behavior requires ${N}_{a}$ to exceed ${N}_{x}$, and $d$ to be sufficiently large. However, these ingredients are more general than the specific implementation in Equation (4). In our model, additional regulators were required because they supplied the slow degrees of freedom to serve as memory; such degrees of freedom could be implemented in other ways, for example, as phosphorylation or methylation (Barkai and Leibler, 1997). Similarly, while nonlinearity is essential (linear dynamics cannot couple to higherorder terms, such as fluctuation magnitude), its exact functional form may be changed while retaining the learning behavior (see Appendix 1, section 'Nonlinearity acts as a sensor of fluctuation variance'). Finally, the explicitly selfcatalytic behavior of ${a}_{\mu}$ in our model is only one possible strategy for translating the stored memory into a faster response.
To demonstrate the generality of these ingredients, we constructed two circuits with very different architectures (Figure 6A,B), both reproducing the results of Figure 3C–F. These are not the only ways that the logic described above can be implemented; rather, our intention is to show that as long as we keep the key elements, we can relax our simplifying assumptions, such as the form of the nonlinearity and selfactivation, while still retaining the ability to learn.
The first of these proposed circuits (Figure 6A) is based on a pair of allosteric enzymes with the toy nonlinearity of Figure 2E replaced by more realistic cooperative binding, and implements dynamics otherwise very similar to those shown above. In this circuit, the enzymes ${E}_{+}$ and ${E}_{}$ can be in an active or inactive state: The active form of ${E}_{+}$, which we denote ${E}_{+}^{*}$, catalyzes the production of $x$; similarly, ${E}_{}^{*}$ catalyzes degradation of $x$. In addition, the active enzymes can bind to molecules of the metabolite $x$ to control the selfcatalytic activity. The total concentration of ${E}_{+}^{*}$, bound and unbound, then plays the role of the activating regulator ${a}_{+}$ from above (${a}_{+}=[{E}_{+}^{*}]+[x{E}_{+}^{*}]$), while ${E}_{}^{*}$ plays the role of the inhibitor ${a}_{}$ (${a}_{}=[{E}_{}^{*}]+[x{E}_{}^{*}]$). The equations defining the dynamics are then:
Despite the extensive changes relative to Figure 3A, the system is still able to learn. Figure 6C compares its performance to a nonlearning version with only the activating branch ${a}_{+}$, which is analogous to the singleactivator SEPI architecture (compare to Figure 3F). For a detailed discussion of this more biologically plausible model, see Appendix 1, section 'A pair of allosteric enzymes'.
Our other proposed circuit (Figure 6B) differs significantly. Here, instead of seeking to match $P$ to $D$, the system maintains the homeostasis of a concentration $x$ perturbed by external factors. In this implementation, the production and degradation of $x$ are both catalyzed by a single bifunctional enzyme; the responsiveness of this circuit scales with the overall expression of the enzyme $E$, and larger fluctuations of $x$ lead to upregulation of $E$ due to the nonlinearity, as before. (For a detailed discussion, see e Appendix 1, section 'An architecture based on a bifunctional enzyme'.) Defining $A={a}_{+}+{a}_{}=[E]+[xE]$ as the total concentration of the enzyme $E$ in both its bound and unbound states, the bound and unbound fractions are described by Hill equations,
The dynamics of our system are:
Despite its compactness, this circuit is also able to learn (Figure 6D; compare to Figures 3F and 6C).
Interestingly, this particular logic is very similar to a small modification of the standard twocomponent signaling architecture (Figure 6E). In this architecture, the signal $s$ determines the concentration of the phosphorylated form ${Y}_{P}$ of the response regulator $Y$; the rapidity of the response is determined by the expression of the histidine kinase $X$, present at a much lower copy number. Although the signaling architecture of Figure 6E, at least in some parameter regimes, is known to be robust to the overall concentrations of $X$ and $Y$(Batchelor and Goulian, 2003), this robustness property applies only to the steadystate mapping from $s$ to ${Y}_{P}$, not the kinetics. Thus, much like in Figure 6B,a nonlinear activation of $X$ by ${Y}_{P}$ (known as autoregulation [Goulian, 2010] or autoamplification [Hoffer et al., 2001], and shown as a dashed arrow in Figure 6E) would endow this signaling system with selftuned reactivity that learns the statistics of the input.
Discussion
In this paper, we have studied a regulatory architecture which is able to infer higherorder statistics from fluctuating environments and use this information to inform behavior. For concreteness, we phrased the regulatory task as seeking to match the production $\overrightarrow{P}$ of one or two metabolites to a rapidly fluctuating demand $\overrightarrow{D}$. Alternatively, and perhaps more generally, the circuits we constructed can be seen as maintaining the homeostasis in a quantity $\overrightarrow{x}$ that is continually perturbed by external factors. We demonstrated that a simple architecture was capable of learning the statistics of fluctuations of its inputs and successfully using this information to optimize its performance. We considered onedimensional and twodimensional examples of such behavior.
In one dimension, learning the statistics of the input meant our circuit exhibited a selftuned reactivity, learning to become more responsive during periods of larger fluctuations. Importantly, we have shown that this behavior can be achieved by circuits that are highly similar to known motifs, such as feedback inhibition (Figure 2A–C) or twocomponent signaling (Figure 6B,E). The latter connection is especially interesting: There are at least a few examples of twocomponent systems where autoamplification, a necessary ingredient for the learning behavior discussed here, has been reported (Shin et al., 2006; Williams and Cotter, 2007). Moreover, in the case of the PhoR/PhoB twocomponent system in E. coli, such autoamplification has been experimentally observed to allow cells to retain memory of a previously experienced signal (phosphate limitation; Hoffer et al., 2001), a behavior the authors described as learninglike. As reported, this behavior constitutes a response to the signal mean and is similar to other examples of simple phenotypic memory (e.g. Lambert et al., 2014); however, our analysis demonstrates that a similar architecture may also be able to learn more complex features. Such a capability would be most useful in contexts where the timescale of sensing could plausibly be the performance bottleneck. Since transcriptional processes are generally slower than the twocomponent kinetics, we expect our discussion to be more relevant for twocomponent systems with nontranscriptional readout, such as those involved in chemotaxis or efflux pump regulation.
In the twodimensional case, our simple circuit was able to learn and unlearn transient correlation structures of its inputs, storing this information in expression levels of different regulators. Our argument was a proof of principle that, for example, the gut bacteria could have the means to not only sense, but also predict nutrient availability based on correlations learned from the past, including correlations that change over fasterthanevolutionary timescales, such as the life cycle (or dietary preferences) of the host. Importantly, we showed that this ability could come cheaply, requiring only a few ingredients beyond simple endproduct inhibition.
The mechanism described here could suggest new hypotheses for the functional role of systems with an excess of regulators, as well as new hypotheses for bacterial function in environments with changing structure.
Materials and methods
All simulations performed in Python 3.7.4. Simulation scripts reproducing all figures are included as Source code 1.
Appendix 1
This text contains supplementary information on modeling details, the analytical calculations, and the exact parameters used in figures.
S1 Simple endproduct inhibition
As the starting point of our regulatory architecture, we consider a basic form of endproduct inhibition (SEPI). The environment is modeled by a timedependent (externally induced) demand $D(t)$ for metabolite $x$ which is produced at a rate $P$ (controlled by the system); both $D$ and $P$ are defined in units of metabolite concentration per unit time. The depletion of the metabolite depends on its concentration $x$ and the demand $D$. Assuming firstorder kinetics (or, alternatively, expanding a general function to linear order in small fluctuations of $x$) the dynamics of $x$ is:
Further, we consider the temporal evolution of the regulator activity $a$
By linearizing $h(x,a)$ around the stationary values $({x}_{0},{a}_{0})$ we get
To examine this equation, we introduce the Fourier transforms $\stackrel{~}{a}(\omega )=\mathcal{F}[a(t){a}_{0}]$ and $\stackrel{~}{x}(\omega )=\mathcal{F}[x(t){x}_{0}]$ and get
Equation (S4) shows that if the support of $\stackrel{~}{x}(\omega )$ is restricted to high frequencies, $\omega \gg {\lambda}_{a}$, then the degradation term ${\lambda}_{a}({a}_{0}a)$ in Equation (S3) is negligible. Including it would only add a restoring force, reducing the amplitude of fluctuations of $a$, and decreasing the performance of the system. Since we are interested in the regime of fast fluctuations of $x$, we choose to omit this term in the SEPI system. With ${\lambda}_{x}=1/\lambda $ we arrive at the dynamics used in the main text:
In the nonlinear system (Equation (3) of the main text), however, fast fluctuations of $x$ can cause the growth of $a$ (as discussed in the section ‘The nonlinearity as a sensor of the fluctuation variance’), thereby inducing slow frequency modes to its dynamics. Thus, in the nonlinear case, we cannot omit the degradation term.
S2 Performance penalty from the degradation term
The model considered in the main text modifies the SEPI architecture as follows:
Consider the case of a static input. We observe that if x_{0} is set to 1, as in the main text, the presence of the degradation term causes the equilibrium point of these dynamics to be displaced away from ${x}_{i}=1$. Therefore, for a static input, the performance of this system—the degree of mismatch between ${P}_{i}$ and ${D}_{i}$, or, equivalently, the deviation of x_{i} from 1—is actually worse than the performance of the original SEPI.
While the case of a static input is irrelevant for the discussion in the main text, this slight offset leads to a performance penalty also for a fluctuating input. Indeed, timeaveraging Equation (S7) shows that for any active regulator, we must have
where $f$ is the nonlinearity in our equation, $f(\gamma )=\mathrm{max}(d,\gamma )$. Clearly, we will in general again have $\u27e8{x}_{i}\u27e9\ne 1$; this is particularly obvious in the limit of small fluctuations when the nonlinearity is ‘not active’, such that $f(\gamma )=\gamma $.
This effect could be corrected by shifting x_{0}. In the interest of keeping our model as close to SEPI as possible, we chose not to do so: this penalty is only significant in the regime where no learning occurs, and is otherwise outweighed by the performance increase due to selftuned responsiveness, with the additional benefit of simplifying the discussion in the main text. Even if optimizing x_{0} could make the performance slightly closer to the optimal bound, this kind of finetuning seems irrelevant in a biological context.
S3 Defining the system’s responsiveness
In the main text, we use a measure for the ‘responsiveness’ of our system to changes in the demand $D(t)$. In this section it is shown in detail how this measure is defined. The central aim of the considered regulatory model is to match the timedependent demand $\overrightarrow{D}$ with the regulated production $\overrightarrow{P}$. The temporal evolution of $\overrightarrow{P}$ is given by:
with
For a static demand ${D}_{i}=1$ the production relaxes to a constant value ${P}_{i}\approx 1$ (where we assumed small κ) and consequently ${\dot{P}}_{i}=0$. A deviation $\delta {D}_{i}$ from the static demand will lead to a change of the production ${P}_{i}$  the larger ${\dot{P}}_{i}$, the faster the response to the changed demand. Therefore, we define the responsiveness of the production ${P}_{i}$ to the demand ${D}_{j}$ as ${\mathcal{R}}_{ij}=\frac{d{\dot{P}}_{i}}{d{D}_{j}}$. When assuming small fluctuations the explicit expression for the responsiveness is then given by:
As an example we consider the onedimensional system studied in Figure 3AF in the main text for which the responsiveness is
where we used that ${\sigma}_{1+}=1$ and ${\sigma}_{1}=1$.
S4 Control theory calculation
The calculation below follows the standard procedure known as LinearQuadratic Optimal Control (LQC). For more details, we refer the reader to standard sources (e.g. Liberzon, 2011).
The problem we set is minimizing $\u27e8{(\overrightarrow{P}\overrightarrow{D})}^{2}\u27e9$, where $\overrightarrow{D}$ follows the dynamics of Equation (1) in the main text,
We then wish to calculate the optimal way of steering $\overrightarrow{P}$. For simplicity, we will set the mean $\overline{D}=0$ (in the context of this abstract problem, this is equivalent to working with meanshifted variables $\delta \overrightarrow{D}\equiv \overrightarrow{D}\overrightarrow{\overline{D}}$ and similarly $\delta \overrightarrow{P}\equiv \overrightarrow{P}\overrightarrow{\overline{D}}$). We can start by discretizing the above equation,
where $\stackrel{~}{M}\equiv M\mathrm{\Delta}t$, and ξ has variance $2\mathrm{\Gamma}\mathrm{\Delta}t$. We seek to determine the optimal way to steer $P$; in other words, the function ${\varphi}_{t}({P}_{t},{D}_{t})$ (‘control policy’) dictating how $P$ should be changed in a given time step:
We then can define our cost function, which combines a cost for the magnitude of u_{t} (how quickly we can change $\overrightarrow{P}$), and the difference between $\overrightarrow{P}$ and $\overrightarrow{D}$:
The $\varphi ({P}_{t},{D}_{t})$ describing the optimal behavior is the one that minimizes this cost. In order to solve for $\varphi ({P}_{t},{D}_{t})$, we follow standard control theory techniques and define the ‘costtogo’ function,
This function defines the smallest cost of all remaining steps; in particular, the total cost that we are trying to minimize is ${V}_{0}(0,0)$. The costtogo satisfies the boundary condition
and the following recursive relation:
Since our system is Gaussian, this recursive relation can be solved by adopting a quadratic ansatz:
Solving for the matrices ${A}_{t}$, ${B}_{t}$, ${C}_{t}$, and ${Q}_{t}$, gives us the following recursive relations:
Since our problem is to minimize the cost at steady state (known in control theory as an ‘infinite horizon’ problem, $N\mapsto \mathrm{\infty}$), we are interested in the fixed point of this mapping, specifically the matrices ${A}_{\mathrm{\infty}},{B}_{\mathrm{\infty}}$ to which this mapping converges when we start from $A}_{N}={B}_{N}={C}_{N}=\mathbb{\U0001d7d9$ and ${Q}_{N}=0$ (as defined by Equation (S19)).
Since ${A}_{N}$ is the identity matrix, all ${A}_{t}$ are proportional to the identity matrix as well: $A}_{t}={\alpha}_{t}\phantom{\rule{thinmathspace}{0ex}}\mathbb{\U0001d7d9$, where ${\alpha}_{t}=1+\frac{\rho {\alpha}_{t+1}}{\rho +{\alpha}_{t+1}}$. The fixed point of this mapping is $\alpha =\frac{1+\sqrt{1+4\rho}}{2}\ge 1$. Similarly, the fixed point of ${B}_{t}$ is $B=[\mathbb{\U0001d7d9}\frac{\rho}{\rho +\alpha}(\mathbb{\U0001d7d9}\stackrel{~}{M}){]}^{1}$. Expressing this in terms of α only:
With these expressions, the optimal ‘control policy’ is defined by the value of $v$ that minimizes Equation S20. This defines the optimal way to change $\overrightarrow{P}$ for a given observed $\overrightarrow{D}$:
or, restoring the notations of the main text, including a nonzero $\overline{D}$:
This $u$ is the exact solution to the discrete version of the problem we considered. Since our simulations in this work use a discrete timestep, this is the form we use. Nevertheless, it is instructive to consider the small$\mathrm{\Delta}t$, largeCIP limit such that $\mathrm{\Delta}t$ and $(\alpha 1)\mathrm{\Delta}t$ are both small compared to inverse eigenvalues of $M$. In this case we have, to first order in $\mathrm{\Delta}t$:
This leads to the following, and very intuitive, form of the optimal control dynamics:
In other words, at every step the change in $\overrightarrow{P}$ mirrors the average expected change in $\overrightarrow{D}$, with an extra term seeking to reduce their deviation. Note also that setting $\alpha =1$ (infinite CIP) corresponds to steering $\overrightarrow{P}$ directly to the expected value of $\overrightarrow{D}$ at the next timestep, as expected.
S5 The system makes use of correlations in the input
Figure 4B in the main text demonstrated that, as the fluctuating inputs become increasingly correlated, our architecture is able to outperform SEPI by an increasingly wide margin. The natural interpretation of this result is that the system is able to learn and exploit this correlation. Technically, however, one might note that this observation alone does not yet prove that our architecture is able to appropriately use the information it learned about specifically correlation. For example, it could be that strongly correlated inputs are somehow inducing a stronger increase in reactivity, causing the system to be generally faster, but without benefiting specifically from the correlated nature of its inputs.
Rather than focusing on excluding this specific scenario (which could be done by comparing the CIP values along the curves shown in Figure 4B), we will show that with a sufficient number of regulators, our architecture can perform better than the theoretical ceiling achievable by any strategy that assumes the inputs to be independent. This will formally prove that, at least for some parameters, our system’s improved performance must necessarily make use of the correlation of its inputs. Although the argument is somewhat academic in nature (we will prove our point using ${N}_{a}=25$ regulators), it is theoretically pleasing, and so we present it here.
Specifically, we consider a system subjected to inputs structured as shown in Figure 3H, with angle $\alpha =\pi /4$ so that the two inputs have the same variance. Appendix 1—figure 1 shows the performance of our architecture for several values of the number of regulators ${N}_{a}$, plotted as curves parameterized by the degradation rate κ. The degradation rate controls how large the ${a}_{\mu}$ can become: a high value of κ leads to lower average steadystate values of the regulator activities, causing the system to be less responsive to changes in $D$. Thus, κ can be used to set the CIP of the regulatory system, allowing us to plot these performance curves in the ‘performance vs. CIP’ axes traditional for control theory.
Reassuringly, all these performance curves are located below the optimal controltheory ceiling computed for the true correlation structure of the input. However, the plot also shows the ‘best independent’ curve, defined as follows. Consider all possible matrices $M$ corresponding to uncorrelated inputs: $M=\left(\begin{array}{cc}{\lambda}_{1}& 0\\ 0& {\lambda}_{2}\end{array}\right)$. Each such $M$ defines a family of control strategies (that would have been optimal if this $M$ were the true $M$ governing the dynamics of the input); this family is indexed by a parameter ρ as described above. A system following an (independenceassuming) strategy $M=\left(\begin{array}{cc}{\lambda}_{1}& 0\\ 0& {\lambda}_{2}\end{array}\right)$ while faced with the actual (partially correlated) inputs will exhibit a certain performance $\mathcal{P}({\lambda}_{1},{\lambda}_{2},\rho )$ and a certain CIP, which we denote $\mathrm{CIP}({\lambda}_{1},{\lambda}_{2},\rho )$. With these notations, the ‘best independent’ curve is defined as
We note that the correlatedinput CIP is different from the independentinput CIP that a given strategy would have incurred if faced by the input for which it is optimal. In particular, while the latter can be computed analytically, the former has to be evaluated in simulations. This makes the optimization procedure computationally costly; thankfully, the symmetry ensured by choosing $\alpha =\pi /4$ allows restricting the search to isotropic strategies $M=\left(\begin{array}{cc}\lambda & 0\\ 0& \lambda \end{array}\right)$, reducing the problem dimensionality from three parameters $\{{\lambda}_{1},{\lambda}_{2},\rho \}$ to more manageable two $\{\lambda ,\rho \}$.
The result is shown in Appendix 1—figure 1 as a dashed line. As highlighted in the inset, with enough regulators, our architecture is indeed able to outperform the theoretical ceiling of the best independenceassuming strategy. Although $N=25$ regulators is of course a regime irrelevant for biological applications, the aim of this argument was to formally prove a theoretical point, namely that the system as constructed must necessarily be making use of the correlation in the input signal, at least for some values of the parameters; by construction, the ‘best independent’ curve is a high bar to clear.
S6 Model predictive control
When framing the problem in the main text, we discussed it as consisting of two tasks, learning the fluctuation structure of the environment and ‘applying that knowledge’ (Figure 1B), and treated the two as conceptually separate. In particular, the calculation in section ‘S4 Control theory calculation’ is known as linearquadratic Gaussian control (LQG) that assumes the correct model of the environment to already be known. This separation was done to simplify presentation, allowing us to encapsulate the control theory discussion, potentially less familiar to our audience, to a smaller portion of the narrative. In addition, the LQG calculation is simple, exact, and provides an upper bound on how well any other control strategy could perform.
Mechanistically, however, the two tasks are inseparable. In the language of control theory, our architecture implements an example of model predictive control: a strategy where the system response is governed by a continually updated internal model of the environment (here, the activity levels of the regulators, which encode the learned correlations between the inputs).
How could one distinguish a predictive vs. nonpredictive control scheme in practice, when measuring all internal variables to decipher whether or not they encode an ‘internal model of the environment’ is infeasible? For our scheme, its ‘predictive’ ability manifests itself as follows. Imagine exposing our system to two inputs $\overrightarrow{D}(t)=\u27e8\overrightarrow{D}\u27e9+(\delta {D}_{1}(t),\delta {D}_{2}(t))$ which for a period of time are strongly correlated, with $\delta {D}_{1}(t)\approx \delta {D}_{2}(t)$. The learning process will drive the responsiveness matrix from one that was initially isotropic to one aligned with the correlated direction (in the notation of the main text, $\alpha =\pi /4$). Compare now the response of the system to an increase in D_{1}: $(\delta {D}_{1},\delta {D}_{2})=(a,0)$. The naÃ¯ve (untrained) system would respond by increasing P_{1} only, as would the SEPI architecture. In contrast, the ‘trained’ system, having learned the correlation between D_{1} and D_{2}, responds by upregulating both P_{1} and P_{2} together. In this hypothetical example, our analysis predicts that deleting seemingly superfluous regulators would hinder or remove this ability (depending on the implementation, possibly without even affecting fitness in a static environment).
This is the behavior expected of a predictive controller: Under a model where $\delta {D}_{1}$ and $\delta {D}_{2}$ are strongly correlated, one expects this state of the environment to relax back to the diagonal $\delta {D}_{1}=\delta {D}_{2}$ in the near future. This kind of associative learning is experimentally measurable and, on an evolutionary scale, is wellknown. Our scheme provides a mechanism for implementing this behavior on a physiological timescale. Another mechanism focusing on binary associations was previously described by Sorek et al., 2013.
S7 Nonlinearity acts as a sensor of fluctuation variance
In the main text, we argue that the nonlinearity in the dynamics of the regulator concentrations acts as a senor for the variance of the fluctuations. To see this, we consider the dynamics of one regulator that is controlling the production of one metabolite:
To simplify notation, we define $\gamma :=1P/D$. Since the dynamics of $a$ are slow compared to $D$, the fluctuations of γ are on a faster timescale than the regulator dynamics. If the fluctuations of γ are small, the nonlinearity in the $\mathrm{max}$ function is ‘not activated’: $\mathrm{max}(d,\gamma )=\gamma $. In this case, the temporal average of $\mathrm{max}(d,\gamma )$ is zero. In contrast, if the fluctuations are strong enough, the nonlinearity is activated (see Appendix 1—figure 2). Then, the temporal average is positive, leading to an additional growth of $a$. Due to the resulting larger values of $a$, the response of the system becomes faster, making the match between $P$ and $D$ better and thus serving to decrease the variance of γ. As a result, the final average steadystate regulator concentration is reached if the system has decreased the variance of γ sufficiently by increasing the rapidity of its response.
This argument makes it clear why the specific choice of nonlinearity is not particularly important. If $\dot{a}=f(x)$, then the static steady state satisfies $f({x}_{0})=0$. For a fastfluctuating input this becomes
For any nonlinear $f$, as long as $f({x}_{0})=0$, the displacement of the original steady state will be determined by higher order statistics of the input. In particular, the rectifiedlinear nonlinearity in our equations can be replaced, for example, by a Hill function. Note that for the argument presented here, the eventual saturation of the response at large argument is irrelevant: the system will retain all the relevant behavior as long as the new nonlinearity is cupconvex in a sufficiently large vicinity of its static fixed point; see section ‘S9.1 A pair of allosteric enzymes’ for an explicit example.
The assumption $f({x}_{0})=0$ is not innocuous; in general, of course, the value of $\u27e8f(x)\u27e9$ is sensitive not only to the variance of $x$ (or other higherorder terms), but also to its mean, and building a system that is sensitive to specifically the variance requires adapting to the mean first. In our model, this is automatically accomplished by the underlying endproduct inhibition architecture, which adapts the mean $P$ to mean $D=\overline{D}$, after which $x$ fluctuates around 1, no matter the value of $\overline{D}$.
S8 The minimal Γ needed to initiate adaptation
Figure 4A in the main text includes arrows indicating theoretically derived threshold values of Γ above which our system (with a given ${\sigma}_{\mu i}$) will begin to adapt its timescale of response, deviating from SEPI in its behavior. Here, we show how this threshold value of Γ can be determined.
As discussed in the main text, at static input ($\mathrm{\Gamma}=0$) only ${N}_{x}$ out of ${N}_{a}$ regulators can be active. Consider the regulators that remain inactive in the static case, and imagine gradually increasing the fluctuation magnitude. Recall the equation for regulator activity dynamics:
After introducing ${\gamma}_{\mu}={\sum}_{i}{\sigma}_{\mu i}(1{P}_{i}/{D}_{i})$ we can write
If we chose ${a}_{\mu}$ as one of the regulators that remained inactive in the static case, we have ${\mathrm{\Delta}}_{\mu}<0$ at $\mathrm{\Gamma}=0$; as we begin increasing the fluctuation magnitude, the timeaveraged ${\mathrm{\Delta}}_{\mu}$ will at first remain negative. The threshold Γ we seek is the one where the first (timeaveraged) ${\mathrm{\Delta}}_{\mu}$ crosses into positive values. It is clear that if the fluctuations of ${\gamma}_{\mu}$ are so small that $\mathrm{max}(d,{\gamma}_{\mu})={\gamma}_{\mu}$ at all times, the system does not adapt. On the other hand, if the fluctuations are large enough and fast compared to the response of the system, they generate an effective additional growth of ${a}_{\mu}$. To first order, this additional growth term is proportional to the standard deviation $\sqrt{{\omega}_{\mu}}$ of ${\gamma}_{\mu}$. Therefore, we expect the fluctuations to cause a growth of ${a}_{\mu}$ if the additional growth term is large compared to κ, i.e. $\sqrt{\omega}\u2a86\text{c}\cdot \kappa $, with $c$ a constant of order 1.
The approximate value of $c$ can be determined using the following argument. With $d=0$, and assuming that ${\gamma}_{\mu}$ is, first, fluctuating on a fast timescale compared to ${\tau}_{a}$ and, second, is Gaussian with mean ${\overline{\gamma}}_{\mu}$ and variance ${\omega}_{\mu}$, we can average over the fluctuations in Equation (S28):
The system is in a stable steady state if either $\u27e8{\mathrm{\Delta}}_{\mu}\u27e9=0$ and ${a}_{\mu}\ge 0$ or $\u27e8{\mathrm{\Delta}}_{\mu}\u27e9<0$ and ${a}_{\mu}=0$. In the nontrivial first case we get the condition ${\overline{\gamma}}_{\mu}\le \kappa $. Approximating ${\overline{\gamma}}_{\mu}\approx 0$ one sees that the average growth rate $\u27e8{\mathrm{\Delta}}_{\mu}\u27e9$ is positive if $\sqrt{{\omega}_{\mu}}>\sqrt{2\pi}\kappa $, so that $c=\sqrt{2\pi}$. If this condition is satisfied, ${a}_{\mu}$ continues its growth until the separation of timescales between ${\gamma}_{\mu}$ and ${\tau}_{a}$ becomes invalid and ${\omega}_{\mu}$ decreases; this is the mechanism by which the system adapts to fast fluctuations.
The variance ${\omega}_{\mu}$ can be derived from the statistical properties of $D$. If the fluctuations of the demand $D$ are small it holds that ${\omega}_{\mu}\approx \delta {\widehat{D}}^{T}{\overrightarrow{\sigma}}_{\mu}\delta \widehat{D}$ where $\delta \widehat{D}$ is the covariance matrix of the stationary probability distribution of the fluctuations $\delta \overrightarrow{D}$ with $\u27e8\delta {D}_{1}^{2}\u27e9=\mathrm{\Gamma}\left(\frac{{\mathrm{cos}}^{2}\alpha}{{\lambda}_{1}}+\frac{{\mathrm{sin}}^{2}\alpha}{{\lambda}_{2}}\right)$ and $\u27e8\delta {D}_{1}\delta {D}_{2}\u27e9=\mathrm{\Gamma}\mathrm{cos}\alpha \mathrm{sin}\alpha \left(\frac{{\lambda}_{1}{\lambda}_{2}}{{\lambda}_{1}{\lambda}_{2}}\right)$. The variance ${\omega}_{\mu}$ is then given by ${\omega}_{\mu}={\overrightarrow{\sigma}}_{\mu}^{T}\delta \widehat{D}{\overrightarrow{\sigma}}_{\mu}$ and the minimal value of Γ is set by the largest ${\omega}_{\mu}$ of the considered system.
S9 Realistic biochemical implementations
In the main text, we proposed a simple model of metabolic regulation which highlighted the necessary properties for learning environment statistics, namely an excess of regulators ${a}_{\mu}$, selfactivation, and a nonlinear regulation of ${a}_{\mu}$ by the metabolite concentrations x_{i}. To show how these properties can enable more realistic systems to learn the statistics of a fluctuating environment, here we present two biochemical implementations. The first of these implements dynamics nearly identical to those described in the main text, and the second, illustrated in Figure 5b, bears resemblance to twocomponent systems. As described in the main text, we do not necessarily expect either of these networks to be implemented in real biological systems ‘as is’. Instead, we use these to illustrate the diversity of systems that could use the logic described in this paper to learn statistics of their environment. For simplicity, we consider the onedimensional case (first column of Figure 3 in the main text).
S9.1 A pair of allosteric enzymes
The circuit is shown in Appendix 1—figure 3. The enzymes ${E}_{+}$ and ${E}_{}$ can be in an active or inactive state: The active form of ${E}_{+}$, which we denote ${E}_{+}^{*}$, catalyzes the production of $x$; similarly, ${E}_{}^{*}$ catalyzes degradation of $x$. In addition, we postulate that active enzymes can bind to molecules of the metabolite $x$, which controls selfcatalytic activity (see diagram). The total concentration of ${E}_{+}^{*}$, bound and unbound, plays the role of the activating regulator ${a}_{+}$ from the main text (${a}_{+}=[{E}_{+}^{*}]+[x{E}_{+}^{*}]$), while ${E}_{}^{*}$ plays the role of the inhibitor ${a}_{}$ (${a}_{}=[{E}_{}^{*}]+[x{E}_{}^{*}]$).
The same regulatory structure could be realized with transcription factor regulation, with the role of the active enzymes (${E}_{+}$ and ${E}_{}$) played by two transcription factors. In this version, activation/deactivation of enzymes is replaced by the simpler process of transcription factor synthesis/degradation. For concreteness, here, we focus on the enzymatic case, largely because we expect architectures like ours to be more relevant in fastresponding circuits, which tend to be nontranscriptional. However, except for the difference of timescales, the dynamics of the two versions would otherwise be identical; in particular, both implementations would ‘learn’ in the same way.
For this discussion, it is convenient to have timescales of dynamics of ${a}_{\mu}$ and x_{i} encoded as explicit parameters. Assuming firstorder kinetics, the dynamics of the network can then be described by:
Here, we assume that the metabolite $x$ is much more abundant than the active enzymes ${E}_{+}^{*}$ and ${E}_{}^{*}$, meaning that the relative amount of bound $x$ is very small. This allows us to neglect, in the dynamics of $x$, the source and sink terms due to binding and unbinding of $x$ to the enzymes. We also assume that this binding and unbinding occurs on a much faster timescale than all other processes.
Appendix 1—figure 4 shows an example of simulation results for these dynamics (for the full list of parameters used, see section ‘S11 Parameters used in figures’). We see that the system reacts to an increasing variance of environmental fluctuations (A) by increasing regulator activities (B). The figure also shows the behavior of a SEPI system which only consists of one ${a}_{+}$ regulator described by the dynamics in Equation (S30). Appendix 1—figure 4C shows that the response strength, defined as discussed in the SI section ‘Defining the system’s responsiveness,’
is increasing due to the changed regulator activities. Finally, Appendix 1—figure 4D compares the performance of the system Equation (S30) with the corresponding SEPI system (which, again, we define by the same equations as Equation (S30), except without the ${a}_{}$ regulator). Similar to Figure 3F in the main text, the performance of the adapting system does not change as the variance of the stimulus increases, while the SEPI system becomes worse.
S9.2 An architecture based on a bifunctional enzyme
For the reader’s convenience, we reproduce this circuit in Appendix 1—figure 5 (identical to Figure 6B in the main text). As described in the main text, for greater generality, we will here rephrase the task: instead of matching production to demand, we will think of maintaining the homeostasis of a quantity $x$ perturbed by external factors. For example, instead of being a metabolite concentration, $x$ could be osmolarity mismatch, and our circuit a hypothetical architecture for active control of osmotic pressure. In this interpretation, the enzyme $E$ might be a mechanosensor triggered by tension in the membrane or cell wall, while ‘production’ and ‘degradation’ of $x$ could be activities of opposing pumps, or regulators of glycerol export or metabolism.
To simplify our language when describing the terms in the specific equations we simulate, we will still talk of a metabolite $x$ being produced and degraded. However, to better accommodate alternative interpretations, the regulator activities will now be defined so that ${a}_{+}$ and ${a}_{}$ would be equal on average (for example, the activities of pumps in opposite directions should, on average, balance). This homeostasismaintaining formulation is in contrast to Figure 3D in the main text, where regulators satisfied the constraint $\u27e8{a}_{+}{a}_{}\u27e9=\overline{D}=1$.
The production and degradation of $x$ are catalyzed by a bifunctional enzyme that changes its activity when bound to $x$ forming the compound $xE$. The concentration of the unbound form $E$ corresponds to the activating regulator, ${a}_{+}=[E]$, and increases the production $P$ of $x$, while $xE$ plays the role of the inhibiting regulator, ${a}_{}=[xE]$, and promotes the degradation of $x$.
As above, we assume firstorder kinetics for the production and degradation of $x$, and that the binding kinetics are fast compared to the other timescales in the problem. Defining $A={a}_{+}+{a}_{}=[E]+[xE]$ as the total concentration of the enzyme $E$ in both its bound and unbound states, the bound and unbound fractions are described by Hill equations:
These expressions make it explicit that a small change in the concentration $x$ induces a change in ${a}_{+}$ and ${a}_{}$ that is proportional to their sum, $A={a}_{+}+{a}_{}$. In this way, the circuit shown in Appendix 1—figure 6 does include an element of selfactivation (one of our ‘key ingredients’), even though no interaction in the diagram is explicitly selfcatalytic.
With these notations, the dynamics of our system are:
where we assumed that modifying the enzyme $E$ does not significantly affect the quantity $x$. (In the metabolic formulation, this corresponds to the assumption that $x$ is much more abundant than $E$, so that the sequestration of $x$ by $E$ has negligible effect on the free concentration of $x$; in the osmotic formulation, the act of triggering a mechanosensor has negligible instantaneous effect on pressure). In the second equation, the synthesis of the enzyme $E$ depends nonlinearly on the metabolite concentration $x$. The specific form of nonlinearity does not significantly affect the results, as long as it is sufficiently cupconvex in the vicinity of the operating point: As described in the section ‘S7 Nonlinearity acts as a sensor of fluctuation variance’, we can think of the nonlinearity $f(x)$ as a ‘sensor’ for the variance of environmental fluctuations. Whenever fluctuations in $D(t)$ increase such that the current responsiveness of the system fails to maintain the homeostasis of $x$ within previous bounds of fluctuation magnitude, the fluctuations of $x$ will lead to growth of $A$, increasing the responsiveness until it is again able to reduce the fluctuations of $x$. In our simulations we chose a Hill function with cooperativity 4 (see section ‘S11 Parameters used in figures’).
Appendix 1—figure 6 shows simulation results for this system. As in the first column of Figure 3, the variance of $D$ is increased and the response of the system to this change is monitored. We see that the regulator concentrations correspondingly increase, causing a larger response strength $\frac{d\dot{x}}{dx}\approx 1+\frac{2\gamma E{c}^{m}m}{{(1+{c}^{m})}^{2}}$. The increase in response strength is able to compensate for most of the performance loss, which shows that the system successfully adapts its timescale of response. This is in contrast to the ‘SEPIlike’ system with a fixed value $A=1$, which cannot adapt its responsiveness and whose performance drops with every increase in fluctuation variance.
S10 Adaptation to static demand
In the main text, we argue that the production $\overrightarrow{P}$ of the proposed system Equation (3) adapts to any static demand $\overrightarrow{D}$. The full dynamics of the system is
With a static demand, Equation (S34) possesses the same fixed points as the simplified dynamics:
These dynamics have a Lyapunovfunction
This can be verified by considering the temporal change of $F$
with ${\mathrm{\Delta}}_{\mu}={\sum}_{i}{\sigma}_{\mu i}(1{P}_{i}/{D}_{i})\kappa $. Thus, $F$ always increases and is obviously bound from above. For small κ, the maximum of $F$ is reached for $\overrightarrow{P}\approx \overrightarrow{D}$, showing that the system adapts to any static demand.
S11 Parameters used in figures
If not stated otherwise, we use the following parameters: $\overline{D}=1$, $\mathrm{\Gamma}=0.35$, $d=0$, $\kappa =0.015$, ${\tau}_{a}=3$, $\alpha ={45}^{\circ}$, ${\lambda}_{1}=8.75$, ${\lambda}_{2}=875$, $dt=1/{\lambda}_{2}\approx 1.14\cdot {10}^{3}$. Since the demand ${D}_{i}$ is modeled by a stochastic process which is, in principle, unbounded, there is a nonzero probability that the demand ${D}_{i}$ becomes negative. To prevent this behavior in the simulations we set ${D}_{i}=0.01$ if ${D}_{i}<0.01$.
Figure 3C–F
Fluctuations: In 1D the matrix $M$ only has one element which we set to $M=7.5$, $\mathrm{\Gamma}=[0.048,0.082,0.16,0.3]$.
System: $\kappa =0.03$.
Timescales: ${\tau}_{a}={\tau}_{a}^{SEPI}=1$.
Figure 3F shows a running average over 100 steps.
Figure 3I–L
Fluctuations: $\alpha =[60,30,30,60]$
System: ${N}_{a}=5$, ${\tau}_{a}=1/{\lambda}_{1}$, $\kappa =0.02$.
SEPI: For a fair comparison, the timescale of SEPI is chosen such that its responsiveness matches the faster responsiveness of the ${N}_{a}=5$ adapting system (measured in an environment with an isotropic $M$ with the same determinant as used in Figure 3J–L): ${\tau}_{a}^{SEPI}={\tau}_{a}/4.9$.
For visualization purposes, to prevent long transients after changes of the activation angle, the regulator activities were capped from below at ${a}_{\mu}=0.1$.
Figure 3L shows a running average over 100 steps.
Figure 4A
Fluctuations: Γ from 0 to 0.1 in 40 equidistant steps.
Timescale SEPI: ${\tau}_{a}^{SEPI}={\tau}_{a}=3$.
Simulation length: 5 · 10^{7} steps.
Figure 4B
Fluctuations: $\mathrm{\Gamma}=0.05$, anisotropy A=[ 1, 1.05,1.1, 1.15, 1.2 , 1.25, 1.3 , 1.35, 1.4, 1.45, 1.5, 1.55, 1.6 , 1.65, 1.7, 1.75, 1.8, 1.85, 1.9, 1.95, 2, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 3.25, 3.5, 4, 4.5, 5, 6, 7, 8, 9, 10]. For each value of $A{\lambda}_{1}$ and ${\lambda}_{2}$ are chosen as: ${\lambda}_{1}=\frac{1+{A}^{2}}{r{A}^{2}}$, ${\lambda}_{2}={A}^{2}{\lambda}_{1}$ with $r=1/8.75+1/875$.
Timescale SEPI: ${\tau}_{a}^{SEPI}={\tau}_{a}=3$.
Simulation length: 5 · 10^{7} steps.
Figure 5A and B
Fluctuations: Results averaged over activation angles $\alpha =[45,85,125,165,205]$.
System: $\kappa =0.02$, ${\tau}_{a}=1/{\lambda}_{1}$.
Simulation length: 10^{7} steps.
Figure 6C and Appendix 1—figure 4
The parameters in the simulation were chosen so as to ensure that, first, ${\tau}_{x}\ll {\tau}_{a}$; and second, the steadystate $x$ stays at 1 (this is analogous to setting ${x}_{0}=1$ in the main text). Specifically, we used: ${\tau}_{x}=0.01$, ${\tau}_{a}=1$, ${\gamma}_{+}={\gamma}_{}=1$, $n=2$, $m=2$, ${c}_{+}^{n}=0.5$, ${c}_{}^{m}=2$, ${\kappa}_{+}=1.0025\frac{1}{{c}_{+}^{n}+1}$, ${\kappa}_{}=1.0025\frac{{c}_{}^{m}}{{c}_{}^{m}+1}$. The parameters describing the fluctuations of $D$ are chosen as: $\overline{D}=1$, $M=1$, $\mathrm{\Gamma}=[0.015,\mathrm{\hspace{0.17em}0.025},\mathrm{\hspace{0.17em}0.04},\mathrm{\hspace{0.17em}0.055}]$.
A brief explanation: While the parameter list is long, there is a simple reasoning which sets most of these choices, and explains how the parameters of this model need to be related to each other in order for the adaptation of responsiveness to occur. First of all, we assume that the concentration $x$ changes on a much faster timescale than the regulator concentrations $a$; here we choose ${\tau}_{a}\mathrm{=}\mathrm{1}$ and ${\tau}_{x}\mathrm{=}\mathrm{0.01}$. Further, the average of $D\mathrm{}\mathrm{(}t\mathrm{)}$ is chosen to be equal to one. Then, for small fluctuations of $D$ we have $x\mathrm{\approx}\gamma \mathrm{}\mathrm{(}{a}_{\mathrm{+}}\mathrm{}{a}_{\mathrm{}}\mathrm{)}$. On the other hand, the nontrivial fixed points of the regulator concentration dynamics are reached if
Thus, we can set the equilibrium point of $x$ by choosing ${\kappa}_{+}$, ${\kappa}_{}$, ${c}_{+}$ and ${c}_{}$. Here, without loss of generality, we choose that the fixed point is reached at $x={x}_{0}=1$ by setting
For the soughtafter effect to occur, the fast fluctuations of $x$ around ${x}_{0}=1$ need to result in an effective additional growth of ${a}_{+}$ and ${a}_{}$, providing a ‘sensor’ for the variance of $D$. One possibility to get this behavior is to set ${c}_{}^{m}=2$ and ${c}_{+}^{n}=0.5$. To avoid the regulator concentrations to grow indefinitely, ${\kappa}_{+}$ and ${\kappa}_{}$ need to be a little larger than the determined values in Equation (S39). Finally, the parameter γ can be chosen rather freely; here we choose $\gamma =1$. Simulation length: 3.2 · 10^{7} steps with $dt=3\cdot {10}^{3}$. Panels 6C and S4 D show a running average over 100 timesteps.
Figure 6D and Appendix 1—figure 6:
We used the following parameters for the simulations: ${\tau}_{x}=1$, ${\tau}_{A}=25$, ${P}_{0}=1$, ${\gamma}_{+}={\gamma}_{}=5$, $\kappa ={10}^{3}$, ${c}^{m}=1$, $m=1$. The nonlinearity was chosen as: $f(x)=d\frac{{x}^{n}}{{x}^{n}+{c}_{1}^{n}}\beta $ with $d=10$, ${c}_{1}^{n}=10$, $n=4$, $\beta =5\cdot {10}^{4}$. The parameters describing the fluctuations of $D$ are set to: $M=3$, $\mathrm{\Gamma}=[0.025,\mathrm{\hspace{0.17em}0.05},\mathrm{\hspace{0.17em}0.1},\mathrm{\hspace{0.17em}0.2}]$. For the mechanism to work, the timescales need to fulfill ${\tau}_{A}\gg {\tau}_{x}$. The parameters P_{0}, $m$ and $c$ are set by the choice of the fixedpoint x_{0} (here ${x}_{0}=1$). Simulation length: 3.2 · 10^{7} steps with $dt=3\cdot {10}^{3}$. Panels 6D and S6 D show a running average over 100 timesteps.
Appendix 1—figure 1
System: κ from 0.01 to 0.025 in steps of size 1.25 · 10^{−4}.
Simulation length = 5 · 10^{7} steps.
For each simulation, the performance was averaged over the last 10^{7} time steps.
Appendix 1—figure 1 inset
All system parameters as in Figure S1 except for: κ from 0.013 to 0.014 in steps of size 2.5 · 10^{−5}.
Simulation length: 10^{8} steps.
For each simulation, the performance was averaged over the last 2 · 10^{7} steps.
The results are binned in 20 equalwidth bins.
Data availability
Python simulation scripts reproducing all figures in the paper from scratch, as well as precomputed simulation data files for faster plotting, are provided as Source code 1 (and are also publicly available at https://doi.org/10.17632/5xngwv5kps.2).
References

Circadian rhythms from multiple oscillators: lessons from diverse organismsNature Reviews Genetics 6:544–556.https://doi.org/10.1038/nrg1633

ConferenceStrong turing completeness of continuous chemical reaction networks and compilation of mixed analogdigital programsInternational Conference on Computational Methods in Systems Biology. pp. 108–127.

Twocomponent signaling circuit structure and propertiesCurrent Opinion in Microbiology 13:184–189.https://doi.org/10.1016/j.mib.2010.01.009

Achieving optimal growth through product feedback inhibition in metabolismPLOS Computational Biology 6:e1000802.https://doi.org/10.1371/journal.pcbi.1000802

Autoamplification of a twocomponent regulatory system results in "learning" behaviorJournal of Bacteriology 183:4914–4917.https://doi.org/10.1128/JB.183.16.49144917.2001

Implementation of dynamic Bayesian decision making by intracellular kineticsPhysical Review Letters 104:228104.https://doi.org/10.1103/PhysRevLett.104.228104

BookCalculus of Variations and Optimal Control Theory: A Concise IntroductionPrinceton University Press.

Escherichia coli Habitats, Cell Types, and Molecular Mechanisms of Gene ControlThe American Naturalist 122:732–744.https://doi.org/10.1086/284168

Bistability, epigenetics, and bethedging in bacteriaAnnual Review of Microbiology 62:193–210.https://doi.org/10.1146/annurev.micro.62.081307.163002

How Can Evolution Learn?Trends in Ecology & Evolution 31:147–157.https://doi.org/10.1016/j.tree.2015.11.009

Autoregulation is essential for precise temporal and steadystate regulation by the Bordetella BvgAS phosphorelayJournal of Bacteriology 189:1974–1982.https://doi.org/10.1128/JB.0168406
Decision letter

Aleksandra M WalczakSenior Editor; École Normale Supérieure, France

Arvind MuruganReviewing Editor; University of Chicago, United States

Rami PugatchReviewer; BenGurion University of the Negev, Israel

Guillaume LambertReviewer; The University of Chicago, United States
In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.
Acceptance summary:
This paper shows how a common biochemical motif enables an organism to learn parameters describing the statistical structure of environmental fluctuations. The inferred parameters (here, the mean and variance) can be used to mount an adaptive response. The work shows a simple design principle to achieve control over phenotype, homeostasis and growth.
Decision letter after peer review:
Thank you for submitting your article "A simple regulatory architecture allows learning the statistical structure of a changing environment" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Aleksandra Walczak as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Rami Pugatch (Reviewer #1); Guillaume Lambert (Reviewer #2).
The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.
Essential Revisions:
1) How would results change if the thresholdlinear model is replaced by a Hill function? Some discussion is currently in the SI. More discussion is needed in the main text of how all results would change if a hill function is used throughout.
2) A unified compact discussion of selfactivation, nonlinearity and regulator excess in the main text.
3) Discuss the cost/benefits of the proposed learning mechanisms if the environment doesn't fit the internal model discussed here – e.g., requires more than a binary onvsoff response or environmental statistics varies very fast.
Reviewer #1:
My recommendation is a to reorganize the paper according to the following comments:
Weaknesses:
The first weakness I wish to discuss is the oversimplification of the model. The authors assume a set of parameters D_{i} (demand) has to be tracked by a set of internal parameters P_{i} and define a quadratic penalty term, (which happens to be the only choice that is amenable to analytical treatment in control theory). To motivate this modeling approach they give two examples, one of which is the expression of a costly metabolic pathway that ideally should track the relevant nutrient. The problem with this model is the assumption that there is only one optimal solution $\overrightarrow{P}=\overrightarrow{D}$. This I think is too simplistic since (i) the dimension of the internal set of parameters π at least in the metabolic example is typically larger than the state parameters D_{i} (even if they represent the demand for a metabolite there are typically multiple alternative supply pathways) ; (ii) The possibility of multiple solutions P_{1},…, P_{m} which are equally optimal is ignored.
For example, absence aminoacid requires expression of proteins that can metabolize these aminoacids from an inorganic nitrogen source. However, presence of both aminoacids and inorganic nitrogen source poses a dilemma to the cell, and a solution that optimize say growth rate depends subtly on both the relative concentration of the inorganic nitrogen source vs. the concentration and availability of external aminoacids, as well as the protein investment in transporting vs. making the aminoacids from scratch. If these relative concentrations fluctuate, an optimal solution in terms of growth rate can either be to ignore the aminoacids, or to ignore the inorganic nitrogen source or to express a mixture of both and this is not captured by the model presented by the authors. And there are other cases where multiple alternative pathways in metabolism are present to satisfy a given demand.
As the great William Feller used to tell his students, the best in science consists of the general embodied in the concrete. I like the general result but I urge the authors to take Feller's challenge and find a concrete example to demonstrate the merit of their approach and use it in the paper.
This biological concreteness will also solve a second weakness I found in the paper, namely the form of the nonlinear response used by the authors in the main text and in preparing the main figures (to my understanding). The nonlinearity is of the form of 'thresholdlinear' response. This is not biologically feasible as there is always some upper saturation in the response. I expected to see the results presented in the main text in terms of a MichaelisMenten (MM) or Hill function form which saturate both at low and at high concentrations. Although such cases were analyzed in the SI, it wasn't clear to me whether the main conclusions would change if such functions were used instead throughout the paper.
Since the author emphasize the role of (i) excess of regulators, (ii) selfactivation, and (iii) nonlinearity of the regulators, I think it is appropriate to analyze their importance in the main text. For some reason the role of selfactivation was deferred to the SI.
Also, I would have liked to understand their relative importance (with realistic MM or Hill nonlinearity) keeping in mind that proteins are costly for the cell to make, so a minimal excess is perhaps better.
Finally, I would expect a discussion of an hypothetical experiment, where such excess is deliberately removed by say gene deletion by the experimenter – what would be the simplest experiment that will ascertain the effect you predict be? Will a simple upshift / downshift will do the job?
Another more technical weakness is the statistical model being a Gaussian process. The authors claim their result is more general so I wonder how their result change for processes without time scale or to processes with powerlaw correlations?
A weakness in the analysis of the benefit of having such mechanisms for 'learning' the statistics of the environment requires some attention. If the environments statistics changes rapidly, or if the environment poses more complex challenges that require a response which is not allornone I would expect a reduction in the benefit of such a mechanism. An example that was discussed, is the case where the environment statistics varies too fast compared to the timescale for learning a new environment statistic. It would be nice to see a simple summary of when the authors predict such a system will be advantageous for cells, considering e.g. the excess cost of building and maintaining such machinery. This naturally leads to a discussion of possible experimental verification as mentioned before.
More importantly, if we consider an ensemble of cells with such control mechanisms, their mutual interaction can lead to complex dynamics, and it is not clear if in such a setting where each cell response to the global change caused by all other cells, there is still benefit for learning the statistics which now become intertwined with the behavior of the average of all cells.
Finally, I find a weakness in the modeling of the "nominal" product feedback inhibition (PFI). It is not fair to compare to a naïve singlestep PFI since actual PFI's are not single step and there is much work on the topic, see in particular the set of papers by Ned Wingreen from Princeton and his collogues.
I like the paper, and I think my comments benefits the paper. I also understand that doing it all over again might require too much effort, so I leave it to you and to the editor to think what it the best way to address my concerns. Perhaps (hopefully) some of my comments are already taken care of, and the only change is to better explain what you already did. I apologize in advance if I misread or misunderstood parts of the paper and because of that made an unnecessary comment.
I suggest you take a look at the following work:
Goyal, Sidhartha, et al., "Achieving optimal growth through product feedback inhibition in metabolism.". PLoS Comput Biol 66 (2010):, 6, 6, e1000802. Web and the PRL they cite.
I also vaguely recall Stanislas Leibler had a PNAS paper on learning with Bin Kan Xue as first author, perhaps it is relevant too.
I too had a paper that is somewhat relevant (That Kobayashi which you cite, cites in his paper)  https://arxiv.org/abs/1308.0623
Please do not cite it unless you find it relevant, I only mention it because after reading your work I realized that I can use your method to test my formula..
Regarding the prediction issue (that such a system predicts the future and respond to the prediction which it constantly updates). This sound to me a lot like model predictive control. So I think a comparison with model predictive control literature might be in place. In particular, a clear discussion of how do you experimentally differ between nonpredictive to predictive control schemes, given that most if not all of the internal variables are not measured in a typical experiment seems necessary.
Reviewer #2:
Overall, I recommend the publication of this manuscript as long as the authors address the following minor comments and suggestions.
1. A piecewise linear model for the regulator activity like the one presented in Eqn. 2c and 4c may not be biologically realistic. Although it is clear that the linear model was chosen because of the ubiquity of linear activation functions in neural networks, the authors may need to better justify why they chose a linear function, or whether it matters at all what the activity function looks like. Indeed, the authors do look into using a Hilllike function to describe the regulation of biochemical activity in section S8.2, it would be illuminating to emphasize this observation and discuss the role of choosing a specific activity function.
2. Line 389: "...but maintaining such a perfect regulatory system would presumably be costly;". Please explain why or provide a citation supporting this claim.
3. Line 777: (typo) "...acts as a senor"
Fig. 1A: Not sure what this panel is trying to show (even from the figure caption). Perhaps show a third panel above the other two where the information about the exterior is removed but the cell "sees" the same environmental composition?
Fig. 2C: The subscripts for a_{1} and a_{2} are missing.
Fig. 2E: It is unclear what the xaxis and yaxis labels are. X = and Y =
Fig. 3H: the "H" label for panel 4H is not there.
Fig. 3I+J: It is unclear how the regulator levels map back to the environment. Perhaps show a small ellipse with the correct orientation alpha (ie. 60, +30, 30, +60) in each sector, at the top of the graph, with the corresponding magnitude of each vector \sigma?.
Fig. 3J: Similarly, the main message of this panel is not very clear: several lines overlap and it is difficult to track a single one over time. Split that panel into 5 graphs arranged vertically, each showing the trace of a single regulator?
Fig. 4: Panel labels "A" and "B" are missing
https://doi.org/10.7554/eLife.67455.sa1Author response
Essential Revisions:
1) How would results change if the thresholdlinear model is replaced by a Hill function? Some discussion is currently in the SI. More discussion is needed in the main text of how all results would change if a hill function is used throughout.
This is indeed a very important point. To exhibit the generality of the learning behavior we describe, we used a minimal model which allowed us to characterize the minimal set of key ingredients without obscuring them by the (inevitably large) parameter space of any realistic implementation. However, the fact that the behavior is retained in a realistic implementation is, of course, a crucial part of the argument. Relegating this discussion to the supplement was a poor decision.
A literal interpretation of the reviewers’ suggestions here might have been to regenerate the plots of Figure 3 & Figure 4 using a Hill function instead of the thresholdlinear model. With an appropriate choice of the Hill parameters, this could be done so that the panels barely change (for reasons explained in SI lines 663670). However, we felt that doing so would have addressed only the letter of the comment, not the spirit; in fact, including such plots could almost be construed as misleading. Our minimal model includes several simplifications. Adding figures with “partial realism” while retaining other simplifications would implicitly suggest that the form of nonlinearity is the biggest concern one might have; but in fact, the simplistic treatment of regulator selfactivation in Eq. (4c) is no less important.
The questions raised by the reviewers – namely, robustness of the behavior to the simplifications of our model – are issues we thought a lot about while writing the paper, and we felt that any claims about generalizability of the behavior must be validated in a reasonably realistic setting – with not only Hillshaped activation curves, but also a biochemically plausible selfactivation. We completely agree that this discussion belongs in the main text, not the supplement. Thus, to address the reviewers’ concerns, we added a figure 6 to the main text (“Realistic implementations”, which includes two panels that were previously in the supplement), and greatly expanded the corresponding section of Results (lines 341374; see also lines 174177).
2) A unified compact discussion of selfactivation, nonlinearity and regulator excess in the main text.
We have merged the relevant material, previously fragmented between the main text and the SI, into a single section of the main text (revised Figure 5; lines 305329).
3) Discuss the cost/benefits of the proposed learning mechanisms if the environment doesn't fit the internal model discussed here – e.g., requires more than a binary onvsoff response or environmental statistics varies very fast.
To address this point we, first, changed Figure 1A, replacing the (unclear) original cartoon by a concise summary identifying the context where the problem we are considering / the mechanism we are presenting are relevant (new Figure 1A and its legend; also lines 4548). Second, the revised manuscript identifies the control theory calculation (and thus our findings of nearoptimality) as being modelspecific; of biological relevance is the ability of the circuit to learn correlations of fluctuations, which is not specific to the Gaussian model (lines 235238). Additional suggestions from Reviewer #1 related to this point were incorporated in lines 8894 and 266268.
Reviewer #1:
My recommendation is a to reorganize the paper according to the following comments:
Weaknesses:
The first weakness I wish to discuss is the oversimplification of the model. The authors assume a set of parameters D_{i} (demand) has to be tracked by a set of internal parameters P_{i} and define a quadratic penalty term, (which happens to be the only choice that is amenable to analytical treatment in control theory). To motivate this modeling approach they give two examples, one of which is the expression of a costly metabolic pathway that ideally should track the relevant nutrient. The problem with this model is the assumption that there is only one optimal solution $\overrightarrow{P}=\overrightarrow{D}$. This I think is too simplistic since (i) the dimension of the internal set of parameters π at least in the metabolic example is typically larger than the state parameters D_{i} (even if they represent the demand for a metabolite there are typically multiple alternative supply pathways) ; (ii) The possibility of multiple solutions P_{1},…, P_{m} which are equally optimal is ignored.
For example, absence aminoacid requires expression of proteins that can metabolize these aminoacids from an inorganic nitrogen source. However, presence of both aminoacids and inorganic nitrogen source poses a dilemma to the cell, and a solution that optimize say growth rate depends subtly on both the relative concentration of the inorganic nitrogen source vs. the concentration and availability of external aminoacids, as well as the protein investment in transporting vs. making the aminoacids from scratch. If these relative concentrations fluctuate, an optimal solution in terms of growth rate can either be to ignore the aminoacids, or to ignore the inorganic nitrogen source or to express a mixture of both and this is not captured by the model presented by the authors. And there are other cases where multiple alternative pathways in metabolism are present to satisfy a given demand.
The biological question of determining the optimal behavior in a fluctuating environment is subtle. The optimal behavior may indeed not be unique, or even welldefined (optimal given what constraints?) The challenges faced by real organisms are certainly vastly more rich than our simplistic setup.
Nevertheless, generically, the most appropriate behavior will depend on statistics of fluctuations; and any organism attempting to make use of this would at a minimum need (1) a mechanism for sensing these statistics. Moreover, such a mechanism would (2) need to be simple enough to have plausibly evolved and be retained by evolution. Our minimal model allows us to focus on these two basic challenges.
In light of this, we agree with the reviewer that introducing our setup as a way to "model the general challenges of surviving in a fluctuating environment" in the original manuscript was poor phrasing. Our aim was to show that a simple circuit is capable of learning environment statistics, in a setup that could have plausibly benefited fitness. For this aim, we believe the simplifications of our setup (such as a unique optimum) are appropriate, and we added a paragraph (lines 8894) to explicitly address this important point.
As the great William Feller used to tell his students, the best in science consists of the general embodied in the concrete. I like the general result but I urge the authors to take Feller's challenge and find a concrete example to demonstrate the merit of their approach and use it in the paper.
We certainly do agree with the sentiment. The metabolic motivation of the problem (line 128132, lines 385388 in discussion), the osmotic example (line 8384) and the possible relation to twocomponent system architecture (lines 375381, 395400) represent our best attempt to include plausible concreteness without straying too far from the domain of our expertise. For example, we were quite intrigued by a possible implementation of our scheme in the context of protein synthesis and tRNA regulation, but we worried our distance from the subject would make it too cartoonish and naive. Similarly, our discussion of the twocomponent systems goes as far as we were comfortable taking it after a literature review and conversation with an expert (M. Goulian in acknowledgments). We therefore opted for explaining the idea in a theoretical language, explicitly separating it from examples. While a concrete and realistic example would have helped with making the message more broadly accessible, the fact that a reader like yourself saw possible parallels with e.g. the regulation of ribosome production  a topic we ourselves did not consider  appears to show that we did overall succeed at our intention.
This biological concreteness will also solve a second weakness I found in the paper, namely the form of the nonlinear response used by the authors in the main text and in preparing the main figures (to my understanding). The nonlinearity is of the form of 'thresholdlinear' response. This is not biologically feasible as there is always some upper saturation in the response. I expected to see the results presented in the main text in terms of a MichaelisMenten (MM) or Hill function form which saturate both at low and at high concentrations. Although such cases were analyzed in the SI, it wasn't clear to me whether the main conclusions would change if such functions were used instead throughout the paper.
This is an excellent point; please see the “essential revisions” section above (point #1) for a detailed list of changes we made to address it. For specifically the point on upper saturation in the response, see the edit in lines 667668.
Since the author emphasize the role of (i) excess of regulators, (ii) selfactivation, and (iii) nonlinearity of the regulators, I think it is appropriate to analyze their importance in the main text. For some reason the role of selfactivation was deferred to the SI.
Another excellent point; thank you. For the list of edits, see “essential revisions” section, point #2.
Also, I would have liked to understand their relative importance (with realistic MM or Hill nonlinearity) keeping in mind that proteins are costly for the cell to make, so a minimal excess is perhaps better.
For relative importance: none of the three ingredients are sufficient on their own (Figure 5); in that sense, all are equally important. For minimal excess being better: indeed! The main feature of our scheme that (hopefully) makes it of plausible relevance to real living systems is that in the presence of nonlinearity & selfactivation, a single additional regulator can already provide substantial benefit. We added a more nuanced discussion of these points (new panel Figure 5B; lines 317329, especially 328329).
For the purposes of this work, we postulated that sensing/learning fluctuations was desirable, and asked if it was possible. In any specific context, the full costbenefit analysis would indeed weigh the benefit against the cost of making the necessary regulatory machinery. We did not perform this analysis; this is acknowledged in line 269 stating we use CIP as only a “proxy” for specifying the maximum tolerable cost.
Finally, I would expect a discussion of an hypothetical experiment, where such excess is deliberately removed by say gene deletion by the experimenter – what would be the simplest experiment that will ascertain the effect you predict be? Will a simple upshift / downshift will do the job?
One feature of the mechanism we described is that the excess regulators would appear fully redundant / dispensable in any static environment: the fitness penalty from their deletion would only manifest itself in a complex fluctuating environment, as a failure to learn. Of course, there are a great many functions whose benefits are masked in the simplified lab environment, so this feature is hardly a decisive identifying criterion. We added a discussion of a hypothetical experiment that could detect specifically the predictive nature of a regulatory scheme (lines 631642). Our hope with this paper is that experts in any given system are better positioned to map the general idea onto their domain than we could ourselves.
Another more technical weakness is the statistical model being a Gaussian process. The authors claim their result is more general so I wonder how their result change for processes without time scale or to processes with powerlaw correlations?
Another good point; please see our response to the “essential revisions”, point #3.
A weakness in the analysis of the benefit of having such mechanisms for 'learning' the statistics of the environment requires some attention. If the environments statistics changes rapidly, or if the environment poses more complex challenges that require a response which is not allornone I would expect a reduction in the benefit of such a mechanism. An example that was discussed, is the case where the environment statistics varies too fast compared to the timescale for learning a new environment statistic. It would be nice to see a simple summary of when the authors predict such a system will be advantageous for cells, considering e.g. the excess cost of building and maintaining such machinery. This naturally leads to a discussion of possible experimental verification as mentioned before.
To address this point, we made a new Figure 1A; the legend specifies that “In this work, we consider an environment undergoing epochs that differ in their fluctuation structure. Epochs are long compared to the physiological timescale, but switch faster than the evolutionary timescale.” The reviewer is correct in saying that, in principle, the discussion of the benefits would need to be offset by the cost of building/maintaining the relevant machinery. We did not perform this analysis here, and do not know how to do it in a way that is not systemspecific; this limitation is acknowledged in lines 267269.
More importantly, if we consider an ensemble of cells with such control mechanisms, their mutual interaction can lead to complex dynamics, and it is not clear if in such a setting where each cell response to the global change caused by all other cells, there is still benefit for learning the statistics which now become intertwined with the behavior of the average of all cells.
This is a very interesting comment. We edited the framing of the problem to specify we consider the fluctuations structure of externally imposed conditions (lines 82, 94). However, as the reviewer points out, the organisms also influence their own environment. Whether this fact makes the problem more relevant or less relevant is something that would be interesting to think about. Still, we feel that such a discussion lies outside of the scope of this manuscript. (However, we added a mention to lines 9192.)
Finally, I find a weakness in the modeling of the "nominal" product feedback inhibition (PFI). It is not fair to compare to a naïve singlestep PFI since actual PFI's are not single step and there is much work on the topic, see in particular the set of papers by Ned Wingreen from Princeton and his colleagues.
This is a good point. Although we keep comparing the performance of our scheme with product feedback inhibition, our intention is not to claim that this scheme is “worse” than ours; we simply use it as an example of a circuit that manifestly does not learn input statistics, in order to show that our circuit does. Our simplified treatment of PFI is not unfair to this scheme: if anything, our results actually reinforce prior findings (Pavlov et al., 2013, Goyal et al. 2010) that despite its simplicity, PFI is a remarkably efficient scheme  in our Figure 4A, up until the separation indicated by the arrow, the SEPI curve is in fact optimal given its CIP! Our scheme is not generally better at the same CIP (note that before the arrow, our performance is worse than the blue “SEPI” curve), it just “knows” to upregulate the CIP to a larger value when fluctuations become large enough to activate the “learning”, and therefore performs better in some parameter regimes.
We added a paragraph to the main text stating the above (lines 276282); and both references.
I like the paper, and I think my comments benefits the paper. I also understand that doing it all over again might require too much effort, so I leave it to you and to the editor to think what it the best way to address my concerns. Perhaps (hopefully) some of my comments are already taken care of, and the only change is to better explain what you already did. I apologize in advance if I misread or misunderstood parts of the paper and because of that made an unnecessary comment.
Thank you for your careful read of our manuscript and for the valuable suggestions that helped improve the work.
I suggest you take a look at the following work:
Goyal, Sidhartha, et al., "Achieving optimal growth through product feedback inhibition in metabolism.". PLoS Comput Biol 66 (2010):, 6, 6, e1000802. Web and the PRL they cite.
I also vaguely recall Stanislas Leibler had a PNAS paper on learning with Bin Kan Xue as first author, perhaps it is relevant too.
I too had a paper that is somewhat relevant (That Kobayashi which you cite, cites in his paper)  https://arxiv.org/abs/1308.0623
Please do not cite it unless you find it relevant, I only mention it because after reading your work I realized that I can use your method to test my formula..
Thank you for the pointers. We read them all with great interest. The Goyal et al., is particularly relevant and we should have included it from the start; we added this reference as well as another one that seemed the most relevant (Pavlov et al., 2013).
Regarding the prediction issue (that such a system predicts the future and respond to the prediction which it constantly updates). This sound to me a lot like model predictive control. So I think a comparison with model predictive control literature might be in place. In particular, a clear discussion of how do you experimentally differ between nonpredictive to predictive control schemes, given that most if not all of the internal variables are not measured in a typical experiment seems necessary.
We added a paragraph commenting on this to the SI section on control theory (section “Model predictive control”; lines 618649).
Reviewer 2:
Overall, I recommend the publication of this manuscript as long as the authors address the following minor comments and suggestions.
1. A piecewise linear model for the regulator activity like the one presented in Eqn. 2c and 4c may not be biologically realistic. Although it is clear that the linear model was chosen because of the ubiquity of linear activation functions in neural networks, the authors may need to better justify why they chose a linear function, or whether it matters at all what the activity function looks like. Indeed, the authors do look into using a Hilllike function to describe the regulation of biochemical activity in section S8.2, it would be illuminating to emphasize this observation and discuss the role of choosing a specific activity function.
This is a very good point. Please see the “essential revisions” section above (point #1) for a detailed list of changes we made to address it.
2. Line 389: "...but maintaining such a perfect regulatory system would presumably be costly;". Please explain why or provide a citation supporting this claim.
Done (line 268).
Fig. 1A: Not sure what this panel is trying to show (even from the figure caption). Perhaps show a third panel above the other two where the information about the exterior is removed but the cell "sees" the same environmental composition?
Yes; we had tried many versions of this cartoon but in retrospect, it’s still confusing. In the revised manuscript, we replaced it by a concise summary identifying the context where the problem we are considering / the mechanism we are presenting are relevant. (New Fig. 1A)
Fig. 3I+J: It is unclear how the regulator levels map back to the environment. Perhaps show a small ellipse with the correct orientation alpha (ie. 60, +30, 30, +60) in each sector, at the top of the graph, with the corresponding magnitude of each vector \sigma?.
Fig. 3J: Similarly, the main message of this panel is not very clear: several lines overlap and it is difficult to track a single one over time. Split that panel into 5 graphs arranged vertically, each showing the trace of a single regulator?
This was an excellent suggestion! We followed the reviewer’s advice to improve the labeling in these two panels, and find them to be much improved. (Updated panels 3I, 3J.)
Fig. 3H: the "H" label for panel 4H is not there.
Fig. 2C: The subscripts for a_{1} and a_{2} are missing.
This may have been due to issues with PDF rendering on the reviewer’s machine; we do see these labels correctly on ours. If the paper is accepted for publication, we will make sure all labels are correctly displayed in the proofs.
3. Line 777: (typo) "...acts as a senor"
Fig. 2E: It is unclear what the xaxis and yaxis labels are. X = and Y =
Fig. 4: Panel labels "A" and "B" are missing
Corrected; thank you for pointing these out.
https://doi.org/10.7554/eLife.67455.sa2Article and author information
Author details
Funding
Deutsche Forschungsgemeinschaft (EN 278/101)
 Stefan Landmann
National Science Foundation (PHY1734030)
 Caroline M Holmes
National Science Foundation (Graduate Research Fellowship)
 Caroline M Holmes
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
We thank M Goulian, A Murugan, and B Weiner for helpful discussions. SL was supported by the German Science Foundation under project EN 278/10–1; CMH was supported by the National Science Foundation, through the Center for the Physics of Biological Function (PHY1734030) and the Graduate Research Fellowship Program.
Senior Editor
 Aleksandra M Walczak, École Normale Supérieure, France
Reviewing Editor
 Arvind Murugan, University of Chicago, United States
Reviewers
 Rami Pugatch, BenGurion University of the Negev, Israel
 Guillaume Lambert, The University of Chicago, United States
Publication history
 Received: February 11, 2021
 Accepted: July 30, 2021
 Version of Record published: September 7, 2021 (version 1)
Copyright
© 2021, Landmann et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 1,124
 Page views

 173
 Downloads

 0
 Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading

 Cancer Biology
 Computational and Systems Biology
Tumorinitiating cells with reprogramming plasticity or stemprogenitor cell properties (stemness) are thought to be essential for cancer development and metastatic regeneration in many cancers; however, elucidation of the underlying molecular network and pathways remains demanding. Combining machine learning and experimental investigation, here we report CD81, a tetraspanin transmembrane protein known to be enriched in extracellular vesicles (EVs), as a newly identified driver of breast cancer stemness and metastasis. Using protein structure modeling and interface predictionguided mutagenesis, we demonstrate that membrane CD81 interacts with CD44 through their extracellular regions in promoting tumor cell cluster formation and lung metastasis of triple negative breast cancer (TNBC) in human and mouse models. Indepth global and phosphoproteomic analyses of tumor cells deficient with CD81 or CD44 unveils endocytosisrelated pathway alterations, leading to further identification of a qualitykeeping role of CD44 and CD81 in EV secretion as well as in EVassociated stemnesspromoting function. CD81 is coexpressed along with CD44 in human circulating tumor cells (CTCs) and enriched in clustered CTCs that promote cancer stemness and metastasis, supporting the clinical significance of CD81 in association with patient outcomes. Our study highlights machine learning as a powerful tool in facilitating the molecular understanding of new molecular targets in regulating stemness and metastasis of TNBC.

 Cell Biology
 Computational and Systems Biology
Chromatin instability and protein homeostasis (proteostasis) stress are two wellestablished hallmarks of aging, which have been considered largely independent of each other. Using microfluidics and singlecell imaging approaches, we observed that, during the replicative aging of S. cerevisiae, a challenge to proteostasis occurs specifically in the fraction of cells with decreased stability within the ribosomal DNA (rDNA). A screen of 170 yeast RNAbinding proteins identified ribosomal RNA (rRNA)binding proteins as the most enriched group that aggregate upon a decrease in rDNA stability induced by inhibition of a conserved lysine deacetylase Sir2. Further, loss of rDNA stability induces agedependent aggregation of rRNAbinding proteins through aberrant overproduction of rRNAs. These aggregates contribute to ageinduced proteostasis decline and limit cellular lifespan. Our findings reveal a mechanism underlying the interconnection between chromatin instability and proteostasis stress and highlight the importance of celltocell variability in aging processes.