Theory for the optimal detection of timevarying signals in cellular sensing systems
Abstract
Living cells often need to measure chemical concentrations that vary in time, yet how accurately they can do so is poorly understood. Here, we present a theory that fully specifies, without any adjustable parameters, the optimal design of a canonical sensing system in terms of two elementary design principles: (1) there exists an optimal integration time, which is determined by the input statistics and the number of receptors; and (2) in the optimally designed system, the number of independent concentration measurements as set by the number of receptors and the optimal integration time equals the number of readout molecules that store these measurements and equals the work to store these measurements reliably; no resource is then in excess and hence wasted. Applying our theory to the Escherichia coli chemotaxis system indicates that its integration time is not only optimal for sensing shallow gradients but also necessary to enable navigation in these gradients.
Introduction
Living cells continually have to respond and adapt to changes in their environment. They often do so on a timescale that is comparable to that of the environmental variations. Examples are cells that during their development differentiate in response to timevarying morphogen gradients (Durrieu et al., 2018) or cells that navigate through their environment (Tostevin and ten Wolde, 2009; Sartori and Tu, 2011; Long et al., 2016). These cells shape, via their movement, the statistics of the input signal, such that the timescale of the input fluctuations becomes comparable to that of the response. In all these cases, it is important to understand how accurately the cell can estimate chemical concentrations that vary in time.
Cells measure chemical concentrations via receptors on their surface. These measurements are inevitably corrupted by the stochastic arrival of the ligand molecules by diffusion and by the stochastic binding of the ligand to the receptor. Wiener and Kolmogorov (Extrapolation, 1950; Kolmogorov, 1992) and Kalman, 1960 have developed theories for the optimal strategy to estimate signals in the presence of noise. Their filtering theories have been employed widely in engineering, and in recent years they have also been applied to cell signaling. They have been used to show that time integration can improve the sensing of timevarying signals by reducing receptor noise, although it cannot remove this input noise completely because of signal distortion (Andrews et al., 2006; Hinczewski and Thirumalai, 2014; Becker et al., 2015). It has been shown that circadian systems can adapt their response to the statistics of the input signal, as predicted by Kalman filtering theory (Husain et al., 2019). Moreover, Wiener–Kolmogorov filtering theory has been employed to derive the optimal topology of the cellular network depending on the statistics of the input signal (Becker et al., 2015). Negative feedback and incoherent feedforward, which are common motifs in cell signaling (Alon, 2007), make it possible to predict future signal values via signal extrapolation, which is useful when the past signal contains information about the future in addition to the current signal (Becker et al., 2015).
The precision of sensing depends not only on the topology of the cellular sensing network but also on the resources required to build and operate it. Receptors and time are needed to take the concentration measurements (Berg and Purcell, 1977), downstream molecules are necessary to store the ligandbinding states of the receptor in the past, and energy is required to store these states reliably (Govern and Ten Wolde, 2014a). Many studies have addressed the question how receptors and time limit the precision of sensing static concentrations that do not vary on the timescale of cellular response (Berg and Purcell, 1977; Bialek and Setayeshgar, 2005; Wang et al., 2007; Rappel and Levine, 2008; Endres and Wingreen, 2009; Hu et al., 2010; Mora and Wingreen, 2010; Govern and Ten Wolde, 2012; Mehta and Schwab, 2012; Govern and Ten Wolde, 2014a; Govern and Ten Wolde, 2014b; Kaizu et al., 2014; Ten Wolde et al., 2016; Mugler et al., 2016; Fancher and Mugler, 2017). In addition, progress has been made in understanding how the number of readout molecules and energy set the precision of sensing static signals (Mehta and Schwab, 2012; Govern and Ten Wolde, 2014a; Govern and Ten Wolde, 2014b). Yet, what the resource requirements for sensing timevarying signals are is a wide open question. In particular, it is not known how the number of receptor and readout molecules, time, and power required to maintain a desired sensing precision depend on the strength and the timescale of the input fluctuations.
In this article, we present a theory for the optimal design of cellular sensing systems as set by resource constraints and the dynamics of the input signal. The theory applies to one of the most common motifs in cell signaling, a receptor that drives a push–pull network, which consists of a cycle of protein activation and deactivation (Goldbeter and Koshland, 1981, see Figure 1). These systems are omnipresent in prokaryotic and eukaryotic cells (Alon, 2007). Examples are GTPase cycles, as in the Ras system, phosphorylation cycles, as in MAPK cascades, and twocomponent systems like the chemotaxis system of Escherichia coli. Push–pull networks constitute a simple exponential filter (Hinczewski and Thirumalai, 2014; Becker et al., 2015), in which the current output depends on the current and past input (with past input values contributing to the output with a weight that decays exponentially with time back into the past). Wiener–Kolmogorov filtering theory (Extrapolation, 1950; Kolmogorov, 1992) shows that these networks are optimal for estimating signals that are memoryless (Becker et al., 2015), meaning that the past input does not contain information that is not already present in the current input. These networks are useful because they act as lowpass filters, removing the highfrequency receptor–ligandbinding noise (Andrews et al., 2006; Hinczewski and Thirumalai, 2014; Becker et al., 2015). Push–pull networks thus enable the cell to employ the mechanism of time integration, in which the cell infers the concentration not from the instantaneous number of ligandbound receptors, but rather from the average receptor occupancy over an integration time (Berg and Purcell, 1977). Our theory gives a unified description in terms of all the cellular resources, protein copies, time, and energy, that are necessary to implement this mechanism of time integration. It does not address the sensing strategy of maximumlikelihood estimation (Endres and Wingreen, 2009; Mora and Wingreen, 2010; Lang et al., 2014; Hartich and Seifert, 2016; Ten Wolde et al., 2016) or Bayesian filtering (Mora and Nemenman, 2019).
While filtering theories are powerful tools for predicting the optimal topology and response dynamics of the cellular sensing network (Andrews et al., 2006; Hinczewski and Thirumalai, 2014; Becker et al., 2015), they do not naturally reveal the resource requirements for sensing. Our theory therefore employs the sampling framework of Govern and Ten Wolde, 2014a and extends it here to timevarying signals. This framework is based on the observation that the cell estimates the current ligand concentration not from the current number of active readout molecules directly, but rather via the receptor: the cell uses its push–pull network to estimate the receptor occupancy from which the ligand concentration is then inferred (see Figure 2). To elucidate the resource requirements for time integration, the push–pull network is viewed as a device that employs the mechanism of time integration by discretely sampling, rather than continuously integrating, the state of the receptor via collisions of the readout molecules with the receptor proteins (see Figure 2). During each collision, the ligandbinding state of the receptor protein is copied into the activation state of the readout molecule (Ouldridge et al., 2017). The readout molecules thus constitute samples of the receptor state, and the fraction of active readout molecules provides an estimate of the average receptor occupancy. The readout activation states have, however, a finite lifetime, which means that this is an estimate of the (running) average receptor occupancy over this lifetime, which indeed sets the receptor integration time ${\tau}_{\mathrm{r}}$. The cell can estimate the current ligand concentration L from this estimate of the average receptor occupancy ${p}_{{\tau}_{\mathrm{r}}}$ over the past integration time ${\tau}_{\mathrm{r}}$ because there is a unique onetoone mapping between ${p}_{{\tau}_{\mathrm{r}}}$ and L. This mapping ${p}_{{\tau}_{\mathrm{r}}}(L)$ is the dynamic input–output relation and differs from the conventional static input–output relations used to describe the sensing of static concentrations that do not vary on the timescale of the response (Berg and Purcell, 1977; Bialek and Setayeshgar, 2005; Kaizu et al., 2014; Ten Wolde et al., 2016) in that it depends not only on the response time of the system but also on the dynamics of the input signal.
Our theory reveals that the sensing error can be decomposed into two terms, which each depend on collective variables that reveal the resource requirements for sensing. One term, the sampling error, describes the sensing error that arises from the finite accuracy by which the receptor occupancy is estimated. This error depends on the number of receptor samples, as set by the number of receptors, readout molecules, and the integration time; their independence, as given by the receptorsampling interval and the timescale of the receptor–ligandbinding noise; and their reliability, as determined by how much the system is driven out of thermodynamic equilibrium via fuel turnover. The other term is the dynamical error and is determined by how much the concentration in the past integration time reflects the current concentration that the cell aims to estimate; it depends on the integration time and timescale of the input fluctuations.
Our theory gives a comprehensive view on the optimal design of a cellular sensing system. Firstly, it reveals that the resource allocation principle of Govern and Ten Wolde, 2014a can be generalized to timevarying signals. There exist three fundamental resource classes – receptors and their integration time, readout molecules, and power and integration time – which each fundamentally limit the accuracy of sensing; and, in an optimally designed system, each resource class is equally limiting so that none of them is in excess and thus wasted. However, in contrast to sensing static signals, time cannot be freely traded against the number of receptors and the power to achieve a desired sensing precision: there exists an optimal integration time that maximizes the sensing precision, which arises as a tradeoff between the sampling error and dynamical error. Together with the resource allocation principle, it completely specifies, without any adjustable parameters, the optimal design of the system in terms of its resources protein copies, time, and energy.
Our theory also makes a number of specific predictions. The optimal integration time decreases as the number of receptors is increased because this allows for more instantaneous measurements. Moreover, the allocation principle reveals that when the input varies more rapidly both the number of receptors and the power must increase to maintain a desired sensing precision, while the number of readout molecules does not.
Finally, we apply our theory to the chemotaxis system of E. coli. This bacterium searches for food via a runandtumble strategy (Berg and Brown, 1972), yielding a fluctuating input signal. In small gradients, the timescale of these input fluctuations is set by the typical run time of the bacterium, which is on the order of a few seconds (Berg and Brown, 1972; Taute et al., 2015), while the strength of these fluctuations is determined by the steepness of the gradient. Interestingly, experiments have revealed that E. coli can sense extremely shallow gradients, with a length scale of approximately 10^{4}µm (Shimizu et al., 2010), raising the question how accurately E. coli can measure the concentration and whether this is sufficient to determine whether during a run it has changed, even in these shallow gradients. To measure the concentration, the chemotaxis system employs a push–pull network to filter out the highfrequency receptor–ligandbinding noise (Sartori and Tu, 2011). Applying our theory to this system predicts that the measured integration time, on the order of 100 ms (Sourjik and Berg, 2002), is not only sufficient to enable navigation in these shallow gradients but also necessary. This suggests that this system has evolved to optimally sense shallow concentration gradients.
Results
Theory: model
We consider a single cell that needs to sense a timevarying ligand concentration $L(t)$ (see Figure 1a). The ligand concentration dynamics is modeled as a stationary memoryless, or Markovian, signal specified by the mean (total) ligand concentration $\overline{L}$, the variance ${\sigma}_{L}^{2}$, and the correlation time ${\tau}_{\mathrm{L}}={\lambda}^{1}$, which determines the timescale on which input fluctuations decay. It obeys Gaussian statistics (Tostevin and ten Wolde, 2010).
The concentration is measured via ${R}_{\mathrm{T}}$ receptor proteins on the cell surface, which independently bind the ligand (Ten Wolde et al., 2016), $L+R\underset{{k}_{2}}{\overset{\text{}{k}_{1}\text{}}{\rightleftharpoons}}RL$. The correlation time of the receptor state, which is the timescale on which fluctuations in the number of ligandbound receptors regresses to the mean, is given by ${\tau}_{\mathrm{c}}=1/({k}_{1}\overline{L}+{k}_{2})$ (Berg and Purcell, 1977; Bialek and Setayeshgar, 2005; Kaizu et al., 2014; Ten Wolde et al., 2016). It determines the timescale on which independent concentration measurements can be made.
The ligandbinding state of the receptor is read out via a push–pull network (Goldbeter and Koshland, 1981). The most common scheme is phosphorylation fueled by the hydrolysis of adenosine triphosphate (ATP) (see Figure 1b). The receptor, or an enzyme associated with it such as CheA in E. coli, catalyzes the modification of the readout, $x+RL+ATP\text{}\stackrel{\text{}}{\rightleftharpoons}{x}^{\ast}+RL+ADP$. The active readout proteins ${x}^{*}$ can decay spontaneously or be deactivated by an enzyme, such as CheZ in E. coli, ${x}^{\ast}\rightleftharpoons \phantom{\rule{thinmathspace}{0ex}}x+Pi$. Inside the living cell the system is maintained in a nonequilibrium steady state by keeping the concentrations of ATP, adenosine diphosphate (ADP), and inorganic phosphate (Pi) constant. We absorb their concentrations and the activities of the kinase and, if applicable, phosphatase in the (de)phosphorylation rates, coarsegraining the (de)modification reactions into instantaneous secondorder reactions: $x+RL\text{}\underset{{k}_{\mathrm{f}}}{\overset{\phantom{\rule{1em}{0ex}}{k}_{\mathrm{f}}\phantom{\rule{1em}{0ex}}}{\rightleftharpoons}}\text{}{x}^{\ast}+RL$, ${x}^{\ast}\text{}\underset{{k}_{\mathrm{r}}}{\overset{\phantom{\rule{1em}{0ex}}{k}_{\mathrm{r}}\phantom{\rule{1em}{0ex}}}{\rightleftharpoons}}\text{}x$. This system has a relaxation time ${\tau}_{\mathrm{r}}=1/[({k}_{\mathrm{f}}+{k}_{\mathrm{f}})\overline{RL}+{k}_{\mathrm{r}}+{k}_{\mathrm{r}}]$ (Govern and Ten Wolde, 2014a), which describes how fast fluctuations in ${x}^{*}$ relax. It determines how long ${x}^{*}$ can carry information on the ligandbinding state of the receptor; ${\tau}_{\mathrm{r}}$ thus sets the integration time of the receptor state.
Theory: inferring concentration from receptor occupancy
The central idea of our theory is illustrated in Figure 2a: the cell employs the push–pull network to estimate the average receptor occupancy ${p}_{{\tau}_{\mathrm{r}}}$ over the past integration time ${\tau}_{\mathrm{r}}$. It then uses this estimate ${\widehat{p}}_{{\tau}_{r}}$ to infer the current concentration L via the dynamic input–output relation ${p}_{{\tau}_{\mathrm{r}}}(L)$, which provides a onetoone mapping between ${p}_{{\tau}_{\mathrm{r}}}$ and L.
Dynamic input–output relation
The mapping ${p}_{{\tau}_{\mathrm{r}}}(L)$ is the dynamic input–output relation. It gives the average receptor occupancy over the past integration time ${\tau}_{\mathrm{r}}$, given that the current value of the input signal is $L=L(t)$ (see Figure 2a). Here, the average is not only over the noise in receptor–ligand binding and readout activation (Figure 2b) but also over the subensemble of past input trajectories that each end at the same current concentration L (Figure 2c; Tostevin and ten Wolde, 2010; Hilfinger and Paulsson, 2011; Bowsher et al., 2013). In contrast to the conventional static input–output relation $p({L}_{\mathrm{s}})$, which gives the average receptor occupancy p for a steadystate ligand concentration ${L}_{\mathrm{s}}$ that does not vary in time, the dynamic input–output relation takes into account the dynamics of the input and the finite response time of the system. It depends on all timescales in the problem: the timescale of the input, ${\tau}_{\mathrm{L}}$, the receptor–ligand correlation time ${\tau}_{\mathrm{c}}$, and the integration time ${\tau}_{\mathrm{r}}$. Only when ${\tau}_{\mathrm{L}}\gg {\tau}_{\mathrm{c}},{\tau}_{\mathrm{r}}$ does the dynamic input–output ${p}_{{\tau}_{\mathrm{r}}}(L)$ become equal to the static input–output relation $p({L}_{\mathrm{s}})$.
Sensing error
Linearizing the dynamic input–output relation ${p}_{{\tau}_{\mathrm{r}}}(L)$ around the mean ligand concentration $\overline{L}$ (see Figure 2a) and using the rules of error propagation, the expected error in the concentration estimate is
Here, ${\sigma}_{{\widehat{p}}_{{\tau}_{r}}}^{2}$ is the variance in the estimate ${\widehat{p}}_{{\tau}_{r}}$ of the average receptor occupancy over the past ${\tau}_{\mathrm{r}}$, given that the current input signal is L (see Figure 2a). The quantity ${\stackrel{~}{g}}_{L\to {p}_{{\tau}_{\mathrm{r}}}}$ is the dynamic gain, which is the slope of the dynamic input–output relation ${p}_{{\tau}_{\mathrm{r}}}(L)$; it determines how much an error in the estimate of ${p}_{{\tau}_{\mathrm{r}}}$ propagates to that in L. Equation 1 generalizes the expression for the error in sensing static concentrations (Berg and Purcell, 1977; Bialek and Setayeshgar, 2005; Wang et al., 2007; Mehta and Schwab, 2012; Kaizu et al., 2014; Govern and Ten Wolde, 2014a; Ten Wolde et al., 2016) to that of timevarying concentrations.
Signaltonoise ratio
Together with the distribution of input states, the sensing error ${(\delta \widehat{L})}^{2}$ determines how many distinct signal values the cell can resolve. The latter is quantified by the signaltonoise ratio (SNR), which is defined as
Here, ${\sigma}_{L}^{2}$ is the variance of the ligand concentration $L(t)$; because the system is stationary and time invariant, we can omit the argument in $L(t)$ and write $L=L(t)$. The variance ${\sigma}_{L}^{2}$ is a measure for the total number of input states, such that the SNR gives the number of distinct ligand concentrations the cell can measure. Using Equation 1, it is given by
The SNR also yields the mutual information $I({x}^{*};L)=1/2\mathrm{ln}(1+\mathrm{SNR})$ between the input L and output ${x}^{*}$ (Tostevin and ten Wolde, 2010).
Readout system samples receptor state
Receptor time averaging is typically conceived as a scheme in which the receptor state is averaged via the mathematical operation of an integral: ${p}_{{\tau}_{\mathrm{r}}}=1/{\tau}_{\mathrm{r}}{\int}_{0}^{{\tau}_{\mathrm{r}}}p({t}^{\prime})\mathit{d}{t}^{\prime}$. Yet, readout proteins are discrete components that interact with the receptor in a discrete and stochastic fashion. To derive the dynamic gain ${\stackrel{~}{g}}_{L\to {p}_{{\tau}_{\mathrm{r}}}}$ and error in estimating ${p}_{{\tau}_{\mathrm{r}}}$, ${\sigma}_{{\widehat{p}}_{{\tau}_{r}}}^{2}$ (Equation 3), we therefore view the push–pull network as a device that discretely samples the receptor state (see Figure 2b; Govern and Ten Wolde, 2014a). The principle is that cells employ the activation reaction $x+RL\to {x}^{*}+RL$ to store the state of the receptor in stable chemical modification states of the readout molecules. Readout molecules that collide with a ligandbound receptor are modified, while those that collide with an unbound receptor are not (Figure 2b). The readout molecules serve as samples of the receptor at the time they were created, and collectively they encode the history of the receptor: the fraction of samples that correspond to ligandbound receptors is the cell’s estimate for ${p}_{{\tau}_{\mathrm{r}}}$. Indeed, this is the discrete and stochastic implementation of the mechanism of time integration. The effective number of independent samples depends not only on the creation of samples, $x+RL\to {x}^{*}+RL$, but also on their decay and accuracy. Samples decay via the deactivation reaction ${x}^{*}\to x$, which means that they only provide information on the receptor occupancy over the past ${\tau}_{\mathrm{r}}$. In addition, both the activation and the deactivation reaction can happen in their microscopic reverse direction, which corrupts the coding, that is, the mapping between the ligandbinding states of the receptor proteins and the activation states of the readout molecules. Energy is needed to break time reversibility and protect the coding. Furthermore, for timevarying signals, we also need to recognize that the samples correspond to the ligand concentration over the past integration time ${\tau}_{\mathrm{r}}$, which will in general differ from the current concentration L that the cell aims to estimate (see Figure 2c). While a finite ${\tau}_{\mathrm{r}}$ is necessary for time integration, it will, as we show below, also lead to a systematic error in the estimate of the concentration that the cell cannot reduce by taking more receptor samples.
This analysis reveals that the dynamic gain is (see Appendix 1)
Only when ${\tau}_{\mathrm{L}}\gg {\tau}_{\mathrm{r}},{\tau}_{\mathrm{c}}$ is the average ligand concentration over the ensemble of trajectories ending at $\delta L(t)$ equal to the current concentration $\delta L(t)$ (Figure 2c) and does ${\stackrel{~}{g}}_{L\to {p}_{{\tau}_{\mathrm{r}}}}$ become equal to its maximal value, the static gain ${g}_{L\to p}=p(1p)/\overline{L}$, where p is the average receptor occupancy averaged over all values of $\delta L(t)$. The analysis also reveals that the error in ${p}_{{\tau}_{\mathrm{r}}}$ can be written as (see Appendix 1, Equation 29)
where ${\sigma}_{{\widehat{p}}_{{\tau}_{r}}}^{2,\mathrm{samp}}$ is a statistical error due to the stochastic sampling of the receptor and ${\sigma}_{{\widehat{p}}_{{\tau}_{r}}}^{2,\mathrm{dyn}}$ is a systematic error arising from the dynamics of the input, as elucidated in Figure 2b, c.
Central result
To know how the error ${\sigma}_{{\widehat{p}}_{{\tau}_{r}}}^{2}$ in the estimate of ${p}_{{\tau}_{\mathrm{r}}}$ propagates to the error ${(\delta \widehat{L})}^{2}$ in the estimate of the current ligand concentration, we divide ${\sigma}_{{\widehat{p}}_{{\tau}_{r}}}^{2}$ by the dynamic gain ${\stackrel{~}{g}}_{L\to {p}_{{\tau}_{\mathrm{r}}}}$ given by Equation 4 (see Equation 1). For the full system, the reversible push–pull network, this yields via Equation 3 the central result of our article, the SNR in terms of the total number of receptor samples, their independence, their accuracy, and the timescale on which they are generated:
This expression shows that the sensing error ${\mathrm{SNR}}^{1}$ can be decomposed into two distinct contributions, which each have a clear interpretation: the sampling error, arising from the stochasticity in the sampling of the receptor state, and the dynamical error, arising from the dynamics of the input.
When the timescale of the ligand fluctuations ${\tau}_{\mathrm{L}}$ is much longer than the receptor correlation time ${\tau}_{\mathrm{c}}$ and the integration time ${\tau}_{\mathrm{r}}$, ${\tau}_{\mathrm{L}}\gg {\tau}_{\mathrm{r}},{\tau}_{\mathrm{c}}$, the dynamical error reduces to zero and only the sampling error remains. Here, ${\overline{N}}_{\mathrm{eff}}$ is the total number of effective samples and ${\overline{N}}_{\mathrm{I}}$ is the number of these that are independent (Govern and Ten Wolde, 2014a). For the full system, they are given by
The quantity $\dot{n}={k}_{\mathrm{f}}p{R}_{\mathrm{T}}\overline{x}{k}_{\mathrm{f}}p{R}_{\mathrm{T}}{\overline{x}}^{*}$ is the net flux of x around the cycle of activation and deactivation, with ${R}_{\mathrm{T}}$ the total number of receptor proteins and $\overline{x}$ and ${\overline{x}}^{*}$ the average number of inactive and active readout molecules, respectively. It equals the rate at which x is modified by the ligandbound receptor; the quantity $\dot{n}/p$ is thus the sampling rate of the receptor, be it ligand bound or not. Multiplied with the relaxation rate ${\tau}_{\mathrm{r}}$, it yields the total number of receptor samples $\overline{N}$ obtained during ${\tau}_{\mathrm{r}}$. However, not all these samples are reliable. The effective number of samples is ${\overline{N}}_{\mathrm{eff}}=q\overline{N}$, where $0<q<1$ quantifies the quality of the sample. Here, $\beta =1/({k}_{\mathrm{B}}T)$ is the inverse temperature, $\mathrm{\Delta}{\mu}_{1}$ and $\mathrm{\Delta}{\mu}_{2}$ are the freeenergy drops over the activation and deactivation reaction, respectively, with $\mathrm{\Delta}\mu =\mathrm{\Delta}{\mu}_{1}+\mathrm{\Delta}{\mu}_{2}$ the total drop, determined by the fuel turnover (see Figure 1b). If the system is in thermodynamic equilibrium, $\mathrm{\Delta}{\mu}_{1}=\mathrm{\Delta}{\mu}_{2}=\mathrm{\Delta}\mu =0$, $q\to 0$ and the system cannot sense because the ligandbinding state of the receptor is equally likely to be copied into the correct modification state of the readout as into the incorrect one. In contrast, if the system is strongly driven out of equilibrium and $\mathrm{\Delta}{\mu}_{1},\mathrm{\Delta}{\mu}_{2}\to \mathrm{\infty}$, then, during each receptor–readout interaction, the receptor state is always copied into the correct activation state of the readout; the sample quality parameter q thus approaches unity and ${\overline{N}}_{\mathrm{eff}}\to \overline{N}$. Yet, even when all samples are reliable, they may contain redundant information on the receptor state. The factor ${f}_{\mathrm{I}}$ is the fraction of the ${\overline{N}}_{\mathrm{eff}}$ samples that are independent. It reaches unity when the receptor sampling interval $\mathrm{\Delta}=2{\tau}_{\mathrm{r}}/({\overline{N}}_{\mathrm{eff}}/{R}_{\mathrm{T}})$ becomes larger than the receptor correlation time ${\tau}_{\mathrm{c}}$.
When the number of samples becomes very large, ${\overline{N}}_{\mathrm{I}},{\overline{N}}_{\mathrm{eff}}\to \mathrm{\infty}$, the sampling error reduces to zero. However, the sensing error still contains a second contribution, which, following Bowsher et al., 2013, we call the dynamical error. This contribution only depends on timescales. It arises from the fact that the samples encode the receptor history and hence the ligand concentration over the past ${\tau}_{\mathrm{r}}$, which will, in general, deviate from the quantity that the cell aims to predict – the current concentration L. This contribution yields a systematic error, which cannot be eliminated by increasing the number of receptor samples, their independence, or their accuracy. It can only be reduced to zero by making the integration time ${\tau}_{\mathrm{r}}$ much smaller than the ligand timescale ${\tau}_{\mathrm{L}}$ (assuming ${\tau}_{\mathrm{c}}$ is typically much smaller than ${\tau}_{\mathrm{r}},{\tau}_{\mathrm{L}}$). Only in this regime will the ligand concentration in the past ${\tau}_{\mathrm{r}}$ be similar to the current concentration and can the latter be reliably inferred from the receptor occupancy, provided the latter has been estimated accurately by taking enough samples.
Importantly, the dynamics of the input signal not only affects the sensing precision via the dynamical error but also via the sampling error. This effect is contained in the prefactor of the sampling error, ${(1+{\tau}_{\mathrm{c}}/{\tau}_{\mathrm{L}})}^{2}{(1+{\tau}_{\mathrm{r}}/{\tau}_{\mathrm{L}})}^{2}$, which has its origin in the dynamic gain ${\stackrel{~}{g}}_{L\to {p}_{{\tau}_{\mathrm{r}}}}$ (Equation 4). It determines how the sampling error ${\sigma}_{{\widehat{p}}_{{\tau}_{r}}}^{2,\mathrm{samp}}$ in the estimate of ${p}_{{\tau}_{\mathrm{r}}}$ propagates to the error in the estimate of L (see Equation 3). Only when ${\tau}_{\mathrm{c}},{\tau}_{\mathrm{r}}\ll {\tau}_{\mathrm{L}}$ can the readout system closely track the input signal and does ${\stackrel{~}{g}}_{L\to {p}_{{\tau}_{\mathrm{r}}}}$ reach its maximal value, the static gain ${g}_{L\to p}$, thus minimizing the error propagation from ${p}_{{\tau}_{\mathrm{r}}}$ to L.
Fundamental resources
We can use Equation 6 to identify the fundamental resources for cell sensing (Govern and Ten Wolde, 2014a) and derive Pareto fronts that quantify the tradeoffs between the maximal sensing precision and these resources. A fundamental resource is a (collective) variable ${Q}_{i}$ that, when fixed to a constant, puts a nonzero lower bound on ${\text{SNR}}^{1}$, no matter how the other variables are varied. It is thus mathematically defined as ${\mathrm{MIN}}_{{Q}_{i}=\mathrm{const}}\left({\text{SNR}}^{1}\right)=f(\mathrm{const})>0.$ To find these collective variables, we numerically or analytically minimized ${\text{SNR}}^{1}$, constraining (combinations of) variables yet optimizing over the other variables. This reveals that the SNR is bounded by (see Appendix 2)
where
Equations 8 and 9 show that the fundamental resources are the number of receptors ${R}_{\mathrm{T}}$, the integration time ${\tau}_{\mathrm{r}}$, the number of readouts ${X}_{\mathrm{T}}$, and the power $\dot{w}=\dot{n}\mathrm{\Delta}\mu $.
Figure 3a, b illustrates that ${R}_{\mathrm{T}},{\tau}_{\mathrm{r}},{X}_{\mathrm{T}},\dot{w}$ are indeed fundamental: the sensing precision is bounded by the limiting resource and cannot be enhanced by increasing another resource. Panel (a) shows that when ${X}_{\mathrm{T}}$ is small, the maximum mutual information ${I}_{\mathrm{max}}$ cannot be increased by raising ${R}_{\mathrm{T}}$: no matter how many receptors the system has, the sensing precision is limited by the pool of readout molecules and only increasing this pool can raise ${I}_{\mathrm{max}}$. Yet, when ${X}_{\mathrm{T}}$ is large, ${I}_{\mathrm{max}}$ becomes independent of ${X}_{\mathrm{T}}$. In this regime, the number of receptors ${R}_{\mathrm{T}}$ limits the number of independent concentration measurements and only increasing ${R}_{\mathrm{T}}$ can raise ${I}_{\mathrm{max}}$. Similarly, panel (b) shows that when the power $\dot{w}$ is limiting, ${I}_{\mathrm{max}}$ cannot be increased by ${R}_{\mathrm{T}}$ but only by increasing $\dot{w}$. Clearly, the resources receptors, readout molecules, and energy cannot compensate each other: the sensing precision is bounded by the limiting resource.
Importantly, while for sensing static concentrations the products ${R}_{\mathrm{T}}{\tau}_{\mathrm{r}}/{\tau}_{\mathrm{c}}$ and $\dot{w}{\tau}_{\mathrm{r}}$ are fundamental (Govern and Ten Wolde, 2014a), for timevarying signals ${R}_{\mathrm{T}}$, $\dot{w}$, and ${\tau}_{\mathrm{r}}$ separately limit sensing. Consequently, neither receptors ${R}_{\mathrm{T}}$ nor power $\dot{w}$ can be traded freely against time ${\tau}_{\mathrm{r}}$ to reach a desired precision, as is possible for static signals. In line with the predictions of signal filtering theories (Extrapolation, 1950; Kolmogorov, 1992; Kalman, 1960), there exists an optimal integration time $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ that maximizes the sensing precision (Andrews et al., 2006; Hinczewski and Thirumalai, 2014; Becker et al., 2015; Monti et al., 2018b; Mora and Nemenman, 2019). Interestingly, its value depends on which of the resources ${R}_{\mathrm{T}}$, ${X}_{\mathrm{T}}$, and $\dot{w}$ is limiting (Figure 3c–f). We now discuss these three regimes in turn.
Receptors
Berg and Purcell, 1977 pointed out that cells can reduce the sensing error by either increasing the number of receptors or taking more measurements per receptor via the mechanism of time integration. However, Equation 8 reveals that for sensing timevarying signals time integration can never eliminate the sensing error completely, as predicted also by filtering theories (Extrapolation, 1950; Kolmogorov, 1992; Kalman, 1960). Equation 8 shows that in the Berg–Purcell regime, where receptors and their integration time are limiting and $h={R}_{\mathrm{T}}{\tau}_{\mathrm{r}}/{\tau}_{\mathrm{c}}$, the sensing precision does not depend on ${R}_{\mathrm{T}}{\tau}_{\mathrm{r}}/{\tau}_{\mathrm{c}}$, as for static signals (Govern and Ten Wolde, 2014a), but on ${R}_{\mathrm{T}}$ and ${\tau}_{\mathrm{r}}$ separately, such that an optimal integration time $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ emerges that maximizes the sensing precision (see Figure 3c). Increasing ${\tau}_{\mathrm{r}}$ improves the mechanism of time integration by increasing the number of independent samples per receptor, ${\tau}_{\mathrm{r}}/{\tau}_{\mathrm{c}}$, thus reducing the sampling error (Equation 6). However, increasing ${\tau}_{\mathrm{r}}$ raises the dynamical error. Moreover, it lowers the dynamical gain ${\stackrel{~}{g}}_{L\to {p}_{{\tau}_{\mathrm{r}}}}$, which increases the propagation of the error in the estimate of the receptor occupancy to that of the ligand concentration. The optimal integration time $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ arises as a tradeoff between these three factors.
Figure 3c also shows that the optimal integration time $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ decreases with the number of receptors ${R}_{\mathrm{T}}$. The total number of independent concentration measurements is the number of independent measurements per receptor, ${\tau}_{\mathrm{r}}/{\tau}_{\mathrm{c}}$, times the number ${R}_{\mathrm{T}}$ of receptors, ${\overline{N}}_{\mathrm{I}}={R}_{\mathrm{T}}{\tau}_{\mathrm{r}}/{\tau}_{\mathrm{c}}$. As ${R}_{\mathrm{T}}$ increases, less measurements ${\tau}_{\mathrm{r}}/{\tau}_{\mathrm{c}}$ per receptor have to be taken to remove the receptor–ligandbinding noise, explaining why $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ decreases as ${R}_{\mathrm{T}}$ increases – time integration becomes less important.
Interestingly, $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ depends nonmonotonically on the receptor–ligand correlation time ${\tau}_{\mathrm{c}}$ (Figure 3d). When ${\tau}_{\mathrm{c}}$ increases at fixed ${\tau}_{\mathrm{r}}$, the receptor samples become more correlated. To keep the mechanism of time integration effective, ${\tau}_{\mathrm{r}}$ must increase with ${\tau}_{\mathrm{c}}$. However, to avoid too strong signal distortion the cell compromises on time integration by decreasing the ratio ${\tau}_{\mathrm{r}}/{\tau}_{\mathrm{c}}$ (see inset). When ${\tau}_{\mathrm{r}}$ becomes too large, the benefit of time integration no longer pays off the cost of signal distortion. Now not only the ratio ${\tau}_{\mathrm{r}}/{\tau}_{\mathrm{c}}$ decreases but also ${\tau}_{\mathrm{r}}$ itself. The sensing system switches to a different strategy: it no longer employs time integration but becomes an instantaneous sensor.
Readout molecules
To implement time integration, the cell needs to store the receptor states in the readout molecules. When the number of readout molecules ${X}_{\mathrm{T}}$ is limiting, the sensing precision is given by Equation 8 with $h={X}_{\mathrm{T}}$. This bound is saturated when ${\tau}_{\mathrm{r}}\to 0$. This is in marked contrast to the nonzero optimal integration $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ in the Berg–Purcell regime (see Figure 3c).
To elucidate the nontrivial behavior of $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$, Figure 3e shows $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ as a function of ${X}_{\mathrm{T}}$. When ${X}_{\mathrm{T}}$ is smaller than ${R}_{\mathrm{T}}$, the average number of samples per receptor is less than unity. In this regime, the system cannot time integrate the receptor, and to minimize signal distortion $\tau _{\mathrm{r}}{}^{\mathrm{opt}}\approx 0$. Yet, when ${X}_{\mathrm{T}}$ is increased, the likelihood that two or more readout molecules provide a sample of the same receptor molecule rises, and time averaging becomes possible. Yet to obtain receptor samples that are independent, the integration time ${\tau}_{\mathrm{r}}$ must be increased to make the sampling interval $\mathrm{\Delta}\sim {\tau}_{\mathrm{r}}{R}_{\mathrm{T}}/{X}_{\mathrm{T}}$ larger than the receptor correlation time ${\tau}_{\mathrm{c}}$. As ${X}_{\mathrm{T}}$ and hence the total number of samples $\overline{N}$ are increased further, the number of samples that are independent, ${\overline{N}}_{\mathrm{I}}$, only continues to rise when ${\tau}_{\mathrm{r}}$ increases with ${X}_{\mathrm{T}}$ further. However, while this reduces the sampling error, it also increases the dynamical error. When the decrease in the sampling error no longer outweighs the increase in the dynamical error, $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ and the mutual information no longer change with ${X}_{\mathrm{T}}$ (see Figure 3a). The system has entered the Berg–Purcell regime in which $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ and the mutual information are given by the optimization of Equation 8 with $h={R}_{\mathrm{T}}{\tau}_{\mathrm{r}}/{\tau}_{\mathrm{c}}$ (gray dashed line). In this regime, increasing ${X}_{\mathrm{T}}$ merely adds redundant samples: the number of independent samples remains ${\overline{N}}_{\mathrm{I}}={R}_{\mathrm{T}}\tau _{\mathrm{r}}{}^{\mathrm{opt}}/{\tau}_{\mathrm{c}}$.
Power
Time integration relies on copying the ligandbinding state of the receptor into the chemical modification states of the readout molecules (Mehta and Schwab, 2012; Govern and Ten Wolde, 2014a). This copy process correlates the state of the receptor with that of the readout, which requires work input (Ouldridge et al., 2017).
The freeenergy $\mathrm{\Delta}\mu $ provided by the fuel turnover drives the readout around the cycle of modification and demodification (Figure 1). The rate at which the fuel molecules do work is the power $\dot{w}=\dot{n}\mathrm{\Delta}\mu $, and the total work performed during the integration time ${\tau}_{\mathrm{r}}$ is $w\equiv \dot{w}{\tau}_{\mathrm{r}}$. This work is spent on taking samples of receptor molecules that are bound to ligand because only they can modify the readout. The total number of effective samples of ligandbound receptors during ${\tau}_{\mathrm{r}}$ is $p{\overline{N}}_{\mathrm{eff}}$ (Equation 7), which means that the work per effective sample of a ligandbound receptor is $w/(p{\overline{N}}_{\mathrm{eff}})=\mathrm{\Delta}\mu /q$ (Govern and Ten Wolde, 2014a).
To understand how energy limits the sensing precision, we can distinguish between two limiting regimes (Govern and Ten Wolde, 2014a). When $\mathrm{\Delta}\mu >4{k}_{\mathrm{B}}T$, the quality parameter $q\to 1$ (Equation 7) and the work per sample of a ligandbound receptor is $w/(p{\overline{N}}_{\mathrm{eff}})=\mathrm{\Delta}\mu $ (Govern and Ten Wolde, 2014a). In this irreversible regime, the SNR bound is given by Equation 8 with $h=\dot{w}{\tau}_{\mathrm{r}}/(\mathrm{\Delta}\mu /4)$. The power limits the sensing accuracy not because it limits the reliability of each sample but because it limits the rate $\dot{n}=\dot{w}/\mathrm{\Delta}\mu $ at which the receptor is sampled.
When $\mathrm{\Delta}\mu <4{k}_{\mathrm{B}}T$, the system enters the quasiequilibrium regime in which the quality parameter $q\to \beta \mathrm{\Delta}\mu /4$ (see Equation 7, noting that in the optimal system $\mathrm{\Delta}{\mu}_{1}=\mathrm{\Delta}{\mu}_{2}=\mathrm{\Delta}\mu /2$). The sensing bound is now given by Equation 8 with $h=\beta \dot{w}{\tau}_{\mathrm{r}}$, which is larger than $h=\dot{w}{\tau}_{\mathrm{r}}/(\mathrm{\Delta}\mu /4)$ in the irreversible regime (where $\mathrm{\Delta}\mu >4{k}_{\mathrm{B}}T$). The quasiequilibrium regime minimizes the sensing error for a given power constraint (Figure 3b) because this regime maximizes the number of effective measurements per work input $p{\overline{N}}_{\mathrm{eff}}/w=q/\mathrm{\Delta}\mu =\beta /4$ (Govern and Ten Wolde, 2014a).
While the sensing precision for a given power and time constraint is higher in the quasireversible regime, more readout molecules are required to store the concentration measurements in this regime. Noting that the flux $\dot{n}=f(1f){X}_{\mathrm{T}}q/{\tau}_{\mathrm{r}}=\dot{w}/\mathrm{\Delta}\mu $, it follows that in the irreversible regime ($q\to 1$) the number of readout molecules consuming energy at a rate $\dot{w}$ is
while in the quasiequilibrium regime ($q\to \mathrm{\Delta}\mu /4$) it is
Since in the quasiequilibrium regime $\mathrm{\Delta}\mu <4{k}_{\mathrm{B}}T$, ${X}_{\mathrm{T}}^{\mathrm{qeq}}>{X}_{\mathrm{T}}^{\mathrm{irr}}$.
Equation 8 shows that the sensing precision is fundamentally bounded not by the work $w=\dot{w}{\tau}_{\mathrm{r}}$, as observed for static signals (Govern and Ten Wolde, 2014a), but rather by the power $\dot{w}$ and the integration time ${\tau}_{\mathrm{r}}$ separately such that an optimal integration time $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ emerges. Figure 3f shows how $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ depends on $\dot{w}$. Since the system cannot sense without any readout molecules, in the lowpower regime the system maximizes ${X}_{\mathrm{T}}$ subject to the power constraint $\dot{w}\sim {X}_{\mathrm{T}}/{\tau}_{\mathrm{r}}$ (see Equations 10 and 11) by making ${\tau}_{\mathrm{r}}$ as large as possible, which is the signal correlation time ${\tau}_{\mathrm{L}}$ – increasing $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ further would average out the signal itself. As $\dot{w}$ is increased, ${X}_{\mathrm{T}}$ rises and the sampling error decreases. When the sampling error becomes comparable to the dynamical error (Equation 6), the system starts to trade a further reduction in the sampling error for a reduction in the dynamical error by decreasing $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$. The sampling error and dynamical error are now reduced simultaneously by increasing ${X}_{\mathrm{T}}$ and decreasing $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$. This continues until the sampling interval $\mathrm{\Delta}\sim {R}_{\mathrm{T}}{\tau}_{\mathrm{r}}/{X}_{\mathrm{T}}$ becomes comparable to the receptor correlation time ${\tau}_{\mathrm{c}}$, as marked by the yellow bar. Beyond this point, $\mathrm{\Delta}<{\tau}_{\mathrm{c}}$ and the sampling error is no longer limited by ${X}_{\mathrm{T}}$ but rather by ${\tau}_{\mathrm{r}}$ since ${\tau}_{\mathrm{r}}$ bounds the number of independent samples per receptor, ${\tau}_{\mathrm{r}}/{\tau}_{\mathrm{c}}$. The system has entered the Berg–Purcell regime, where $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ is determined by the tradeoff between the dynamical error and the sampling error as set by the maximum number of independent samples, ${R}_{\mathrm{T}}{\tau}_{\mathrm{r}}/{\tau}_{\mathrm{c}}$ (Figure 3c).
Optimal design
In sensing timevarying signals, a tradeoff between time averaging and signal tracking is inevitable. Moreover, the optimal integration time depends on which resource is limiting, being zero when ${X}_{\mathrm{T}}$ is limiting and finite when ${R}_{\mathrm{T}}$ or $\dot{w}$ is limiting (Figure 3). It is therefore not obvious whether these sensing systems still obey the optimal resource allocation principle as observed for systems sensing static concentrations (Govern and Ten Wolde, 2014a).
However, Equation 8 shows that when for a given integration time ${\tau}_{\mathrm{r}}$, ${R}_{\mathrm{T}}{\tau}_{\mathrm{r}}/{\tau}_{\mathrm{c}}={X}_{\mathrm{T}}=\beta \dot{w}{\tau}_{\mathrm{r}}$, the bounds on the sensing precision as set by, respectively, the number of receptors ${R}_{\mathrm{T}}$, the number of readout molecules ${X}_{\mathrm{T}}$, and the power $\dot{w}$ are equal. Each of these resources is then equally limiting sensing and no resource is in excess. We thus recover the optimal resource allocation principle:
Irrespective of whether the concentration fluctuates in time, the number of independent concentration measurements at the receptor level is ${R}_{\mathrm{T}}{\tau}_{\mathrm{r}}/{\tau}_{\mathrm{c}}$, which in an optimally designed system also equals the number of readout molecules ${X}_{\mathrm{T}}$ and the energy $\beta \dot{w}{\tau}_{\mathrm{r}}$ that are both necessary and sufficient to store these measurements reliably.
The design principle ${X}_{\mathrm{T}}\approx \beta \dot{w}{\tau}_{\mathrm{r}}$ (Equation 12) predicts that there exists a driving force $\mathrm{\Delta}{\mu}^{\mathrm{opt}}$ that optimizes the tradeoff between the number of samples and their accuracy. Noting that $\beta \dot{w}{\tau}_{\mathrm{r}}=\beta \dot{n}\mathrm{\Delta}\mu {\tau}_{\mathrm{r}}=\beta qf(1f){X}_{\mathrm{T}}\mathrm{\Delta}\mu $ reveals that the principle ${X}_{\mathrm{T}}\approx \beta \dot{w}{\tau}_{\mathrm{r}}$ (Equation 12) specifies $\mathrm{\Delta}\mu $ for the optimal system in which $f\to 1/2$ and $\mathrm{\Delta}{\mu}_{1}=\mathrm{\Delta}{\mu}_{2}=\mathrm{\Delta}\mu /2$ via the equation $q(\mathrm{\Delta}{\mu}^{\mathrm{opt}})=4{k}_{\mathrm{B}}T/\mathrm{\Delta}{\mu}^{\mathrm{opt}}$, where $q(\mathrm{\Delta}\mu )$ is defined in Equation 7. A numerical inspection shows that to a good approximation the solution of this equation is precisely given by the crossover from the quasiequilibrium regime to the irreversible one: $\mathrm{\Delta}{\mu}^{\mathrm{opt}}\approx 4{k}_{\mathrm{B}}T$. This can be understood by noting that in the quasiequilibrium regime ${X}_{\mathrm{T}}$ can, for a given power and time constraint, be reduced by increasing $\mathrm{\Delta}\mu $ (Equation 11) without compromising the sensing precision (Equation 8 with $h=\dot{w}{\tau}_{\mathrm{r}}$); in this regime, increasing $\mathrm{\Delta}\mu $ increases the reliability of each sample, and a smaller number of more reliable samples precisely compensates for a larger number of less reliable ones. Yet, when $\mathrm{\Delta}\mu $ becomes larger than $4{k}_{\mathrm{B}}T$, the system enters the irreversible regime. Here, ${X}_{\mathrm{T}}$ corresponding to a given $\dot{w}$ and ${\tau}_{\mathrm{r}}$ constraint still decreases with $\mathrm{\Delta}\mu $ (Equation 10), but the sensing error now increases (Equation 8 with $h=\dot{w}{\tau}_{\mathrm{r}}/(\mathrm{\Delta}\mu /4)$) because each sample has become (essentially) perfect in this regime – hence, the samples’ accuracy cannot (sufficiently) increase further to compensate for the reduction in the sampling rate $\dot{n}\sim {X}_{\mathrm{T}}/{\tau}_{\mathrm{r}}$.
Equation 12 holds for any integration time ${\tau}_{\mathrm{r}}$, yet it does not specify ${\tau}_{\mathrm{r}}$. The cell membrane is highly crowded, and many systems employ time integration (Berg and Purcell, 1977; Bialek and Setayeshgar, 2005; Govern and Ten Wolde, 2014a). This suggests that these systems employ time integration and accept the signal distortion that comes with it simply because there is not enough space on the membrane to increase ${R}_{\mathrm{T}}$. Our theory then allows us to predict the optimal integration time $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ based on the premise that ${R}_{\mathrm{T}}$ is limiting. As Equation 8 reveals, in this limit $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ does not only depend on ${R}_{\mathrm{T}}$ but also on ${\tau}_{\mathrm{c}}$, ${\tau}_{\mathrm{L}}$, and ${\sigma}_{L}/\overline{L}:{\tau}_{r}^{\mathrm{o}\mathrm{p}\mathrm{t}}={\tau}_{r}^{\mathrm{o}\mathrm{p}\mathrm{t}}({R}_{\mathrm{T}},{\tau}_{\mathrm{r}},{\tau}_{\mathrm{L}},{\sigma}_{L}/\overline{L})$. The optimal design of the system is then given by Equation 12 but with ${\tau}_{\mathrm{r}}$ given by $\tau _{\mathrm{r}}{}^{\mathrm{opt}}=\tau _{\mathrm{r}}{}^{\mathrm{opt}}({R}_{\mathrm{T}},{\tau}_{\mathrm{c}},{\tau}_{\mathrm{L}},{\sigma}_{L}/\overline{L})$:
This design principle maximizes for a given number of receptors ${R}_{\mathrm{T}}$ the sensing precision and minimizes the number of readout molecules ${X}_{\mathrm{T}}$ and power $\dot{w}$ needed to reach that precision.
Comparison with experiment
To test our theory, we turn to the chemotaxis system of E. coli. This system contains a receptor that forms a complex with the kinase CheA. This complex, which is coarsegrained into R (Govern and Ten Wolde, 2014a), can bind the ligand L and activate the intracellular messenger protein CheY (x) by phosphorylating it. Deactivation of CheY is catalyzed by CheZ, the effect of which is coarsegrained into the deactivation rate. This push–pull network allows E. coli to measure the current concentration, and the relaxation time of this network sets the integration time for the receptor (Sartori and Tu, 2011). The system also exhibits adaptation on longer timescales due to receptor methylation and demethylation. The push–pull network and the adaptation system together allow the cell to measure concentration gradients via a temporal derivative, taking the difference between the current concentration and the past concentration as set by the adaptation time (Segall et al., 1986). A lower bound for the error in the estimate of this difference is given by the error in the estimate of the current concentration, the central quantity of our theory. Here, we ask how accurately E. coli can estimate the latter and whether the sensing precision is sufficient to determine whether during a run the concentration has changed.
Our theory predicts that if the number of receptors is limiting then the optimal integration time $\tau _{\mathrm{r}}{}^{\mathrm{opt}}({R}_{\mathrm{T}},{\tau}_{\mathrm{c}},{\tau}_{\mathrm{L}},{\sigma}_{L}/\overline{L})$ is given by minimizing Equation 8 with $h={R}_{\mathrm{T}}{\tau}_{\mathrm{r}}/{\tau}_{\mathrm{c}}$. The number of receptor–CheA complexes depends on the growth rate and varies between ${R}_{\mathrm{T}}\approx {10}^{3}$ and ${R}_{\mathrm{T}}\approx {10}^{4}$ (Li and Hazelbauer, 2004). The receptor correlation time for the binding of aspartate to the Tar receptor can be estimated from the measured dissociation constant (Vaknin and Berg, 2007) and the association rate (Danielson et al., 1994), $\tau}_{\mathrm{c}}\approx 10\mathrm{m}\mathrm{s$ (Govern and Ten Wolde, 2014a). The timescale ${\tau}_{\mathrm{L}}$ of the input fluctuations is set by the typical run time, which is on the order of a few seconds, ${\tau}_{\mathrm{L}}\approx 1\mathrm{s}$ (Berg and Brown, 1972; Taute et al., 2015).
This leaves one parameter to be determined, ${({\sigma}_{L}/\overline{L})}^{2}$. This is set by the spatial ligand–concentration profile and the typical length of a run. We have a good estimate of the latter. In shallow gradients, it is on the order of $l\approx 50\mu \mathrm{m}$ (Berg and Brown, 1972; Taute et al., 2015; Jiang et al., 2010; Flores et al., 2012); specifically, Figure 4 of Taute et al., 2015 shows that the typical run times are 1–2 s while the typical run speeds are $2060\mu {\mathrm{ms}}^{1}$, yielding a run length on the order of indeed 50 µm. We do not know the spatial concentration profiles that E. coli has experienced during its evolution. We can however get a sense of the scale by considering an exponential ligand–concentration gradient. For a profile $\overline{L}(x)={L}_{0}{e}^{x/{x}_{0}}$ with length scale x_{0}, the relative change in the signal over the length of a run is ${\sigma}_{L}/\overline{L}\simeq (d\overline{L}/dx)l/\overline{L}=l/{x}_{0}$. We consider the range ${\sigma}_{L}/\overline{L}\approx l/{x}_{0}<1$, where ${\sigma}_{L}/\overline{L}<0.1$ corresponds to shallow gradients with $x}_{0}\gtrsim 500\mu \mathrm{m$ in which cells move with a constant drift velocity (Shimizu et al., 2010; Flores et al., 2012).
Figure 4a shows that as the gradient becomes steeper and ${\sigma}_{L}/\overline{L}\approx l/{x}_{0}$ increases the optimal integration time $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ decreases. This can be understood by noting that the relative importance of the dynamical error compared to the sampling error scales with ${\left({\sigma}_{L}/\overline{L}\right)}^{2}$ (Equation 6). Shallow ingredients thus allow for a larger integration time while steep gradients necessitate a shorter one.
Experiments indicate that the relaxation rate of CheY is $\tau _{\mathrm{r}}{}^{1}\approx 2{\mathrm{s}}^{1}$ for the attractant response and $\approx 20{\mathrm{s}}^{1}$ for the repellent response (Sourjik and Berg, 2002), such that the integration time ${\tau}_{\mathrm{r}}\approx 50500\mathrm{m}\mathrm{s}$ (Sourjik and Berg, 2002; Govern and Ten Wolde, 2014a). Figure 4a shows that this integration time is optimal for detecting shallow gradients. Our theory thus predicts that the E. coli chemotaxis system has been optimized for sensing shallow gradients.
To navigate, the cells must be able to resolve the signal change over a run. During a run of duration ${\tau}_{\mathrm{L}}$, the system performs ${\tau}_{\mathrm{L}}/{\tau}_{\mathrm{r}}$ independent concentration measurements. The effective error for these measurements is the instantaneous sensing error ${(\delta \widehat{L})}^{2}$ divided by the number of independent measurements ${\tau}_{\mathrm{L}}/{\tau}_{\mathrm{r}}:(\delta \hat{L}{)}^{2}/({\tau}_{\mathrm{L}}/\tau \mathrm{r})$. Hence, the SNR for these concentration measurements is ${\text{SNR}}_{{\tau}_{\mathrm{L}}}\equiv {({\sigma}_{L}/\delta \widehat{L})}^{2}{\tau}_{\mathrm{L}}/{\tau}_{\mathrm{r}}$.
Figure 4b shows that our theory predicts that when ${R}_{\mathrm{T}}={10}^{3}$, the shallowest gradient that cells can resolve, defined by ${\text{SNR}}_{{\tau}_{\mathrm{L}}}=1$, is $l/{x}_{0}\approx {\sigma}_{L}/\overline{L}\approx 1\times {10}^{2}$, corresponding to ${x}_{0}\approx 7500\mu \mathrm{m}$, while when ${R}_{\mathrm{T}}={10}^{4}$, $l/{x}_{0}\approx 2\times {10}^{3}$ and ${x}_{0}\approx 25000\mu \mathrm{m}$. The shallowest gradient is thus on the order of ${x}_{0}\approx {10}^{4}\mu \mathrm{m}$. Shimizu et al., 2010 show that E. coli cells are indeed able to sense such very shallow gradients: Figure 2A of Shimizu et al., 2010 shows that E. coli cells can detect exponential up ramps with rate $r=0.001/\mathrm{s}$; using $r={v}_{\mathrm{r}}/{x}_{0}$, where ${v}_{\mathrm{r}}\approx 10\mu \mathrm{m}/\mathrm{s}$ is the run speed (Jiang et al., 2010), this corresponds to ${x}_{0}\approx {10}^{4}\mu \mathrm{m}$. Importantly, the predictions of our theory (Figure 4) concern the shallowest gradient that the system with the optimal integration time can resolve. These observations indicate that the optimal integration time is not only sufficient to make navigation in these very shallow gradients possible but also necessary.
Figure 4 also shows that $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ decreases as the number of receptor–CheA complex, ${R}_{\mathrm{T}}$, increases because the latter allows for more instantaneous measurements, reducing the need for time integration (Figure 3c). Interestingly, the data of Li and Hazelbauer, 2004 shows that the copy numbers of the chemotaxis proteins vary with the growth rate. Clearly, it would be of interest to directly measure the response time in different strains under different growth conditions.
Discussion
Here, we have integrated ideas from Tostevin and ten Wolde, 2010; Hilfinger and Paulsson, 2011; and Bowsher et al., 2013 on information transmission via timevarying signals with the sampling framework of Govern and Ten Wolde, 2014a to develop a unified theory of cellular sensing. The theory is founded on the concept of the dynamic input–output relation ${p}_{{\tau}_{\mathrm{r}}}(L)$. It allows us to develop the idea that the cell employs the readout system to estimate the average receptor occupancy ${p}_{{\tau}_{\mathrm{r}}}$ over the past integration time ${\tau}_{\mathrm{r}}$ and then exploits the mapping ${p}_{{\tau}_{\mathrm{r}}}(L)$ to estimate the current ligand concentration L from ${p}_{{\tau}_{\mathrm{r}}}$. The theory reveals that the error in the estimate of L depends on how accurately the cell samples the receptor state to estimate ${p}_{{\tau}_{\mathrm{r}}}$, and on how much ${p}_{{\tau}_{\mathrm{r}}}$, which is determined by the concentration in the past ${\tau}_{\mathrm{r}}$, reflects the current ligand concentration. These two distinct sources of error give rise to the sampling error and dynamical error in Equation 6, respectively.
While the system contains no less than 11 parameters, Equation 6 provides an intuitive expression for the sensing error in terms of collective variables that have a clear interpretation. The dynamical error depends only on the timescales in the problem, most notably ${\tau}_{\mathrm{r}}/{\tau}_{\mathrm{L}}$. The sampling error depends on how accurately the readout system estimates ${p}_{{\tau}_{\mathrm{r}}}$, which is determined by the number of receptor samples, their independence, and their accuracy; yet it also depends on ${\tau}_{\mathrm{r}}/{\tau}_{\mathrm{L}}$ via the dynamic gain, which determines how the error in the estimate of ${p}_{{\tau}_{\mathrm{r}}}$ propagates to that of L. The tradeoff between the sampling error and dynamical error yields an optimal integration time.
Our study reveals that the optimal integration time $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ depends in a nontrivial manner on the design of the system. When the number of readout molecules ${X}_{\mathrm{T}}$ is smaller than the number of receptors ${R}_{\mathrm{T}}$, time integration is not possible and the optimal system is an instantaneous responder with $\tau _{\mathrm{r}}{}^{\mathrm{opt}}\approx 0$. When the power $\dot{w}\sim {X}_{\mathrm{T}}/{\tau}_{\mathrm{r}}$, rather than ${X}_{\mathrm{T}}$, is limiting, $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ is determined by the tradeoff between the sampling error and dynamical error. In both scenarios, however, one resource, ${X}_{\mathrm{T}}$ or $\dot{w}$, is limiting the sensing precision. In an optimally designed system, all resources are equally limiting so that no resource is wasted. This yields the resource allocation principle (Equation 12), first identified in Govern and Ten Wolde, 2014a, for sensing static concentrations. The reason it can be generalized to timevarying signals is that the principle concerns the optimal design of the readout system for estimating the receptor occupancy over a given integration time ${\tau}_{\mathrm{r}}$, which holds for any type of input: the number of independent concentration measurements at the receptor level is ${R}_{\mathrm{T}}{\tau}_{\mathrm{r}}/{\tau}_{\mathrm{c}}$, irrespective of how the input varies, and in an optimally designed system this also equals the number of readout molecules ${X}_{\mathrm{T}}$ and energy $\beta \dot{w}{\tau}_{\mathrm{r}}$ to store these measurements reliably. We thus expect that the design principle also holds for systems that sense signals that vary more strongly in time (Mora and Nemenman, 2019).
While the allocation principle Equation 12 holds for any ${\tau}_{\mathrm{r}}$, it does not specify the optimal integration time $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$. However, our theory predicts that if the number of receptors ${R}_{\mathrm{T}}$ is limiting, then there exists a $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ that maximizes the sensing precision for that ${R}_{\mathrm{T}}$ (Equation 8 with $h={R}_{\mathrm{T}}{\tau}_{\mathrm{r}}/{\tau}_{\mathrm{c}}$). Via the allocation principle Equation 13, ${R}_{\mathrm{T}}$ and $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ then together determine the minimal number of readout molecules ${X}_{\mathrm{T}}$ and power $\dot{w}$ to reach that precision. The resource allocation principle, together with the optimal integration time, thus completely specifies the optimal design of the sensing system.
Applying our theory to the E. coli chemotaxis system shows that this system not only obeys the resource allocation principle (Govern and Ten Wolde, 2014a) but also that the predicted optimal integration time to measure shallow gradients is in agreement with that measured experimentally (Figure 4a). This is remarkable because there is not a single fit parameter in our theory. Moreover, Figure 4b shows that the optimal integration time is not only sufficient to enable the sensing of these shallow gradients but also necessary. This is interesting because the sensing precision could also be increased by increasing the number of receptors, readout molecules, and energy devoted to sensing – but this would be costly. Our results thus demonstrate not only that the chemotaxis system obeys the design principles as revealed by our theory but also that there is a strong selection pressure to design sensing systems optimally, that is, to maximize the sensing precision given the resource constraints.
Our theory is based on a Gaussian model and describes the optimal sensing system that minimizes the mean square error in the estimate of the ligand concentration (see Equation 1). The latter is precisely the performance criterion of Wiener–Kolmogorov (Extrapolation, 1950; Kolmogorov, 1992) and Kalman, 1960 filtering theory, which, moreover, become exact for systems that obey Gaussian statistics. In fact, since our system (including the input signal) is stationary, they predict the same optimal filter, which is an exponential filter for signals that are memoryless. The signals studied here belong to this class, and the push–pull network forms an exponential filter (Hinczewski and Thirumalai, 2014; Becker et al., 2015). This underscores the idea that our theory gives a complete description, in terms of all the required resources, for the optimal design of cellular sensing systems that need to estimate this type of signals. Furthermore, because our model is Gaussian, the goal of minimizing the meansquare error in the estimate of the input signal is equivalent to maximizing the mutual information between the input (the ligand concentration) and the output (the readout ${x}^{*}$) (Becker et al., 2015).
In recent years, filtering theories and information theory have been applied increasingly to neuronal and cellular systems (Laughlin, 1981; Brenner et al., 2000; Fairhall et al., 2001; Andrews et al., 2006; Ziv et al., 2007; Nemenman et al., 2008; Cheong et al., 2011; Nemenman, 2012; Hinczewski and Thirumalai, 2014; Becker et al., 2015; Husain et al., 2019; Tkacik et al., 2008; Tkačik and Walczak, 2011; Dubuis et al., 2013; Monti and Wolde, 2016; Monti et al., 2018a). A key concept in these theories is that optimal sensing systems match the response to the statistics of the input. When the noise is weak, maximizing the entropy of the output distribution becomes paramount, which entails matching the shape of the input–output relation to the shape of the input distribution to generate a flat output distribution (Laughlin, 1981; Tkacik et al., 2008; Monti et al., 2018a). Yet, when the noise is large, the optimal response is also shaped by the requirement to tame the propagation of noise in the input signal (Andrews et al., 2006; Hinczewski and Thirumalai, 2014; Becker et al., 2015; Monti et al., 2018a; Monti et al., 2018b; Mora and Nemenman, 2019) or to lift the signal above the intrinsic noise in the response system (Tostevin and ten Wolde, 2010; Bowsher et al., 2013). In Appendix 3, we show that estimating the concentration from ${p}_{{\tau}_{\mathrm{r}}}$ is equivalent to that via readout ${x}^{*}$. This makes it possible to connect our sampling framework, which is based on ${p}_{{\tau}_{\mathrm{r}}}(L)$, to filtering and information theory, which are based on ${x}^{*}(L)$. In particular, we show in this appendix how the optimal integration and dynamic gain can be understood from these ideas on matching the response to the input. We also briefly discuss in Appendix 3 the concepts from information theory that are beyond the scope of the Gaussian model considered here.
Yet, our discrete sampling framework gives a detailed description of how the optimal design of sensing systems depends on the statistics of the input signal in terms of all the required cellular resources: protein copies, time, and energy. In an optimal system, each receptor is sampled once every receptor–ligand correlation time ${\tau}_{\mathrm{c}}$, $\mathrm{\Delta}\approx {\tau}_{\mathrm{c}}$, and the number of samples per receptor is $\tau _{\mathrm{r}}{}^{\mathrm{opt}}/\mathrm{\Delta}\approx \tau _{\mathrm{r}}{}^{\mathrm{opt}}/{\tau}_{\mathrm{c}}$. The optimal integration time $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ for a given ${R}_{\mathrm{T}}$ is determined by the tradeoff between the age of the samples and the number required for averaging the receptor state. When the input varies more rapidly, the samples need to be refreshed more regularly: to keep the dynamical error and the dynamic gain constant, $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ must decrease linearly with ${\tau}_{\mathrm{L}}$ (see Equation 6). Yet, only decreasing $\tau _{\mathrm{r}}{}^{\mathrm{opt}}$ would inevitably increase the sampling error ${\sigma}_{{\widehat{p}}_{{\tau}_{r}}}^{2,\mathrm{samp}}$ in estimating the receptor occupancy because the sampling interval $\mathrm{\Delta}\sim {R}_{\mathrm{T}}\tau _{\mathrm{r}}{}^{\mathrm{opt}}/{X}_{\mathrm{T}}^{\mathrm{opt}}$ would become smaller than ${\tau}_{\mathrm{c}}$, creating redundant samples. To keep the sensing precision constant, the number of receptors ${R}_{\mathrm{T}}$ needs to be raised with $\tau _{\mathrm{L}}{}^{1}$, such that the sampling interval $\mathrm{\Delta}\sim {R}_{\mathrm{T}}\tau _{\mathrm{r}}{}^{\mathrm{opt}}/{X}_{\mathrm{T}}^{\mathrm{opt}}$ remains of order ${\tau}_{\mathrm{c}}$ and the decrease in the number of samples per receptor, $\tau _{\mathrm{r}}{}^{\mathrm{opt}}/{\tau}_{\mathrm{c}}$, is precisely compensated for by the increase in ${R}_{\mathrm{T}}$. The total number of independent concentration measurements, ${R}_{\mathrm{T}}\tau _{\mathrm{r}}{}^{\mathrm{opt}}/{\tau}_{\mathrm{c}}$, and hence the number of readout molecules ${X}_{\mathrm{T}}^{\mathrm{opt}}$ to store these, does indeed not change. In contrast, the required power $\beta {\dot{w}}^{\mathrm{opt}}\approx {R}_{\mathrm{T}}/{\tau}_{\mathrm{c}}$ rises (Equation 12): each receptor molecule is sampled each ${\tau}_{\mathrm{c}}$ at $\mathrm{\Delta}{\mu}^{\mathrm{opt}}\approx 4{k}_{\mathrm{B}}T$, and the increase in ${R}_{\mathrm{T}}$ raises the sampling rate $\dot{n}={\dot{w}}^{\mathrm{opt}}/\mathrm{\Delta}{\mu}^{\mathrm{opt}}\sim {X}_{\mathrm{T}}^{\mathrm{opt}}/\tau _{\mathrm{r}}{}^{\mathrm{opt}}$. Our theory thus predicts that when the input varies more rapidly the number of receptors and the power must rise to maintain a required sensing precision, while the number of readout molecules does not.
The fitness benefit of a sensing system does not only depend on the sensing precision but also on the energetic cost of maintaining and running the system. In principle, the cell can reduce the sensing error arbitrarily by increasing ${R}_{\mathrm{T}}$ and decreasing ${\tau}_{\mathrm{r}}$. Our resource allocation principle (Equation 12) shows that then not only the number of readout molecules needs to be raised but also the power. Clearly, improving the sensing precision comes at a cost: more copies of the components of the sensing system need to be synthesized every cell cycle, and more energy is needed to run the system. Our theory (i.e., Equation 6) makes it possible to derive the Pareto front that quantifies the tradeoff between the maximal sensing precision and the cost of making the sensing system (see Figure 5). Importantly, the design of the optimal system at the Pareto front obeys, to a good approximation, our resource allocation principle (Equation 12). This is because this principle specifies the optimal ratios of ${R}_{\mathrm{T}}$, ${X}_{\mathrm{T}}$, $\dot{w}$, and ${\tau}_{\mathrm{r}}$ given the input statistics, and these ratios are fairly insensitive to the costs of the respective resources: resources that are in excess cannot improve sensing and are thus wasted, no matter how cheap they are. It probably explains why our theory, without any fit parameters, not only predicts the integration time that allows E. coli to sense shallow gradients (Figure 4) but also the number of receptor and readout molecules (Govern and Ten Wolde, 2014a).
In our study, we have limited ourselves to a canonical push–pull motif. Yet, the work of Govern and Ten Wolde, 2014a indicates that our results hold more generally, pertaining also to systems that employ cooperativity, negative or positive feedback, or multiple layers, as the MAPK cascade. While multiple layers and feedback change the response time, they do not make time integration more efficient in terms of readout molecules or energy (Govern and Ten Wolde, 2014a). And provided it does not increase the input correlation time (Skoge et al., 2011; Ten Wolde et al., 2016), cooperative ligand binding can reduce the sensing error per sample, but the resource requirements in terms of readout molecules and energy per sample do not change (Govern and Ten Wolde, 2014a). In all these systems, time integration requires that the history of the receptor is stored, which demands protein copies and energy.
Lastly, in this article we have studied the resource requirements for estimating the current concentration via the mechanism of time integration. However, to understand how E. coli navigates in a concentration gradient, we do not only have to understand how the system filters the highfrequency ligandbinding noise via time averaging but also how on longer timescales the system adapts to changes in the ligand concentration (Sartori and Tu, 2011). This adaptation system also exhibits a tradeoff between accuracy, speed, and power (Lan et al., 2012; Sartori and Tu, 2015). Intriguingly, simulations indicate that the combination of sensing and adaptation allows E. coli not only to accurately estimate the current concentration but also the future ligand concentration (Becker et al., 2015). It will be interesting to see whether an optimal resource allocation principle can be formulated for systems that need to predict future ligand concentrations.
Materials and methods
Methods are described in Appendices 1–3. Appendix 1 derives the central result of our article (Equation 6). Appendix 2 derives the fundamental resources and the corresponding sensing limits (Equations 8 and 9). Appendix 3 describes how the optimal gain and integration time can be understood using ideas from filtering and information theory.
Appendix 1
Signaltonoise ratio
Here, we provide the derivation of the central result of this article, Equation 6 of the main text. The derivation starts from the SNR, given in Equation 2. Here, ${\sigma}_{L}^{2}$ is the width of the input distribution, while ${(\delta \widehat{L})}^{2}$ is the error in the estimate of the concentration. The latter is derived from the dynamic input–output relation ${p}_{{\tau}_{\mathrm{r}}}(L)$, which is the mapping between the average receptor occupancy over the past integration time ${\tau}_{\mathrm{r}}$ and the current ligand concentration L (see Figure 2). Concretely, the error ${(\delta \widehat{L})}^{2}$ is given by Equation 1, where ${\sigma}_{{\widehat{p}}_{{\tau}_{r}}}^{2}$ is the error in the estimate of the average receptor occupancy over the past integration time ${\tau}_{\mathrm{r}}$ and ${\stackrel{~}{g}}_{L\to {p}_{{\tau}_{\mathrm{r}}}}$ is the dynamic gain, which is the slope of the dynamic input–output relation ${p}_{{\tau}_{\mathrm{r}}}(L)$. Below, we first derive the dynamic gain ${\stackrel{~}{g}}_{L\to {p}_{{\tau}_{\mathrm{r}}}}$ and then the error in the estimate of the receptor occupancy ${\sigma}_{{\widehat{p}}_{{\tau}_{r}}}^{2}$.
Dynamic input–output relation
The dynamic input–output relation ${p}_{{\tau}_{\mathrm{r}}}(L)$ is the average receptor occupancy ${p}_{{\tau}_{\mathrm{r}}}$ over the past integration time ${\tau}_{\mathrm{r}}$, given that the current ligand concentration $L(t)=L$. The cell estimates ${p}_{{\tau}_{\mathrm{r}}}$ via its receptor readout system, which is a device that takes samples of the receptor: the readout molecules at time t constitute samples of the ligandbinding state of the receptor at earlier sampling times t_{i} (see Figure 2). More specifically, the cell estimates ${p}_{{\tau}_{\mathrm{r}}}$ from the number of active readout molecules ${x}^{*}(L(t))={x}^{*}(L)$:
where $\overline{N}$ is the average of the number of samples N taken during the integration time ${\tau}_{\mathrm{r}}$. Hence, the dynamic input–output relation is
where $n({t}_{i})=0,1$ is the receptor occupancy at time t_{i}, E denotes the expectation over the sampling times t_{i}, and ${\u27e8\mathrm{\dots}\u27e9}_{L(t)}$ denotes an average over receptor–ligand binding noise and the subensemble of ligand trajectories that each end at $L(t)$ (see Figure 2c); the quantity ${\u27e8n({t}_{i})\u27e9}_{L(t)}$ is indeed the average receptor occupancy at time t_{i}, given that the ligand concentration at time t is $L\mathit{}\mathrm{(}t\mathrm{)}\mathrm{=}L$. Importantly, the receptor samples can also decay via the deactivation of ${x}^{*}$. Taking this into account, the probability that a readout molecule at time t provides a sample of the receptor at an earlier time t_{i} is $p({t}_{i}\mathrm{sample})={e}^{(t{t}_{i})/{\tau}_{\mathrm{r}}}/{\tau}_{\mathrm{r}}$ (Govern and Ten Wolde, 2014a). Averaging the receptor occupancy over the sampling times t_{i} then yields
Dynamic gain
When the current ligand concentration $L(t)$ deviates from its mean $\overline{L}$ by $\delta L(t)\equiv L(t)\overline{L}$, then ${p}_{{\tau}_{\mathrm{r}}}$ deviates on average from its mean p (the average receptor occupancy over all $\delta L(t)$) by
Here, E denotes again the expectation over the sampling times t_{i}, and ${\u27e8\delta n({t}_{i})\u27e9}_{\delta L(t)}\equiv {\u27e8n({t}_{i})\u27e9}_{\delta L(t)}p$ is the average deviation in the receptor occupancy $n({t}_{i})$ at time t_{i} from its mean p, given that the ligand concentration at time t is $\delta L(t)$ (see Figure 2c). We can compute it within the linearnoise approximation (Gardiner, 2009):
where ${\rho}_{n}=p(1p)/({\overline{L}}_{T}{\tau}_{\mathrm{c}})$ and ${\u27e8\delta L({t}^{\prime})\u27e9}_{\delta L(t)}$ is the average ligand concentration at time ${t}^{\prime}$, given that the ligand concentration at time t is $\delta L(t)$. The latter is given by Bowsher et al., 2013
Combining Equations 17–19 yields the following expression for the average change in the average receptor occupancy ${p}_{{\tau}_{\mathrm{r}}}$, given that the ligand at time t is $\delta L(t)$:
Hence, the dynamic gain is
The dynamic gain is the slope of the dynamic input–output relation ${p}_{{\tau}_{\mathrm{r}}}(L)$ (see Figure 2a). It yields the average change in the receptor occupancy ${p}_{{\tau}_{\mathrm{r}}}$ over the past integration time ${\tau}_{\mathrm{r}}$ when the change in the ligand concentration at time t is $\delta L(t)$. It depends on all the timescales in the problem and only reduces to the static gain ${g}_{L\to p}=p(1p)/\overline{L}$ when the integration time ${\tau}_{\mathrm{r}}$ and the receptor correlation time ${\tau}_{\mathrm{c}}$ are both much shorter than the ligand correlation time ${\tau}_{\mathrm{L}}$. The dynamic gain determines how much an error in the estimate of ${p}_{{\tau}_{\mathrm{r}}}$ propagates to the estimate of $L(t)$.
Error in receptor occupancy
We can derive the variance in the estimate of the receptor occupancy over the past integration time ${\tau}_{\mathrm{r}}$, ${\sigma}_{{\widehat{p}}_{{\tau}_{r}}}^{2}$, directly from Equation 14 for the system in the irreversible limit (Malaguti and Ten Wolde, 2019). While this derivation is illuminating, it is also lengthy. For the fully reversible system studied here, we follow a simpler route. Since the average number of samples $\overline{N}$ over the integration time ${\tau}_{\mathrm{r}}$ is constant, it follows from Equation 14 that
where ${\sigma}_{{x}^{*}L}^{2}$ is the variance in the number of phosphorylated readout molecules, conditioned on the signal at time t being $L(t)=L$. The conditional variance (Tostevin and ten Wolde, 2010)
is the full variance ${\sigma}_{{x}^{*}}^{2}$ of ${x}^{*}$ minus the variance ${\stackrel{~}{g}}_{L\to {x}^{*}}^{2}{\sigma}_{L}^{2}$ that is due to the signal variations, given by the dynamic gain ${\stackrel{~}{g}}_{L\to {x}^{*}}^{2}$ from L to ${x}^{*}$ times the signal variance ${\sigma}_{L}^{2}$.
The full variance of the readout ${\sigma}_{{x}^{*}}^{2}$ in Equation 25 can be obtained from the linearnoise approximation (Gardiner, 2009), see Malaguti and Ten Wolde, 2019:
In this expression, $\mu =\tau _{\mathrm{c}}{}^{1}={k}_{1}\overline{L}+{k}_{2}$ is the inverse of the receptor correlation time $\tau}_{\mathrm{c}};\text{}p=\overline{RL}/{R}_{\mathrm{T}}={k}_{1}\overline{L}/({k}_{2}+{k}_{1}\overline{L})={k}_{1}\overline{L}{\tau}_{\mathrm{c}$ is the probability that a receptor is bound to ligand; $\rho ={R}_{\mathrm{T}}{k}_{1}(1p)=p(1p){R}_{\mathrm{T}}\mu /\overline{L}$; $\mu}^{\mathrm{\prime}}={\tau}_{\mathrm{r}}^{1}=({k}_{\mathrm{f}}+{k}_{\mathrm{f}})p{R}_{\mathrm{T}}+{k}_{\mathrm{r}}+{k}_{\mathrm{r}$ is the inverse of the integration time $\tau}_{\mathrm{r}$; $f=\overline{{x}^{\ast}}/{x}_{\mathrm{T}}=({k}_{\mathrm{f}}p{R}_{\mathrm{T}}+{k}_{\mathrm{r}}){\tau}_{\mathrm{r}}$ is the fraction of phosphorylated readout; and ${\rho}^{\prime}={k}_{\mathrm{f}}{X}_{\mathrm{T}}(1f){k}_{\mathrm{f}}{X}_{\mathrm{T}}f=\dot{n}/(p{R}_{\mathrm{T}})$ is the total flux $\dot{n}$ around the cycle of readout activation and deactivation divided by the total number $p{R}_{\mathrm{T}}$ of ligandbound receptors: it is the rate at which each receptor is sampled, be it ligand bound or not. For what follows below, we note that the quality parameter $q=({e}^{\mathrm{\Delta}{\mu}_{1}}1)({e}^{\mathrm{\Delta}{\mu}_{2}}1)/({e}^{\mathrm{\Delta}\mu}1)={\rho}^{\prime}p{R}_{\mathrm{T}}{\tau}_{\mathrm{r}}/(f(1f){X}_{\mathrm{T}})=\dot{n}{\tau}_{\mathrm{r}}/(f(1f){X}_{\mathrm{T}})$.
To get ${\sigma}_{{\widehat{p}}_{{\tau}_{r}}}^{2}$ from Equations 24 and 25, we need not only ${\sigma}_{x}^{2}$ (Equation 26) but also the average number of samples $\overline{N}$ and the dynamic gain ${\stackrel{~}{g}}_{L\to {x}^{*}}^{2}$. The average number of samples taken during the integration time ${\tau}_{\mathrm{r}}$ is $\overline{N}=\dot{n}{\tau}_{\mathrm{r}}/p=f(1f){X}_{\mathrm{T}}q/p={\rho}^{\prime}{R}_{\mathrm{T}}/{\mu}^{\prime}$, and the effective number of reliable samples is ${\overline{N}}_{\mathrm{eff}}=q\overline{N}$. Since ${p}_{{\tau}_{\mathrm{r}}}(L)=E{\u27e8{x}^{*}\u27e9}_{L}/\overline{N}$, where $E{\u27e8{x}^{*}\u27e9}_{L}$ is the average number of active readout molecules for a given input $L(t)=L$ and $\overline{N}$ is a constant independent of L, it follows that
with ${\stackrel{~}{g}}_{L\to {p}_{{\tau}_{\mathrm{r}}}}$ the dynamic gain from L to ${p}_{{\tau}_{\mathrm{r}}}$, given by Equation 22. Equation 27 can be verified via another route that does not rely on the sampling framework because we also know that ${\stackrel{~}{g}}_{L\to {x}^{*}}={\sigma}_{L,{x}^{*}}^{2}/{\sigma}_{L}^{2}$ (Tostevin and ten Wolde, 2010), where the covariance ${\sigma}_{L,{x}^{*}}^{2}$ can be obtained from the linearnoise approximation (Malaguti and Ten Wolde, 2019; Gardiner, 2009). Combining Equations 24–27 yields
This can be rewritten using the expression for the fraction of independent samples, which, assuming that ${\tau}_{\mathrm{r}}\gg {\tau}_{\mathrm{c}}$, is ${f}_{I}=1/(1+2{\tau}_{\mathrm{c}}/\mathrm{\Delta})$, with $\mathrm{\Delta}=2{\tau}_{\mathrm{r}}{R}_{\mathrm{T}}/{\overline{N}}_{\text{eff}}$ the effective spacing between the samples (Govern and Ten Wolde, 2014a):
Here, ${\sigma}_{{\widehat{p}}_{{\tau}_{r}}}^{2,\mathrm{samp}}$ is the sampling error in the estimate of ${p}_{{\tau}_{\mathrm{r}}}$ (Malaguti and Ten Wolde, 2019); it is a statistical error, which arises from the finite cellular resources to sample the state of the receptor, protein copies, time, and energy (see Figure 2b). The other contribution, ${\sigma}_{{\widehat{p}}_{{\tau}_{r}}}^{2,\mathrm{dyn}}$, is the dynamical error in the estimate of ${p}_{{\tau}_{\mathrm{r}}}$ (Malaguti and Ten Wolde, 2019); it is a systematic error that arises from the input dynamics and only depends on the average receptor occupancy and the timescales of the input, receptor, and readout (see Figure 2c); it neither depends on the number of protein copies nor on the energy necessary to sample the receptor.
Final result: SNR
Combining Equations 29 and 22 with Equation 3 yields the principal result of our work (Equation 6) of the main text.
Appendix 2
Fundamental resources
To identify the fundamental resources limiting the sensing accuracy and derive the corresponding sensing limits (Equations 8 and 9), it is helpful to rewrite the SNR in terms of collective variables that illuminate the cellular resources. For that, we start from Equation 6 of the main text and split the first term on the righthand side and exploit the expression for the effective number of independent samples ${\overline{N}}_{\mathrm{I}}=1/(1+2{\tau}_{\mathrm{c}}/\mathrm{\Delta}){\overline{N}}_{\mathrm{eff}}$ with $\mathrm{\Delta}=2{\tau}_{\mathrm{r}}{R}_{\mathrm{T}}/{\overline{N}}_{\mathrm{eff}}$. We then sum up the last two terms on the righthand side and use that ${\overline{N}}_{\mathrm{eff}}=q\overline{N}=q\dot{n}{\tau}_{\mathrm{r}}/p$:
The second term in between the square brackets describes the contribution to the sensing error that comes from the stochasticity in the concentration measurements at the receptor level. The first term in between the square brackets, the coding noise, describes the contribution that arises in storing these measurements into the readout molecules.
From Equation 30, the fundamental resources and the corresponding sensing limits (Equations 8 and 9) can be derived. Specifically, when the number of receptors and their integration are limiting, the coding noise in Equation 30 is zero; exploiting that typically ${\tau}_{\mathrm{c}}\ll {\tau}_{\mathrm{r}},{\tau}_{\mathrm{L}}$ and that the contribution to the sensing error from the receptor input noise is minimized for $p\to 1/2$, this yields Equation 8 with $h={R}_{\mathrm{T}}{\tau}_{\mathrm{r}}/{\tau}_{\mathrm{c}}$. When the number of readout molecules ${X}_{\mathrm{T}}$ is limiting, the receptor input noise is zero and $q\to 1$; noting that $\dot{n}=f(1f){X}_{\mathrm{T}}q/{\tau}_{\mathrm{r}}$ and that the contribution from the coding noise is minimized when $f\to 1/2$ and $p\to 0$, and again exploiting that ${\tau}_{\mathrm{c}}\ll {\tau}_{\mathrm{r}},{\tau}_{\mathrm{L}}$, this yields Equation 8 with $h={X}_{\mathrm{T}}$. When the power $\dot{w}=\dot{n}\mathrm{\Delta}\mu $ is limiting, then the receptor input noise is (again) zero. The coding noise is minimized for a given power constraint $\dot{w}$ when $\mathrm{\Delta}{\mu}_{1}=\mathrm{\Delta}{\mu}_{2}=\mathrm{\Delta}\mu /2$, but two regimes can be distinguished based on the total freeenergy drop $\mathrm{\Delta}\mu $. When $\mathrm{\Delta}\mu >4{k}_{\mathrm{B}}T$, the system is in the irreversible regime and $q\to 1$ (see Equation 7); Equation 30 shows that the error is then bounded by Equation 8 with $h=\dot{w}{\tau}_{\mathrm{r}}/(\mathrm{\Delta}\mu /4)$, using ${\tau}_{\mathrm{c}}\ll {\tau}_{\mathrm{r}},{\tau}_{\mathrm{L}}$ and $p\to 0$. Yet, the sensing error is minimized in the quasiequilibrium regime, where $\mathrm{\Delta}{\mu}_{1}=\mathrm{\Delta}{\mu}_{2}=\mathrm{\Delta}\mu /2\to 0$ and $q\to \beta \mathrm{\Delta}\mu /4$, yielding Equation 8 with $h=\beta \dot{w}{\tau}_{\mathrm{r}}$.
Appendix 3
The optimal gain and optimal integration time
The theory of the main text (Equation 6) is based on the idea that the cell uses its push–pull network to estimate the receptor occupancy ${p}_{{\tau}_{\mathrm{r}}}(L)$ from which the current ligand concentration L is then inferred by inverting the dynamic input–output relation ${p}_{{\tau}_{\mathrm{r}}}(L)$. Yet, as we show here, this framework is equivalent to the idea that the cell estimates the concentration from the output ${x}^{*}$, using the dynamic input–output relation ${x}^{*}(L)$. Here, we use this observation to analyze our system using ideas from filtering and information theory. But first we demonstrate this correspondence.
To show that estimating the concentration from ${\widehat{p}}_{{\tau}_{r}}$ is equivalent to that from estimating it from ${x}^{*}$, we first note that because the average number of samples $\overline{N}$ is constant, ${\sigma}_{{x}^{*}L}^{2}={\sigma}_{{\widehat{p}}_{{\tau}_{r}}}^{2}{\overline{N}}^{2}$ while the gain from L to ${x}^{*}$ is ${\stackrel{~}{g}}_{L\to {x}^{*}}^{2}={\stackrel{~}{g}}_{L\to {p}_{{\tau}_{\mathrm{r}}}}^{2}{\overline{N}}^{2}$. Consequently, the absolute error ${(\delta \widehat{L})}^{2}$ in estimating the concentration via ${x}^{*}$, ${(\delta \widehat{L})}^{2}={\sigma}_{{x}^{*}L}^{2}/{\stackrel{~}{g}}_{L\to {x}^{*}}^{2}$, is the same as that of Equation 1: because the instantaneous number of active readout molecules ${x}^{*}$ reflects the average receptor occupancy ${p}_{{\tau}_{\mathrm{r}}}$ over the past ${\tau}_{\mathrm{r}}$, estimating the ligand concentration from ${x}^{*}$ is no different from inferring it from the average receptor occupancy ${\widehat{p}}_{{\tau}_{r}}={x}^{*}/\overline{N}$.
To make the connection with information and filtering theory, we note that in our Gaussian model the conditional distribution of $\delta {x}^{*}$ given $\delta L$ is given by Tostevin and ten Wolde, 2010
where ${\stackrel{~}{g}}_{L\to {x}^{*}}\delta L={\u27e8\delta x\u27e9}_{L}$ is the average value of $\delta {x}^{*}$ given that $\delta L(t)=\delta L$, and ${\sigma}_{{x}^{*}L}^{2}$ is the variance of this distribution (see also Equation 25).
The relative error, the inverse of the SNR (see Equation 2), is
As mentioned in the main text, the SNR also yields the mutual information $I({x}^{*};L)=1/2\mathrm{ln}(1+\mathrm{SNR})$ between the input L and output ${x}^{*}$ (Tostevin and ten Wolde, 2010).
The notion of an optimal integration time or optimal dynamic gain is well known from filtering and information theory (Andrews et al., 2006; Hinczewski and Thirumalai, 2014; Becker et al., 2015; Monti et al., 2018a; Monti et al., 2018b; Mora and Nemenman, 2019). To elucidate the optimal gain and integration time in our system, we combine the above equation with Equations 25 and 26 to write the relative error as
where ${g}_{RL\to {x}^{*}}={\rho}^{\prime}/{\mu}^{\prime}$ is the static gain from $RL$ to ${x}^{*}$. Written in this form, the tradeoffs in maximizing the mutual information $I({x}^{*};L)$ (and minimizing the relative error in estimating the concentration) become apparent: increasing the dynamic gain ${\stackrel{~}{g}}_{L\to {x}^{*}}$ by decreasing the integration time ${\tau}_{\mathrm{r}}$ raises the slope of the input–output relation ${x}^{*}(L)$, which helps to lift the transmitted signal above the intrinsic binomial switching noise of the readout, $f(1f){X}_{\mathrm{T}}$. Also, the dynamical error is minimized by minimizing ${\tau}_{\mathrm{r}}$ and maximizing ${\stackrel{~}{g}}_{L\to {x}^{*}}$. Yet, for the second term, which describes how noise in the input signal arising from receptor switching, $p(1p){R}_{\mathrm{T}}$, is propagated to the output ${x}^{*}$, there exists an optimal integration time that minimizes this term: while decreasing ${\tau}_{\mathrm{r}}$ increases the dynamic gain, which helps to raise the signal above the noise, it also impedes time averaging of this switching noise, described by the factor $1/(1+{\tau}_{\mathrm{r}}/{\tau}_{\mathrm{c}})$.
The mutual information is $I({x}^{*};L)=H({x}^{*})H({x}^{*}L)$, with $H({x}^{*})$ the entropy of the marginal output distribution and $H({x}^{*}L)$ the entropy of the output distribution conditioned on the input. Hence, information theory shows that in the weak noise limit, information transmission is optimal when the entropy of the output distribution is maximized (Laughlin, 1981; Tkacik et al., 2008). Our system obeys this principle. Since the dynamic gain ${\stackrel{~}{g}}_{L\to {x}^{*}}=\rho {\rho}^{\prime}\tau _{\mathrm{L}}{}^{2}{\tau}_{\mathrm{c}}{\tau}_{\mathrm{r}}/[({\tau}_{\mathrm{c}}+{\tau}_{\mathrm{L}})({\tau}_{\mathrm{r}}+{\tau}_{\mathrm{L}})]\propto {R}_{\mathrm{T}}{X}_{\mathrm{T}}$, the amplification of the signal rises with ${R}_{\mathrm{T}}$ and ${X}_{\mathrm{T}}$. Since the standard deviation of the noise added to the transmitted signal coming from the stochastic receptor and readout activation scales with $\sqrt{{R}_{\mathrm{T}}}$ and $\sqrt{{X}_{\mathrm{T}}}$, respectively, it is clear that the SNR increases with $\sqrt{{R}_{\mathrm{T}}}$ and $\sqrt{{X}_{\mathrm{T}}}$. In the limit that ${R}_{\mathrm{T}},{X}_{\mathrm{T}}\to \mathrm{\infty}$, the relative error ${\mathrm{SNR}}^{1}$ is only set by the dynamical error, which can be reduced to zero by ${\tau}_{\mathrm{r}}\to 0$, exploiting that typically ${\tau}_{\mathrm{c}}\ll {\tau}_{\mathrm{L}}$. This is the weaknoise limit in which the mutual information $I({x}^{*};L)$ is maximized by maximizing the entropy of the output distribution $H({x}^{*})$. Indeed, ${\tau}_{\mathrm{r}}\to 0$ corresponds to maximizing the gain, which maximizes the width of the output distribution, in this limit equal to ${\sigma}_{x}^{2}={\stackrel{~}{g}}_{L\to {x}^{*}}^{2}{\sigma}_{L}^{2}$ (see Equation 25), and thereby the entropy of the output distribution $H({x}^{*})=1/2\mathrm{ln}(2\pi e{\sigma}_{x}^{2})$.
Finally, we note that our Gaussian model is linear such that the central control parameter, besides protein copies and energy, is the integration time or the dynamic gain, which sets the slope of the linear input–output relation. While Wiener–Kolmogorov and Kalman filtering are exact only for these Gaussian models, information theory also applies to nonlinear systems with nonGaussian statistics. It has been used to show that neuronal systems (Laughlin, 1981; Brenner et al., 2000; Fairhall et al., 2001; Nemenman et al., 2008; Tkacik et al., 2010), signaling and gene networks (Segall et al., 1986; Tkacik et al., 2008; Tkačik and Walczak, 2011; Nemenman, 2012; Dubuis et al., 2013), and circadian systems (Monti and Wolde, 2016; Monti et al., 2018a) can maximize information transmission by optimizing the shape of the input–output relation (Laughlin, 1981; Brenner et al., 2000; Fairhall et al., 2001; Tkacik et al., 2008; Monti et al., 2018a); by desensitization, that is, adapting the output to the mean input via incoherent feedforward or negative feedback (Segall et al., 1986); by gain control, that is, adapting the output to the variance of the input by capitalizing on a steep response function and temporal correlations in the input (Nemenman, 2012); by removing coding redundancy via temporal decorrelation (Nemenman et al., 2008); by optimizing the tiling of the output space via the topology of the network (Tkačik and Walczak, 2011; Dubuis et al., 2013); or by exploiting crosscorrelations between the signals (Tkacik et al., 2010; Monti and Wolde, 2016).
Data availability
All data generated or analysed during this study are included in the manuscript and supporting files.
References

BookIntroduction to Systems Biology: Design Principles of Biological NetworksBoca Raton, FL: CRC press.https://doi.org/10.1016/j.mbs.2008.07.002

Optimal noise filtering in the chemotactic response of Escherichia coliPLOS Computational Biology 2:e154.https://doi.org/10.1371/journal.pcbi.0020154

Optimal prediction by cellular signaling networksPhysical Review Letters 115:258103.https://doi.org/10.1103/PhysRevLett.115.258103

Physics of chemoreceptionBiophysical Journal 20:193–219.https://doi.org/10.1016/S00063495(77)855446

Physical limits to biochemical signalingPNAS 102:10040–10045.https://doi.org/10.1073/pnas.0504321102

The fidelity of dynamic signaling by noisy biomolecular networksPLOS Computational Biology 9:e1002965.https://doi.org/10.1371/journal.pcbi.1002965

Bicoid gradient formation mechanism and dynamics revealed by protein lifetime analysisMolecular Systems Biology 14:e8355.https://doi.org/10.15252/msb.20188355

Maximum likelihood and the single receptorPhysical Review Letters 103:158101.https://doi.org/10.1103/PhysRevLett.103.158101

BookExtrapolation, Interpolation, and Smoothing of Stationary Time Series: With Engineering ApplicationsMIT Press.

Fundamental limits to collective concentration sensing in cell populationsPhysical Review Letters 118:078101.https://doi.org/10.1103/PhysRevLett.118.078101

Signaling noise enhances chemotactic drift of E. coliPhysical Review Letters 109:148101.https://doi.org/10.1103/PhysRevLett.109.148101

BookStochastic Methods: A Handbook for the Natural and Social SciencesBerlin: SpringerVerlag.

Fundamental limits on sensing chemical concentrations with linear biochemical networksPhysical Review Letters 109:218103.https://doi.org/10.1103/PhysRevLett.109.218103

Energy dissipation and noise correlations in biochemical sensingPhysical Review Letters 113:258102.https://doi.org/10.1103/PhysRevLett.113.258102

Physical limits on cellular sensing of spatial gradientsPhysical Review Letters 105:048104.https://doi.org/10.1103/PhysRevLett.105.048104

Quantitative modeling of Escherichia coli chemotactic motion in environments varying in space and timePLOS Computational Biology 6:e1000735.https://doi.org/10.1371/journal.pcbi.1000735

The BergPurcell limit revisitedBiophysical Journal 106:976–985.https://doi.org/10.1016/j.bpj.2013.12.030

A new approach to linear filtering and prediction problemsJournal of Basic Engineering 82:35–45.https://doi.org/10.1115/1.3662552

BookProbability theory and mathematical statisticsIn: Watanabe S, Prokhorov J. V, editors. Selected Works of A. N. Kolmogorov. Netherlands: Springer Science & Business Media. pp. 8–14.https://doi.org/10.1007/BFb0078455

The energyspeedaccuracy tradeoff in sensory adaptationNature Physics 8:422–428.https://doi.org/10.1038/nphys2276

Thermodynamics of statistical inference by cellsPhysical Review Letters 113:148103.https://doi.org/10.1103/PhysRevLett.113.148103

A simple coding procedure enhances a neuron's information capacityZeitschrift Für Naturforschung C 36:910–912.https://doi.org/10.1515/znc198191040

Cellular stoichiometry of the components of the chemotaxis signaling complexJournal of Bacteriology 186:3687–3694.https://doi.org/10.1128/JB.186.12.36873694.2004

Feedback between motion and sensation provides nonlinear boost in runandtumble navigationPLOS Computational Biology 13:e1005429.https://doi.org/10.1371/journal.pcbi.1005429

Energetic costs of cellular computationPNAS 109:17978–17982.https://doi.org/10.1073/pnas.1207814109

Optimal entrainment of circadian clocks in the presence of noisePhysical Review E 97:032405.https://doi.org/10.1103/PhysRevE.97.032405

Robustness of clocks to input noisePhysical Review Letters 121:078101.https://doi.org/10.1103/PhysRevLett.121.078101

The accuracy of telling time via oscillatory signalsPhysical Biology 13:035005–035014.https://doi.org/10.1088/14783975/13/3/035005

Physical limit to concentration sensing in a changing environmentPhysical Review Letters 123:198101.https://doi.org/10.1103/PhysRevLett.123.198101

Limits of sensing temporal concentration changes by single cellsPhysical Review Letters 104:248101.https://doi.org/10.1103/PhysRevLett.104.248101

Neural coding of natural stimuli: information at submillisecond resolutionPLOS Computational Biology 4:e1000025.https://doi.org/10.1371/journal.pcbi.1000025

Gain control in molecular information processing: lessons from neurosciencePhysical Biology 9:026003–026008.https://doi.org/10.1088/14783975/9/2/026003

Thermodynamics of computational copying in biochemical systemsPhysical Review X 7:021004.https://doi.org/10.1103/PhysRevX.7.021004

Receptor noise and directional sensing in eukaryotic chemotaxisPhysical Review Letters 100:228101.https://doi.org/10.1103/PhysRevLett.100.228101

Noise filtering strategies in adaptive biochemical signaling networks: application to E. coli chemotaxisJournal of Statistical Physics 142:1206–1217.https://doi.org/10.1007/s109550110169z

Free energy cost of reducing noise while maintaining a high sensitivityPhysical Review Letters 115:118102.https://doi.org/10.1103/PhysRevLett.115.118102

Dynamics of cooperativity in chemical sensing among cellsurface receptorsPhysical Review Letters 107:178101.https://doi.org/10.1103/PhysRevLett.107.178101

Fundamental limits to cellular sensingJournal of Statistical Physics 162:1395–1424.https://doi.org/10.1007/s1095501514405

Information transmission in genetic regulatory networks: a reviewJournal of Physics: Condensed Matter 23:153102.https://doi.org/10.1088/09538984/23/15/153102

Mutual information between input and output trajectories of biochemical networksPhysical Review Letters 102:218101.https://doi.org/10.1103/PhysRevLett.102.218101

Mutual information in timevarying biochemical systemsPhysical Review E 81:061917.https://doi.org/10.1103/PhysRevE.81.061917

Physical responses of bacterial chemoreceptorsJournal of Molecular Biology 366:1416–1423.https://doi.org/10.1016/j.jmb.2006.12.024

Quantifying noise levels of intercellular signalsPhysical Review E 75:061905.https://doi.org/10.1103/PhysRevE.75.061905
Decision letter

Raymond E GoldsteinReviewing Editor; University of Cambridge, United Kingdom

Detlef WeigelSenior Editor; Max Planck Institute for Developmental Biology, Germany
In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.
Acceptance summary:
Malaguti and ten Wolde combine analysis of sensing limits for timevarying signals with previous work on resource allocation to minimize the costs of sensing. The work makes a solid contribution towards understanding the principles of resource allocation, and the conclusion that E. coli chemotaxis is optimized for shallow gradients should stimulate further discussion and work.
Decision letter after peer review:
Thank you for submitting your article "Theory for the optimal detection of timevarying signals in cellular sensing systems" for consideration by eLife. Your article has been reviewed by two peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Detlef Weigel as the Senior Editor. The reviewers have opted to remain anonymous.
The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.
Summary:
Malaguti and ten Wolde combine analysis of MoraNemenman (PRL 2019) on BergPurcell type sensing limits for timevarying signals with previous work of one of the authors (ten Wolde, PNAS 2014) on resource allocation to minimize the costs of sensing. The previous work discussed tradeoffs for the costs of sensing a constant signal. Here it is extended to timevarying signals (modeled as colorednoise Gaussian). Although the authors exaggerate the novelty of their analysis, they make a solid contribution towards understanding the principles of resource allocation. The conclusion that E. coli chemotaxis is optimized for shallow gradients seems reasonable and should stimulate further discussion and work. Despite flaws in establishing context, the paper deserves publication
Revisions for this paper:
While enthusiastic about the results in the manuscript, the reviewers found that there are issues involving claims of novelty, and the overall presentation of mathematical results that will need considerable revision.
1) The novelty of their analysis is exaggerated.
The basic problem of optimally estimating the state of a timevarying signal probed by noisy measurements is textbook material for engineers. Here, because everything is linearized about an operating point and because noise is approximated as Gaussian, there is an exact analytic theory going back to Kalman and Bucy in 1961. The result underlies the MoraNemenman analysis (although they, too, did not reference that work for the Gaussian approximation used for their concrete results) and that done here. In engineering, the general problem optimally estimating a stochastic signal via noisy measurements was already considered by Kolmogorov and Wiener in the 1940s and formulated in the time domain, for linear systems driven by white noise (the approximation used here) by Kalman and Bucy.
It is a weakness that the connections are not mentioned specifically by referencing. Appealing to known results not only underlines how different disciplines often need to tackle the same problems, it allows for the use of textbook results and can shorten a paper. As a corollary, authors who map their problem onto known results should not be penalized for doing so, when the application is new and important (as here).
2) Specific sentences have language that seems exaggerated, in the light of historical work on filtering theory:
"Our theory is based on a new concept, the dynamic inputoutput relation p_{τr}(L)."
 Dynamical models (with dynamical inputoutput relations) are a central element of filtering theory.
".Our theory reveals that the sensing error can be decomposed into two terms [sampling (sensing) error and dynamical error]."
 The framework of filtering theory assumes noisy measurements of stochastic signals.
"Our theory illuminates how the optimal design depends on the timescale of the input τ_{L}."
 The statement is true but should be framed in the context of other work, including in systems biology / biophysics, which comes to the same conclusion. For example, the work of Laughlin in 1981, and later Nemenman and Bialek, show that the inputoutput "gain function" should be adapted to the statistics of the input (including time scales of variation). Bialek's 2012 Biophysics book discusses many examples.
3) In its present form, the paper will be essentially unreadable by the vast majority of the eLife readership; the writing style is more appropriate to the Physical Review than to eLife. Indeed, much of the paper is written like a technical note that builds on previous work (Berg/Purcell, etc.) without sufficient explanation for the general reader. This criticism extends to the explanations of the underlying model, the lack of definition of terminology, and the notation.
Here are but a few examples of the points above. The authors are encouraged to rethink the entire presentation de novo for maximum improvement.
a) The definition of a "pushpull" network should be given in the paper.
b) For the purpose of readability by a diverse audience some care should be taken to define technical terms explicitly (e.g. "Markovian"), and to explain various statements more clearly. Appearing so early in the paper, such condensed statements will be offputting to the general reader.
c) The whole notation of "inverting the mapping" that is central to the theory is not wellexplained.
d) Regarding notation: consider, for example, the caption of Figure 2 and Equation 1, in which there is the quantity $\sigma \frac{2}{{\hat{p}}_{{t}_{r}L}}$ . This is simply too complicated, and its meaning will be utterly opaque to the general reader.
4) In many ways it seems that if the authors presaged the development of the theory by indicating the runandtumble context early on it would help the reader to understand the precise motivations behind their lengthy calculations, and to give an idea of the time scales.
https://doi.org/10.7554/eLife.62574.sa1Author response
Revisions for this paper:
While enthusiastic about the results in the manuscript, the reviewers found that there are issues involving claims of novelty, and the overall presentation of mathematical results that will need considerable revision.
1) The novelty of their analysis is exaggerated.
The basic problem of optimally estimating the state of a timevarying signal probed by noisy measurements is textbook material for engineers. Here, because everything is linearized about an operating point and because noise is approximated as Gaussian, there is an exact analytic theory going back to Kalman and Bucy in 1961. The result underlies the MoraNemenman analysis (although they, too, did not reference that work for the Gaussian approximation used for their concrete results) and that done here. In engineering, the general problem optimally estimating a stochastic signal via noisy measurements was already considered by Kolmogorov and Wiener in the 1940s and formulated in the time domain, for linear systems driven by white noise (the approximation used here) by Kalman and Bucy.
It is a weakness that the connections are not mentioned specifically by referencing. Appealing to known results not only underlines how different disciplines often need to tackle the same problems, it allows for the use of textbook results and can shorten a paper. As a corollary, authors who map their problem onto known results should not be penalized for doing so, when the application is new and important (as here).
We fully acknowledge the point that is made here and we agree that we should have described the connection with filtering theory much more clearly, especially given the fact that we have applied filtering theory ourselves to biochemical signalling before (Becker, Mugler, Ten Wolde, 2015).
We indeed employ a Gaussian model, and for such a Gaussian model Kalman and WienerKolmogorov filtering theory become exact, as the reviewers correctly point out. In our previous work, we studied different types of input signals (Markovian and nonMarkovian), and then used WienerKolmogorov filtering theory to derive the optimal topology of the sensing network for the different types of input signals (Becker et al., 2015). In our current work, we consider one type of input signal—a stationary Markovian signal. The optimal filter for this type of input signal is a lowpass, exponential filter and the canonical network motif studied in our current study implements precisely such a filter (see Becker et al., 2015). This filter is useful, because it allows the system to time integrate the input signal and hence filter out the highfrequency input noise arising here from receptorligand binding. We agree with the reviewers that these ideas are well established and that we should describe them more clearly in our manuscript.
Yet, while filtering theory provides a powerful approach to elucidating the optimal topology and response dynamics of the sensing network, as determined by the statistics of the input signal, it does not naturally reveal the resource requirements for sensing. To this end, we have generalised the sampling framework that we have developed previously (Govern and Ten Wolde, 2014). This framework is particularly well suited for elucidating the resource requirements for time integration, because it starts from the observation that the receptorreadout system consists of discrete molecules that interact with the receptor in a stochastic fashion: our description is thus based on the idea that the readout system is a sampling device that implements the mechanism of time integration not by continuously integrating the state of the receptor, but rather by discretely and stochastically sampling it, via the collisions of the readout molecules with the receptor proteins. The novelty of this manuscript is that we have now extended this sampling framework, for the first time, to timevarying signals.
In the revised manuscript, we have thoroughly rewritten the Introduction and added two new paragraphs to the Discussion. In the Introduction, we now mention that the problem of optimally predicting timevarying signals is a classic problem, for which different theories have been developed. We then also briefly review the application of filtering theories to biochemical signalling, adding relevant references. We then emphasise that we consider one class of input signals and one canonical network motif, which is known to implement a filter that is optimal for this type of signals, and raise the central question of our manuscript, which is what the cellular resource requirements are for optimally implementing such a network. We end the Introduction with a brief discussion of the main ideas of our theory and our findings.
In the revised Introduction, we also note that our theory applies to systems that employ the mechanism of time integration, and not to systems that employ MaximumLikelihood sensing or Bayesian filtering, which underlies the analysis of Mora and Nemenman, 2019.
In the new paragraphs in the Discussion section, we first emphasise that our model is Gaussian and that the performance criterion of our theory, minimizing the meansquare error, is identical to that of Wiener and Kalman filtering theory, which are exact for Gaussian models. We then mention that for our Gaussian model minimizing the meansquare error is equivalent to maximizing the mutual information, thus making a connection between our theory, filtering theories, and information theory. In the next paragraph, we then discuss the concepts that have emerged from these theories and we describe how our findings relate to these concepts; here we have also added a new Appendix 3, where we work out this connection in more mathematical detail.
We thank the reviewers and the editor for this important comment, because we agree that addressing it puts our work into a broader context.
2) Specific sentences have language that seems exaggerated, in the light of historical work on filtering theory:
"Our theory is based on a new concept, the dynamic inputoutput relation p_{τr}(L)."
 Dynamical models (with dynamical inputoutput relations) are a central element of filtering theory.
We fully agree that dynamic inputoutput relations are a key concept in filtering theory and we acknowledge that similar inputoutput relations have been studied before in the biophysics literature (Bialek et al., IEEE, 2006; Tostevin and Ten Wolde, 2010; Nemenman, 2012; Hinczewski and Thirumalai, 2014; Becker et al., 2015). We merely wanted to emphasise that our dynamic inputoutput relation p_{τr}(L) differs fundamentally from the conventional static inputoutput relations that are typically considered in the context of sensing static signals (Berg and Purcell, 1977; Bialek and Setayesghar, 2005; Kaizu et al., 2014; Mugler et al., 2016).
We have completely rewritten this paragraph. As we describe in our response to the previous point, we now first introduce filtering theory, then describe the novel aspect of the current manuscript, which is the extension of our sampling framework to timevarying signals, and then mention the dynamic inputoutput relation, contrasting it with the static inputoutput relation used to describe the sensing of static signals.
"Our theory reveals that the sensing error can be decomposed into two terms [sampling (sensing) error and dynamical error]."
 The framework of filtering theory assumes noisy measurements of stochastic signals.
We agree with this, but our decomposition is more specific, showing how the sampling error and the dynamical error depend on the cellular resources protein copies, time, and energy. Our decomposition identifies the combinations of cellular resources, the “collective variables”, that control the sensing precision. It is this decomposition that allows us to make specific predictions on the optimal design of sensing systems that maximize the sensing precision given resource constraints. This is the novel contribution of our paper.
We now emphasise this more clearly.
"Our theory illuminates how the optimal design depends on the timescale of the input τ_{L}."
 The statement is true but should be framed in the context of other work, including in systems biology / biophysics, which comes to the same conclusion. For example, the work of Laughlin in 1981, and later Nemenman and Bialek, show that the inputoutput "gain function" should be adapted to the statistics of the input (including time scales of variation). Bialek's 2012 Biophysics book discusses many examples.
The idea of an optimal integration time, and the related ideas of matching the gain and the inputoutput relation to the statistics of the input, are indeed well known. In fact, we had cited a number of papers that also show the existence of an optimal integration time (Becker et al., 2015; Monti et al., 2018; Mora and Nemenman, 2019). The aim of our paragraph in the Discussion, was to discuss how the input timescale governs the optimal design in terms of the cellular resources receptors, readout molecules, and energy—this is the novel aspect of our work. But we agree with the reviewers that it is important to connect our observations to previous ideas on the optimal gain and inputoutput relations that are matched to the statistics of the input. We therefore now first discuss here, in the Discussion section, these ideas and then describe how our findings relate to them. After we have discussed this connection, we emphasise that our sampling framework gives a detailed description of the optimal design in terms of the required cellular resources, and then discuss how this depends on the statistics of the input signal.
More concretely, we first point out that our model is Gaussian and that the performance criterion of our theory is minimizing the meansquared error, which is precisely the performance criterion of Wiener and Kalman filtering. We emphasise that the pushpull network is an exponential filter and that this is the optimal filter for the memoryless (Markovian) signals as studied here. We also point out that for our Gaussian model minimizing the meansquare error is equivalent to maximizing the mutual information, thus making the connection between our work, filtering theory, and information theory. In the next paragraph, we then describe concepts that have emerged from filtering and information theory, and describe how our findings relate to these concepts. Here, we also refer to a new Appendix 3, in which we work this connection out in more (mathematical) detail. In this appendix we also discuss concepts for optimizing information transmission in nonlinear systems, which are beyond the scope of the current model.
3) In its present form, the paper will be essentially unreadable by the vast majority of the eLife readership; the writing style is more appropriate to the Physical Review than to eLife. Indeed, much of the paper is written like a technical note that builds on previous work (Berg/Purcell, etc.) without sufficient explanation for the general reader. This criticism extends to the explanations of the underlying model, the lack of definition of terminology, and the notation.
Here are but a few examples of the points above. The authors are encouraged to rethink the entire presentation de novo for maximum improvement.
We acknowledge that the audience of eLife is broader than that of Physical Review and we have rewritten the manuscript drastically. We now try to avoid technical terms like “Markovian”, “mapping”, “pushpull” network as much as possible, and where this cannot be avoided we have explained them for a broader audience. Indeed, we have carefully verified whether all the necessary concepts that are needed to follow our analysis are explained for a broad audience. For example, in the description of our model and our theory we now describe in more detail the concept of the input correlation time, receptor correlation time, readout relaxation time, the idea of time integration via discrete sampling of the receptor state. We have also simplified notation. Most importantly, perhaps, we have tried to rewrite the manuscript in a more lucid style so that it is easier to follow for a broad audience (see, for example, the description of the optimal design principle, Equation 12). With these improvements, we strongly believe that the revised manuscript is accessible for the broad readership of eLife, also because the previous, well cited, paper on which the current manuscript builds, is written with a similar level of detail and published in a journal with a similarly broad audience (Govern and Ten Wolde, 2014).
a) The definition of a "pushpull" network should be given in the paper.
While we could replace “pushpull network” with “cellular network” or simply “network”, we think it does help to keep this term, because it specifically refers to the cycle of readout phosphorylation and dephosphorylation downstream of the receptor. We have therefore decided to keep this term. We do, however, now explain it in the Introduction when we introduce it for the first time.
b) For the purpose of readability by a diverse audience some care should be taken to define technical terms explicitly (e.g. "Markovian"), and to explain various statements more clearly. Appearing so early in the paper, such condensed statements will be offputting to the general reader.
Since “Markovian” only appears twice in the manuscript, we have changed “Markovian” by “memoryless” with an explanation. The concepts of “correlation time”, “integration time”, and “relaxation time”, are important and used widely in the manuscript; we have therefore added explanations the first time we introduce these, see Section “Theory: Model”. We have also clarified other technical terms, such as time averaging, sampling device, coding.
c) The whole notation of "inverting the mapping" that is central to the theory is not wellexplained.
We have clarified that.
d) Regarding notation: consider, for example, the caption of Figure 2 and Equation 1, in which there is the quantity $\sigma \frac{2}{{\hat{p}}_{{t}_{r}L}}$ This is simply too complicated, and its meaning will be utterly opaque to the general reader.
We have now simplified the notation as much as possible, while trying to keep clear to what specific quantity the symbol actually refers. For example, we have simplified $\phantom{\rule{0.333em}{0ex}}\mathrm{\text{\sigma}}\frac{2}{{\hat{p}}_{{t}_{r}L}}$ to $\phantom{\rule{0.333em}{0ex}}\mathrm{\text{\sigma}}\frac{2}{{\hat{p}}_{{t}_{r}}}$; we needed to retain the subscript τ_{r} because this quantity denotes the variance not of the receptor occupancy p but rather that of the receptor occupancy p_{τr} over the integration time τ_{r}.
4) In many ways it seems that if the authors presaged the development of the theory by indicating the runandtumble context early on it would help the reader to understand the precise motivations behind their lengthy calculations, and to give an idea of the time scales.
We would like to emphasise that the analysis is much more generic than the specific application to the E. coli chemotaxis system. We therefore would like to maintain the distinction between the general theory and the application of it to E. coli chemotaxis. However, we also agree that introducing this system earlier helps to understand the motivation and main ideas behind our theory. We have therefore expanded the discussion of this system in the Introduction.
https://doi.org/10.7554/eLife.62574.sa2Article and author information
Author details
Funding
Nederlandse Organisatie voor Wetenschappelijk Onderzoek
 Giulia Malaguti
 Pieter Rein ten Wolde
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
We wish to acknowledge Bela Mulder, Tom Shimizu, and Tom Ouldridge for many fruitful discussions and a careful reading of the manuscript. This work is part of the research program of the Netherlands Organisation for Scientific Research (NWO) and was performed at the research institute AMOLF.
Senior Editor
 Detlef Weigel, Max Planck Institute for Developmental Biology, Germany
Reviewing Editor
 Raymond E Goldstein, University of Cambridge, United Kingdom
Publication history
 Received: September 24, 2020
 Accepted: February 12, 2021
 Accepted Manuscript published: February 17, 2021 (version 1)
 Version of Record published: March 10, 2021 (version 2)
Copyright
© 2021, Malaguti and ten Wolde
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 1,022
 Page views

 238
 Downloads

 5
 Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading

 Physics of Living Systems
Spatial organization of chromatin plays a critical role in genome regulation. Previously, various types of affnity mediators and enzymes have been attributed to regulate spatial organization of chromatin from a thermodynamics perspective. However, at the mechanistic level, enzymes act in their unique ways and perturb the chromatin. Here, we construct a polymer physics model following the mechanistic scheme of TopoisomeraseII, an enzyme resolving topological constraints of chromatin, and investigate how it affects interphase chromatin organization. Our computer simulations demonstrate TopoisomeraseII's ability to phase separate chromatin into eu and heterochromatic regions with a characteristic walllike organization of the euchromatic regions. We realized that the ability of the euchromatic regions to cross each other due to enzymatic activity of TopoisomeraseII induces this phase separation. This realization is based on the physical fact that partial absence of selfavoiding interaction can induce phase separation of a system into its selfavoiding and nonselfavoiding parts, which we reveal using a meanfield argument. Furthermore, motivated from recent experimental observations, we extend our model to a bidisperse setting and show that the characteristic features of the enzymatic activity driven phase separation survive there. The existence of these robust characteristic features, even under the nonlocalized action of the enzyme, highlights the critical role of enzymatic activity in chromatin organization.

 Cell Biology
 Physics of Living Systems
Inside prokaryotic cells, passive translational diffusion typically limits the rates with which cytoplasmic proteins can reach their locations. Diffusion is thus fundamental to most cellular processes, but the understanding of protein mobility in the highly crowded and nonhomogeneous environment of a bacterial cell is still limited. Here we investigated the mobility of a large set of proteins in the cytoplasm of Escherichia coli, by employing fluorescence correlation spectroscopy (FCS) combined with simulations and theoretical modeling. We conclude that cytoplasmic protein mobility could be well described by Brownian diffusion in the confined geometry of the bacterial cell and at the high viscosity imposed by macromolecular crowding. We observed similar size dependence of protein diffusion for the majority of tested proteins, whether native or foreign to E. coli. For the fasterdiffusing proteins, this size dependence is well consistent with the StokesEinstein relation once taking into account the specific dumbbell shape of protein fusions. Pronounced subdiffusion and hindered mobility are only observed for proteins with extensive interactions within the cytoplasm. Finally, while protein diffusion becomes markedly faster in actively growing cells, at high temperature, or upon treatment with rifampicin, and slower at high osmolarity, all of these perturbations affect proteins of different sizes in the same proportions, which could thus be described as changes of a welldefined cytoplasmic viscosity.