Pathway dynamics can delineate the sources of transcriptional noise in gene expression
Abstract
Singlecell expression profiling opens up new vistas on cellular processes. Extensive celltocell variability at the transcriptomic and proteomic level has been one of the standout observations. Because most experimental analyses are destructive we only have access to snapshot data of cellular states. This loss of temporal information presents significant challenges for inferring dynamics, as well as causes of celltocell variability. In particular, we typically cannot separate dynamic variability from within cells (‘intrinsic noise’) from variability across the population (‘extrinsic noise’). Here, we make this nonidentifiability mathematically precise, allowing us to identify new experimental setups that can assist in resolving this nonidentifiability. We show that multiple generic reporters from the same biochemical pathways (e.g. mRNA and protein) can infer magnitudes of intrinsic and extrinsic transcriptional noise, identifying sources of heterogeneity. Stochastic simulations support our theory, and demonstrate that ‘pathwayreporters’ compare favourably to the wellknown, but often difficult to implement, dualreporter method.
eLife digest
In biology, seemingly random variation within or between cells can have significant effects on a number of cellular processes, like how cells divide and develop. For example, how often a gene is switched on, or ‘expressed’, can randomly fluctuate over time. This ‘noise’ may lead to a cell having slightly more of a particular molecule, causing it to behave differently to other cells in the population. However, it is currently unclear how this random variation is created and controlled in cells, and what effect this has on biological systems as a whole.
When a gene is expressed, its sequence typically gets copied in to a molecule called mRNA, which is then processed and used to build the protein encoded by the gene. By measuring the levels of mRNA molecules in individual cells, researchers have been able to investigate how gene expression varies within populations. These experiments are carried out on dead cells at a single point in time, and mathematical models are then applied to detect noise in the molecular data.
This approach, however, precludes how noise changes over time, making it difficult to determine the source of celltocell variability. In particular, whether the variation detected is the result of genuine random molecular changes (intrinsic noise), or external factors – such as temperature and pH – fluctuating in the cells environment (extrinsic noise).
Here, Ham et al. have built on previous mathematical models to identify a new approach for investigating the source of molecular noise. They found that for any given gene it is impossible to understand what causes its activity levels to vary just from data on its mRNA levels. Instead, information on other molecules that are affected by expression of the gene (termed ‘pathway reporters’) can provide a clearer picture of whether molecular variability is the result of intrinsic or extrinsic noise.
The mathematical models developed by Ham et al. reveal what can and cannot be learned about noise from gene expression data. Furthermore, pathwayreporters are easier to measure experimentally than other reporters that are typically used to study the origins and effects of celltocell variability. These findings could help researchers design single cell experiments that are better for studying noise, leading to a deeper understanding of how different types of variation impact cell biology.
Introduction
Noise is a fundamental aspect of every cellular process Shahrezaei and Swain, 2008b. Frequently it is even of functional importance, for example in driving cellfate transitions. Sometimes it can afford evolutionary advantages, for example, in the context of bethedging strategies. Sometimes, it can be a nuisance, for example, when it makes cellular signal processing more difficult. But noise is nearly ubiquitous at the molecular scale, and its presence has profoundly shaped cellular life. Analysing and understanding the sources of noise, how it is propagated, amplified or attenuated, and how it can be controlled, has therefore become a cornerstone of modern molecular cell biology.
Noise arising in gene expression has arguably attracted most of the attention so far (but see e.g. Filippi et al., 2016 and Jetka et al., 2018 for the analysis of noise at the signalling level). Generally speaking, gene expression noise is separable into two sources of variability, as pioneered by Swain et al., 2002. Intrinsic noise is generated by the dynamics of the gene expression process itself. The process, however, is often influenced by other external factors, such as the availability of promoters and of RNA polymerase, the influence of long noncoding RNA as a transcriptional regulator Goodrich and Kugel, 2006, as well as differences in the cellular environment. Such sources of variability contribute extrinsic noise, and reflect the variation in gene expression and transcription activity across the cell population. As such, understanding extrinsic noise lies at the heart of understanding cellpopulation heterogeneity.
So far, identifying the sources of gene expression noise from transcriptomic measurements alone has proven difficult Paulsson, 2004; Pedraza and Paulsson, 2008. The fundamental hindrance lies in the fact that singlecell RNA sequencing, which provides most of the available data, is destructive, so that datasets reflect samples from across a population, rather than samples taken repeatedly from the same cell. As temporal information is lost in such measurements Komorowski et al., 2011, it may be impossible to distinguish temporal variability within individual cells (e.g. burstiness), from ensemble variability across the population (i.e. extrinsic noise). A number of numerical and experimental studies have suggested this confounding effect Jones and Elf, 2018; Jones et al., 2014; Zopf et al., 2013, showing that systems with intrinsic noise alone exhibit behaviour that is indistinguishable from systems with both extrinsic and intrinsic noise. This is examined more formally in Ham et al., 2020a, where we show that the moment scaling behaviour and transcript distribution may be indistinguishable from situations with purely intrinsic noise. The limitations in inferring dynamics from population data are becoming increasingly evident, and a number of studies that seek to address some of these problems have emerged Skinner et al., 2016; Gorin and Pachter, 2020a.
Here we provide adetailed analysis of the extent to which sources of variability are identifiable from population singlecell omics data. We are able to prove rigorously that it is in general impossible to identify the sources of variability, and consequently, the underlying transcription dynamics, from observed transcript abundance distributions alone. Systems with intrinsic noise alone can always present identically to similar systems with extrinsic noise. For practical purposes, the effect does not appear to depend on the precise choice of distribution, but holds more generally. Moreover, we demonstrate that extrinsic noise invariably distorts the apparent degree of burstiness of the underlying system: data which seems ‘bursty’ is not necessarily generated by a bursty process, if there is celltocell variability across the population. This is a stronger nonidentifiability result than has previously been obtained Ingram et al., 2008; Shahrezaei and Swain, 2008a; Khammash, 2009; Munsky et al., 2009; Komorowski et al., 2011, and has important ramifications for our analysis of experimental data: we cannot assess causes of transcriptional variability, if we do not simultaneously assess celltocell variability in the transcriptional machinery. Our results highlight (in fact prove mathematically) the requirement for additional information, beyond the observed copy number distribution, in order to constrain the space of possible dynamics that could give rise to the same distribution.
This seemingly intractable problem can at least partially be resolved with a brilliantly simple approach: the dualreporter method Swain et al., 2002. In this approach, noise can be separated into extrinsic and intrinsic components, by observing correlations between the expression of two independent, but identically distributed fluorescent reporter genes. Dualreporter assays have been employed experimentally to study the noise contributed by both global and genespecific effects Elowitz et al., 2002; Raser and O'Shea, 2005; Raj et al., 2006. A particular challenge, however, is that dual reporters are rarely identically regulated Raj et al., 2006; Quarton et al., 2020, and are not straightforward to set up in every system. More recently, it has been shown that the independence of dual reporters cannot be guaranteed in some systems Naik et al., 2021. As a result, there have been efforts in developing alternative methods for decomposing noise Quarton et al., 2020; Singh, 2014; Lin and Amir, 2021. Here we develop a widely applicable generalisation (and simplification) of the original dualreporter approach Swain et al., 2002. We demonstrate that nonidentical and notnecessarily independent reporters can provide an analogous noise decomposition. Our result shows that measurements taken from the same biochemical pathway (e.g. mRNA and protein) can replace dual reporters, enabling the noise decomposition to be obtained from a single gene. This completely circumvents the requirement of conditionally independent and identically regulated reporter genes. The results obtained from our ‘pathwayreporter’ method are also borne out by stochastic simulations, and compare favourably with the dualreporter method. In the case of constitutive expression, the results obtained from our decomposition are identical to those obtained from dual reporters. For bursty systems, we show that our approach provides a satisfactorily close approximation, except in extreme cases. Our methodology is verified mathematically for the most common models of gene transcription, but holds independently of the specific nature of the gene expression dynamics, as we verify in silico across a range of more detailed models.
Materials and methods
A simple model for stochastic mRNA dynamics is the Telegraph model: a twostate model describing promoter switching, transcription, and mRNA degradation. In this model, all parameters are fixed, and gene expression variability arises due to the inherent stochasticity of the transcription process. As discussed above, this process will often be influenced by extrinsic sources of variability, and so modifications to account for this additional source of variability are required.
The telegraph model
Request a detailed protocolThe Telegraph model was first introduced in Ko, 1991, and has since then been widely employed in the literature to model bursty gene expression in eukaryotic cells Bahar Halpern et al., 2015; So et al., 2011; Suter et al., 2011; Larsson et al., 2019. In this model, the gene switches probabilistically between an active state and an inactive state, at rates λ (onrate) and μ (offrate), respectively. In the active state, mRNAs are synthesised according to a Poisson process with rate $K$, while in the inactive state, transcription does either not occur, or possibly occurs at some lower Poisson rate, ${K}_{0}\ll K$. Degradation of mRNA molecules occurs independently of the gene promoter state at rate δ. Figure 1A shows a schematic of the Telegraph model. Throughout the discussion here, we will rescale all parameters of the Telegraph model by the mRNA degradation rate, so that $\delta =1$. The steadystate distribution for the mRNA copy number can be explicitly calculated as Peccoud and Ycart, 1995,
Here, θ denotes the parameter vector $(\mu ,\lambda ,K,\delta )$, the function ${}_{1}F_{1}$ is the confluent hypergeometric function Abramowitz and Stegun, 1965, and, for real number $x$ and positive integer $n$, the notation ${x}^{(n)}$ abbreviates the rising factorial of $x$ (also known as the Pochhammer function). Throughout, we refer to the probability mass function ${\stackrel{~}{p}}_{T}(n;\theta )$ as the Telegraph distribution with parameters θ.
Constitutive gene expression is a limiting case of the Telegraph model, which arises when the offrate μ is 0, so that the gene remains permanently in the active state. In this case, the Telegraph distribution simplifies to a Poisson distribution with rate $K$; the distribution $\mathrm{P}\mathrm{o}\mathrm{i}\mathrm{s}(K)$.
At the opposite extreme is instantaneously bursty gene expression in which mRNA are produced in very short bursts with potentially prolonged periods of inactivity in between. This mode of gene expression has been frequently reported experimentally, particularly in mammalian genes Raj et al., 2006; Bahar Halpern et al., 2015; Suter et al., 2011; Larsson et al., 2019. Transcriptional bursting may be treated as a limit of the Telegraph model, where the offrate, μ, tends to infinity, while the onrate λ remains fixed. In this limit, it can be shown Jones and Elf, 2018; Ham et al., 2020a that the Telegraph distribution converges to the negative binomial distribution $\mathrm{N}\mathrm{e}\mathrm{g}\mathrm{B}\mathrm{i}\mathrm{n}(\lambda ,\frac{K}{\mu +K})$.
The compound distribution
Request a detailed protocolWe can account for random celltocell variation across a population by way of a compound distribution Ham et al., 2020b
where $\stackrel{~}{p}(n;\theta )$ is the stationary probability distribution of a system with fixed parameters θ and $f(\theta ;\eta )$ denotes the multivariate distribution for θ with hyperparameters η. Often we will take $\stackrel{~}{p}(n;\theta )$ to be the stationary probability distribution of the Telegraph model ((1)), and refer to (2) as the compound Telegraph distribution. Sometimes $\stackrel{~}{p}(n;\theta )$ will be the Poisson distribution or the negative binomial distribution, depending on the underlying mode of gene activity. Figure 1B gives a pictorial representation of the compound distribution.
The compound distribution is valid in the case of ensemble heterogeneity, that is, when parameter values differ between individual cells according to the distribution $f(\theta ;\eta )$, but remain constant over time Filippi et al., 2016. This model is also a valid approximation for individual cells with dynamic parameters, provided these change sufficiently slower than the transcriptional dynamics Lenive et al., 2016. In general, the compound Telegraph distribution $\stackrel{~}{q}(n;\eta )$ will be more dispersed than a Telegraph distribution to account for the uncertainty in the parameters; see Figure 1C. Such dispersion is widely observed experimentally, and as demonstrated in Ham et al., 2020a, reflects the presence of extrinsic noise.
In the context of gene expression, it has been shown experimentally that some of the primary sources of extrinsic noise have an autocorrelation time comparable to the cell cycle Rosenfeld et al., 2005. It is these slow changes in variability that justify the assumptions of the compound model. Typical sources of extrinsic variability for each parameter in the Telegraph model are summarised in Table 1 of Ham et al., 2020a. A further significant source of heterogeneity arises from the differences in cellcycle phases across the population Skinner et al., 2016. Such effects have been shown to obscure the precise underlying transcriptional dynamics Zopf et al., 2013; Huh and Paulsson, 2011, and impede the inference of transcriptional parameters from experimental data Beentjes et al., 2020. The compound model we consider here is able to capture this variability, provided that the parameter change within each cell phase is relatively slow, and any dynamic parameter changes during the transition between phases can be considered as ephemeral. A more explicit treatment of the mechanisms and changes during the cell cycle is challenging to study analytically, and theoretical modelling is only in its infancy Cao and Grima, 2020; Beentjes et al., 2020; PerezCarrasco et al., 2020. Later, we verify our proposed noise decomposition on detailed models of gene transcription, incorporating salient features of the cellcycle, such as gene replication, dosage compensation, binomial partitioning of products due to cell division, and cellcycle length variability.
Results
Identifiability considerations
Decoupling the effects of extrinsic noise from experimental measurements has been notoriously challenging. In the context of (2), the distribution $f(\theta ;\eta )$ reflects the population heterogeneity, but experimental data provides samples only of $\stackrel{~}{q}(n;\eta )$. How much can we deduce of the underlying dynamics (that is, $\stackrel{~}{p}(n;\theta ))$, and the population heterogeneity ($f(\theta ;\eta )$), from measurements of transcripts from across the cell population ($\stackrel{~}{q}(n;\eta )$)?
Of course, even though we may be able to infer the parameter η from experimental data, the expression $\stackrel{~}{p}(n;\theta )$ is really a family of distributions, parameterised by θ. This presents two fundamental challenges. The first is the possibility that there are different families of distributions $\stackrel{~}{p}(n;\theta )$ that can yield the same compound distribution, $\stackrel{~}{q}(n;\eta )$, but which are generated by different mechanisms, that is noise distributions, $f(\theta ;\eta )$. The second, perhaps more subtle, challenge is that, even for a fixed family of distributions $\stackrel{~}{p}(n;\theta )$ it may be possible that different choices of the noise distribution $f(\theta ;\eta )$ could still yield the same compound distribution $\stackrel{~}{q}(n;\eta )$. In these situations, we cannot hope to infer a unique solution for the noise distribution. This belongs to the important class of identifiability problems Villaverde, Villaverde and Banga, 2017; it has important ramifications for the interpretability of parameter estimates obtained from experimental data Ingram et al., 2008. Indeed, if two or more model parameterisations are observationally equivalent (in this case, in the form of the transcript abundance distribution $\stackrel{~}{q}(n;\eta )$), then not only does this cast doubt upon the suitability of the model to represent (and subsequently predict) the system, but it also obstructs our ability to infer mechanistic insight from experimental data.
An example of the first identifiability problem arises from a wellknown example of a compound distribution, (2): when $f(\theta ;\eta )$ is a gamma distribution and $\stackrel{~}{p}(n;\theta )$ is a Poisson distribution, corresponding to constitutive gene expression, the resulting compound distribution $\stackrel{~}{q}(n;\eta )$ is a negative binomial distribution Greenwood and Yule, 1920. But this is the same distribution as that arising from instantaneously bursty gene expression Ham et al., 2020a. Such identifiability instances may be circumvented if there is confidence in the basic mode of gene activity, that is, if there is reason to believe that a gene is not constitutively active, for example. We find, however, that there are numerous instances that can present insurmountable identifiability problems.
Bursty gene expression
We first observe that any Telegraph distribution with fixed parameters can be identically obtained from a Telegraph distribution with parameter variation. As shown in the supplementary material (Appendix Pathway dynamics can delineate the sources of transcriptional noise in gene expression), any Telegraph distribution ${\stackrel{~}{p}}_{T}(n;\lambda ,{\mu}^{\prime},{K}^{\prime})$ can be written as,
where $\mu <{\mu}^{\prime}$ and ${f}_{{K}^{\prime}}(t;\lambda +\mu ,{\mu}^{\prime}\mu )$ is the probability density function for a scaled beta distribution ${\mathrm{B}\mathrm{e}\mathrm{t}\mathrm{a}}_{K}(\lambda +\mu ,{\mu}^{\mathrm{\prime}}\mu )$ with support $[0,{K}^{\prime}]$. Thus, any Telegraph distribution can be obtained by varying the transcription rate parameter on a narrower Telegraph distribution (i.e. with a smaller offrate) according to a scaled beta distribution. Figure 2A (top panel) compares the representation obtained in (3) with the corresponding fixedparameter Telegraph distribution for two different sets of parameters. When $\mu =0$ the representation given in (3) simplifies to the wellknown Poisson representation of the Telegraph distribution in terms of the scaled beta distribution Srividya et al., 2009.
Instantaneously bursty gene expression
The previous result extends to instantaneously bursty systems. The copy number distribution of an instantaneously bursty system can be obtained from both bursty and instantaneously bursty dynamics, provided that there is appropriate parameter variation. The supplementary material contains the relevant derivations; refer to Appendix Pathway dynamics can delineate the sources of transcriptional noise in gene expression. In the following, we let ${\stackrel{~}{p}}_{\text{NB}}(n;r,\beta )$ denote the probability mass function of a $\mathrm{N}\mathrm{e}\mathrm{g}\mathrm{B}\mathrm{i}\mathrm{n}(r,\frac{\beta}{\beta +1})$ distribution, where $\beta >0$. Then for any negative binomial distribution of the form $\mathrm{N}\mathrm{e}\mathrm{g}\mathrm{B}\mathrm{i}\mathrm{n}(\lambda ,\frac{\beta}{\beta +1})$ we have,
where $f(t;\lambda +\mu ,\beta )$ is the probability density function of a $\mathrm{G}\mathrm{a}\mathrm{m}\mathrm{m}\mathrm{a}(\lambda +\mu ,\beta )$ distribution. This result generalises the aforementioned wellknown representation of the negative binomial distribution Greenwood and Yule, 1920, which corresponds to the particular case of $\mu =0$. In Figure 2A (middle panel), we compare the representation obtained in (4) with the corresponding fixedparameter negative binomial distribution for two different sets of parameters.
We also obtain the following representation for a negative binomial distribution in terms of a scaled beta prime distribution,
where ${f}_{b}(b\theta 1;\lambda {\lambda}^{\prime},{\lambda}^{\prime})$ is the probability mass function of a scaled beta prime ${\mathrm{B}\mathrm{e}\mathrm{t}\mathrm{a}\mathrm{P}\mathrm{r}\mathrm{i}\mathrm{m}\mathrm{e}}_{b}(\lambda {\lambda}^{\mathrm{\prime}},\lambda )$ distribution, where $b>0$ and $\lambda >{\lambda}^{\prime}$. This equivalently corresponds to scaled beta noise ${\mathrm{B}\mathrm{e}\mathrm{t}\mathrm{a}}_{b}(\lambda {\lambda}^{\mathrm{\prime}},{\lambda}^{\mathrm{\prime}})$ on the inverse of the expected burst intensity. Thus, the distribution of any instantaneously bursty system with mean burst intensity $b$ can be obtained from one with greater burst frequency, by varying the mean burst intensity θ according to a shifted beta prime distribution. Figure 2A (bottom panel), compares the representation obtained in (5) with the associated fixedparameter negative binomial distribution for two different sets of parameters.
An exception: constitutive expression
It has long been known Feller, 1943 that a compound Poisson distribution uniquely determines the compounding distribution. In the context of (2), this means the full extrinsic noise distribution $f(\theta ,\eta )$ is identifiable from $\stackrel{~}{q}(n;\eta )$. As we will demonstrate in related work (Ham et al., in preparation) , it is therefore possible to extract the extrinsic noise distribution, $f(\theta ,\eta )$, from transcript copy number measurements.
Implications for parameter inference
Estimates of kinetic parameters from experimental data suggest that gene expression is often either bursty or instantaneously bursty (i.e., $\mu \gg \lambda $). In turn, the assumption that geneinactivation events occur far more frequently than geneactivation events is often used to derive other models of stochastic mRNA dynamics Jia and Grima, 2020; Cao and Grima, 2020; Beentjes et al., 2020. The representations given in (3 – 5), however, show that both estimating parameters and the underlying dynamics from the form of the copy number distribution alone can be misleading. Noise on the transcription rate will invariably produce copy number data that is suggestive of a more bursty model. To illustrate this, consider an example in which the underlying process is a (mildly) bursty Telegraph system with distribution ${\stackrel{~}{p}}_{T}(n;2,3,K)$. Now assume that noise on the transcription rate parameter $K$ follows the scaled Beta distribution on the interval $[0,100]$ with $\alpha =\lambda +\mu =2+3=5$ and $\beta ={\mu}^{\prime}\mu =123=9$. The shape of this noise distribution closely resembles a slightly skewed Gaussian distribution, with the majority of transcription rates between around 10 and 60. This noise on the transcription rate $K$ within the Telegraph system ${\stackrel{~}{p}}_{T}(n;2,3,K)$ will present identically to the significantly burstier system ${\stackrel{~}{p}}_{T}(n;2,12,100)$.
It is of practical importance to recognise that, while the nonidentifiability results (summarised in Table 1) are dependent on specific noise distributions, for practical purposes any similar distribution will produce a similar effect. To demonstrate this, we replace the various noise distributions required for the representations in (3 – 5), with suitable normal distributions truncated at 0. In each case, we sample 1000 data points from the corresponding compound distribution, and compared this with the associated fixedparameter copy number distribution. The results are shown in Figure 2B. The truncated normal distribution is not chosen on the basis of biological relevance, but rather to demonstrate that even a symmetric noise distribution (except for truncation at 0) produces qualitatively similar results to the distributions used in the precise nonidentifiability results. In every case, the effect of varying the transcription rate or burst intensity parameter according to a unimodal noise distribution is to produce copy number distributions that are generally consistent with systems that appear burstier.
Finally, we note that our results explain previous empirical observations that static measurements of mRNA are not always sufficient to infer the underlying dynamics of gene activity. Skinner et al., 2016 address some of these limitations by quantifying both nascent and mature mRNA in individual cells, as well incorporating cellcycle effects into their analysis of two mammalian genes. A more developed treatment of model identifiability is given in Gorin and Pachter, 2020a, where it shown how a stochastic model incorporating the downstream processing of mRNA can be used to distinguish a particular instance of nonidentifiability. More specifically, the authors consider the nonidentifiability problem noted in Ham et al., 2020a, arising from the GammaPoisson compound representation of the negative binomial distribution; a particular case of (4) above. Despite identical distributions at the nascent level, the marginal distributions of the processed (mature) mRNA are found to be substantially different. It is likely that a similar analysis will be valuable in the context of the other identifiability problems we have given in Table 1. In the next section, we take a more general approach to resolving nonidentifibiality and exploit the properties of complex gene expression dynamics to determine not only the presence of extrinsic noise, but also estimate its magnitude.
Resolving nonidentifiability
The results of the previous section show that additional information, beyond the observed copy number distribution, is required to constrain the space of possible dynamics that could give rise to the same distribution. One way to narrow this space of possibilities, is to determine the intrinsic and extrinsic contributions to the total variation in the system.
The dualreporter method
The total gene expression noise, as measured by the squared coefficient of variation ${\eta}^{2}$, can be decomposed exactly into a sum of intrinsic and extrinsic noise contributions Swain et al., 2002. The decomposition applies to dynamic noise Hilfinger and Paulsson, 2011, and generalises to higher moments in Hilfinger et al., 2012. Sets of dual reporters at multiple levels of the transcriptional pathway has been shown to achieve a finer breakdown of noise into subcategories Bowsher and Swain, 2012. As shown in Hilfinger and Paulsson, 2011, the noise decomposition is equivalent to the normalised Law of Total Variance (Ross, 2014). Indeed, if $X$ is the random variable denoting the number of molecules of a certain species (e.g. mRNA or protein) in a given cell, then we can decompose the total noise by conditioning $X$ on the state $\mathbf{\mathbf{Z}}=({Z}_{1},\mathrm{\dots},{Z}_{n})$ of the environmental variables Z_{1},…,Zn,
It has been shown Swain et al., 2002; Hilfinger and Paulsson, 2011 that if X_{1} and X_{2} are random variables denoting the expression levels of independent (conditional on $\mathbf{\mathbf{Z}}$) and identically distributed gene reporters, then the extrinsic noise contribution ${\eta}_{\text{ext}}^{2}$ in (6) can be identified by the normalised covariance between X_{1} and X_{2},
Decomposing noise with nonidentical reporters
The dualreporter method requires distinguishable measurements of transcripts or proteins from two conditionally independent and identically distributed reporter genes integrated into the same cell. In practice, however, dual reporters rarely have identical dynamics, which is widely considered to be a significant challenge to interpreting experimental results Quarton et al., 2020. We show that, under certain conditions, the decomposition in (6) can alternatively be obtained from nonidentically distributed and notnecessarily independent reporters.
Our result relies on the observation that the covariance of any two variables can be decomposed into the expectation of a conditional covariance and the covariance of two conditional expectations (the Law of Total Covariance Ross, 2014). If $X$ and $Y$ denote, for example, the numbers of molecules of two chemical species (eg. mRNA and protein) in a given cell, then the covariance of $X$ and $Y$ can be decomposed by conditioning on the state $\mathbf{\mathbf{Z}}=({Z}_{1},\mathrm{\dots},{Z}_{n})$ of the environmental variables Z_{1},…,Z_{n},
We will see that in many cases of interest the random variable $\mathrm{E}(X;\mathbf{Z})$ (as a function of $\mathbf{\mathbf{Z}}$) splits across common variables with $\mathrm{E}(Y;\mathbf{Z})$. By this we mean that $\mathrm{E}(X;\mathbf{Z})=f({\mathbf{Z}}_{X}){h}_{X}({\mathbf{Z}}^{\mathrm{\prime}})$ and $\mathrm{E}(Y;\mathbf{Z})=g({\mathbf{Z}}_{Y}){h}_{Y}({\mathbf{Z}}^{\mathrm{\prime}})$, where $\mathbf{Z}}_{X$ are the variables of $\mathbf{\mathbf{Z}}$ that appear in $\mathrm{E}(X;\mathbf{Z})$ but not in $\mathrm{E}(Y;\mathbf{Z})$, and dually, ${\mathbf{\mathbf{Z}}}_{Y}$ are those in $\mathrm{E}(Y;\mathbf{Z})$ that are not in $\mathrm{E}(X;\mathbf{Z})$. The variables ${\mathbf{\mathbf{Z}}}^{\prime}$ are those variables from $\mathbf{\mathbf{Z}}$ not in $\mathbf{Z}}_{X}\cup {\mathbf{Z}}_{Y$. In these cases, the component of $\mathrm{C}\mathrm{o}\mathrm{v}(X,Y)$ that is contributed by the variation in $\mathbf{\mathbf{Z}}$ (the extrinsic component) may be written as the covariance of the functions ${h}_{X}({\mathbf{\mathbf{Z}}}^{\prime})$ and ${h}_{Y}({\mathbf{\mathbf{Z}}}^{\prime})$. Conveniently, in the cases of interest here, the two functions ${h}_{X}$ and ${h}_{Y}$ coincide, and this is the form we use in the following decomposition principle. The supplementary material (Appendix A) contains the proof of this result.
The noise decomposition principle (NDP)
Assume that there are measurable functions $f$, $g$, and $h$ such that $\mathrm{E}(X;\mathbf{Z})$ and $\mathrm{E}(Y;\mathbf{Z})$ split across common variables by way of $\mathrm{E}(X;\mathbf{Z})=f({\mathbf{Z}}_{X})h({\mathbf{Z}}^{\mathrm{\prime}})$ and $\mathrm{E}(Y;\mathbf{Z})=g({\mathbf{Z}}_{Y})h({\mathbf{Z}}^{\mathrm{\prime}})$. Then, provided that the variables ${Z}_{1},\mathrm{\dots},{Z}_{m}$ are mutually independent, the normalised covariance of $\mathrm{E}(X;\mathbf{Z})$ and $\mathrm{E}(Y;\mathbf{Z})$ will identify the total noise on $h({\mathbf{\mathbf{Z}}}^{\prime})$ (i.e. ${\eta}_{h({\mathbf{\mathbf{Z}}}^{\prime})}^{2}$).
As we show in the next section, there are many situations where the random variable $\mathrm{E}(X;\mathbf{Z})$ is precisely the common part of $\mathrm{E}(Y;\mathbf{Z})$ and $\mathrm{E}(X;\mathbf{Z})$ (i.e., $h({\mathbf{Z}}^{\mathrm{\prime}})=\mathrm{E}(X;\mathbf{Z})$), and the normalised intrinsic contribution to the covariance is either zero or negligible. In these cases, the normalised covariance of $X$ and $Y$ will identify precisely the extrinsic noise contribution ${\eta}_{\text{ext}}^{2}$ to the total noise ${\eta}_{X}^{2}$. To see this, consider the situation where $\mathrm{E}(Y;\mathbf{Z})=f({\mathbf{Z}}_{Y})\mathrm{E}(X;\mathbf{Z})$. Then provided $f({\mathbf{Z}}_{Y})$ and $E(X;\mathbf{\mathbf{Z}})$ are independent random variables, the extrinsic contribution to the covariance of $X$ and $Y$ is given by,
If the normalised intrinsic contribution to the covariance is either zero or is negligible, it follows from (8) that
Thus, under certain conditions, measuring the covariance between two nonidentically distributed and notnecessarily independent reporters can replace dual reporters.
The pathwayreporter method
We show that for some reporters $X$ and $Y$ belonging to the same biochemical pathway, the covariance of $X$ and $Y$ continues to identify the extrinsic, and subsequently intrinsic, noise contributions to the total noise. The basis of the pathwayreporter method depends on the emergent covariances between the various species (e.g. nascent/mature mRNA and protein) in the gene expression pathway. Qualitatively, this effect can be seen in Figure 3, which compares simulated sample distributions of a simple fourstage model of gene transcription (refer to the model ${\mathbf{\mathbf{M}}}_{4}$ below) in the case of moderate extrinsic noise to the case with no extrinsic noise. The plots are the bivariate distributions for nascentmature, nascentprotein, and matureprotein levels, respectively. This will be made more precise below, where we find that it is possible to extract quantitative information about the extrinsic noise distribution itself from this data.
Throughout this section, we assume that extrinsic noise sources act independently i.e., the environmental variables ${Z}_{1},\mathrm{\dots},{Z}_{n}$ of $\mathbf{\mathbf{Z}}$ are mutually independent. Additionally, our modelling focuses only on a single gene copy, although the same analysis applies to multiple but indistinguishable gene copies; we refer to the supplementary material (Appendix B) for more details.
Measuring noise from a constitutive gene
We consider first the simplest case where the underlying process is constitutive. We begin with a stochastic model of mRNA maturation, which we denote by ${\mathbf{\mathbf{M}}}_{1}$; Figure 4A (top left) gives a schematic of the constitutive model. In this model, the gene continuously produces nascent mRNA according to a Poisson process at constant rate ${K}_{N}$, which are subsequently spliced into mature mRNA according at rate ${K}_{M}$. Degradation of mature mRNA occurs as a firstorder Poisson process with rate ${\delta}_{M}$. The model ${\mathbf{\mathbf{M}}}_{1}$, together with its extensions, has been considered in a number of recent studies Gorin and Pachter, 2020a; Cao and Grima, 2020; La Manno et al., 2018; Gorin and Pachter, 2020b; Bergen2020, and has a known solution for the stationary joint probability distribution Jahnke and Huisinga, 2007 given by,
where $n$ is the number of nascent mRNA, $m$ is the number of mature mRNA, and the parameter $\theta =({K}_{N},{K}_{M},{\delta}_{M})$. We use ${X}_{N}$ to denote the number of nascent mRNA, ${X}_{M}$ the number of mature mRNA produced from the same constitutive gene, and $\mathbf{\mathbf{Z}}=({K}_{N},{K}_{M},{\delta}_{M})$. To simplify notation, we abbreviate the variables in ${\mathbf{\mathbf{Z}}}_{{X}_{N}}$ as ${\mathbf{\mathbf{Z}}}_{N}$, and similarly for ${\mathbf{\mathbf{Z}}}_{{X}_{M}}$. It follows immediately from (11) that ${X}_{N}$ and ${X}_{M}$ are independent conditional on $\mathbf{\mathbf{Z}}$, and so the intrinsic contribution to the covariance of ${X}_{N}$ and ${X}_{M}$ (the first term of (8)) is 0. It is also easy to see from (11) that $E({X}_{N};\mathbf{\mathbf{Z}})=f({\mathbf{\mathbf{Z}}}_{N}){K}_{N}$ and $E({X}_{M};\mathbf{\mathbf{Z}})=g({\mathbf{\mathbf{Z}}}_{M}){K}_{N}$, where $f({\mathbf{\mathbf{Z}}}_{N})=\frac{1}{{K}_{M}}$ and $g({\mathbf{\mathbf{Z}}}_{M})=\frac{1}{{\delta}_{M}}$. Since the extrinsic noise sources are assumed to act independently, it follows that the Noise Decomposition Principle (NDP) of the previous section holds. We then have that $\mathrm{C}\mathrm{o}\mathrm{v}({X}_{N},{X}_{M})={\eta}_{{K}_{N}}^{2}$, where ${\eta}_{{K}_{N}}^{2}$ is the total noise on the transcription rate parameter ${K}_{N}$. Thus, measuring $\mathrm{C}\mathrm{o}\mathrm{v}({X}_{N},{X}_{M})$ can replace dual reporters to decompose the gene expression noise at the transcriptional level.
To support our mathematical results, we simulate the model ${\mathbf{\mathbf{M}}}_{1}$ subject to parameter variation using the stochastic simulation algorithm (SSA). Table 2 compares the extrinsic noise contributions found from various simulations with the corresponding theoretical values. In each simulation, the degradation rate ${\delta}_{m}$ is fixed at 1, with the other parameters scales accordingly. The maturation rate ${K}_{M}$ is sampled according to a $\mathrm{G}\mathrm{a}\mathrm{m}\mathrm{m}\mathrm{a}(8,0.0125)$ distribution, which has coefficient of variation 0.125. We consider different noise distributions on ${K}_{N}$, producing a range of noise strengths. Our theory predicts that pathwayreporters will identify the total noise on ${K}_{N}$. Overall, we observe an excellent agreement between the results obtained by the pathwayreporter method, the dualreporter method and the theoretical noise. There is consistently slightly more variation in the pathwayreporter results compared with the dualreporter results.
To explore the pathwayreporter method more widely, we consider 60 different parameter combinations to produce a range of mean copy numbers consistent with those reported experimentally. We also consider different noise distributions taken from the scaled Beta distribution family in order to produce a range of noise strengths; refer to Supplementary file 1. The pathwayreporter method performs favourably compared to the dualreporter method calculated from mature mRNA, and consistently outperforms the dualreporter method on nascent mRNA.
Next, we consider the simplest stochastic model of gene expression that includes both mRNA and protein dynamics: the wellknown ‘twostage model’ of gene expression, which, together with its threestage extension to include promoter switching has been widely studied Raser and O'Shea, 2005; Raj et al., 2006; Thattai and van Oudenaarden, 2001; Friedman et al., 2006; Singh and Hespanha, 2007; Shahrezaei and Swain, 2008a; Bokes et al., 2012; Molina et al., 2013. We denote this model by ${\mathbf{\mathbf{M}}}_{2}$; see Figure 4A (top right) for a schematic of this model. In this model, mRNA are synthesised according a Poisson process at rate ${K}_{m}$, which are then later translated into protein at rate ${K}_{p}$. Degradation of mRNA and protein occur as firstorder Poisson processes with rates ${\delta}_{m}$ and ${\delta}_{p}$, respectively. If ${X}_{m}$ denotes the number of mRNA, ${X}_{p}$ the number of proteins produced from the same constitutive gene, and if $\mathbf{\mathbf{Z}}=({K}_{m},{K}_{p},{\delta}_{m},{\delta}_{p})$, then the stationary means and covariance are given by Thattai and van Oudenaarden, 2001; Singh and Hespanha, 2007:
It is easily verified that $\mathrm{E}({X}_{p};\mathbf{Z})=f({\mathbf{Z}}_{p})\mathrm{E}({X}_{m};\mathbf{Z})$, where $f({\mathbf{\mathbf{Z}}}_{p})=\frac{{K}_{p}}{{\delta}_{p}}$. Thus, it follows from the NDP that the normalised contribution of $\mathrm{C}\mathrm{o}\mathrm{v}({X}_{m},{X}_{p})$ contributed by $\mathbf{\mathbf{Z}}$ will identify the extrinsic noise contribution to the total noise on ${X}_{m}$. Now, if we assume that ${\delta}_{m}$ is fixed across the cellpopulation, and all parameters are scaled so that ${\delta}_{m}=1$, we have the following expression for the intrinsic contribution to the covariance of ${X}_{m}$ and ${X}_{p}$; refer to the supplementary material (Appendix B) for details.
Since mRNA tends to be less stable than protein, we have ${\delta}_{p}<1$, and often ${\delta}_{p}\ll 1$Bernstein et al., 2002; Schwanhäusser et al., 2011. So, we can expect $\alpha \ll 1$. Further, for many genes we can expect the number of mRNA per cell (${K}_{m}$) to be in the order of tens, so $1/E({K}_{m})<1$. It follows that $\mathrm{E}(\mathrm{C}\mathrm{o}\mathrm{v}({X}_{m},{X}_{p};\mathbf{Z}))\ll \text{}1$, so that $\mathrm{C}\mathrm{o}\mathrm{v}({X}_{m},{X}_{p})$ will closely approximate the extrinsic noise at the transcriptional level.
We test our theory using stochastic simulations of the model ${\mathbf{\mathbf{M}}}_{2}$ subject to parameter variation. Table 3 gives a comparison of the results of the mRNAprotein reporters and dual reporters. In each case, we varied ${K}_{p}$ according to a $\mathrm{G}\mathrm{a}\mathrm{m}\mathrm{m}\mathrm{a}(5,0.4)$ distribution and ${\delta}_{p}$ according to a $\mathrm{G}\mathrm{a}\mathrm{m}\mathrm{m}\mathrm{a}(8,0.125)$ distribution; the corresponding noise strengths are 0.20 and 0.125, respectively. We consider different noise distributions on ${K}_{m}$, which produce a range of noise strengths, and the noise distribution parameters are selected to produce a mean mRNA of approximately 50 and a mean number of approximately 1000 proteins in each simulation. As our theory predicts, the mRNAprotein reporters identify the extrinsic noise contribution to the total noise on ${X}_{m}$. Again, we can see from Table 3 that there is excellent agreement between the results of the pathway reporters and the dual reporters, with slightly more variation in the pathwayreporter results. A larger exploration of the parameter space reveals similar results; these are provided in Supplementary file 1. Thus, despite mRNA and protein numbers not being strictly independent, they can, for practical purposes, replace dual reporters to decompose the noise at the transcriptional level.
We note that both the pathwayreporter (nascentmature or matureprotein) and dualreporter methods show slower convergence when copy numbers are low. Pathway reporters usually show fractionally slower convergence and fractionally more variation than a dual reporter, as suggested by the standard deviations in Tables 2 and 3. A more detailed exploration of convergence is given in the supplementary material (Appendix C).
Measuring noise from a facultative (bursty) gene
The most common mode of gene expression that is reported experimentally is burstiness Jones and Elf, 2018; Raj et al., 2006; Bahar Halpern et al., 2015; Suter et al., 2011; Larsson et al., 2019; Golding et al., 2005, in which mRNA are produced in short bursts with periods of inactivity in between. One example is gene regulation via repression, which naturally leads to periods of gene inactivity. Here, we consider a fourstage model of bursty gene expression, which incorporates both promoter switching and mRNA maturation; we denote this model by ${\mathbf{\mathbf{M}}}_{4}$; see Figure 4A (bottom left). This model has recently been considered in Cao and Grima, 2020, where the marginal distributions are solved in some limiting cases. In this model, the gene switches probabilistically between an active state (A) and an inactive state (I), at rates λ (onrate) and μ (offrate), respectively. In the active state, nascent mRNA is synthesised according to a Poisson process at rate ${K}_{N}$, while in the inactive state transcription does not occur. Nascent mRNA is spliced into mature mRNA at rate ${K}_{M}$; this in turn is later translated into protein, with rate ${K}_{P}$. Degradation of mRNA and protein occur independently of the promoter state at rates ${\delta}_{M}$ and ${\delta}_{P}$, respectively.
For this model, we have three natural candidates for pathway reporters: (a) nascent and mature mRNA (b) mature mRNA and protein, and (c) nascent mRNA and protein reporters. Below we show that nascent mRNA–protein reporters yield consistently good estimates of the extrinsic noise contribution ${\eta}_{\mathrm{ext}}^{2}$ to the total gene expression noise, while nascent–mature and mature RNA–protein reporters are reliable in some restricted cases. We begin by showing that each of the reporter pairs (a), (b), and (c) satisfy the Noise Decomposition Principle. We then demonstrate computationally, that despite a lack of independence between these reporter pairs, the pathwayreporter method can still be used to decompose the total gene expression noise at the transcriptional level. Throughout, we let ${X}_{N}$ denote the number of nascent mRNA, we let ${X}_{M}$ denote the number of mature mRNA, and let ${X}_{P}$ denote the number of proteins produced from the same gene. We also let $\mathbf{\mathbf{Z}}=\{\lambda ,\mu ,{K}_{N},{K}_{M},{K}_{P},{\delta}_{M},{\delta}_{P}\}$.
Following Raj et al., 2006, we assume that the transcription rate ${K}_{N}$ is large relative to the other parameters. We further assume that the maturation rate ${K}_{M}$ is large (i.e. ${K}_{M}>{\delta}_{M}$), which is supported by experiments Cao and Grima, 2020. Then, using the results of Raj et al., 2006 and arguments similar to those in Cao and Grima, 2020, it can be shown that the stationary averages for the nascent mRNA, mature mRNA and protein levels are given by,
respectively. The supplementary material (Appendix B) provides more detail on how these expressions can be obtained.
We consider first the nascentmature pathway reporters, case (a). From (14), it is easily seen that $\mathrm{E}({X}_{N};\mathbf{Z})=f({\mathbf{Z}}_{N}){K}_{N}\frac{\lambda}{(\lambda +\mu )}$ and $\mathrm{E}({X}_{M};\mathbf{Z})=g({\mathbf{Z}}_{M}){K}_{N}\frac{\lambda}{(\lambda +\mu )}$, where $f({\mathbf{Z}}_{N})=\frac{1}{{K}_{M}}$ and $g({\mathbf{\mathbf{Z}}}_{M})=\frac{1}{{\delta}_{M}}$. So the NDP holds, and the normalised covariance of $\mathrm{E}({X}_{N};\mathbf{Z})$ and $\mathrm{E}({X}_{M};\mathbf{Z})$ will identify the extrinsic noise on the transcription component $K}_{N}\frac{\lambda}{(\lambda +\mu )$. Consider now the matureprotein reporters, case (b). Again, we can see from (14) that $\mathrm{E}({X}_{M};\mathbf{Z})=f({\mathbf{Z}}_{P})\mathrm{E}(X;\mathbf{Z})$, where $f({\mathbf{Z}}_{P})=\frac{{K}_{P}}{{\delta}_{P}}$. Thus, the NDP holds, and so the normalised covariance of $\mathrm{E}({X}_{M};\mathbf{Z})$ and $\mathrm{E}({X}_{P};\mathbf{Z})$ will identify the total noise on $\mathrm{E}({X}_{M};\mathbf{Z})$ (the extrinsic noise on ${X}_{M}$). For the nascentprotein reporters, case (c), it is easy to see that $\mathrm{E}({X}_{N};\mathbf{Z})=f({\mathbf{Z}}_{N}){K}_{N}\frac{\lambda}{(\lambda +\mu )}$, where $f({\mathbf{Z}}_{N})=\frac{1}{{K}_{M}}$, and $\mathrm{E}({X}_{P};\mathbf{Z})=g({\mathbf{Z}}_{P}){K}_{N}\frac{\lambda}{(\lambda +\mu )}$, where $g({\mathbf{Z}}_{P})=\frac{{K}_{P}}{{\delta}_{M}{\delta}_{P}}$. Thus, again the NDP holds, and the normalised covariance of $\mathrm{E}({X}_{N};\mathbf{Z})$ and $\mathrm{E}({X}_{P};\mathbf{Z})$ will identify the noise on the transcriptional component $K}_{N}\frac{\lambda}{(\lambda +\mu )$.
In order for the pathwayreporter method to provide a close approximation to the extrinsic noise in cases (a), (b), and (c), we require that the normalised intrinsic contribution to the covariance is either zero or negligible. This condition will hold provided there is sufficiently small correlation between the reporter pairs. In the case of (prokaryotic) mRNA and protein, this lack of correlation has been been verified experimentally in E. coli (Taniguchi et al., 2010). More generally, it is possible to provide an intuition about the conditions under which the lack of correlation might hold. The time series of copy numbers for each of nascent mRNA, mature mRNA and protein broadly follow each other, each with delay from its predecessor (Figure 4B). Parameter values that reduce this delay will tend to increase correlation, and thereby increase the normalised intrinsic contribution to the covariance. The primary example of this effect is seen when ${\delta}_{p}$ approaches, or even exceeds ${\delta}_{m}$ (or for nascentmature reporters, when ${\delta}_{M}$ approaches the maturation rate). A further contributor to high correlation between mRNA and protein, is when the system undergoes long timescale changes. In this situation, the copy numbers tend to drop to very low values for extended periods. The primary parameter influencing this type of behaviour is the activerate λ, specifically, when λ tends to 0 (Figure 4C). An illustrative example of this can be seen by considering a Telegraph system in the limit of slow switching, which produces a copy number distribution that converges to a scaled PoissonBernoulli compound distribution: even without any extrinsic noise, the pathway reporter method will identify the ${\eta}^{2}$ value of the corresponding scaled Bernoulli distribution.
An extensive computational exploration of the parameter space (Supplementary file 2) supports our intuition, though the strength of the effect varies across the three different reporter pairs. This is further corroborated by the heatmaps shown in Figure 5, which for three fixed values of μ and a broad spectrum of ${\delta}_{p}$ and λ values, give the intrinsic term ${\eta}_{\mathrm{int}}^{2}$ in (8), for fixed $\mathbf{\mathbf{Z}}$. Thus, the heatmaps provide an estimate for the overshoot error in the pathwayreporter approach. Note that blue pixels represent an overshoot estimate of less than 0.05.
For nascentprotein reporters, the normalised intrinsic contribution to the covariance is satisfactorily small (less than 0.05) except for (a) high values of ${\delta}_{p}$ in unison with (b) low values of λ (less than 1, although lower values are acceptable if ${\delta}_{p}$ is small). The assumption (a) that ${\delta}_{P}<{\delta}_{M}$ is known to be true for a large number of genes, and is justified by the difference in the mRNA and protein lifetimes. While there is of course variation across genes and organisms, values of ${\delta}_{P}\le 0.5{\delta}_{M}$ and even ${\delta}_{P}\le 0.2{\delta}_{M}$ seem reasonable for the majority of genes. In E. coli Taniguchi et al., 2010 and yeast Belle et al., 2006, for example, mRNA are typically degraded within a few minutes, while most proteins have lifetimes at the level of 10 s of minutes to hours. For mammalian genes Schwanhäusser et al., 2011, it has been reported that the median mRNA decay rate ${\delta}_{M}$ is (approximately) five times larger than the median protein decay rate ${\delta}_{P}$, determined from 4200 genes. Assumption (b) requires that the gene is sufficiently active. In a recent paper by Larsson et al., 2019, the promoterswitching rates λ, μ, and the transcription rate $K$ of the Telegraph model are estimated from singlecell data for over 7000 genes in mouse and human fibroblasts. Of those genes with mean mRNA levels greater than 5, we found that over 90% have a value for λ of at least 0.5, and over 65% have a value for λ greater than 1. In Cao and Grima, 2020, a comprehensive list of genes (ranging from yeast cells through to human cells) with experimentally inferred parameter values are sourced from across the literature (see Table 1 in Cao and Grima, 2020). After scaling the parameter values of the 26 genes reported there, we find that around 88% have a value for λ of at least 0.5, and approximately 58% have a value for λ greater than 1. Thus, the heatmaps given in Figure 5 (top panel) suggest that nascentprotein reporters will provide a satisfactory estimate of the extrinsic noise level for a substantial fraction of genes.
The mature nRNA–protein reporters are less reliable, with the requirement of slow protein decay and higher onrate being more pronounced than for the nascent mRNA–protein reporters; this is evident from Figure 5 (second panel). The performance of the nascent–mature reporter is of course independent of ${\delta}_{p}$, but is only viable in the case of a large onrate (see Figure 5—figure supplement 1).
We test our approach for each of the pathway reporter pairs (a), (b), and (c) against dual reporters using stochastic simulations. Table 4 shows the results from six simulations across a spectrum of behaviours from moderately slow switching, to fast switching as well as a range of levels of burstiness. For each of the parameters $\mu ,\lambda ,{K}_{P},{\delta}_{P}$, we selected a scaled $\mathrm{B}\mathrm{e}\mathrm{t}\mathrm{a}(5,6)$ distribution, with squared coefficient of variation ${\eta}^{2}=0.1$; the scaling is chosen in each case to achieve a mean value equal to the parameter value given in Table 4. It is routinely verified that scaling these distributions does not change the value of ${\eta}^{2}$. The parameter ${K}_{N}$ is given the noise distribution $\mathrm{B}\mathrm{e}\mathrm{t}\mathrm{a}(3,6)$, which has a slightly higher coefficient of variation ${\eta}^{2}=0.2$. In order to achieve direct benchmarking against the dual reporters, the parameter ${K}_{M}$ is fixed; we select ${K}_{M}=10$. This is because the nascentprotein pathway reporter estimates noise on the value of ${K}_{N}\frac{\lambda}{\lambda +\mu}$, while the mature mRNA dualreporter measures noise on $\frac{{K}_{N}}{{K}_{M}}\frac{\lambda}{\lambda +\mu}$, and these coincide only when ${K}_{M}$ is fixed. The mean values of ${K}_{N}$ and ${K}_{P}$ are chosen to achieve approximate average nascent mRNA levels, mature mRNA levels and protein levels at 5, 50, and 1000 respectively, given the chosen values of $\lambda ,\mu ,{\delta}_{P}$.
The results for the nascent mRNA–protein reporters, case (c), given in Table 4 show comparable performance to dual reporters, with only modest overshoot; even in the worst performing case of $\lambda =0.5$, $\mu =1$ the result of the pathway reporters is within one standard deviation, in a very tight distribution. The error heatmaps of Figure 5 provide an accurate estimate of the overshoot in the nascentprotein results in Table 4. As an example, the first row is most closely matched by the heatmap at top left of Figure 5, which at $\lambda =0.5$ and ${\delta}_{P}=0.1$ is suggestive of an error around the boundary between blue and red (around 0.06). The same accuracy is obtained for the other rows. As predicted, the matureprotein reporters show significantly more overshoot, especially with the less active genes. Improved accuracy can again be obtained by subtracting the estimated overshoot given in the error heatmaps from the obtained value. Thus for example, the error heatmap for $\mu =2$ (Figure 5 lower left) gives an error approximately 0.07 for $\lambda =1,{\delta}_{P}=0.1$, which agrees very closely to the actual overshoot of 0.07 shown in the corresponding row of Table 4. An overshoot of approximately 0.06 is suggested by the heatmap for $\mu =2$, when $\lambda =2,{\delta}_{P}=0.3$, which leads to a correction from 0.35 in Table 4 to a value of 0.29. This is quite consistent with the dual reporter benchmark of 0.27. As expected (based on Figure 5—figure supplement 1), nascentmature reporters do not perform well on bursty systems except for high λ and so the values are not included in Table 4; only in the case of $\lambda =\mu =10$ does the result begin to approach the dual reporter value, returning $0.32\pm 0.03$.
Generality of the pathway reporter method
To test the robustness of our pathway reporter approach, we validate our theoretical results on various other gene expression dynamics. (1): We begin by considering a more detailed model of the mRNA maturation process, where the nascent mRNA maturate after a fixed amount of time. The assumption of a fixed maturation time is sometimes taken to approximate the combined effect of intermediate maturation processes such as initiation, elongation, and release Xu et al., 2016. More specifically, we consider the model ${\mathbf{\mathbf{M}}}_{4}$ above (Figure 4D) in the case of constitutive expression ($\lambda =1$, $\mu =0$), and replacing the firstorder reaction ${K}_{M}$ by a fixed interval of time. Additionally, we explore maturation times sampled from Erlang distributions, to account for the fact that maturation can involve several shorter stochastic processes. We find that the extrinsic noise contribution obtained using the nascent and mature mRNA reporters match closely to the true (dual reporter) values across a range of maturation times; refer to the supplementary material for details.
(2): Next we consider an extension of a model of transcriptional bursting given in Bartman et al., 2019; Cao et al., 2020. The model we consider is the same as in Bartman et al., 2019; Cao et al., 2020, however, is extended to include protein synthesis and degradation. This model captures the transcriptional process at a finer level of detail, and is argued in Cao et al., 2020 to be the model most closely matching experimental observations. In this more nuanced ‘multiscale’ model, the gene stochastically switches between three states: two active states S_{10} and S_{11}, and one inactive state S_{0}. The activation of the gene occurs in two steps, initially by the binding of transcriptional factors (transition from S_{0} to S_{10} at rate ${\lambda}_{1}$, and reversible at rate ${\mu}_{1}$), and then as a secondary step, by the binding and pause of the mRNA polymerase (transition from S_{10} to S_{11} at rate ${\lambda}_{2}$). Transitions from S_{11} to S_{0} also occur at rate ${\mu}_{1}$, due to detachment of both the transcriptional factors and polymerase. Transcription of nascent mRNA (at rate ${K}_{N}$) occurs only in state S_{11} and results in immediate transition to state S_{10}. Nascent mRNA maturate at rate ${K}_{M}$, and are subsequently translated into protein at rate ${K}_{p}$. Degradation of mRNA and protein occur with rates ${\delta}_{m}$ and ${\delta}_{p}$, respectively. All reactions are considered to be firstorder with exponentially distributed waiting times between successive reactions. A schematic of the model is given in Figure 6 (inner rectangle). In this case, we are again able to prove that the Noise Decomposition Principle holds for all reporter pairs taken from this pathway using existing formulæ derived in Cao et al., 2020 for the marginal means. For details refer to the supplementary material (Appendix Convergence of Pathway and Dual Reporters).
(3): We combine models (1) and (2) above, incorporating the fixed time maturation of (1) with the multiscale approach of (2).
(4): The cell cycle is a major contributing factor to extrinsic noise, introducing population heterogeneity (as cells are at different stages of the cell cycle), as well as internal dynamics to parameter values. Here we incorporate the salient features of the cell cycle into model (3), which is measurable as extrinsic noise by our methodology. Specifically, we model the effects of (i) gene replication, (ii) dosage compensation, (iii) binomial partitioning of products due to cell division, as well as (iv) cellcycle length variability. Refer to Figure 6 for a schematic. This model is an extension of that solved in Cao and Grima, 2020. Our model further incorporates the multiscale dynamics of model (3) and the Erlangdistributed maturation times of model (1). As far as we are aware, this model has not been explored even by stochastic simulations before. A detailed description of how the above cellcycle effects are captured in our model is given as follows. (i) Replication results in a doubling of the gene from one to two at the replication time, t_{r}. This replication occurs over a period which is much shorter than the length of the cell cycle, and we follow the assumptions in Cao and Grima, 2020 by considering it to occur instantaneously. (ii) Dosage compensation changes the rate at which the gene switches from inactive to active (${\lambda}_{1}$) upon replication at time t_{r}. Again following Cao and Grima, 2020, the activation rate after replication is 70% of the rate prior to replication. (iii) Binomial partitioning of nascent mRNA, mature mRNA and protein at cell division. We assume that nascent mRNA, mature mRNA, and protein segregate into the two daughter cells, with independent probability $1/2$. We follow just one of the daughter cells, chosen at random with equal probability. (iv) Cellcycle length variability. The time between successive cell divisions is stochastic, and is assumed to be sampled from an Erlang distribution. This has been shown to be consistent with experimental data Cao and Grima, 2020. The time to replication, and subsequently to cell division, are both chosen from an Erlang distribution with shape parameter 12, which produces a total cell cycle length distributed according to $\mathrm{E}\mathrm{r}\mathrm{l}\mathrm{a}\mathrm{n}\mathrm{g}(24,\lambda )$; this matches the $\mathrm{E}\mathrm{r}\mathrm{l}\mathrm{a}\mathrm{n}\mathrm{g}(24,\lambda )$ cell cycle length in Cao and Grima, 2020, where replication is at the exact halfway point. We similarly choose a rate parameter $\lambda ={\lambda}_{1}$ chosen at a commensurate proportion to our mRNA decay rate of $\delta =1$.
In each of the cases (1 – 4) above, we find that the correlations between reporter pairs is negligible, and the predicted contribution of extrinsic noise matches those obtained from the dual reporter method across a range of parameter combinations. Details of the simulation methods and results can be found in the supplementary material (Appendix Convergence of Pathway and Dual Reporters). In summary, the results show that our pathway reporter approach is remarkably insensitive to the specific dynamics of mRNA and protein synthesis. In particular, the correlations between reporter pairs do not strongly depend on the details of the gene expression model used.
Discussion
Despite the proliferation of experimental methods for singlecell profiling, the ability to extract transcriptional dynamics from measured distributions of mRNA copy numbers is limited. In particular, the multiple factors that contribute to mRNA heterogeneity can confound the measured distribution, which hinders analysis. Theoretical innovations that allow us to quantify and help in identifying the causes of these observable effects are therefore of great importance. In this work, we have demonstrated, through a series of mathematical results, that it is impossible to delineate the relative sources of heterogeneity from the measured transcript abundance distribution alone: multiple possible dynamics can give rise to the same distribution. Our approach involves establishing integral representations for distributions that are commonly encountered in singlecell data analysis, such as the negative binomial distribution and the stationary probability distribution of the Telegraph model. We show that a number of wellknown representations can be obtained from our results. A particular feature of our nonidentifiability results is that population heterogeneity inflates the apparent burstiness of the system. It is therefore necessary to collect further information, beyond measurements of the transcripts alone, in order to constrain the number of possible theoretical models of gene activity that could represent the system. In particular, additional work may be required to determine the true level of burstiness of the underlying system.
We have developed a theoretical framework for estimating levels of extrinsic noise, which can assist in resolving nonidentifiability problems. The dual reporter method of Swain et al., 2002 already provides one such approach; but it is experimentally challenging to set up in many systems, and requires strictly identical and conditionally independent pairs of gene reporters. Our Noise Decomposition Principle directly generalises the theoretical underpinnings of the dual reporter method and related approaches Bowsher and Swain, 2012; Jetka et al., 2018; here we have used it to identify a practical approach—the pathwayreporter method—for obtaining an effective and experimentally more easily implementable noise decomposition. Our approach allows us to use measurements of two different species from the transcriptional pathway of a single gene copy instead of having to set up a more cumbersome dual reporter assay. The accuracy of the pathwayreporter method is provably identical for constitutive gene expression, and in the case of nascentmature mRNA reporters, the measurements are readily obtainable from current singlecell data Shah et al., 2018; La Manno et al., 2018; Skinner et al., 2016. For bursty systems, the method in general provides only an approximation of the extrinsic noise. We are, however, able to demonstrate computationally, that one of the proposed pathway reporters provides a satisfactory estimate of the extrinsic noise for most genes. The other pathway reporters also provide viable estimates of the extrinsic noise in some cases. We further validate our theoretical framework on synthetic data for genes with various underlying gene expression dynamics. Our results show that the pathway reporter method is independent of the specific dynamics of mRNA and protein synthesis, and therefore should be applicable to a broad range of experimental scenarios.
Despite the generality of our theoretical contribution, our pathwayreporter approach has some caveats. In particular, the approach relies on the assumption that extrinsic noise sources act independently. Experimentally, however, these may be correlated. For example, it has been suggested Hilfinger et al., 2016; Cole and LutheySchulten, 2017 that the transcription and translation rates in E. coli anticorrelate. Additional work is required to determine degree to which the independence of noise sources is a reasonable assumption.
Recent developments in singlecell profiling now allow simultaneous measurements of transcripts and proteins in thousands of single cells Stoeckius et al., 2017; Peterson et al., 2017; Reimegård et al., 2019. As discussed in Gorin and Pachter, 2020a, experimental improvements that would additionally allow measurements of nascent transcripts, coupled with theoretical developments such as those presented here, will allow for identification of noise sources on a genomewide scale. Our work reveals that extrinsic noise distorts the multivariate copy number distribution of the different species in the gene expression pathway. We have exploited this to yield reliable estimates of noise strength, which we are confident will assist in setting better practices for model fitting and inference in the analysis of singlecell data. A more nuanced analysis of this multivariate distribution may offer even further insight into model and noise identifiability.
Appendix 1
Derivations of the nonidentifiability results
In the main text we provide a number of examples of when the compound distribution does not have a unique representation. We here provide the full details of the derivations of these results.
Bursty expression: telegraph representation
Consider first a Telegraph distribution ${\stackrel{~}{p}}_{T}(n;\lambda ,{\mu}^{\prime},{K}^{\prime})$ and the probability density function of the scaled beta distribution ${\mathrm{B}\mathrm{e}\mathrm{t}\mathrm{a}}_{{K}^{\mathrm{\prime}}}(\lambda +\mu ,{\mu}^{\mathrm{\prime}}\mu )$, given by
Note that this distribution has support $[0,{K}^{\prime}]$. In the main text, we claim that the Telegraph distribution ${\stackrel{~}{p}}_{T}(n;\lambda ,{\mu}^{\prime},{K}^{\prime})$ can be obtained from compounding a Telegraph distribution by a scaled beta distribution with pdf $f(t;\lambda +\mu ,{\mu}^{\prime}\mu )$. In other words, that ${\stackrel{~}{p}}_{T}(n;\lambda ,{\mu}^{\prime},{K}^{\prime})$ can be written as:
This is Equation (3) in the main text. Starting from the right hand side of (15), we have
Substituting $t={K}^{\prime}T$ and simplifying, the right hand side of (16) becomes
Now, using the integral representation Olver et al., 2010 [13.4.2] of the confluent hypergeometric function (with $a=\lambda +n$, $b=\lambda +{\mu}^{\prime}+1$ and $c=\lambda +\mu +1$), this becomes
which is the left hand side of (15). Hence, we have that (15) holds.
Instantaneously bursty expression: negative binomial representations
We consider first the representation given in Equation (4) in the main text. Recall that we let ${\stackrel{~}{p}}_{\text{NB}}(n;r,\beta )$ denote the probability mass function of a $\mathrm{N}\mathrm{e}\mathrm{g}\mathrm{B}\mathrm{i}\mathrm{n}(r,\frac{\beta}{\beta +1})$ distribution, where $\beta >0$. In the main text, we claim that for any negative binomial distribution of the form $\mathrm{N}\mathrm{e}\mathrm{g}\mathrm{B}\mathrm{i}\mathrm{n}(\lambda ,\frac{\beta}{\beta +1})$ (where $\beta >0$) we have,
where $f(t;\lambda +\mu ,\beta )$ is the probability mass function of a $\mathrm{G}\mathrm{a}\mathrm{m}\mathrm{m}\mathrm{a}(\lambda +\mu ,\beta )$ distribution. Beginning with the right hand side of (17) we have,
We now apply the following identity given in Saad and Hall, 2003 [Appendix 1] with $a=\lambda +n$, $b=\lambda +\mu +n$, $k=1$, $d=\lambda +\mu +n$ and $h=\beta $.
So the left hand side of (18) becomes
which is the right hand side of (17). Hence, we have that (17) holds. We now consider the representation given in Equation (5) of the main text. Here we claim that any negative binomial distribution of the form $\mathrm{N}\mathrm{e}\mathrm{g}\mathrm{B}\mathrm{i}\mathrm{n}({\lambda}^{\mathrm{\prime}},\frac{b}{1+b})$ (where $b>0$) can be written as,
where ${f}_{b}(b\theta 1;\lambda {\lambda}^{\prime},{\lambda}^{\prime})$ is the probability mass function of a scaled beta prime ${\mathrm{B}\mathrm{e}\mathrm{t}\mathrm{a}\mathrm{P}\mathrm{r}\mathrm{i}\mathrm{m}\mathrm{e}}_{b}(\lambda {\lambda}^{\mathrm{\prime}},\lambda )$ distribution, where $b>0$ and $\lambda >{\lambda}^{\prime}$. Starting from the right hand side of (20), we have
Now substituting $H=b\theta 1$ and simplifying, we obtain
and letting $y=b+1$ and simplifying we have that
Here, we used the fact that the integral in the variable $y$ is the probability density function of a $\mathrm{B}\mathrm{e}\mathrm{t}\mathrm{a}\mathrm{P}\mathrm{r}\mathrm{i}\mathrm{m}\mathrm{e}(\lambda {\lambda}^{\mathrm{\prime}},{\lambda}^{\mathrm{\prime}}+n)$ distribution. Thus, Equation (21) simplifies to
which is the right hand side of (20). Thus, we have that (20) holds.
Appendix 2
Proof of the Noise Decomposition Principle
Here we provide the proof of the Noise Decomposition Principle given in the main text. For convenience we restate the principle below.
The Noise Decomposition Principle (NDP)
Assume that there are measurable functions $f$, $g$ and $h$ such that $\mathrm{E}(X;\mathbf{Z})$ and $\mathrm{E}(Y;\mathbf{Z})$ split across common variables by way of $\mathrm{E}(X;\mathbf{Z})=f({\mathbf{Z}}_{X})h({\mathbf{Z}}^{\mathrm{\prime}})$ and $\mathrm{E}(Y;\mathbf{Z})=g({\mathbf{Z}}_{Y})h({\mathbf{Z}}^{\mathrm{\prime}})$. Then provided that the variables ${Z}_{1},\mathrm{\dots},{Z}_{m}$ are mutually independent, the normalised covariance of $\mathrm{E}(X;\mathbf{Z})$ and $\mathrm{E}(Y;\mathbf{Z})$ will identify the total noise on $h({\mathbf{Z}}^{\mathrm{\prime}})$ (i.e., ${\eta}_{h({Z}^{\prime})}^{2}$).
Consider first the covariance of $\mathrm{E}(X;\mathbf{Z})$ and $E(Y;\mathbf{\mathbf{Z}})$. We have
Note that here we used the fact that the variables in $\mathbf{Z}=({Z}_{1},\dots ,{Z}_{n})$ are mutually independent. Now using the fact that $\mathrm{E}(X)=\mathrm{E}(\mathrm{E}(X;\mathbf{Z}))$ and $\mathrm{E}(Y)=\mathrm{E}(\mathrm{E}(Y;\mathbf{Z}))$ (the Law of Total Expectation), and then normalising we obtain
Hence, under certain conditions the normalised covariance of $\mathrm{E}(X;\mathbf{Z})$ and $\mathrm{E}(Y;\mathbf{Z})$ will identify the total noise on $h({\mathbf{\mathbf{Z}}}^{\prime})$.
Appendix 3
Pathway reporters
Justification for simulating only one copy of the gene
Our simulations and theory have been based over reporters from a single gene copy, whereas in practice there may be multiple copies of the gene that contribute to the overall mRNA. If there are mechanisms in place to distinguish the mRNA or protein of one gene copy from another then the theory and analysis we have developed in main paper holds without change. In the case where it is not possible to distinguish mRNA and protein (respectively) from the multiple gene copies, then we now observe that the general theory continues to hold, provided there is independence between the gene copies; this assumption has been verified experimentally Skinner et al., 2016. Here, we demonstrate this in the case of two gene copies, though the general case for more than two genes is essentially identical but is notationally cumbersome. We are considering a situation where the variable $X$ in the Noise Decomposition Principle is the sum of two independent variables ${X}_{1},{X}_{2}$ and $Y$ is the sum of two independent variables ${Y}_{1},{Y}_{2}$. We assume common dependence on the environmental variables $\mathbf{\mathbf{Z}}$ so that $E({X}_{1};\mathbf{\mathbf{Z}})=E({X}_{2};\mathbf{\mathbf{Z}})$, $E({Y}_{1};\mathbf{\mathbf{Z}})=E({Y}_{2};\mathbf{\mathbf{Z}})$. Using these equalities and the independence of $X}_{1},{X}_{2$ and ${Y}_{1},{Y}_{2}$ in $X={X}_{1}+{X}_{2}$, $Y={Y}_{1}+{Y}_{2}$, we find the numerator of
is simply $4\mathrm{C}\mathrm{o}\mathrm{v}(\mathrm{E}({X}_{1};\mathbf{Z}),\mathrm{E}({Y}_{1};\mathbf{Z}))$, while the denominator is $4\mathrm{E}({X}_{1})\mathrm{E}({Y}_{1})$. Thus the noise decomposition coincides with that for the single copy $X}_{1},{Y}_{1$. Further work may be required to consider systems where there is independence between, or there is significant deviations in the gene copies.
Upper bound for the intrinsic contribution to the covariance: constitutive expression
In the main text, we claim that the error in the pathwayreporter approach in the case of mRNAprotein reporters is negligible (i.e. the error is $\ll 1$); refer to Equation (13) in the main text. We here provide full details of the derivation of this expression. First let ${X}_{m}$ and ${X}_{p}$ be the number of mRNA and protein produced from the same constitutive gene modelled by the 'twostage' model, ${\mathbf{\mathbf{M}}}_{2}$ (see Figure 4A (top right) of the main text). Also, let $\mathbf{Z}=\{{K}_{m},{\delta}_{m},{K}_{p},{\delta}_{p}\}$.
We restate here the expression for the intrinsic contribution to the covariance of ${X}_{m}$ and ${X}_{p}$ given as Equation (13) of the main text.
We require the following expressions for the stationary mean mRNA level and protein level of the twostage model Thattai and van Oudenaarden, 2001; Singh and Hespanha, 2007.
Assuming that ${\delta}_{m}$ is fixed across the cellpopulation, and all parameters are scaled so that ${\delta}_{m}=1$, it follows from (28), that
Using the Total Law of Expectation, we also have,
Thus, it follows that
Determination of the marginal means for model ${\mathbf{\mathbf{M}}}_{4}$
In order to establish the Noise Decomposition principle in the case of bursty gene expression, we rely on expressions for the marginal stationary means of the model ${\mathbf{\mathbf{M}}}_{4}$; see Figure 4A (bottom left) and the associated caption for more details. To derive the marginal means for nascent and mature mRNA and for protein (Equation (14) of the main text), we first observe that the nascent mRNA population may treated identically to that of mRNA in general (that is, no distinction between nascent and mature), as in Peccoud and Ycart, 1995, except that mRNA decay is replaced by the sum of decay and maturation. As in the work of Cao and Grima, 2020, the assumption of fast maturation allows us to ignore decay completely in the nascent phase, so that the marginal distribution is identical to that of Peccoud and Ycart, 1995, except with decay replaced by maturation. This leads to a marginal nascent mRNA mean of
The marginal means for mature mRNA and protein are derived in Raj et al., 2006 under the assumption that the transcription rate parameter ${K}_{N}$ is large relative to the other parameters. The expressions are given by
Formally, the marginal means in Raj et al., 2006 are for the threestage model ${\mathbf{\mathbf{M}}}_{3}$, which ignores the downstream processing of mRNA, such as splicing. The assumption of fast maturation however, justifies the treatment of the nascent phase of mRNA as an ephemeral step within the Poissonian modelling of mRNA transcription.
There are a number of possibly compounding assumptions on the parameters here, but simulations show that there is a lot of tolerance, with even only moderate maturation and transcription still returning sample means consistent with the formulas.
Measuring noise from an instantaneously bursty gene
Here we use the simple stochastic model of Singh and Bokes, 2012 that includes both instantaneous transcriptional bursting and mRNA maturation. This model can be obtained as a limiting case of the model ${\mathbf{\mathbf{M}}}_{4}$ (Figure 4A (top left) in the main text), where the offrate μ has tended toward infinity, while the onrate λ remains fixed. Under this condition, transcription is rare enough to be considered instantaneous, leading to ‘bursts’ of transcriptional activity. The intensity of the bursts $M$ is known to be geometrically distributed with mean burst size $B=\frac{{K}_{N}}{\mu}$.
If we let ${X}_{N}$ and ${X}_{M}$ denote the number of nascent and mature mRNA, respectively, then according to Singh and Bokes, 2012, the steadystate marginal means and covariance are given by
It is easily seen from (32) that $\mathrm{E}({X}_{N};\mathbf{Z})=f({\mathbf{Z}}_{N})B\lambda$ and $\mathrm{E}({X}_{M};\mathbf{Z})=g({\mathbf{Z}}_{M})B\lambda$, where $f({\mathbf{Z}}_{N})=\frac{1}{{K}_{M}}$ and $g({\mathbf{Z}}_{M})=\frac{1}{{\delta}_{M}}$. Thus, the Noise Decomposition Principle holds. Using arguments similar to those given in the above section, it is straightforward to derive the following expression for the error in the pathway reporter approach:
We can expect that $\alpha \approx 1$, so that unless the burst frequency is substantial, that is $\lambda \gg 1$, the overshoot in the the nascentmature pathwayreporter approach is significant.
Appendix 4
Convergence of pathway and dual reporters
Convergence of both pathway reporters and dual reporters in low output genes is slower than for high output genes, and this affect is most pronounced for reporters taken from nascent mRNA. Fig. Convergence of Pathway and Dual Reporters compares convergence in some low output genes and high output genes for nascentmature pathway reporters and the maturemature dual reporter, in the case of constitutive expression. From Equation (11) of the main text and the NDP, these should measure precisely the extrinsic noise on the transcription rate parameter ${K}_{N}$. In the low output gene, both the dual reporter and pathway reporters are yet to show accurate measurement of the overall extrinsic noise after 1000 samples (Figure Convergence of Pathway and Dual ReportersA), although they are providing rough estimates after as little as a few hundred samples. Pathway reporters for nascentmature typically exhibit slightly slower convergence than maturemature dual reporter; the difference is marginal, but can be seen for both the high output gene (compare the values after 500 samples) and the low output gene (compare the values after 1000 samples). In this figure, all parameters (except the reference parameter ${\delta}_{M}$) were given scaled Beta distribution noise. The noise on ${K}_{N}$ is a $\mathrm{B}\mathrm{e}\mathrm{t}\mathrm{a}(3,6)$ distribution scaled to achieve $E({K}_{N})=5$ in the low output case and $E({K}_{N})=50$ in the high output case. The squared coefficient of variation is 0.2, and is shown in green in the figure.
Convergence for the matureprotein reporters is considered in Figure Convergence of Pathway and Dual Reporters. Here, Equation (13) of the main text shows that we should expect an overshoot in comparison to the maturemature dual reporter. We have again compared a low output gene with a high output gene, and with all parameters (except ${\delta}_{M}$) experiencing noise. The transcription is again given scaled $\mathrm{B}\mathrm{e}\mathrm{t}\mathrm{a}(3,6)$ distribution, having squared coefficient of variation 0.2, with scaling to achieve $E({K}_{N})=5$ in low output and $E({K}_{N})=50$ in the high output. The noise distribution for ${\delta}_{P}$ is a $\mathrm{B}\mathrm{e}\mathrm{t}\mathrm{a}(8,6)$ distribution, scaled to achieve a mean value of 0.2. Computational sampling from 10^{7} samples finds the value of $\frac{\mathrm{E}(1/({\delta}_{p}+1))}{\mathrm{E}(1/{\delta}_{p})}$ as approximately 0.1573. The high output gene then expects an overshoot of 0.003146 from the matureprotein reporters, while the low output gene expects an overshoot of approximately 0.03146. Figure Convergence of Pathway and Dual ReportersA shows comparison of convergence over the first 2000 samples, with the theoretical values accommodating the calculated overshoot. Figure Convergence of Pathway and Dual ReportersB shows the same over 20,000 samples; this time two lowoutput two highoutput genes are considered, so that the variation around the expected long term value can be seen. It is evident that reasonable estimates are given after a relatively modest number of samples, but there is a very long delay to a highly accurate convergence for the pathway reporters in the low output gene.
Appendix 5
Generality of the pathway reporter method: model and simulation details
We verify our pathway reporter approach on synthetic data for genes with various underlying gene expression dynamics. In the main text, we consider four different scenarios, beyond the standard fourstage model of gene transcription. Below we give details of these models, and their method of simulation.
(1): First, we consider a more detailed model of the mRNA maturation process, where the nascent mRNA maturate after a more finely controlled time interval. Specifically we adjust the constitutive model of gene transcription (${\mathbf{\mathbf{M}}}_{4}$ of the main text with $\lambda =1$ and $\mu =0$) to include a fixedduration of maturation. We also explore timeintervals sampled from Erlang distributions, to account for the fact that maturation can involve several shorter stochastic processes. The process is simulated using an adaptation of SSA, whereby the time of maturation of an individual nascent mRNA is calculated (by sample from the chosen distribution) at the point it is transcribed, and this timepoint is queued in a register of maturation times; this can be trivially adapted to other distributions; generalisedErlang for example. At each iteration of the simulation, determination of the next event involves taking the smallest of the various exponentially sampled Markov processes (on, off, mRNA transcription, etc) and the next maturation event (the minimum of the maturation register). In the case that maturation is the next event, the corresponding maturation time is removed from the maturation register.
(2): We next consider a model of transcriptional bursting given in Bartman et al., 2019; Cao et al., 2020. In this ‘multiscale’ model, gene activation occurs in two steps, initially by the binding of transcription factors, and then as a secondary step by the binding and pause of the mRNA polymerase. For details of the model refer to Figure 6 and the surrounding paragraphs in the main text. We here show that our Noise Decomposition Principle holds for all reporters taken from this gene pathway, and verify computationally that correlations between some of the reporter pairs are negligible; the process is simulated by the SSA. In the following, we assume all reactions are firstorder, characterised by exponential waiting times between successive reactions. Let ${X}_{N}$ denote the number of nascent mRNA, let ${X}_{M}$ denote the number of mature mRNA, and let ${X}_{P}$ denote the number of proteins produced from the same gene. Also let $\mathbf{\mathbf{Z}}=\{{\lambda}_{1},{\mu}_{1},{\lambda}_{2},{K}_{N},{K}_{M},{K}_{P},{\delta}_{M},{\delta}_{P}\}$. We assume that the maturation rate ${K}_{M}$ is large (i.e. ${K}_{M}>{\delta}_{M}$), which is supported by experiments Cao and Grima, 2020. Then using the results of Cao et al., 2020, the stationary averages for the nascent mRNA, mature mRNA and protein levels are given by,
respectively. Here $\gamma}_{1}={\lambda}_{1}+{\mu}_{1$ and ${\gamma}_{2}={K}_{N}+{\lambda}_{2}+{\mu}_{1}$. We begin by considering the nascentmature pathway reporters. From (34), it is easily seen that $\mathrm{E}({X}_{N};\mathbf{Z})=f({\mathbf{Z}}_{N}){K}_{N}\frac{{\lambda}_{1}{\lambda}_{2}}{{\gamma}_{1}{\gamma}_{2}}$ and $\mathrm{E}({X}_{M};\mathbf{Z})=g({\mathbf{Z}}_{M}){K}_{N}\frac{{\lambda}_{1}{\lambda}_{2}}{{\gamma}_{1}{\gamma}_{2}}$, where $f({\mathbf{Z}}_{N})=\frac{1}{{K}_{M}}$ and $g({\mathbf{\mathbf{Z}}}_{M})=\frac{1}{{\delta}_{M}}$. So the NDP holds, and the normalised covariance of $\mathrm{E}({X}_{N};\mathbf{Z})$ and $\mathrm{E}({X}_{M};\mathbf{Z})$ will identify the extrinsic noise on the transcriptional component ${K}_{N}\frac{{\lambda}_{1}{\lambda}_{2}}{{\gamma}_{1}{\gamma}_{2}}$. For the matureprotein reporters, we can again see from (34) that $\mathrm{E}({X}_{M};\mathbf{Z})=f({\mathbf{Z}}_{P})\mathrm{E}(X;\mathbf{Z})$, where $f({\mathbf{Z}}_{P})=\frac{{K}_{P}}{{\delta}_{P}}$. Thus the NDP holds, and so the normalised covariance of $\mathrm{E}({X}_{M};\mathbf{Z})$ and $\mathrm{E}({X}_{P};\mathbf{Z})$ will identify the strength of extrinsic noise in the mRNA copy number distribution. For the nascentprotein reporters, it is easy to see that $\mathrm{E}({X}_{N};\mathbf{Z})=f({\mathbf{Z}}_{N}){K}_{N}\frac{{\lambda}_{1}{\lambda}_{2}}{{\gamma}_{1}{\gamma}_{2}}$, where $f({\mathbf{\mathbf{Z}}}_{N})=\frac{1}{{K}_{M}}$, and $\mathrm{E}({X}_{P};\mathbf{Z})=g({\mathbf{Z}}_{P}){K}_{N}\frac{{\lambda}_{1}{\lambda}_{2}}{{\gamma}_{1}{\gamma}_{2}}$, where $g({\mathbf{\mathbf{Z}}}_{P})=\frac{{K}_{P}}{{\delta}_{M}{\delta}_{P}}$. Thus, again the NDP holds, and the normalised covariance of $\mathrm{E}({X}_{N};\mathbf{Z})$ and $\mathrm{E}({X}_{P};\mathbf{Z})$ will identify the noise on the transcriptional component $K}_{N}\frac{{\lambda}_{1}{\lambda}_{2}}{{\gamma}_{1}{\gamma}_{2}$.
(3): We combine models (1) and (2) above, incorporating the fixed time maturation of (1) with the finelevel transcriptional approach of (2). Again, we employ the adaptation of the SSA described in (1).
(4): We incorporate the salient features of the cellcycle into model (3), as well as introducing Erlangdistributed maturation times. Specifically, the model accounts for the effects of gene replication, dosage compensation, binomial partitioning of products due to cell division, as well as cellcycle length variability. Within each phase of the cell cycle, the handling of maturation times for a given gene copy is exactly as in Case (1) above, but within the overall system described in Case (3). To capture the cell cycle, the SSA is now further modified to handle the two key sporadic changes: gene replication and cell division. After gene replication, we perform two independent modified SSA simulations, with common parameter values. The final copy numbers are obtained as a sum of those from each gene copy. At cell division, we follow just one daughter cell, selecting the inherited copy numbers by way of binomial partitioning from the mother cell values at the point of division (including the maturation register). The length of these cellcycle phases is sampled from the chosen Erlang distribution, on commencement of the simulation of the phase.
The results of pathway reporters for each of the Cases (1 – 4) above are given in Appendix 5—tables 1–8 below. In all cases, the values given are the average of 20 simulations, each calculated from 500 copy number samples, and the errors are ± one standard deviation. For each model, we consider two parameter sets, each without extrinsic noise (Subcase A) and with extrinsic noise (Subcase B). For Case (1), the model parameters are chosen to produce an average nascent mRNA copy number of 10, and an average number of 2000 proteins; the average mature mRNA copy number varies according to the maturation time. We find that all of the pathway reporters accurately predict the true extrinsic noise levels (as given by dual reporters) across a range of maturation times; refer to Appendix 5—table 1, 2. For Cases (2 – 4), the model parameters are chosen to produce an average nascent mRNA copy number of 5 and an average mature mRNA copy number of 100, and an average number of 2000 proteins. We verify computationally that the matureprotein and nascentprotein reporter pairs provide accurate predictions of extrinsic noise contributions across a range of noise levels. For these results refer to Appendix 5—tables 3–8. We mention that the results of Case (4)B (Appendix 5—tables 8) may suggest a slight undershoot in the pathway reporter values in comparison to dual reporters. The difference is, however, within one standard deviation, and are calculated from a relatively small sample size due to the complexity of the simulation and corresponding run time. Investigating the presence and possible causes of this marginal effect for this model and other complicated models may be of interest in further work.
Data availability
All methods and simulation results are shared via a github site https://github.com/leham/PathwayReporters (copy archived at https://archive.softwareheritage.org/swh:1:rev:269e0fffe4fc716db6991ccf78ad2191e509c2e1). There is no original data associated with this manuscript.
References

BookHandbook of Mathematical Functions with Formulas, Graphs, and Mathematical TablesMassachusetts, United States: Courier Corporation.

Bursty gene expression in the intact mammalian liverMolecular Cell 58:147–156.https://doi.org/10.1016/j.molcel.2015.01.027

Exact and approximate distributions of protein and mRNA levels in the lowcopy regime of gene expressionJournal of Mathematical Biology 64:829–854.https://doi.org/10.1007/s0028501104335

A stochastic model of gene expression with polymerase recruitment and pause releaseBiophysical Journal 119:1002–1014.https://doi.org/10.1016/j.bpj.2020.07.020

Stochastic gene expression in a single cellScience 297:1183–1186.https://doi.org/10.1126/science.1070919

On a general class of "Contagious" DistributionsThe Annals of Mathematical Statistics 14:389–400.https://doi.org/10.1214/aoms/1177731359

NoncodingRNA regulators of RNA polymerase II transcriptionNature Reviews. Molecular Cell Biology 7:612–616.https://doi.org/10.1038/nrm1946

Special function methods for bursty models of transcriptionPhysical Review. E 102:022409.https://doi.org/10.1103/PhysRevE.102.022409

Extrinsic noise and HeavyTailed laws in gene expressionPhysical Review Letters 124:108101.https://doi.org/10.1103/PhysRevLett.124.108101

Exactly solvable models of stochastic gene expressionThe Journal of Chemical Physics 152:144106.https://doi.org/10.1063/1.5143540

Nonidentifiability of the source of intrinsic noise in gene expression from singleburst dataPLOS Computational Biology 4:e1000192.https://doi.org/10.1371/journal.pcbi.1000192

Solving the chemical master equation for monomolecular reaction systems analyticallyJournal of Mathematical Biology 54:1–26.https://doi.org/10.1007/s002850060034x

Small protein number effects in stochastic models of autoregulated bursty gene expressionThe Journal of Chemical Physics 152:084115.https://doi.org/10.1063/1.5144578

Bursting onto the scene? exploring stochastic mRNA production in bacteriaCurrent Opinion in Microbiology 45:124–130.https://doi.org/10.1016/j.mib.2018.04.001

Stochastic gene expression: modeling, analysis, and identificationIFAC Proceedings Volumes 42:1022–1028.https://doi.org/10.3182/200907063FR2004.00170

A stochastic model for gene inductionJournal of Theoretical Biology 153:181–194.https://doi.org/10.1016/s00225193(05)804217

Disentangling intrinsic and extrinsic gene expression noise in growing cellsPhysical Review Letters 126:078101.https://doi.org/10.1103/PhysRevLett.126.078101

Listening to the noise: random fluctuations reveal gene network parametersMolecular Systems Biology 5:318.https://doi.org/10.1038/msb.2009.75

BookNIST Handbook of Mathematical FunctionsCambridge, United Kingdom: Cambridge University Press.

Markovian modeling of GeneProduct synthesisTheoretical Population Biology 48:222–234.https://doi.org/10.1006/tpbi.1995.1027

Effects of cell cycle variability on lineage and population measurements of messenger RNA abundanceJournal of the Royal Society Interface 17:20200360.https://doi.org/10.1098/rsif.2020.0360

Multiplexed quantification of proteins and transcripts in single cellsNature Biotechnology 35:936–939.https://doi.org/10.1038/nbt.3973

Uncoupling gene expression noise along the central dogma using genome engineered human cell linesNucleic Acids Research 48:9406–9413.https://doi.org/10.1093/nar/gkaa668

Gene regulation at the singlecell levelScience 307:1962–1965.https://doi.org/10.1126/science.1106914

Integrals containing confluent hypergeometric functions with applications to perturbed singular potentialsJournal of Physics A: Mathematical and General 36:7771–7788.https://doi.org/10.1088/03054470/36/28/307

The stochastic nature of biochemical networksCurrent Opinion in Biotechnology 19:369–374.https://doi.org/10.1016/j.copbio.2008.06.011

Consequences of mRNA transport on stochastic variability in protein levelsBiophysical Journal 103:1087–1096.https://doi.org/10.1016/j.bpj.2012.07.015

ConferenceStochastic analysis of gene regulatory networks using moment closure2007 American Control Conference.

General properties of transcriptional time series in Escherichia coliNature Genetics 43:554–560.https://doi.org/10.1038/ng.821

Stochasticity of gene products from transcriptional pulsingPhysical Review. E, Statistical, Nonlinear, and Soft Matter Physics 79:031911.https://doi.org/10.1103/PhysRevE.79.031911

Simultaneous epitope and transcriptome measurement in single cellsNature Methods 14:865–868.https://doi.org/10.1038/nmeth.4380

Stochastic kinetics of nascent RNAPhysical Review Letters 117:128101.https://doi.org/10.1103/PhysRevLett.117.128101

Cellcycle dependence of transcription dominates noise in gene expressionPLOS Computational Biology 9:e1003161.https://doi.org/10.1371/journal.pcbi.1003161
Article and author information
Author details
Funding
University of Melbourne (DRM)
 Lucy Ham
 Michael PH Stumpf
Volkswagen Foundation (93 062)
 Michael PH Stumpf
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
The authors gratefully acknowledge Rowan D Brackston for helpful discussions in the early stages of this research. We thank Lior Pachter and Gennady Gorin for fruitful discussions on noise identifiability. We also wish to thank Arjun Raj for providing valuable feedback on this work. LH. and MPHS. were supported by the University of Melbourne Driving Research Momentum initiative.
Version history
 Preprint posted: September 30, 2020 (view preprint)
 Received: April 12, 2021
 Accepted: October 11, 2021
 Accepted Manuscript published: October 12, 2021 (version 1)
 Version of Record published: November 22, 2021 (version 2)
Copyright
© 2021, Ham et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 2,670
 views

 370
 downloads

 28
 citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading

 Chromosomes and Gene Expression
 Developmental Biology
Developmental programming involves the accurate conversion of signalling levels and dynamics to transcriptional outputs. The transcriptional relay in the Notch pathway relies on nuclear complexes containing the coactivator Mastermind (Mam). By tracking these complexes in real time, we reveal that they promote the formation of a dynamic transcription hub in Notch ON nuclei which concentrates key factors including the Mediator CDK module. The composition of the hub is labile and persists after Notch withdrawal conferring a memory that enables rapid reformation. Surprisingly, only a third of Notch ON hubs progress to a state with nascent transcription, which correlates with polymerase II and core Mediator recruitment. This probability is increased by a second signal. The discovery that targetgene transcription is probabilistic has farreaching implications because it implies that stochastic differences in Notch pathway output can arise downstream of receptor activation.

 Cancer Biology
 Chromosomes and Gene Expression
The chromatinassociated protein WD Repeat Domain 5 (WDR5) is a promising target for cancer drug discovery, with most efforts blocking an argininebinding cavity on the protein called the ‘WIN’ site that tethers WDR5 to chromatin. WIN site inhibitors (WINi) are active against multiple cancer cell types in vitro, the most notable of which are those derived from MLLrearranged (MLLr) leukemias. Peptidomimetic WINi were originally proposed to inhibit MLLr cells via dysregulation of genes connected to hematopoietic stem cell expansion. Our discovery and interrogation of smallmolecule WINi, however, revealed that they act in MLLr cell lines to suppress ribosome protein gene (RPG) transcription, induce nucleolar stress, and activate p53. Because there is no precedent for an anticancer strategy that specifically targets RPG expression, we took an integrated multiomics approach to further interrogate the mechanism of action of WINi in human MLLr cancer cells. We show that WINi induce depletion of the stock of ribosomes, accompanied by a broad yet modest translational choke and changes in alternative mRNA splicing that inactivate the p53 antagonist MDM4. We also show that WINi are synergistic with agents including venetoclax and BETbromodomain inhibitors. Together, these studies reinforce the concept that WINi are a novel type of ribosomedirected anticancer therapy and provide a resource to support their clinical implementation in MLLr leukemias and other malignancies.