Exploiting fluctuations in gene expression to detect causal interactions between genes
Figures
Causal interactions between genes can be inferred when non-genetic cell-to-cell variability violates the covariance invariant of Equation 2.
(A) Consider an arbitrary network of interacting cellular components in which an engineered reporter is introduced to act as a passive read-out of the transcriptional signal that regulates a gene of interest geneX, but itself does not regulate other cellular components. Any cellular component belongs to one of two groups: components affected by (red circles), and components not affected by (blue squares), where the arrows indicate directed causal biochemical effects. Specifically, an arrow from to denotes the existence of a reaction that changes with a rate that depends on (see Appendix 1 for details). (B) The dashed line of Equation 2 constrains the normalized covariance between and any cellular component not affected by (blue squares). In contrast, components affected by (red circles) are not constrained by Equation 2. A violation of Equation 2 thus implies the existence of a causal interaction from to . Data points are numerical simulations of specific example networks (see Appendix 1 for details) to illustrate the analytically proven theorem of Equation 2. (C) The invariant of Equation 2 applies not only to transcriptional reporters but also when and correspond to co-regulated fluorescent proteins with translation rates that scale with transcript abundances.
Numerical simulations of example systems confirm computationally that the analytically proven Equation 2 constrains cellular abundances and concentrations in growing and dividing cells under fairly general assumptions as long as does not affect .
(A) To numerically demonstrate the validity of Equation 2 we consider ten example birth-death processes covering six different network topologies with non-linear rates, closed-loop feedback, time-varying upstream signals, and fluctuating degradation rates. In all cases, chemical species and are co-regulated but do not affect a third chemical species of interest. See Appendix 1—table 1 for details of the simulated systems. (B) We consider three different cellular growth dynamics that affect molecular abundances through random partitioning of molecules at cell division and molecular concentrations through dilution. Additionally, system reaction rates depend in varying ways on the cell volume. At cell division, molecular abundances are assumed to be partitioned on average proportional to cell size. (C) Simulation results for each of the example topologies from panel A, subject to the different growth dynamics of panel B. Model parameters were varied over several orders of magnitude. The numerically obtained Pearson correlations do not satisfy a general relation, whereas normalized covariances satisfy Equation 2 for absolute abundances as well as concentrations. Colors correspond to the topologies indicated in panel A (results for individual topologies are compared in Appendix 1—figures 3 and 4). For each dot, single-cell trajectories of 20,000 cell divisions were simulated at least 40 times, with the center of the dot corresponding to the average of the simulation ensemble (see Materials and methods).
Experimental data from all but one synthetic regulatory circuit are consistent with the theory.
(A) In all synthetic circuits, we considered TetR as our protein of interest () fused to YFP to allow for quantification through fluorescence microscopy. CFP () was used as a passive read-out of the transcriptional control of tetR by placing it under the control of a copy of the same promoter as tetR. In all synthetic circuits and are thus co-regulated by the LacI protein. (B) We constructed two different types of synthetic circuits using the repressilator motif (Elowitz and Leibler, 2000) as a basis. Left: example circuit in which TetR () causally affects a RFP reporter (). Right: negative control example circuit in which RFP was expressed constitutively and thus expected to be independent of TetR levels. In this circuit, does not causally affect but and are correlated due to plasmid copy number fluctuations. (C) E. coli cells with the synthetic circuit encoded on the pSC101 plasmid were grown in a microfluidic device and observed over hundreds of cell divisions while daughter cells were washed away. Fluorescence levels of YFP, CFP, and RFP were measured simultaneously for hundreds of mother cells, along with cell area, cell length, and the growth rate. For each strain, the time-lapse data of all cells were combined into a population distribution from which the normalized covariances were computed. All temporal information was thus discarded and not used in the analysis. (D) Two out of four causal interactions were clearly detected through violations of Equation 2. Of the other two, one was clearly not detected, whereas one was situated right at the limit of experimental detectability but did not violate the null hypothesis test at a 2.5% significance level (see Appendix 1—figure 2). See Materials and methods for details of our error analysis. Numbers indicate the synthetic circuits as listed in Appendix 1—figure 17. (E) All but one negative control circuits led to data consistent with Equation 2 (dashed line). Numbered synthetic circuits are listed in Appendix 1—figures 18 and 19 . The two inconsistent data points corresponding to the same synthetic circuit (number 12), which is presented in detail in Figure 4. The false positives either imply that our method works imperfectly or that we detected an unexpected causal interaction from TetR onto growth rate in this circuit. We present evidence for the latter interpretation in the next section.
In the outlier strain, the bacterial stress response is triggered, which in turn regulates the ‘constitutive’ RFP reporter.
(A) The outlier strain consists of the repressilator circuit encoded on the pSC101 plasmid with an endogenous copy of the lacI gene encoded on the chromosome. This endogenous source of LacI disrupts the regular oscillations of the repressilator circuit. RFP was chromosomally expressed under the control of the pRNA1 promoter (Lin-Chao and Bremer, 1987). (B) The outlier strain exhibited large growth rate variability. (C) The outlier strain exhibited large variability in RFP levels with the highest peaks occurring after the slowest growth periods. (D) RFP levels were strongly correlated with RpoS activity as quantified through the known RpoS target gadX (Sampaio et al., 2022). (E) Deletion of the rpoS gene made the outlier strain consistent with Equation 2. (F) Expressing LacI endogenously leads to long periods of low TetR-YFP levels rather than regular repressilator oscillations. Because TetR is the only repressor of the cI gene in the synthetic circuit, we hypothesize that during periods of low TetR, CI expression is so high that resource competition with the also highly expressed RFP triggers the bacterial stress response. This interaction, hypothesized to be present in cells with high CI and RFP expression levels, is indicated with a dashed arrow. Solid arrows indicate interactions with direct experimental support (Patange et al., 2018).
The dynamics of the ‘dual reporters’ X and Y are not necessarily symmetric despite the fact that X and Y have the same production rate.
(A) Feedback in gene regulation can suppress or amplify fluctuations in the abundances of molecular species. Here we show an example system of the class of Equation 1, where X affects the shared transcription rate of X and,Y but does not affect the shared upstream variable Z. In this example, the intrinsic fluctuations of X are suppressed through negative feedback, whereas those from Y are not. This leads to unequal dynamics and cell-to-cell variability in X and Y levels. See Section 10 for details of the simulated example system. (B) The invariant Equation 2 does not hold in terms of Pearson correlation coefficients, i.e., even when X does not affect Z. In this example system, the divergence between and increases as the feedback strength increases (see Section 10). In contrast, Equation 2 constrains all systems in which X does not affect Z, even when the X and Y levels have vastly different dynamics and cell-to-cell variability. Note, the covariances and correlations can be determined from the static snapshots of cell-to-cell variability indicated in panel A. The time-trace to indicate the dynamics of the reporters is for illustration purposes only.
Using the null hypothesis test on the measured covariability ratios.
Using the null hypothesis test on the measured covariability ratios. (A) Plotted are the measured covariability ratios for synthetic circuits shown in Figure 3D and Appendix 1—figure 17 in which X affects Z. Error bars correspond to estimated 95% confidence intervals, which were determined with error propagation (using the uncertainties python package) using the 95% confidence intervals for and that were estimated as detailed in Section 17 and the Materials and methods of the main text. Under the null hypothesis that there is no causal interaction from X to Z, the expected value of . If the measured covariability ratio is above (below) the expected value of 1, then there is a 2.5% probability that the expected value of 1 falls below (above) the lower (upper) error bar. As a result, under a significance level of 2.5%, we say there is a causal interaction from X to Z if the value of does not fall within the measured 95% confidence intervals. Using this rule, our method detected two causal interactions (synthetic circuits #1 and #2) out of four. (B) Under the rule for detecting causality outlined in A, we detect a causal interaction in synthetic circuit #12 when we set the RFP concentration as the Z component of interest. Plotted are the measured covariability ratios for the synthetic circuits shown in Figure 3E and Appendix 1—figures 18 and 19. (C) Similarly to B, we detect a causal interaction in synthetic circuit # 12 when we set the growth rate as the Z component of interest. All other negative control circuits satisfy the null hypothesis test.
Percentage uncertainty distributions for individual simulated processes — part 1.
Plotted are the computed normalized covariances from the simulated concentrations of growing and dividing cells in processes 1–5 (as listed in Appendix 1—table 1). Each number in the top left corners corresponds to the process index listed in Appendix 1—table 1. For each process, a number of random parameters were chosen and simulated as described in the Materials and methods section of the main text. On the left of each subplot are the normalized covariances for each of the parameter sets simulated in the respective process. Error bars are present but are almost always too small to be visible. On the right of each subplot is the distribution of percentage uncertainty, defined as and , where is the estimated 95% confidence interval for . Most of the percentage errors fall below 5%. However, occasionally a parameter set is chosen that leads to either rare birth and death events in the stochastic process, which leads to low sampling, or the particular parameters lead to near zero normalized covariances, which can lead to large percentage errors. Colors of dots and error bars in the left plots correspond to the same colors plotted in the combined figure of Figure 2.
Percentage uncertainty distributions for individual simulated processes — part 2.
Same as Appendix 1—figure 3 but for processes 6 to 10 (as listed in Appendix 1—table 1).
Percentage of simulations that deviate from the invariant of Equation 2.
(A) Plotted is the percentage of simulated parameter sets for each simulated process (as listed in Appendix 1—table 1) that did not produce normalized covariances that satisfied Equation 2. Specifically, the 95% confidence interval for the ratio in these ‘outlier’ simulations did not capture the predicted value of 1. It is expected that not all simulations will agree with Equation 2 due to sampling errors. The largest outlier percentage is for process 5, which has 5.01% of simulations that did not satisfy Equation 2. As described in the Materials and Methods section of the main text, all outlier systems were simulated again with additional sampling which resulted in agreement with Equation 2. (B) Plotted is the number of parameter sets simulated for each process. Some processes took less time to simulate and, as a result, resulted in a larger number of parameter sets. All processes were simulated for at least 100 parameter sets.
Estimating uncertainty with a second method does not change the results of Figure 3D,E.
Shown are the measured normalized covariances for the synthetic positive control circuits (left) and negative control circuits (right), where error bars are computed by splitting the ensemble of single-cell traces into 7-10 disjoint sets. Final normalized covariances are taken as the averages of the sets, with 95% confidence intervals given by two times the standard error of the mean (plotted error bars). Data corrections such as non-even illumination correction and autofluorescence removal were done separately on each set in order to estimate the error of these corrections. Autofluorescence and imaging media fluorescence were corrected by analyzing cells that lost the synthetic plasmid due to random partitioning at cell division, as described in Materials and methods. If 10 or more cells from an experiment lost the plasmid, then the single-cell traces were randomly split into 10 disjoint sets, both with and without the plasmid. Otherwise, the number of disjoint sets was set as the number of cells that lost the plasmid (a minimum of 7), so that each set would contain at least one cell that lost the plasmid to estimate the autofluorescence. This is described in the Materials and methods section of the main text, as well as in Appendix 1—figure 16 and Section 17.
The outlier circuit does not satisfy Equation 2 after switching the fluorescent proteins.
(A) To rule out the possibility that the outlier in Figure 3E was caused by a systematic error from an asymmetry between measurements of the CFP and YFP fluorescence signals, we rebuilt the outlier circuit, but with cfp fused to tetR and yfp set as the transcriptional passive reporter. If the outlier was caused by such a systematic error, then the normalized covariance measurements would switch to the other side of the dashed line. (B) The normalized covariances still do not satisfy Equation 2, and they do not switch to the other side of the dashed line. Two repeats of the experiment are shown. Note that the numerical values for the normalized covariances are different than the outlier circuit in Figure 3E. This could be indicative that the CFP protein used changes the TetR function when used as a fusion.
Using a different cell segmentation pipeline does not alter the main result of Figure 3E .
For the results shown in Figure 3 we used the deep-learning cell segmentation pipeline DeLTA (O’Connor et al., 2022), see Materials and Methods. We used the DeLTA pipeline to segment cells with the bright field channel because the three fluorescence channels (CFP, YFP, and RFP) were not always bright enough in the positive control strains to use as a segmentation marker. However, in all the negative control strains, RFP is under the control of the strong pRNA1 promoter which has been used as a segmentation marker previously (Potvin-Trottier et al., 2016; Lord et al., 2019). We thus applied the same segmentation pipeline used in Potvin-Trottier et al., 2016; Lord et al., 2019 on the RFP channel of the negative control strains. The numerical values of the normalized covariances change slightly, but their relation to Equation 2 remains intact. This second segmentation pipeline determines the edges of the cells using a thresholding algorithm, and it estimates background fluorescence by taking the median of the median of each image; see Lord et al., 2019 for details. The autofluorescence of strains 6, 7, 8, 9, and 10 was not removed because these strains express RFP from the plasmid, so cells that lost the plasmid (which we use to estimate autofluorescence, see Materials and Methods) cannot be tracked with this pipeline.
In the outlier circuit, large RFP fluctuations are often negatively correlated with growth rate fluctuations.
We observed many instances in which RFP concentration was negatively correlated with fluctuations in the cellular growth rate. Shown are three sample single-cell traces. The top two are from the outlier circuit, whereas the bottom trace is from the outlier circuit with swapped fluorescent proteins (see Appendix 1—figure 7A). Dots correspond to times at which a cell division has been detected. (B) To quantify this trend over all lineages, we used the cross-correlation function between the RFP concentration and growth rate (g) in the outlier circuit. We observe a negative correlation time shifted by approximately 1 hr. quantifies how the RFP concentration is correlated with growth rate when the growth rate signal is shifted by time τ relative to the RFP signal. was computed using the python scipy function correlate on each single-cell trajectories while normalizing by . The dark line corresponds to the average at time τ over all trajectories, with the width of the shaded region given by the standard deviation about the mean.
The outlier circuit shows more variability in RFP concentrations and growth rate measurements as compared to other strains that express RFP from the chromosome with the pRNA1 promoter.
(A) The ‘regular’ repressilator circuit without endogenous LacI also exhibits negative correlation between RFP and growth rate, albeit with smaller magnitude. Shown are the cross correlations (defined in Appendix 1—figure 9B) for the outlier circuit (left), the strain with the regular repressilator (outlier circuit with lacI deletion, middle), and the background strain without a plasmid but with RFP still expressed (right). In all cases, RFP is expressed from the chromosome under the control of the same pRNA1 promoter. The outlier circuit exhibits slightly more negative correlation. Below each cross-correlation plot is a sample time-trace for a cell with the respective circuit. Dots correspond to times at which a cell division has been detected. (B) The ‘regular’ repressilator circuit without endogenous LacI does not exhibit the same degree of growth rate fluctuations and does not exhibit as pronounced RFP pulses. The size of the fluctuations of the growth rate and RFP concentration was measured as the coefficient of variability (CV), defined as . The CVs were measured for the outlier circuit, the strain with the regular repressilator (outlier circuit with lacI deletion), and the background strain without a plasmid but with RFP still expressed. The outlier circuit exhibits a larger CV in both cases, suggesting that the outlier circuit causes the increased growth rate fluctuations. Orange error bars correspond to 95% confidence intervals estimated using the same bootstrapping approach used for the normalized covariances described in the Materials and Methods of the main text.
RpoS activity often displayed pulses that are negatively correlated with growth rate.
We observed many instances in which RpoS activity, quantified by a transcriptional GFP reporter under the control of the gadX promoter, was negatively correlated with fluctuations in the cellular growth rate. Shown are three sample single-cell traces. (B) To quantify this trend over all lineages, we used the cross-correlation function between the GFP concentration and growth rate (g). We observe a negative correlation time shifted by approximately 1 hr. quantifies how the GFP concentration is correlated with growth rate when the growth rate signal is shifted by time relative to the GFP signal. It was computed as described in the Appendix 1—figure 9 caption.
Removing the pRNA1 controlled rfp gene changes the TetR dynamics of the outlier circuit and increases the growth rate.
One of the positive control circuits (4) contains the same synthetic circuit as the outlier circuit, but without the pRNA1 controlled rfp on the chromosome, and instead a pLtetO1 controlled rfp gene on the plasmid. We observed that the TetR dynamics, quantified by YFP, were different than in the outlier circuit as they displayed random large pulses. The RFP levels, in turn, also displayed random large pulses when the TetR levels were reduced. To quantify these observations over all lineages, we computed the CV for the TetR levels for both strains (top right). The strain 4 CV is larger. We also computed the average growth rate for both strains and found that the strain 4 growth rate is approximately 9% larger. These results suggest that the pRNA1-controlled rfp gene affected the rest of the circuit elements, along with the cellular growth rate, which we hypothesize is from burden caused by resource competition. Error bars correspond to 95% confidence intervals using the bootstrapping approach used for the normalized covariances as discussed in the Materials and methods section.
Correcting for temporal drift of the focal points.
(A) Individual average fluorescence measurements (taken over cell area) are binned according to their timeframe. The average of the measurements at each timeframe is plotted. A second-degree polynomial is fitted to the resulting time-traces using a least squares fit. Shown is data from the strain 6 in Appendix 1—figure 18. The YFP and CFP time-traces fluctuate more than the RFP time-trace due to the large pulses that they display from the Repressilator. (B) Temporal drift is corrected by multiplying all of the individual average fluorescence measurements with the reciprocal of the fits. Plotted is the average time-traces taken from the corrected data, along with new fits that display constant trends.
Estimating autofluorescence and media fluorescence with cells that lost plasmid.
(A) A single-cell time trace from a mother machine experiment in which the plasmid is kept for the entire duration of the experiment. On the top is the cell area over time displaying cellular growth and division, and on the bottom are the average YFP and CFP signals taken over the cell area. Note that autofluorescence and media fluorescence have not been corrected for here. (B) A single-cell time trace in which the plasmid containing the synthetic circuit is lost. The signals decay to constant values just above zero. The cell area continues to grow and divide, indicating that the decay is not the result of segmentation error or loss of cell.
Correcting for background and uneven illumination.
(A) Plotted is the fluorescence intensity profile across the dashed orange line shown in a single image of the CFP channel. The background is traced by the light blue solid line in the bottom plot, to illustrate the slight dependence. To estimate and remove the background, in each image the average fluorescence across a 50x50 pixel box located 15 pixels above each mother cell was removed from the fluorescence measurement of each respective cell in that image. Note also that an empty trench produces a significant amount of fluorescence when compared to cell-containing trenches. This indicates that fluorescence originating from the media itself needs to be considered. This will be discussed in the next section. The fluorescence profile was created using the plot profile function in Fiji ImageJ. (B) With the background removed, it remains to correct for . We bin all of the fluorescence measurement of all cells and over all time according to pixel position in bins of size 50 pixels and take the average of each bin (blue dots). A single frame has a width of 2048 pixels. This gives an estimate of , with deviations resulting from finite sampling. The binned averages are fit using the least squares method with a third-degree polynomial (solid orange line). The uneven illumination gain function is corrected by multiplying all the fluorescence measurements with the reciprocal of the fitted polynomial. Data shown is from the strain 6 system from Appendix 1—figure 18.
Estimating confidence intervals for normalized covariance measurements using two methods.
(A) Imaging error corrections and the normalized covariance are computed 100 times for 100 bootstrap samples of the experimental data. The final normalized covariance is given as the average over the samples, with confidence intervals given by two times the standard deviation. (B) The data is divided into disjoint sample sets, where M is the number of cells that lost the plasmid. Most experiments produced , but a few produced (with a minimum of 7). Imaging error corrections and the normalized covariance are computed for each sample. The final normalized covariance is given by the average over the samples, with confidence intervals given by two times the standard error of the mean.
Positive control synthetic circuits with sample single-cell time traces.
Plotted fluorescence units are not consistent from sample to sample.
Negative control synthetic circuits with sample single-cell time traces – part 1.
Plotted fluorescence units are not consistent from sample to sample.
Negative control synthetic circuits with sample single-cell time traces – part 2.
Plotted fluorescence units are not consistent from sample to sample.
Definition of the class of systems modeling a gene and its passive reporter.
(A) An example Markov chain with the state of the system given by . Nodes correspond to system variables, whereas arrows correspond to reaction rate dependencies. For instance, if changes under any transition with a reaction rate that depends explicitly on , then we draw an arrow from to . We say that a variable affects another variable if a path can be drawn in the network topology from the first variable to the second. (B) We consider all stochastic processes in which two cellular components, X and Y, are made with an identical, but unspecified, rate. This rate can depend in any way on a cloud of unknown time-varying components , which in turn can depend in arbitrary ways on the number of X and Y molecules, denoted by x and y. The birth and death reactions of X and Y in Equations 4.3; 4.4 are the only specified parts within this arbitrarily large network. We define two disjoint sets that make up all the unspecified components in the networks. The variables affected by X or Y, denoted by are all the components in the network that X or Y affects. All other components in the system are denoted by , which can affect the components affected by X in an arbitrary way.
Definition of the class of systems modeling a gene and its passive reporter.
Given a component that is not affected by X or Y, we define A to be all the components in the network that affect , along with all the components that affect the degradation rates of X and Y, along with the component itself. We define B to be all the components that are not affected by X or i and that are not in the set A. The union of A and B make up the cloud of components in Appendix 1—figure 21 that are not affected by X or Y. We show in this section that the proof of Equation 2 still holds when the production rates of X and Y differ by fluctuations that arise from the cloud of components B.
An example system shows that the invariant holds when the rates differ by additive fluctuations that average to zero.
(A) Simulations of the birth-death process of Equation 7.5 with , which ensures that the additive noise term averages to zero. Each dot corresponds to a single iteration of the Gillespie algorithm that was run until each reaction occurred at least 107 times. In each dot, the parameters are , , , . The K parameter quantifies the strength of the additive noise and was varied from 0 to 40 in increments of 2, with each dot having a particular K value. As can be seen, the dots all lie on or very near the line given by Equation 2. Small deviations from the line are inevitable and are due to numerical errors brought on by finite sampling of the stochastic process. (B) Same as in A except now , which means the additive noise term does not average to zero. Here as the K parameter is increased and more noise is added to the X production rate, there is further deviation from Equation 2 even as X and Y do not affect Z.
Process 1 — Negative feedback in X but not Y.
Here the X and Y production rate depends linearly on the Z abundances, but is also suppressed by X abundances through negative feedback. The Z component is produced according to a Poisson process. This could model, for example, a gene regulated by an upstream transcription factor that also has an auto-repression mechanism. All components in this model are degraded with a first-order reaction with constant degradation constants.
Process 2 — Asymmetric feedback.
Though we state in the main text that Y is a passive reporter that does not affect other components in the network, in the proof of the invariant of Equation 2 we only assume that Y does not affect the Z component of interest. Therefore, as long as Y does not affect Z, the invariant of Equation 2 should hold even if X and Y are involved in a feedback mechanism that depends on different ways on the X and Y abundances. To test this, we simulated the process above. Here, the X and Y production rate is a hill function that increases with X and Z abundances, but has noise suppression in Y abundances. The δ parameter is a small number that is added to stop the rate from ever reaching zero when x=0 (otherwise the rate would stay at zero at the first moment that the X abundances hit zero). There is also a δ in the denominator to stop the denominator from ever becoming zero which would lead to numerical error.
Process 3 — Negative feedback with a confounding variable.
Next, we test the invariant on a process in which Z is correlated with X and Y through a confounding variable W. Here, X, Y, and Z are produced with rates that are proportional to the W abundances. However, the X and Y production rate is also suppressed by the X abundances through negative feedback. This could model, for example, a transcription factor that regulates X, Y, and Z, in which X is involved in an auto-repression mechanism.
Process 4 — Negative feedback with concentration dependence 1.
In the previous systems, the rates depend on the abundances of the components. However, reaction rates can depend on the concentration of components instead of the abundances, which will exhibit different dynamics in growing and dividing cells. In order to demonstrate that the invariant of Equation 2 holds when the reaction rates depend on concentrations, we simulated the above process. Here, the production rate of X and Y increases linearly with the Z concentration, but is suppressed by the X concentration. The Z production rate is a Poisson process, and all components undergo first-order degradation with constant degradation constant.
Process 5 — Negative feedback with concentration dependence 2.
Here we simulated another process with rates that depend on the component concentrations. However, here the rates scale with the cell volume V. This is to model a transcription rate that scales with the cell volume. In growing and dividing cells, this scaling in the transcription rate can allow for the mRNA concentration to be on average constant throughout the cell cycle, which has been observed in many genes. The δ parameter is added so that the denominator does not ever go to zero, which would cause numerical error.
Process 6 — Shared time-varying upstream signal.
Here we consider a process where the X, Y, and Z production rates are driven by a common upstream signal. This signal is a deterministic sine function. The rates scale with the cell volume V so that the rate of production of the concentrations, given by the abundance production rates divided by the cell volume, are themselves sine functions. This could model, for example, three genes that are co-regulated by an oscillating signal.
Process 7 — Asymmetric feedback with confounding variable and concentration dependencies 1.
This process is similar to process 2 in that the production rate of X and Y depends on both components in different ways. In particular, the shared production rate of X and Y is a hill function that increases with Y concentration but is suppressed by X concentrations. There is also a confounding variable W that regulates the dual reporters by acting as the hill coefficient in the shared production rate of X and Y. The Z production is also regulated by the W component, through a linear dependency on the W concentration.
Process 8: Interaction mediated by a confounding variable.
Here we have a process where the X and Y production rate is a hill function that depends non-linearly on the abundance of component W. In addition, the production rate of the W component is repressed by the Z concentration, which itself is repressed by the W concentration. Therefore, X and Y are correlated with Z because they are affected by Z (through W), but also because W affects X, Y, and Z.
System 9 — Fluctuating degradation rates.
In the previous process, the X and Y components underwent first-order degradation with constant degradation time. The invariant of Eq. (LABEL:EQ:_causality_constraint_(manuscript)) holds when the degradation constants fluctuate as long as they are not affected by X or Y (see Section 4.1). To demonstrate that the invariant holds with fluctuations in the degradation constants, we simulated the above process where the degradation constant is given by the Z abundance, where the production of Z follows a Poisson process. This could model, for example, stochastic fluctuations in the abundances of an enzyme that degrades the X and Y components.
Process 10 — Oscillating degradation rates.
This process is similar to the previous process, but here all the components are driven by a shared upstream oscillating signal. This could model, for example, a system where a degradation enzyme is regulated by an oscillating signal.
Equation 2 is satisfied in an example network made of 10 components.
(A) The reaction network topology for the system given by Equations 14.1; 14.2; 14.3. Blue squares are components not affected by X. Red circles are affected by X. (B) Plotted are the normalized covariances between the concentrations of X, Y with those of the other components in the network. Components that are not affected by X (blue squares) satisfy the invariant of Equation 2. Components affected by X (red circles) violate the invariant. Note, the further down the cascade regulated by X, the closer the normalized covariances move towards the dashed line. This suggests that the distance from the dashed line is inversely proportional to the relative distances of components down regulatory cascades. Simulations were performed according to the algorithm described in Section 13. Normalized covariances were computed by integrating over the trajectories to obtain time averages for first and second moments, waiting for 10,000 cell divisions to occur before integrating to allow the system to reach stationarity. Each computed normalized covariance corresponds to the average of 40 independent simulations that ran for 106 cell divisions. Estimated 95% confidence intervals (which are too minuscule to see) for the normalized covariances correspond to two times the standard error of the mean over these 40 simulations. Using error propagation, in each of the components not affected by X, the ratio produced 95% confidence intervals that encompassed the predicted value of 1. See Materials and methods for additional simulation details.
Fluctuations in plasmid copy numbers reduce the degree of violation of Equation 2 in simulated test cases.
(A) Results of the simulations of the open-loop cascade system given by Equation 15.1. For each randomly sampled set of parameters, the system was simulated without plasmid fluctuations ( is constant) and with plasmid fluctuations (with dynamics given by Equation 15.3). The degree of violation of the invariant of Equation 2 was computed as the absolute value of the ratio of the normalized covariances . As X in this system represses the Z production rate, there can be a deviation from . We find that the degree of violation of the invariant tends to be bigger in the case without plasmid fluctuations. (B) Similar to A but for simulations of the closed-loop cascade system given by Equation 5.12. We find that the degree of violation of the invariant tends to be bigger in the case without plasmid fluctuations. For both the open-loop and closed-loop systems, 100 sets of parameters were chosen randomly, with , and where are uniformly distributed random numbers in (0,1). Plasmid copy number fluctuations were simulated with and which sets the average copy number to 5. Each simulation was performed using the Gillespie algorithm and ran until each reaction occurred at least 105 times.
Videos
Bright field images of a single imaging position of a mother machine experiment with colored cell boundaries produced from the DelTA segmentation pipeline.
Here, segmentation was done on bright field images.
Fluorescent images of a single imaging position of a mother machine experiment with colored cell boundaries produced from the DelTA segmentation pipeline.
Here, segmentation was done on the RFP channel, which produced similar results to those in Figure 3.
Tables
Simulated birth-death processes in growing and dividing cells.
Additional details in Section 11. Top box shows a general four-component birth-death process. Below the box shows the respective rates for the 10 processes simulated. For each process, three different single-cell volume trajectories were simulated according to the Materials and methods of the main text, and the rate parameters shown above were varied randomly multiple times. In processes 1–3, 8, and 9, λ was picked randomly from the set {10, 100, 1000}, whereas in processes 4, 5, and 7, it was picked randomly from the set {1,10, 100}. The parameter δ is set to 0.01 throughout to avoid divisions by 0. In processes 4, 5, 7, and 9, α is respectively set to (2, 0.1, 1),(2, 0.5, 0.1), (2, 1, 0.5), (1, 2, 0.5) for all simulations with respective volume dynamics (1, 2, 3). For example, in system 4 with volume dynamics 2, α = 0.1. All other parameters, namely, were picked randomly from the set {0.1, 1, 10}. In all simulations, the average division time is set to 1. Simulations were performed using the Gillespie algorithm, with adjustments to account for time-dependent rates, as described in Section 14.
| Process index | ||||||||
|---|---|---|---|---|---|---|---|---|
| 1 | 0 | 0 | ||||||
| 2 | 0 | 0 | ||||||
| 3 | ||||||||
| 4 | 0 | 0 | ||||||
| 5 | 0 | 0 | ||||||
| 6 | 0 | 0 | ||||||
| 7 | ||||||||
| 8 | ||||||||
| 9 | 0 | 0 | ||||||
| 10 | 0 | 0 |
Strains and plasmids.
See Appendix 1—figures 17–20 for circuits.
| Strain | Details |
|---|---|
| EJS1 | Synthetic circuit 5. Background strain is MG1655 E. coli attB::lacYA177C, ∆araCBAD, ∆lacIZYA, ∆araE, ∆araFGH, ∆rhaSRT, ∆rhaBADM, attn7::mCherry-mKate2-hybrid. Contains pEJS1. |
| EJS2 | Synthetic circuit 12. Background strain is MG1655 E. coli with glmS::PRNA1-mCherry-mKate2- hybrid and motility deletion ∆motA. Contains pEJS1. |
| EJS3 | Synthetic circuits 1 and 2. Background strain is MG1655 E. coli with attB::lacYA177C, ∆araCBAD, ∆lacIZYA, ∆araE, ∆araFGH, ∆rhaSRT, and ∆rhaBADM. Contains plasmid pEJS3. |
| EJS4 | Synthetic circuit 3. Background strain is MG1655 E. coli with attB::lacYA177C, ∆araCBAD, ∆lacIZYA, ∆araE, ∆araFGH, ∆rhaSRT, and ∆rhaBADM. Contains plasmid pEJS4. |
| EJS6 | Synthetic circuit 4. Background strain is MG1655 E. coli with motility deletion ∆motA. Contains plasmid pEJS4. |
| EJS7 | Synthetic circuit 11. Background strain is MG1655 E. coli with attB::lacYA177C, ∆araCBAD, ∆lacIZYA, ∆araE, ∆araFGH, ∆rhaSRT, and ∆rhaBADM. Contains plasmid pEJS5m. |
| EJS9 | Synthetic circuit 14. Background strain is MG1655 E. coli with glmS::PRNA1-mCherry-mKate2- hybrid and motility deletion ∆motA. Contains plasmid pEJS10. |
| PA12 PA52 | Strain with no plasmid in Appendix 1—figure 10. Background strain is MG1655 E. coli attB::lacYA177C, ∆araCBAD, ∆lacIZYA, ∆araE, ∆araFGH, ∆rhaSRT, ∆rhaBADM, attn7::mCherry-mKate2-hybrid. Synthetic circuits 6, 7, 8, and 9. Background strain is MG1655 E. coli with attB::lacYA177C, ∆araCBAD, ∆lacIZYA, ∆araE, ∆araFGH, ∆rhaSRT, and ∆rhaBADM. Contains plasmid pEJS5. |
| PA53 | Synthetic circuit 10. Background strain is MG1655 E. coli with attB::lacYA177C, ∆araCBAD, ∆lacIZYA, ∆araE, ∆araFGH, ∆rhaSRT, and ∆rhaBADM. Contains plasmids pEJS5 and pLPT41. |
| PA64 | Used in Figure 4D Background strain is MG1655 E. coli with glmS::PRNA1-mCherry-mKate2-hybrid and motility deletion ∆motA. Contains pSC101 plasmid with Kanamycin resistance and gfpmut2 reporter gene with gadX promoter taken from Zaslaver et al., 2006. |
| PA66 | Synthetic circuit 13. Background strain is MG1655 E. coli with glmS::PRNA1-mCherry-mKate2- hybrid, motility deletion ∆motA, and DRpos::FRT Kan frt. Contains plasmid pEJS1. |
| Plasmid | Details |
| pEJS1 | Used in circuits 5, 12, and 13. Repressilator with no degradation tags, with tetR replaced with tetR-mVenusNB fusion, along with an integrated SCFP3A reporter. pSC101 origin. Ampicillin resistance. |
| pEJS3 | Used in circuit 1 and 2. Repressilator with no degradation tags with cI replaced with mCherry-mKate2-hybrid, tetR replaced with tetR-mVenusNB fusion, and an integrated SCFP3A reporter. pSC101 origin. Ampicillin resistance. |
| pEJS4 | Used in circuit 3 and 4. Repressilator with no degradation tags, with tetR replaced with tetR-mVenusNB fusion, and integrated SCFP3A and mCherry-mKate2-hybrid reporters. pSC101 origin. Ampicillin resistance. |
| pEJS5 | Used in circuits 6, 7, 8, and 9. Repressilator with no degradation tags, with tetR replaced with tetR-mVenusNB fusion, and integrated SCFP3A and mCherry-mKate2-hybrid reporters. pSC101 origin. Ampicillin resistance. |
| pEJS5m | Used in circuit 11. Same as pEJS5, with a single missense point mutation on lacI that results in no oscillations forming (see 11 in Appendix 1—figure 19). LacI function must be affected by the mutation. pSC101 origin. Ampicillin resistance. |
| pEJS10 | Used in circuit 14. Same as pEJS1 but SCFP3A forms the fusion with tetR while mVenusNB is integrated as a transcriptional reporter. pSC101 origin. Ampicillin resistance. |
| pLPT41 | Used in circuit 10. ColE1 origin plasmid with PLtetO1 promoter, taken from Potvin-Trottier et al., 2016. Acts as a “titration sponge” for small numbers of TetR levels using repressor-binding sites, see Potvin-Trottier et al., 2016. Kanamycin resistance. |