Causal interactions between genes can be inferred when non-genetic cell-to-cell variability violates the covariance invariant of Eq. (2).

A) Consider an arbitrary network of interacting cellular components in which an engineered reporter Y is introduced to act as a passive read-out of the transcriptional signal that regulates a gene of interest geneX, but itself does not regulate other cellular components. Any cellular component belongs to one of two groups: components affected by X (red circles), and components not affected by X (blue squares), where the arrows indicate directed causal biochemical effects. B) The dashed line of Eq. (2) constrains the normalized covariance between X and any cellular component Zk not affected by X (indicated in blue squares in both panels). In contrast, components Zk affected by X (indicated as red circles in both panels), are not constrained by Eq. (2). A violation of Eq. (2) thus implies the existence of a causal interaction from X to Zk. Data points here are for illustration purposes but correspond exact numerical simulations of specific example networks (see SI for details). C) The invariant of Eq. (2) applies not only to transcriptional reporters but also when X and Y correspond to co-regulated fluorescent proteins with translation rates that scale with transcript abundances.

Extensive numerical simulations of example systems confirm computationally that Eq. (2) constrains cellular abundances and concentrations in growing and dividing cells under fairly general assumptions.

A) To numerically verify Eq. (2) we consider ten stochastic birth-death processes covering six different network topologies with non-linear rates, closed-loop feedback, time-varying upstream signals, and fluctuating degradation rates. In all cases chemical species X and Y are co-regulated but do not affect a third chemical species Z of interest. See supplemental Table S1 for details of the simulated systems. B) We consider three different cellular growth dynamics that affect molecular abundances through random partitioning of molecules at cell division and molecular concentrations through dilution. Additionally, system reaction rates depend in varying ways on the cell volume. At cell-division, molecular abundances are assumed to be partitioned on average proportional to cell size. C) Simulation results for each of the example topologies from panel A, subject to the different growth dynamics of panel B. Model parameters were varied over several orders of magnitude. The numerically obtained Pearson correlations do not satisfy a general relation whereas normalized covariances satisfy Eq. (2) for absolute abundances as well as concentrations. Colors correspond to the topologies indicated in panel A. For each dot, single-cell trajectories of 20,000 cell divisions were simulated 40 – 1000 times, with the center of the dot corresponding to the average of the simulation ensemble (see Materials and Methods).

Experimental data from all but one synthetic regulatory circuit are consistent with the theory.

A) In all synthetic circuits, we considered TetR as our protein of interest (X) fused to YFP to allow for quantification through fluorescence microscopy. CFP (Y) was used as a passive read-out of the transcriptional control of tetR by placing it under the control of a copy of the same PLlacO1 promoter as tetR. In all synthetic circuits X and Y are thus co-regulated by the LacI protein. B) We constructed two different types of synthetic circuits using the repressilator motif [35] as a basis. Left: example circuit in which TetR (X) causally affects a RFP reporter (Z). Right: negative control example circuit in which RFP was expressed constitutively and thus expected to be independent of TetR levels. In this circuit, X does not causally affect Z but X and Z are correlated due to plasmid copy number fluctuations. C) E. coli cells with the synthetic circuit encoded on the pSC101 plasmid were grown in a microfluidic device and observed over hundreds of cell divisions while daughter cells are washed away. Fluorescence levels of YFP, CFP, and RFP were measured simultaneously for hundreds of mother cells, along with cell area, cell length, and the growth rate. For each strain, the time-lapse data of all cells were combined into a population distribution from which the normalized covariances were computed. All temporal information was thus discarded and not used in the analysis. D) Three out of four causal interactions were detected through violations of Eq. (2) However, one violation occurred right at the limit of experimental accuracy. If our error bars underestimate experimental uncertainty, only two causal interactions would have been successfully detected. See Materials and Methods for details of our error analysis. Numbers indicate the synthetic circuits as listed in supplemental Table S2. E) All but one negative control circuits led to data consistent with Eq. (2) (dashed line). Numbered synthetic circuits are listed in supplemental Tables S3 and S4. The two inconsistent data points corresponding to the same synthetic circuit (number 12) which is presented in detail in Fig. 4. The false positives either imply that our method works imperfectly or that we detected an unexpected causal interaction from TetR onto growth rate in this circuit. We present evidence for the latter interpretation in the next section and.

In the outlier strain the bacterial stress response is triggered which in turn regulates the “constitutive” RFP reporter.

A) The outlier strain consists of the repressilator circuit encoded on the pSC101 plasmid with an endogenous copy of the lacI gene encoded on the chromosome. This endogenous source of LacI disrupts the regular oscillations of the repressilator circuit. RFP was chromosomally expressed under the control of the pRNA1 promoter [38]. B) The outlier strain exhibited large growth rate variability. C) The outlier strain exhibited large variability in RFP levels with the highest peaks occurring after the slowest growth periods. D) RFP levels were strongly correlated with RpoS activity as quantified through the known RpoS target gadX [39]. E) Deletion of the rpoS gene made the outlier strain consistent with Eq. (2). F) Expressing LacI endogenously leads to long periods of low TetR-YFP levels rather than regular repressilator oscillations. Because TetR is the only repressor of the cI gene in the synthetic circuit, we hypothesize that during periods of low TetR, CI expression is so high that resource competition with the also highly expressed RFP triggers the bacterial stress response. This interaction, hypothesized to be present in cells with high CI and RFP expression levels, is indicated with a dashed arrow. Solid arrows indicate interactions with direct experimental support [40].