Introduction

The balance between the cost associated with the expression of genes and the demands for intracellular resources driven by growth is a tightly regulated process in bacteria, reflecting an evolutionary pressure to optimize fitness [1-4]. Recent studies have built on this principle to derive powerful, predictive growth laws [2, 5-7]. For example, at the population level, these laws predict that expressing unnecessary proteins reduces the availability of ribosomes that would otherwise be destined for the synthesis of other proteins, including those involved in ribosomal function and other cellular processes [6]. Consequently, the synthesis of unnecessary proteins is accompanied by a growth burden that increases with protein expression levels [1, 4, 7]. These laws have been derived under the assumption of balanced growth: a state where all cellular components, such as proteins, RNAs, DNA, and lipids, are synthesized at rates that are proportional to the constant growth rate of the cells measured at the population level [8, 9]. Balanced growth also ensures that cell growth has reached a steady-state and does not run into resource bottlenecks that could impair its physiology [10, 11]. While balanced growth is a practical assumption for predicting bacterial growth at the population level, it does not take into account out-of-balance processes [12-14], such as sudden and stochastic molecular events that inherently accompany growth and gene expression observed in single cells [15-22], even when environmental conditions are well-controlled and steady. Therefore, extracting the laws that govern the interplay between stochastic growth variations and protein synthesis is crucial for predicting and understanding the behavior and constraints of the single cell.

In recent years, studies in E. coli have reported that flagellar genes—responsible for the synthesis of one of the most energy-intensive bacterial organelles—exhibit large stochastic expression pulses [20]. The assembly of this complex organelle requires more than 30 different proteins that mostly constitute the basal body of a rotary motor, as well as a 20,000 subunit-long filament [23, 24] (Fig. S1A). During flagella synthesis, the cell directs a sizable portion of protein synthesis resources to this process [25-27]. While it has been demonstrated from bulk measurements that during balanced growth, the expression of this locomotion apparatus is regulated to compensate on average for changes in cell size across distinct growth conditions [28-30], far less is known about out-of-balance processes and their effect on gene expression, such as stochastic fluctuations of growth observed at the single-cell level. In view of the large protein synthesis demand that accompanies the stochastic transcriptional expression of flagellar genes, here we investigate the interplay between such gene activity and growth fluctuations within a single bacterium. Subsequently, we extend our analysis to that of synthetic expression systems and develop a computational model to validate the generality of our observations.

Results

Collectively, production of flagellar proteins represents a sizable proteome investment that may consume resources otherwise available for growth, presenting a potential resource allocation tradeoff which the cell must navigate [30, 31]. The activity of flagellar genes follows a well-characterized transcriptional cascade [23, 32]. Briefly, it begins with Class-1 genes that encode the master regulator FlhDC, which drives the transcriptional activation of Class-2 genes. These genes are involved in the synthesis of the flagellar basal body, the hook, and the alternative sigma factor, FliA. This sigma factor, in turn, activates Class-3 genes that encode the filament and signaling chemotaxis proteins (Fig. S1A). Recently, the transcriptional activity of Class-2 and Class-3 flagellar genes has been found to be well-delimited in time with alternating and distinct on and off states, each spanning several cellular divisions [20]. Operationally, this temporal pattern makes the system experimentally accessible for dissecting the relationship between growth fluctuations and gene activity in living individual cells.

We tracked single cells as they grew and divided inside a microfluidic device, called a ‘Mother Machine’ [33, 34], which allows bacteria to grow exponentially under controlled conditions for extended periods of time (Fig. S1B). This device confines cells within narrow channels, with one end exposed to a continuous flow of fresh medium while the other end is sealed. The temporal activity of flagellar promoters and the growth of individual cells are simultaneously monitored using fluorescence and phase-contrast microscopy (Materials and Methods). This experimental platform enables us to capture time-lapse images and extract various physiological parameters such as cell size, elongation rate, and fluorescence signals from reporter proteins with 5 min resolution over ∼30 generations (Fig. S1C). One key feature of this approach is its ability to isolate intrinsic fluctuations within the cellular system from those typically caused by uncontrolled environmental variations.

As previously observed by some of us, in the commonly used wild-type E. coli MG1655 strain, Class-2 and Class-3 promoters exhibit sparse stochastic pulses, whereas Class-1 displays low but sustained activity that resembles that of a constitutive promoter. The question then becomes how the growth rate varies during distinct periods of active or inactive flagellar gene activity. Here we use the elongation rate of cell size as monitored in the Mother Machine channels, as a measure of growth rate. We monitored the activity of the Class-2 promoter, fliFp, using the fluorescent reporter SCFP3A (see Materials and Methods) as it also encapsulates the dynamics of Class-3 as previously described [20]. Practically, to capture the relationship between instantaneous variations in elongation rate and those of Class-2 activity, we extracted these two quantities from time series at each time point using a Δt = 5 min time window (Fig. 1 Left). These pairs were then sorted and grouped into bins based on their activity levels (Fig. S2). In addition to the strain with wild-type flagellar expression, we tested three distinct strains whose master regulator, FlhDC, is driven by unregulated synthetic promoters (P1, P2, P3; [20, 35]) with increasing strength, resulting in progressively higher transcriptional activity of downstream flagellar genes.

Relationship between instantaneous growth rate and flagellar gene activity in single cells versus population.

Left: Time traces of Class-2 activity and Size (S) measured from the same mother cell across divisions (vertical ticks). We defined Class-2 activity as the time derivative of total fluorescence normalized by size, and the elongation rate as ER(t) = (1/S)(ΔS/Δt) . First, we used Class-2 activity to sort data points with Δt = 5 min and then bin the pair activity-elongation rate (see Fig. S2A). Finally, we average the data in each bin and display the result in the right panel. Because the averaged bins are built from pairs of activity– elongation rate values using a 5-minute time window, this binning allows us to determine the relationship between instantaneous flagellar activity and growth rate in single cells. Right: Instantaneous Class-2 activity versus elongation rate for different strains with varying mean levels of flagellar expression. In such strains, the native Class-1 promoter was replaced by synthetic promoters with different strengths, P1, P2 and P3 (color coded in top panel), P3 being threefold stronger than P1 [35]. The dashed line shows the linear fit to the population average of the pair Class-2 activity-elongation rate across strains, which illustrates the well-established relationship with a negative slope (cost) between growth rate and increasing mean flagellar activity across strains [30]. By contrast, single-cell binned data within each distinct strain shows a positive slope between instantaneous growth rate and flagellar activity, demonstrating that cells with higher flagellar activity are also the ones growing faster. Only the data exhibiting Class-2 activity (excluding off states) are considered in this plot; see Fig. S2B for all states. The number of time traces is n=(69, 107, 96, 67) for the strains with the promoter P1, WT, P2, and P4, respectively, all with a similar duration of 47 hours.

The well-known population result is recovered when we fit the whole bulk of the instantaneous single-cell data across the four different strains (Fig. 1 Right, dashed line), indicating that as cells, on average, increase expression of unnecessary proteins, their growth rate decreases. Surprisingly, when investigating growth relationships for individual cells within each strain, we found that bins with higher instantaneous transcriptional activity were also associated with higher elongation rates (Fig. 1 Right). This result is unexpected under the canonical growth laws and directly contrasts with population-level observations, where elevated flagellar gene activity correlates with reduced growth rates (Fig. S3, [28-30]; see Fig. S4 for an alternative growth rate proxy). The coexistence of these opposing trends—positive within strains but negative across strains—suggests that a phenomenon such as the Simpson’s paradox is at play.

The Simpson’s paradox is a classic statistical phenomenon that occurs when the way data are grouped or averaged hides underlying differences among the datasets being compared, causing the overall trend to reverse. In our study, although both analyses rely on instantaneous single-cell measurements, they differ fundamentally in how the data are aggregated, a distinction from which the observed paradox arises. In the population-level analysis (Fig. 1, dashed line), each point represents the mean instantaneous growth rate and mean promoter activity averaged across each distinct strain, which differ in their steady-state expression levels of the flagellar regulon. This aggregation collapses co-fluctuations of growth and activity within cells and compares averages across genetic backgrounds, thus reflecting between-strain differences governed by balanced growth constraints. In the binned analysis, each data point reflects paired fluctuations of growth and gene expression measured within the same cells, preserving their natural co-variation at a given time point. Hence, the binned analysis reveals dynamic coupling between growth and expression in single cells, whereas the aggregated data captures only the average trade-off observed across strains.

We reasoned that if this single-cell effect reflects dynamic coupling between growth and activity, it should also be observable along individual time series and not only in binned data. To examine this point further, we took advantage of the fact that flagellar genes show pulses of activity, which allow us to clearly track how growth rate changes during significant fluctuations in gene activity. We isolated time series segments of Class-2 activity featuring two consecutive pulses, rescaled their total duration to a unit interval (arbitrary units), and computed the average across a large collection of such pulse-to-pulse excerpts (Fig. 2, top; Materials and Methods).

Cells grow faster during pulses.

(A) The Class-2 activity time series were divided into intervals that consisted of a pulse of activity followed by an off state and another pulse. We selected all the intervals with similar durations (within a range of no more than 6 hours). The pulses were defined by the period for which the activity is above an empirical threshold (activity = 50 arbitrary units). Then, the intervals of similar duration were normalized such that they spanned from 0 to 1, in order to average them. (B) Top: Averaged pulse-to-pulse intervals (real time rescaled to 1) of the Class-2 activity. Middle and bottom: elongation rate and division time associated with the same time intervals as the top panel. The averages were calculated by applying a sliding window mean over the overlapping data consisting of n = 103 intervals.

Then, we jointly plotted the time series of elongation rate and division time that are directly associated with the pulse-to-pulse excerpts. We found that the aggregated time traces indicate that increases in flagellar gene activity coincide with higher elongation rates, while division time varies in the opposite phase (Fig. 2, middle and bottom). This relationship remains consistent across pulse-to-pulse intervals of different lengths (see Fig. S5). To further examine whether there is a phase delay between elongation rate and Class-2 activity, we calculated the cross-correlation function of the full time series (about 20 divisions). We found that the fluctuations in elongation rate temporally precede those of flagellar gene activity, and that they are positively correlated (Fig. S6). This result was unexpected, since one might intuitively assume, based on population measurements, that increased flagellar activity would cause a cost that reduces the elongation rate. If that were the case, we would expect to see a negative correlation where changes in gene activity precede decreases in elongation rate, rather than the positive correlation and temporal order we observed.

To expand this result, we repeated the correlation analysis using a reporter for the Class-3 promoter, fliCp, which drives expression of the energetically costly flagellin protein. We again observed a positive correlation (Fig. S7), with a slightly larger shift between elongation rate and activity for Class-3 compared to Class-2, consistent with Class-2 activity preceding and driving Class-3.

We also analyzed the three strains whose synthetic constitutive promoters (P1, P2, P3) control the expression of the flagellum master regulator and found similar temporal delay (Fig. S6). Since the constitutive promoters P1, P2 and P3 are synthetic, they do not respond to any input specific to flagellar regulation; rather they behave more like unregulated, constitutive promoters (Fig. S8, S9). This result suggests that the observed temporal shift in flagellar activity may be driven by global intracellular factors rather than flagellum-specific ones. Indeed, the signature observed in the correlation between elongation rate and Class-2 activity (see Fig. S9) is consistent with the scenario described by Kiviet and Nghe et al. [36], in which the dominant noise source fluctuations are global cellular processes.

Recent work has observed heterogeneity in ribosome density across populations of E. coli cells [37-40]. While Pavlou et al. found no immediate association between ribosome abundance and unequal division [39], several super-resolution studies suggest that ribosome heterogeneity arises from asymmetric partitioning during cell division [37, 38, 41]. They demonstrated such asymmetry stems from large clusters of ribosomes that are asymmetrically distributed at cell division, leading to differences in growth after division. Such a pronounced effect would not be possible if each ribosome was independently partitioned according to a binomial distribution, which, due to their large number, would result in a highly symmetrical distribution between daughter cells [15, 42]. With these results in mind and given that the ribosome synthesis is an upstream process relative to flagellar gene expression, we proceeded to analyze the symmetry between cells following cell division as well as the heritability of flagellar gene activity and growth rate fluctuations from one generation to the next.

First, we compared the flagellar gene activity between daughter cells, the two progenies that arise after the division of a mother cell (Fig. 3A). By plotting the flagellar activity of each daughter cell at the same time point in a scatter plot, we found considerable heterogeneity in activity levels between daughters. Specifically, when one daughter exhibits higher flagellar gene activity, it often shows a higher elongation rate as well. On one hand, this result is consistent with the positive correlation between gene expression and growth observed previously. However, this daughter-daughter analysis further suggests that growth factors and activity are not equally partitioned during cell division, leading to asymmetric inheritance of both flagellar activity and growth rate between the two daughters.

Short-timescale shifts in Class-2 activity correlate with immediate changes in growth.

(A) The schematic (top) depicts a mother cell dividing into two daughters with elongation rates ER1 and ER2. To compare their growth, we calculated the relative elongation rate as (ER2ER1)/(ER2 + ER1); values above 0 indicate faster elongation in Daughter 2, values below 0 indicate faster elongation in Daughter 1, and a value equal to 0 indicates equal rates. Class-2 activity for each daughter was determined by normalizing the change in fluorescence by division time. The scatter plot (bottom) shows Class-2 activity for both daughters, with the diagonal representing equal activity. Data points are color-coded for their relative elongation rate: blue for ER1 > ER2 and red for ER2 > ER1, demonstrating that higher flagellar expression in one daughter tends to coincide with a higher elongation rate. (n = 541, daughter cells; ER1 = 0.544 ± 0.176 h−1, ER2 = 0.539 ± 0.168 h-1). (B) elongation rates across two consecutive generations for cells experiencing an abrupt change in Class-2 activity: off to on (yellow) and on to off (green). For both cases, we plot the binned ER of the daughter cell (generation g + 1) against the ER of the previous generation (g). A positive change in activity (off to on) is accompanied by an increase in ER across generations compared to the dotted line, reflecting faster cell growth in generation g+1. A decrease in activity (on to off) is associated with a reduced ER compared to the dotted line in generation g+1 (ER is lower at g+1 than at g). The number of divisions for each subset is ∼300. The shaded regions correspond to the standard deviation for each bin, and the dotted line shows the linear fit for the entire dataset, irrespective of activity, representing a null model for ER across divisions.

We then focused exclusively on cells that exhibited changes in flagellar activity across division (from mother to daughter) and measured the corresponding changes in elongation rate. We first selected the cells that transitioned from off to on activity across division (Fig. 3B, upper right) and compared their elongation rates (Fig. 3B, yellow). Our results indicate that the increase in activity across divisions is associated with an increase in the elongation rate that is greater than for cells with no transition (off - off, dotted line). By contrast, cells showing a transition from on to off activity (Fig. 3B, green) fall below the dotted line. Overall, these results demonstrate that fluctuations in global noise affecting flagellar activity and growth occur predominantly on short time scales, with asymmetric division playing a major role in generating this variability.

Because direct causal links between instantaneous growth and protein synthesis in single cells are difficult to establish, we instead asked which growth-fluctuation timescales matter for flagellar expression. The strain MC4100, known to exhibit slow, long-period oscillations in growth, provided a useful comparison to MG1655. Consistent with prior reports, MG1655 showed rapid size-fluctuation dynamics (1.5 generations), whereas MC4100 fluctuated much more slowly (10 generations) ([42], Fig. S10A). Although asymmetric ribosome partitioning can explain short-term growth boosts in MG1655, pulse-to-pulse analysis showed no correlation between MC4100’s long-timescale growth oscillations and flagellar gene activity (Fig. S10B). After digitally filtering out these slow oscillations, however, MC4100 displayed a positive growth– expression correlation similar to MG1655 (Fig. S10C), suggesting that short-timescale growth variations, obscured in MC4100 by its slow oscillatory behavior, are the relevant drivers.

Next, we explored whether the single-cell behavior observed in Fig. 1, namely the violation of the population-level growth law, is specific to the flagellum system or if it constitutes a general phenomenon at the single-cell level. To this end, we repeated our experiments using an unregulated synthetic expression system, in which five different strains that express the yellow fluorescent protein, Venus, were under the control of synthetic constitutive promoters with various strengths. Although our bulk measurements (Fig. 4A, black markers) as in [5] experimentally confirm the standard growth law—where growth rate decreases as unnecessary protein expression increases—binning analyses of instantaneous growth rate measurements revealed a strikingly different behavior at the single-cell level. As with the flagellum system, our analysis shows that an instantaneous increase (decrease) in fluorescent protein synthesis is accompanied by a faster (slower) elongation rate (Fig. 4A; cross-correlation between elongation rate and activity in Fig. S11). The result in Fig. 4A appears to contradict the growth law, which posits that the increased synthesis of unnecessary proteins, such as the fluorescent proteins in our experiments, depletes ribosomal availability and should therefore reduce growth rate. However, our single-cell analyses reveal a contrasting behavior, where transient boosts of growth accompany increases in protein synthesis. Consequently, we hypothesized that the observed behavior likely arises from an out-of-balance growth process during which a surplus (or deficit) of beneficial growth factors are redistributed at division from the mother at no cost to the daughter cell. The process is out-of-balance because the post-division abundance of growth factors (such as ribosomes) reflects unequal inheritance rather than balanced synthesis.

Relationship between instantaneous growth and promoter activity in single cells versus population for different levels of constitutive expression of unnecessary proteins.

(A) For a set of strains expressing the fluorescent protein Venus, driven by promoters of varying strengths (P1–P6, as indicated at the top), we calculated the mean elongation rate as a function of promoter activity. The diamond markers represent the average for the entire population of each strain, and the black dotted line indicates the linear fit through these population averages. The colored dots show the binned single-cell data for each strain. In each strain, cells exhibit a positive correlation between elongation rate and constitutive promoter activity. The inset shows the same data with the x-axis plotted on a log scale. (B) (top) Single-cell simulations for increasing average allocation fraction to unnecessary protein expression, capturing both the decreased elongation rate with increasing activity at the population level across conditions (diamonds), as well as the binned data, that shows the increase in elongation rate with activity at the single-cell level within conditions (colored dots). (bottom) Single-cell simulations of elongation rate as a function of activity with identical parameters to (top), except without noise due to unequal partitioning of ribosomes at division (σR=0), showing that the positive slope between elongation rate and activity seen at the single-cell level (top) is driven by fluctuations in ribosome concentration at the single-cell level. (C) Simulations were performed for a fixed fraction of unnecessary proteins, fU= 0.073, and varying noise amplitudes of ribosomes due to unequal partitioning σR=(0,0.02,0.04,0.06) (light gray to black). For each condition, the resulting scattered data of activity versus elongation rate was binned as in the experiments. All simulations were implemented using rules for proteome allocation and division timing from [14], and other simulation parameters are described in SI Section 1.

To illustrate this point, we performed single-cell simulations using a computational model (Fig. 4B) that incorporates all the key elements that govern the growth laws as commonly defined from population measurements (Simulation details in the Supplementary Materials). Importantly, the model also includes the stochastic redistribution of growth factors such as ribosomes and possibly other factors from the mother to daughter cells at division. We find that our binning procedure shows the same positively correlated relationship between growth rate and synthesis of unnecessary proteins observed experimentally (Fig. 4B, top). Moreover, the slope of this relationship becomes steeper as the noise in the distribution of growth factors becomes larger (Fig. 4C). In the absence of noise, the relationship shows no correlation between growth rate and protein synthesis (Fig. 4B, bottom). In contrast, this correlation exhibits increased steepness with higher noise levels, indicating that stochastic partitioning of growth factors amplifies transient out-of-balance states.

This effect shows that instantaneous growth and expression changes are mainly dominated by inheritance noise rather than by the steady-state burden predicted by the growth law. In our model, single-cell growth rate is positively correlated with protein synthesis, so any mechanism generating molecular growth factors heterogeneity across generations will give rise to Simpson’s paradox when comparing single-cell and population-level trends.

Altogether, these simulations illustrate that both behaviors can coexist: the growth burden caused by the synthesis of unnecessary proteins as seen at the population level, and the out-of-balance growth observed at single-cell level, which mediates instantaneous, burden-free synthesis of unnecessary proteins.

Rather than identifying the precise molecular source of heterogeneity, our goal is to examine the consequences of fluctuations of unspecified growth factors for balanced growth models. We therefore propose that these fluctuations underlie an intergenerational mechanism for cost-free (“pre-paid”) protein production. We argue that mother cells can pass molecular growth factors to their daughters, either through excess of their production or asymmetric division, thereby giving these daughters an advantage in synthesizing costly proteins such as the flagellum.

Discussion

Kim et al. [20] found that flagellar genes in E. coli are activated in stochastic pulses. Subsequently, some of us proposed that these pulses are generated by means of an inhibitor of the Class-1 transcription factor, called RflP (formerly YdiV), which creates an ultrasensitive switch that acts as a digital filter to remove small-amplitude input temporal fluctuations in Class-1 expression [24]. While this molecular mechanism demonstrates that pulses can arise without feedback, it does not address how such pulses influence cellular growth.

Knowing that these pulses are not an essential feature of flagellar biosynthesis, especially since some strains can be genetically modified to produce flagella steadily [43], this raises the question of the biological advantage of such a pulsing dynamics in wild-type strains. Here, we address this question through time series analyses to demonstrate that pulses of flagellar gene activity accompany sudden growth boosts that transiently alleviate the growth burden associated with flagellar gene expression. By temporarily mitigating this burden, we speculate that some single cells may more effectively explore and colonize new environments as the cell does not have to immediately ‘pay’ for the of cost flagellum expression. In particular, it would be especially interesting to investigate whether this advantage is more pronounced in nutrient-depleted environments where seeder bacteria with a faster growth rate would also be the ones producing flagella. The fact that the relationship between growth rate and flagella expression or, more generally, fluorescent protein expression at the single-cell level contradicts the population-level trend across strains observed in bulk measurements (Fig. 1, Fig. 4) is reminiscent of Simpson’s paradox and arises from the separation of timescales inherent to these distinct analyses of the same sample [17, 18, 22]. Similar time series analyses based on the separation of timescales may also reveal that out-of-balance growth processes serve as a general mechanism for temporarily mitigating the cost of expressing other energetically demanding endogenous genes [21].

Materials and Methods

Strains

For a complete description of the strains see, Table 2 in the Supplementary Materials. Briefly: MG1655 flagella reporter strains (E98KFD and E98KFD-Pro) carry the gene for yellow fluorescent protein (mVenus NB) inserted downstream of the Class-1 native promoter of the flhDC operon. At the attB site, a Class-2 promoter, fliFp, controls the activity of the cyan fluorescent protein SCFP3A. The E98KFD-Pro strains differ from E98KFD only in that the native Class-1 promoter is replaced with constitutive synthetic promoters with various strengths (P1: Pro2, P2: Pro4, P3: Pro5, from [35]). The strain construction is described in detail in [20].

Strains with constitutive expression of fluorescent proteins (MGPro-Venus; [44]): each strain carries a low-copy plasmid with an SC101 origin [45], in which a synthetic promoter [35] controls the expression of the sequence of the mVenus NB yellow fluorescent protein [46]. We renamed the promoters according to their reported strength and measured fluorescence level (P1: Pro2, P2: Pro4, P4: ProB, P5: ProC, P6: ProD).

MC*: The classic strain MC4100 [42, 47] was modified to correct for its flh5301 point mutation by replacing flhD with the same gene sequence as in MG1655 in order to recover its flagellar expression. We also removed the insertion element IS5 upstream of flhDp as to keep Class-1 levels consistent with those in MG1655. We introduced the mutation (MotA E98K) to make the strain non-motile. Finally, the strain carries the Class-1 and Class-2 fluorescent reporters as in the MG1655 E98KFD strain. To implement mutations, we used a scarless chromosomal engineering strategy described in [20], and all transformations were performed via electroporation.

Growth Media

Microfluidic experiments were conducted using a modified version of Teknova’s MOPS-EZ Medium: MOPS and Supplement EZ at 1x concentration; 0.4% glycerol serving as the carbon source, phosphate at 1.32 mM. Additionally, we added surfactant (Pluronic F-108, Sigma-Aldrich) at a final concentration of 0.85 g/L. Overnight cultures were prepared using LB medium.

Microfluidic device

We used a Mother Machine device customized for the lab setup. The master mold chip has 4 independent main feed channels where growth media flows, feeding smaller microchannels with a closed end (length × width × height = 25 μm × 1.2 μm × 1.2 μm).

We fabricated the microfluidic devices by casting a 10:1 mixture of degassed PDMS and curing agent (Sylgard 184, Dow Corning) onto the mold, curing it overnight at 65 °C. The individual chips were then removed from the mold, and inlets and outlets were created using a 0.75 mm biopsy punch, then they were bonded to a glass slide (25 mm × 40 mm No. 1.5 coverglass (VWR)) using a plasma cleaner and kept at 65 °C for another hour.

Experimental Setup

Individual colonies were cultured overnight at 30°C, then concentrated by centrifugation for 1 minute at 8000 g and loaded into the device by pipetting. After loading the device, the sample was left for 1 hour at 30°C to increase the likelihood of the cells entering the channels by diffusion. Then the trapped cells in the channels began to grow exponentially. We flowed the media through the microfluidic chamber using Tygon tubing (VWR; inside diameter 0.02 inch × outside diameter 0.06 inch), and the flow-through was collected from an outlet via a second tubing into a waste beaker. Growth medium was first pumped at a rate of 20 µl/min for at least 1 hour to allow the inlets and outlets to be cleared.

Subsequently, the flow rate was decreased to 9 µl/min, and the temperature was maintained at 30°C. We allowed approximately 3 hours for the cells to adapt before initiating the measurements. The time-lapse images were acquired every 5 minutes on a Zeiss Axiovert 200M microscope equipped with a Plan-Apochromat 40x/1.3 Oil Ph3 Objective, a CCD camera (Hamamatsu C4742-98-24ERG) and a fluorescence excitation LED illumination source (SOLA SE II, Lumencor). We collected images from the phase-contrast and epifluorescent channels (yellow, blue, and/or red protein filters, depending on the strains). The typical duration of the experiment was 30-60 hours.

Microscopy Image Processing

We used the software BACMMAN (BACteria in Mother Machine Analyzer) [48] for cell segmentation and tracking. Using this tool, we customized various parameters to process images according to each specific strain and condition, and then manually corrected any segmentation errors.

Elongation rate and promoter activity calculations

To obtain the elongation rate ER(t), we determined the first derivative of individual cell sizes S(t) (from birth to division) using a Savitzky–Golay filter (from Python’s SciPy library) with a window of 7 data points (35 minutes). This derivative was then normalized by dividing by the cell size, resulting in the expression ER(t) = 1/S(t) × dS(t)/dt.

As a proxy for the promoter activity, we used PA(t) = 1/S(t) dF(t)/dt, where S(t) is the cell size (number of pixels at time t), and F(t) is the total fluorescence. As in previous literature [17, 20, 49-51], we considered the rate normalized by cell size, thereby adjusting the production rate so that it is expressed per chromosomal equivalent. By normalizing in this way, we focus on the gene’s specific activity, rather than the cell’s total output.

Given that the total fluorescence F(t) is the mean intensity of the cell, C(t), multiplied by its size S(t), one can show that PA can also be written as PA(t) = C(t) × ER(t) + dC(t)/dt. We used this equation to calculate the PA, as the mean fluorescence effectively represents the intracellular concentration of the fluorescent signal, which changes more continuously and is less affected by the sudden halving associated with cell division, making the calculation of the derivative less prone to artifacts. We smoothed C(t) with a window of 35 minutes, which we slid over the time series with an increment of 5 min. We calculated the time derivative using the Savitzky–Golay filte, also with with a window of 35 minutes.

Binning

To analyze the single-cell relationship between ER(t) and Activity(t) across all acquired time points, we first visualized the data using a scatter plot. We then binned the data based on Activity(t) values, as the series displayed a clear pulsatile behavior and distinct mean values across strains. Each bin contained an equal number of data points. Finally, for each bin, we calculated the mean values of both ER(t) and Activity(t).

Pulse-to-pulse analysis

We defined Class-2 pulse-to-pulse intervals by empirically choosing a threshold to determine the on (above the threshold, the pulses) and off states (below the threshold). Then, for each sequence of on-off-on states, we identified the two maxima within the Class-2 activity time series and included the 15 timepoints before and after the maxima for each time interval.

We grouped those intervals so that they had similar durations and rescaled them from 0 to 1. Following the rescaling, we applied a sliding window to compute the average of the Class-2 activity within these normalized intervals, which allowed us to move systematically over the data points. This method provided us with a smoothed representation of the Class-2 activity, highlighting the significant changes. To examine the ‘instantaneous’ relationship between the pulses and other quantities, we analyzed the same time intervals within the same cells initially defined by the Class-2 activity for the elongation rate, division time, etc. We then rescaled the intervals and used the same window size as for Class-2 activity to average them together. The averages of these quantities for different pulse-to-pulse duration intervals are presented in Fig. S5.

Autocorrelation and Cross-Correlation Calculations

To measure the cross-correlation between ER(t) and Activity(t), we used the function ‘correlate’ from the SciPy library in Python, which evaluates how well one series matches another as a function of time displacement called ‘lag’. Prior to performing the calculation, we removed the mean from each time series. To average the cross-correlation for different lineages from the same strain, we adjusted the lag by dividing it by the average division time of the series. This rescaling ensures that fluctuations due to cell division are aligned across the different series. We applied the same methodology for the autocorrelation function.

Filtering frequencies from ‘Size(t)’ time series

To filter out the slow frequency oscillations from the Size S(t) time series in MC*, we first calculated the Power Spectral Density using Welch’s method from Python’s SciPy signal library. For every mother cell’s size time series, we identified the frequency corresponding to the highest power, which we find to be ω = 0.0937 h-1 (∼10.67 h) for most of them. We then applied a notch filter at ω, with a bandwidth of 0.01.

Next, wecalculated a new elongation rate from the filtered series, by fitting an exponential function of the size of each birth-to-division event, S(t) = Sbirth(t) × 2ER * t.

Simulation details: Stochastic Proteome Allocation Model with Division-Coupled Noise

Single-cell simulations were implemented using rules for proteome allocation and division timing from ref. [14], allowing for analysis of instantaneous proteome allocation and growth rate at the level of the individual cell. At each division event, a single daughter cell was simulated which inherited the intracellular composition of the mother cell, maintaining the overall population size. Previous experimental work has shown that the major driver of heterogeneity in ribosome concentration across single cells is not stochastic production, but rather unequal partitioning of ribosomes at division due to the formation of larger clusters [37]. Motivated by this, we modeled the redistribution of ribosomes at division through a stochastic differential equation in which perturbations to the mean maximum allocation to ribosomes,, only occurred at division times, td, then decayed slowly with a mean-reversion timescale τR ≫ 0. Specifically:

where δ(·) is the Dirac delta function and denotes random jump magnitudes drawn from a Gaussian distribution of mean 0 and variance . In the absence of clustering, σR = 0. In contrast, we assumed the main driver of heterogeneity in unnecessary protein allocation is due to stochastic gene expression. As such, we modeled noise in unnecessary protein expression by assuming it continuously fluctuated around its mean allocation value, with timescale τU and with volatility coefficient σU, yielding:

where Wt denotes a Wiener process. Assuming constant cellular density, protein mass fraction is proportional to protein concentration, allowing us to rewrite our previous definition of promoter activity as:where ϕU (t) is the mass fraction of unnecessary proteins and κ(t) is the single-cell growth rate at time t . Using this expression, along with our definition of sector dynamics [14], the activity of unnecessary protein expression could then be calculated as:

The single-cell growth rate and corresponding activity for 3000 individual cells was computed at intervals of Δt = 3min for 75 h total.

Parameters

All other parameters are taken from [14].

Data availability

All the time series (Cell Size, Elongation Rate, Fluorescence, Promoter Activity, etc.) were deposited in Dataverse at https://doi.org/10.7910/DVN/T56JYW and are publicly available. The code to reproduce all the simulations is available on GitHub: https://github.com/josiah-k/Simpsons-paradox.

Acknowledgements

We thank Maximino Aldana and Mark Kim for early discussions and advice, the Cluzel lab members and Fotios Avgidis for critical reading of the manuscript. AI-assisted technologies were used to prepare the manuscript. Funding: This work was performed, in part, at the Center for Nanoscale Systems, a member of the National Nanotechnology Infrastructure Network, which is supported by the National Science Foundation under award no. ECS-0335765. The Center for Nanoscale Systems is part of Harvard University. This work was supported in part by grant 1615487 from the National Science Foundation and by the United States Government, ARPA-H. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the United States Government.

Additional files

Supplemental Material

Additional information

Funding

National Science Foundation (NSF) (1615487)

  • Philippe Cluzel