Speed variations of bacterial replisomes

  1. Deepak Bhat
  2. Samuel Hauf
  3. Charles Plessy
  4. Yohei Yokobayashi
  5. Simone Pigolotti  Is a corresponding author
  1. Biological Complexity Unit, Okinawa Institute of Science and Technology, Japan
  2. Department of Physics, School of Advanced Sciences, Vellore Institute of Technology, India
  3. Nucleic Acid Chemistry and Engineering Unit, Okinawa Institute of Science and Technology, Japan
  4. Genomics and Regulatory Systems Unit, Okinawa Institute of Science and Technology, Japan


Replisomes are multi-protein complexes that replicate genomes with remarkable speed and accuracy. Despite their importance, their dynamics is poorly characterized, especially in vivo. In this paper, we present an approach to infer the replisome dynamics from the DNA abundance distribution measured in a growing bacterial population. Our method is sensitive enough to detect subtle variations of the replisome speed along the genome. As an application, we experimentally measured the DNA abundance distribution in Escherichia coli populations growing at different temperatures using deep sequencing. We find that the average replisome speed increases nearly fivefold between 17 °C and 37 °C. Further, we observe wave-like variations of the replisome speed along the genome. These variations correlate with previously observed variations of the mutation rate, suggesting a common dynamical origin. Our approach has the potential to elucidate replication dynamics in E. coli mutants and in other bacterial species.

Editor's evaluation

This manuscript combines theory with experiments to characterize the replication speed of bacteria chromosomes through the cell cycle. The authors show oscillatory patterns in the replication speed of E. coli, which they relate to the heterogeneity of mutation rates along the genome, suggesting a tradeoff between the speed and accuracy of replication. This work presents an elegant approach for investigating bacterial growth from a systems biology perspective.



Every cell must copy its genome in order to reproduce. This task is carried out by large protein complexes called replisomes. Each replisome separates the two DNA strands and synthesizes a complementary copy of each of them, thereby forming a Y–shaped DNA junction called a replication fork. The speed and accuracy of replisomes is impressive (Baker and Bell, 1998). They proceed at several hundreds to one thousand base pairs per second (Pham et al., 2013; Elshenawy et al., 2015), with an inaccuracy of about one mis-incorporated monomer every 10 billion base pairs (Schaaper, 1993). In bacteria, two replisomes initiate replication at a well-defined origin site on the circular genome, progress in opposite directions, and complete replication upon encountering each other in a terminal region.

The initiation and the completion of DNA replication conventionally delimit the three stages of the bacterial cell cycle (Dewachter et al., 2018; Wang and Levin, 2009). The first stage, B, spans the period from cell birth until the initiation of DNA replication. The second stage, C, encompasses the time needed for replication. The last phase, D, begins at the end of DNA replication and concludes with cell division. While it is established that DNA replication and the cell cycle must be coordinated, their precise relation has been a puzzle for decades (Willis and Huang, 2017). A classic study by Cooper and Helmstetter, 1968 finds that, upon modifying the growth rate by changing the nutrient composition in Escherichia coli, the durations of stages C and D remain constant at about 40 min and 20 min, respectively. This means that the replisome speed must be unaffected by the nutrient composition, at least on average. When the cell division time is shorter than one hour, DNA replication is initiated in a previous generation. This implies that, in fast growth conditions, multiple pairs of replisomes simultaneously replicate the same genome (Fossum et al., 2007). Tuning the growth rate by changing the temperature has a radically different effect on bacterial physiology. For example, in vivo (Pierucci, 1972) and in vitro (Yao et al., 2009) studies show that the speed of replisomes is affected in this case.

More precise features of replisome dynamics, such as whether their speed is approximately constant or varies along the genome, are important to determine the location of their encounter point and the distribution of replication errors on the genome (Niccum et al., 2019; Dillon et al., 2018). However, this detailed information is hard to obtain (Pham et al., 2013). One way for inferring it is to measure the DNA abundance distribution, that is the frequency of DNA fragments along the genome in an exponentially growing cell population. In fact, the frequency of these fragments in the population depends on the proportions of synthesizing genomes of different lengths, which in turn depend on the replisome dynamics. Previous studies have used the DNA abundance distribution to understand the functioning of bacterial replication and how different proteins assist completion of DNA replication (Wendel et al., 2014; Wendel et al., 2018; Rudolph et al., 2013; Midgley-Smith et al., 2019; Midgley-Smith et al., 2018). However, these studies focused on qualitative analysis of the observed changes of the DNA distribution in knockout mutants with respect to the wild type, and did not attempt to predict the shape of the distribution using quantitative theoretical models. The DNA abundance distribution has also been used to identify actively growing species in a microbiome (Korem et al., 2015).

In this paper, we introduce a method to infer the replisome dynamics from the DNA abundance distribution. As an application, we experimentally measured the DNA abundance distribution of E. coli growing at different temperatures between 17Co and 37Co using high-throughput sequencing. Our approach, combined with our experiments, shows that the average speed of replisomes exhibits an Arrhenius dependence on the temperature, with an almost fivefold variation in the range we considered. Moreover, the precision of our experiments reveals that the speed of replisomes varies along the genome in a seemingly periodic and highly repeatable fashion around this average value. We find that this pattern is highly correlated with previously observed wave-like variations of the single base pair mutation rate along the bacterial genome (Niccum et al., 2019; Dillon et al., 2018). We discuss possible common causes for these two patterns.


Distribution of genome types

We consider a population of bacteria that grow exponentially in a steady environment. Each cell in the growing population can encompass three types of genomes, see Figure 1a and Figure 1b: (i) one template genome, that is, the genome that the cell inherited at its birth. (ii) incomplete genomes, that is, genomes which are being synthesized. (iii) post-replication genomes that will be passed to new cells and become their templates.

Dynamics of genome types and DNA abundance distribution in an exponentially growing bacterial population.

(a) Cell cycle. In slow growth conditions (top panel), newborn cells contain a template (stage B, red). As the cell cycle progresses, two replisomes synthesize a new genome (stage C, blue) starting from the origin on the template (yellow spot). When replication terminates, cells contain the original template and a post-replication genome (stage D, green). Upon subsequent cell division, the post replication genome becomes the template for the newborn cell. In fast growth conditions (bottom panel), newborn cells acquire a template which is already undergoing synthesis. In subsequent stages, multiple replicating genomes may exist in the same cell. (b) Composition of genomes in an exponentially growing population of cells. Each cell may contain a different number of genomes, depending on its stage in the cell cycle and growth conditions. (c) Dynamics of genome types. Dashed blue arrow represent initiation of replication. The dotted green arrow represents completion of replication. The solid red arrow represents cell division. (d) Replisome dynamics. Two replisomes begin replication at an origin and progress in opposite directions. Their speed may depend, in general, on their genome coordinate. (e) Sketch of the DNA abundance distribution as a function of the genome coordinate. All three types of genomes contribute to the DNA abundance distribution. Because of incomplete genomes, the DNA abundance is largest at the origin and smallest at the terminal region (i.e., towards the periphery). (f) Experimental DNA abundance distribution at different temperatures.

In nutrient-rich conditions, bacteria replicate their genome in parallel, so that the numbers of incomplete genomes and post-replication genomes per cell are variable, see Figure 1b. The classic Cooper-Helmstetter model (Cooper and Helmstetter, 1968) describes the dynamics of these genomes in a given cell through generations. We adopt a different approach and focus on the abundance dynamics of the three types of genomes in the whole population. We call NT(t), NS(t), NP(t) the total number of template genomes, incomplete (synthesizing) genomes, and post-replication genomes, respectively, that are present in the population at time t. Our first aim is to quantify the relative fractions of these three types of genomes.

The total number of genomes is N(T)=NT(t)+NS(t)+NP(t). Since each cell contains exactly one template, the total number of cells is equal to NT(t). The total number of genomes evolves as effect of: (a) replication initiation, which creates new synthesizing genomes at a rate k; (b) completion of replication, which transforms synthesizing genomes into post-replication ones at rate β; and (c) cell division, which turns post-replication genomes into templates at a rate α, see Figure 1c. This dynamics is described by the set of equations:

(1) ddtNT(t)=αNP
(2) ddtNS(t)=kN-βNS
(3) ddtNP(t)=βNS-αNP.

It follows from Equations 1–3 that, in steady growth, the total number of genomes grows exponentially at a rate equal to the fork firing rate k. In this exponential regime, the fractions of the three genome types are constant:

(4) NT(t)N(t)=αβ(k+β)(k+α)
(5) NS(t)N(t)=k(k+β)
(6) NP(t)N(t)=βk(k+β)(k+α).

The ratio N/NT can be interpreted as the average number of genomes per cell. Since this ratio is constant, the fork firing rate k can also be identified as the exponential growth rate of the number of cells. For this reason, from now on, we refer to k as the ‘fork firing rate’ or the ‘growth rate’ interchangeably.

In principle, the rates α, β, and k should depend on the ‘age’ of each genome, that is the time spent by the genome in each stage. In Appendix 1, we generalize our model to an age-dependent model to account for this fact. We find that this age-dependent model is equivalent to Equations 1–3 in the exponential growth regime. This result supports the use of our simple model of genome-type dynamics.

We now analyze the incomplete genomes in more detail. We call x1 and x2 the portions of a given incomplete genome copied by the two replisomes at a given time, with 0x1,x2L, see Figure 1d. Replication initiates at x1=x2=0 and completes once the replisomes meet each other, that is, x1+x2=L. The replisome dynamics proceeds as follows. Each replisome is characterized by a speed which depends, in principle, on the replisome position (be it x1 or x2) and by a diffusion coefficient representing random fluctuations of the speed. The coordinates x1,x2 of the two replisomes evolve according to the stochastic differential equations:

(7) ddtx1(t)=v(x1)+2Dξ1(t)ddtx2(t)=v(x2)+2Dξ2(t),

where ξ1(t) and ξ2(t) are white noise variables satisfying ξ1(t)=ξ2(t)=0, ξ1(t)ξ1(t)=ξ2(t)ξ2(t)=δ(t-t), and ξ1(t)ξ2(t)=0. Here, denotes an average over realizations.

Close to thermodynamic equilibrium, the diffusion coefficient D can be estimated by the Stokes-Einstein relation (Hynes, 1977). However, since replisomes are driven far from equilibrium by hydrolysis of dNTPs, their diffusion coefficient could deviate from this estimate. In the absence of fluctuations (D=0), each of the two replisomes copies exactly half of the genome, whereas for D>0 their meeting point is characterized by a certain degree of uncertainty.

In steady exponential growth, we call pst(x1,x2) the stationary probability distribution of finding an incomplete genome with copied portions x1 and x2. This probability distribution satisfies the equation:

(8) [vpst]+D2pst-kpst=0,

where we introduce the vector notation v=(v(x1),v(x2)) and =(/x1,/x2). The last term in the right hand side of Equation 8 is a dilution term that accounts for the exponential increase in newborn cells. We formally derive Equation 8 and discuss its boundary conditions in Methods.

DNA abundance distribution

The DNA abundance distribution A(y) is the probability that a small DNA fragment randomly picked from the population originates from genome position y, see Figure 1e. We define the genome coordinate y[-L/2,L/2], where y=0 corresponds to the origin of replication and L is the genome length. Templates and post-replication genomes yield a uniform contribution to the distribution A(y) (red and green in Figure 1e). In contrast, incomplete genomes contribute in a way that depends on the replisome positions along the genomes (blue in Figure 1e). Our experiments permit to measure the distribution A(y) with high accuracy, see Figure 1f and Methods.

To express the distribution A(y) in our model, we first introduce the probability P(y) that a randomly chosen genome (either complete or incomplete) in the population includes the genome location y. This probability is expressed by

(9) P(y)=kβ+k[|y|Ldx10Lx1dx2 pst(x1,x2)ycopiedby1streplisome+L|y|Ldx20Lx2dx1 pst(x1,x2)ycopiedby2ndreplisome]+βk+βyinacompletegenome .

where the two terms in square brackets reflect the fact that either of the replisomes can in principle have copied position y, and we used that a randomly chosen genome is complete with probability (1-NS/N)=β/(k+β), see Equation 5. The DNA abundance distribution A(y) is proportional to P(y), up to a normalization constant:

(10) A(y)=P(y)L/2L/2P(y)dy.

For given choices of v(x), D, and k, our theory permits to compute the distribution of incomplete genomes pst(x1,x2) via Equation 8. From this solution, we can also calculate β as the steady rate at which replisomes complete replication (see Methods). This information can be used to compute the DNA abundance distribution A(y) using Equation 10. Therefore, by experimentally measuring the DNA abundance distribution, we can test our hypotheses on the speed function v(x) and the diffusion coefficient D.

Constant speed model

We first consider a scenario in which replisomes progress at a constant speed v¯ and without fluctuations, D=0. We find that, in this case, the DNA abundance distribution is expressed by

(11) A(y)=k2v¯[1ekL/2v¯]ek|y|/v¯ ,

see Methods. We fit this distribution to the experimental data using maximum likelihood, see Figure 2a. The speed v¯ is the only fitting parameter, because we independently measure the exponential growth rate k from the optical density, see Methods.

Figure 2 with 3 supplements see all
Results of the constant speed model.

(a) DNA abundance distribution for T=37Co. Orange circles represent experimental data. The solid black line is the prediction of our model assuming constant speed and D=0. Fits are performed using a maximum likelihood method, see Appendix 2 for details. The quality of fits for replicates and other temperatures is comparable, see Figure 2—figure supplement 1, Figure 2—figure supplement 2, and Figure 2—figure supplement 3. In particular, fits of replicates yield similar values of the speed v¯. (b) Replisome speed as a function of temperature. Error bars represent sample-to-sample variations. (c) Comparison of the temperature-dependence of speed and growth rate (see Methods for details on the growth rate estimation). The solid curves are fits of Arrhenius laws to the data. The fitted parameters are A=(2.5±5.3)×108bps-1, ΔR=(50±5)kJmol-1, B=(6.0±24.9)×1012hr-1 and ΔC=(74±10) kJmol1. We exclude the data point for T=17Co in both fits. (d) Estimated number of replisomes per complete genome at different temperatures. The red triangles represents the estimate from Equation 19 in which we use the expression of β for the constant speed model, Equation 22. The black circles are the estimates from Equation 20.

We find that the speed increases nearly fivefold with temperature in the range we considered and appears to follow an Arrhenius law, see Figure 2b. This behavior resembles that of the growth rate. The effective activation energy characterizing the cell cycle is larger than that characterizing the replisome speed, see Figure 2c, possibly due to the large number of molecular reactions involved in the cell cycle. The data point at 17°C appears to deviate from the Arrhenius law for both the speed and the growth rate (Roy et al., 2021), see Figure 2c.

As seen in Figure 2c, the replisome speed does not increase as fast as the growth rate at increasing temperature. This observation suggests that the number of replisomes per genome should increase with temperature to compensate for this gap. Indeed, in the temperature range we studied, our model predicts that the average number of replisomes per complete genome increases from two to almost six, see Figure 2d and Methods.

We now focus on the average genome content per cell. Since the model assumes that genomes evolve independently, the average DNA content per cell C is the product of the average genome length times the average number of genomes per cell N/NT. Computing the average genome length in the model (see Methods) and using Equation 4 for the average number of genomes per cell, we obtain that the average DNA content per cell is expressed by

(12) C=2v¯kk+αα[ekL/(2v¯)1].

The classic Cooper-Helmstetter model (Cooper and Helmstetter, 1968) predicts the DNA content per cell assuming constant durations of stages B, C, and D of the cell cycle. Since we assumed constant speed and D=0, the duration of the replication cycle L/(2v¯) is constant in our case as well. As a consequence, the prediction of Equation 12 is equivalent to that of the Cooper-Helmstetter model (see Appendix 3).

It is generally believed that the ratio between the average DNA content per cell C and the protein content per cell should be maintained approximately constant at varying physiological conditions. This implies that C should be proportional to the cell size. If the growth rate is varied by changing the nutrient composition, v¯ remains constant (Cooper and Helmstetter, 1968). Equation 12 then predicts an approximately exponential growth of C with k, which is consistent with observations. In this case, the Schaechter–Maaloe–Kjeldgaard growth law states that the cell size grows exponentially with k (Schaechter et al., 1958), thereby ensuring DNA-protein homeostasis. In the case of varying temperature, we find that v¯ and k present a similar dependence on T (see Figure 2c), so that their ratio and thereby C weakly depends on k (see Appendix 3—figure 1). Our result is consistent with observations showing that, at increasing temperature, the cell size remains approximately constant (Shehata and Marr, 1975) or possibly slightly increases (Trueba et al., 1982).

Oscillating speed model

The assumption of constant speed leads to a rather good fit of our DNA abundance data. However, the precision of our data permits us to appreciate systematic deviations from the model predictions under the constant speed hypothesis, see Figure 3a-e. These deviations appear as regular oscillations as a function of the genome coordinate. They are evident at all the temperatures we studied except for 17Co, where they are barely visible. They are highly repeatable (see Figure 3—figure supplement 1, Figure 3—figure supplement 2 and Figure 3—figure supplement 3) and approximately symmetric with respect to the origin of replication. We also analyzed previous experimental data from Midgley-Smith et al., 2018. We observed clear oscillations for experiments in nutrient-rich LB medium, but not for experiments in M9 minimal medium, see Figure 3—figure supplement 4. This analysis further supports that this phenomenon is robust, at least in fast growth conditions.

Figure 3 with 5 supplements see all
Wave-like deviations from the predictions of the constant speed model.

(a–e) Colored lines: ratios of the experimental DNA abundance over the corresponding prediction assuming constant speed and D=0. The solid black lines represent the ratios of the predictions assuming oscillatory speed, Equation 13 and D0, over constant speed and D=0. Corresponding plots for replicates and other temperatures are presented in Figure 3—figure supplement 1, Figure 3—figure supplement 2 and Figure 3—figure supplement 3. (f–j) Experimental DNA abundance distribution at different temperatures. The solid black lines are the fits of the oscillatory speed model. Tests based on the Akaike information criterion show that the oscillatory speed model should be chosen over the constant speed model for all the replicates and at all temperatures, see Figure 3—figure supplement 5. The fitted parameters are reported in Table 1.

To account for these observations, we introduce a more refined model in which the replisome speed oscillates along the genome:

(13) v(x)=v¯[1+δcos(ωx+ϕ)],

where δ represents the relative amplitude of oscillations; ω their angular frequency along the genome; and ϕ their initial phase. We also take into account random speed fluctuations in this case, D0. In this case, we predict the DNA abundance distribution using stochastic simulations, see Methods. By fitting the DNA abundance, we estimate the parameters v¯, δ, ω, ϕ, and D, see Figure 3(f–j) and Table 1.

Table 1
Parameters of the oscillatory speed model.

Temperatures are expressed in Co, v¯ in bps-1, ω in radMbp-1, ϕ in rad, and D in kbp2s-1. Reported values are averages and standard deviations over experimental replicates. The oscillatory and constant speed models yield estimates of the parameter v¯ that are consistent with each other, see Appendix 4—table 1.


Our fitted speed oscillations are reminiscent of a previously observed wave-like pattern in the mutation rate along the genome of different bacterial species (Dillon et al., 2018; Niccum et al., 2019). For a quantitative comparison, we analyze this pattern in a mutant E. coli strain lacking DNA mismatch repair (Niccum et al., 2019). We find that the oscillations in mutation rate and speed are highly correlated, see Figure 4a. The mutation rate appears approximately in phase with the speed, meaning that regions where replisomes proceed at higher speed are characterized by a higher mutation rate. This observation leads to the hypothesis that the two phenomena have a common cause.

Figure 4 with 2 supplements see all
Results of the oscillatory speed model.

(a) Solid lines: relative speeds v(|y|)/v¯ along the genome (Red: T = 22°C, sky blue: T = 27°C, brown: T = 32°C, and orange: T = 37°C). We omitted the curve for T = 17°C as the oscillations are less evident in this case (see Figure 4—figure supplement 1). The wave-like pattern of the speed is quantitatively similar to the variations of the mutation rate along the genome (green triangles, from Niccum et al., 2019; Pearson correlation coefficients between speed and mutation rate: r22C=0.42; r27C=0.84;r32C=0.80 and r37C=0.69). The mutation rate is defined as the number of base pair substitutions per generation per kilo base pairs. The solid green line is a fit to the mutation rate data with the same function as in Equation 13. The fit parameters are v¯=2.4kbp-1gen-1, δ=0.18, ω=4.9radMbp-1 and ϕ=1.93rad. (b) Temperature dependence of angular frequency of oscillation ω (squares). (c) phase ϕ (squares). Green triangles in (b) and (c) represent the angular frequency and phase, respectively, from the fit to the mutation rate data with Equation 13. (d) Diffusion coefficient D. Circles represent individual fitted values of diffusion coefficients. Blue circles represent cases in which the fitted value of D is either zero or not significant (see SI). This occurs in two out of nine cases for 37°C and three out of nine cases for each of the other temperatures.

We consider two possible explanations for these oscillations. The first is that the oscillations originate from a systematic process related with the cell cycle (Niccum et al., 2019). The second explanation is that the oscillations are caused by competition among replisomes for nucleotides or other molecules required for replication. Assuming approximately constant cell division times, we estimate the cell division time as τ=(ln2)/k. Since k is also equal to the fork firing rate per genome, the time between firing events in a cell is also approximately equal to τ, so that the two hypotheses lead to the same quantitative prediction. If the speed of replisomes was coupled to a factor oscillating with period τ, this would cause spatial oscillation of speed with angular frequency ω=2π/(v¯τ)=2πk/[(ln2)v¯]. This prediction qualitatively agrees with our fitted values of ω, see Figure 4b.

If the wave-like pattern were caused by competition among replisomes, one would expect either a minimum of the speed every time a new fork is fired (ϕ=π) or the speed to start decreasing when a new fork is fired (ϕ=π/2). Our fitted values of the phase ϕ are also compatible with this range, see Figure 4c.

Our results show that the diffusion coefficient D is quite small. For about one third of our experimental realizations at each temperature, our fitted value of D is not significant according to the Akaike information criterion (see Figure 4d and Figure 4—figure supplement 2). For comparison, we estimate the equilibrium diffusion constant of replisomes in the cytoplasm from the Stokes-Einstein relation as DSE6kbp2s1 (see Appendix 5), of the same order of magnitude as our fitted values, see Table 1 and Figure 4. These results suggest that, despite their high average speed, the fluctuations of the replisome position are remarkably similar to the equilibrium case.

The diffusion coefficient determines the uncertainty about the genome site where the two replisomes meet. In the absence of diffusion (D=0), replisomes would always meet at the midpoint on the circular genome. For D>0, we estimate the typical size lD of the region in which the two replisomes meet as follows. Since the fitted values of δ and D are both small, we approximate the replication time as τCL/(2v¯). In this time, the accumulated uncertainty due to diffusion is equal to lD22DτD. From our estimated diffusion coefficients and average velocities, we obtain values of lD on the order of 100200kbp, depending on temperature.

We remark that our bacterial cultures are grown in LB medium. Although the growth curves appear exponential before saturating (see Methods), the nutrient composition can be such that the assumption of steady growth made in our model is not valid. It is therefore important to scrutinize whether the oscillations can be a consequence of this issue. To this aim, we analyzed a version of the model in which the fork firing rate is not steady, but gradually declines with time, see Appendix 7. We find that the discrepancy between the DNA abundance distributions predicted by this model and by the steady-state model is very small compared to the oscillations we observe. This observation supports that a potential lack of steady-state in the LB medium is not a likely cause for the oscillations.


In this paper, we infer the dynamics of replisomes from the DNA abundance distribution in a growing bacterial population. Our theory can be seen as a generalization of the classic Cooper-Helmstetter theory (Cooper and Helmstetter, 1968; Bremer and Churchward, 1977), that permits to estimate the duration of the replication period from the abundance of certain genomic locations in a growing population, see e.g. (Zheng et al., 2016; Si et al., 2017). While the Cooper-Helmstetter theory assumes constant replisome speed, our approach allows for varying speeds. We test our method by measuring the DNA abundance distribution of E. coli populations growing at different temperatures. We thereby accurately estimate the average speed of replisomes in vivo, and their speed variations along the genome.

We find that the dependence of the average replisome speed on the temperature is well described by an Arrhenius law, similar to that governing the population growth rate. This quantitative dependence can be used to deduce other laws governing bacterial physiology at varying temperature. For example, we argue that precise DNA-protein homeostasis requires the cell size to mildly vary with temperature. This prediction is in qualitative agreement with previous observations (Shehata and Marr, 1975; Trueba et al., 1982) and calls for more systematic measurements of cell parameters at varying temperature, similar to what has been done in the case of varying nutrient composition (Si et al., 2017).

Our approach reveals a wave-like oscillation of the replisome speed along the E. coli genome. The relative amplitude of these oscillations ranges from 10% to 20% of the average replisome speed. A quantitatively similar pattern was observed in the bacterial mutation rate along the DNA of an E. coli mutant strain (Niccum et al., 2019) and of other bacterial species (Dillon et al., 2018). This similarity suggests that the two phenomena have a common dynamical origin. In particular, we hypothesize that this correlation could be a manifestation of the trade-off between accuracy and speed that characterizes DNA polymerases (Sartori and Pigolotti, 2013; Banerjee et al., 2017; Fitzsimmons et al., 2018). Because of this trade-off, any mechanism increasing the speed of a polymerase is expected to increase its error rate as well.

Our analysis of the frequency of these oscillations supports that this pattern may originate from a process synchronized with the cell cycle (Dillon et al., 2018), whose activity alters the replisome function. An alternative hypothesis is that the oscillations originate from competition among replisomes for shared resources, such as nucleotides. According to this idea, the firing of new forks can hinder the progression of existing replisomes. The frequency of oscillations is compatible with both explanations. The following additional evidence supports the latter hypothesis. We did not observe appreciable oscillations for our lowest temperature of 17°C, which according to our estimates falls outside the multi-fork replication regime. Further, we found that oscillations disappeared when analyzing previous data from a culture grown in a minimal medium (Midgley-Smith et al., 2018), where multi-fork replication is also not expected. Similarly, oscillations in the mutation rate are also less evident at lower temperature and disappear in minimal medium (Niccum et al., 2019). On one hand, these facts point to competition between multiple replisomes in the same cell as a likely source for the oscillations. On the other hand, given the difficulty of obtaining steady exponential growth in LB medium, further experiments will be important to assess an eventual effect of the growth medium on the DNA abundance shape.

Beside these regular and repeatable variations, our analysis shows that random fluctuations of replisome speed are quite small, leading to an uncertainty of about 100200kbp on the location of the replisome meeting point. In bacteria, the terminal region of replication is flanked by two groups of termination (Ter) sites having opposite orientations. Ter sites are the binding sites for the Tus protein and permit passage of replication forks in one direction only (Elshenawy et al., 2015), so that the two groups effectively trap the two forks in the terminal region (Duggin et al., 2008). Out of the ten Ter sequences in E. coli, only two of them (TerB and TerC) are within 100200kbp of the point diametrically opposite to the origin. These two sequences have the same orientation. Our result therefore implies that most Ter sequences are usually not needed to localize the replisome meeting point. This prediction is consistent with previous observations that the phenotypes of Tus- E. coli mutants (Roecklein et al., 1991) or mutants lacking Ter sequences (Duggin et al., 2008) do not appear distinct from that of the wild type.

Quantitative modeling of the DNA abundance distribution has the potential to shed light on aspects of replisome dynamics beyond those explored in this paper. For example, it was observed that the knockout of proteins involved in the completion of DNA replication leads to either over-expression or under-expression of DNA in the terminal region (Wendel et al., 2014; Wendel et al., 2018; Sinha et al., 2018). Incorporating the role of these proteins into our model will permit to validate possible explanations for these patterns. More in general, our approach is simple and general enough to be readily applied to other bacterial species, to unravel common principles and differences in their DNA replication dynamics.


Cultivation and DNA extraction

E. coli MG1655 was cultured in LB medium supplemented with 50 mM MOPS pH 7.2 and 0.2% glucose. Overnight cultures grown at 37 °C were diluted into fresh medium and grown until reaching an OD600 of about 1.0 at the target temperature. These cultures were used to inoculate 50 ml medium at the desired temperature in 500 ml Erlenmeyer flasks with baffles at a target OD of 0.01. Cultivation was performed with shaking at 250 rpm. OD was determined with a NanoDrop One in cuvette mode.

The growth curves of E. coli were highly repeatable (over three replicate experiments for each temperature), see Figure 5a and Figure 5b. We computed the growth rate at each temperature by fitting a logistic function to individual growth curves, see Figure 5c. When the time was rescaled by the average growth rate, OD of different temperatures collapsed along a single curve. The cultures (1.4 ml) were harvested by centrifugation at 21,000 g for 20 s after reaching an OD of around 0.5 (mid-exponential phase, dashed lines in Figure 5a). Cells were kept growing for at least 45 doubling times (as measured in exponential phase) to reach stationary phase. Samples of 0.2 ml from the stationary phase cultures grown at 17°C, 27°C, and 37°C were harvested for DNA extraction. The pellets were immediately frozen at –80 °C until DNA extraction. DNA was extracted in parallel using Genomic DNA Purification Kit from Thermo Fisher Scientific.

Figure 5 with 1 supplement see all
Growth curves at different temperatures.

(a) Optical density (OD) as a function of time at different temperatures. Each curve is averaged over three different replicates at the same temperature. Error bars represent standard deviations. Dashed lines mark the OD window in which the cells are harvested. Solid lines represents the exponential growth curve for each temperature. We computed the growth rate ki for each sample i=1,2,3 at a given temperature by fitting the optical density to a logistic function ai/[1+biexp(-kit)], where ai and bi are sample-specific constants (Zwietering et al., 1990). The growth rate k for each temperature is the average of the kis. (b) Same data as in (a), but the time in the x-axis is scaled by the growth rate k at each temperature. As a result of this rescaling, the growth curves collapse on each other. (c) Average growth rate as a function of temperature. The solid purple line is an Arrhenius fit to the data (see Figure 2), resulting in B=(6.0±24.9)×1012hr-1 and ΔC=(74±10)kJmol-1. We exclude the data point for T=17Co from the fit.


We sequenced three samples in the exponential phase from different experimental realizations for each temperature. In addition, we sequenced three stationary samples at three different temperatures. DNA samples were sheared by ultrasound using Covaris AFA technology. Libraries were prepared using the Illumina DNA PCR-Free Library Prep Kit. Sequencing was performed on a Novaseq6000 using paired-end 150 bp reads.

Alignment and bias elimination

We aligned reads from each sample using Bowtie2 (Langmead and Salzberg, 2012), using the MG1655 genome as a reference. We calculated the frequency of reads as a function of the genome coordinate with bin size 10kbp. To attenuate bias, we divided the frequency at each genome coordinate in a sample from the exponential phase by the frequency of the corresponding bin in a stationary sample (Wendel et al., 2014; Midgley-Smith et al., 2018). We alternatively used all of our three stationary samples to correct the bias of each sample in the exponential phase. Therefore, after bias elimination, we effectively have 3×3=9 different DNA abundance curves in the exponential phase at each temperature. See Figure 2—figure supplement 1, Figure 2—figure supplement 2 and Figure 2—figure supplement 3 for details.

Stationary distribution of replisome positions

In this section, we discuss how to compute the stationary distribution of incomplete replisome positions pst(x1,x2;t). We call nS(x1,x2;t) the number density of incomplete genomes at time t with replisome positions at x1 and x2. By definition 0Ldx10Lx1dx2nS(x1,x2;t)=NS(t). It follows from Equation 7 that this number density evolves according to

(14) tnS(x1,x2;t)=-[vnS]+D2ns,

where =(/x1,/x2) and v=(v(x1),v(x2)). We now introduce the normalized probability p(x1,x2;t)=nS(x1,x2;t)/NS(t). By substituting this definition into Equation 14, we obtain

(15) tp(x1,x2;t)=-[vp]+D2p-kp.

The stationary distribution pst(x1,x2) is a time-independent solution of Equation 15, see Equation 8.

Because of replication completion, the line x1+x2=L is an absorbing state for the dynamics described by Equation 15. Equation 15 must be consistent with Equation 2, which describes the dynamics of incomplete genomes regardless of the coordinates of their replisomes. This implies that the rate β at which replication completes (see Equation 2) must equal to the probability flux through the absorbing boundary:

(16) β=x1+x2=LJn^ dl,

where we introduce the probability current J(x1,x2)=vp-Dp, the unit vector n^=(1/2,1/2), and the infinitesimal line increment dl along the absorbing boundary. Similarly, the probability flux entering the system at (x1,x2)=(0,0) must match the rate of replication initiation as given by Equation 2.

Given a hypothesis on the speed function v(x) and the diffusion coefficient D, we solve Equation 15 at stationarity using the experimentally measured growth rate k. From the stationary solution pst(x1,x2), we obtain β using Equation 16. Our approach does not permit to determine the cell division rate α appearing in Equation 1-Equation 3. However, this rate is not necessary to compute the DNA abundance distribution, which is expressed by Equation 9 and Equation 10 in terms of pst(x1,x2) and β only.

Average genome length

To compute the average genome length, we first note that the integral of P(y) is equal to the average genome length in the population

(17) =L/2L/2P(y)dy.

Combining Equation 17, Equation 10, and the fact that P(0)=1, we obtain a simple relation between the DNA abundance distribution and the average genome length:

(18) =A(0)1.

Average number of replisomes per complete genome

We estimate the average number of replisomes per complete genome N in two alternative ways. On the one hand, using Equation 4-Equation 6 we find that

(19) N=2NSNP+NT=2kβ.

On the other hand, it can be seen in Figure 2d that the fraction of complete genome in the population is equal to the ratio A(L/2)/A(0) between the DNA abundance at the terminal and at the origin. It follows that

(20) N=2[A(0)A(L/2)]A(L/2).

Constant speed

We focus on the scenario with constant speed and D=0. In this case, the steady solution of Equation 15 is given by

(21) pst(x1,x2)=kek2v¯(x1+x2)v¯(1ekL/(2v¯))δ(x1x2).

The rate at which replication completes is equal to

(22) β=ke-kL/(2v¯)1-e-kL/(2v¯).

Substituting Equation 21 and Equation 22 into Equation 9, we obtain.

(23) P(y)=ek|y|/v¯,

from which Equation 11 follows by normalizing, see Equation 10.

We exactly solved Equation 15 also in the case where the speed depends on the genome coordinate, provided that the diffusion coefficient vanishes, see Appendix 6.

Stochastic simulations

In the case of oscillating speed and D>0, we compute the stationary solution of Equation 15 using numerical simulations. To this aim, we interpret Equation 15 as describing a stochastic process subject to stochastic resetting (Evans and Majumdar, 2011). Specifically, we perform stochastic simulations of Equation 7. In addition to the dynamics described by Equation 7, with a stochastic rate equal to the fork firing rate k, trajectories are reset to the origin, x1=x2=0 (blue trajectory in Figure 6a). Since the boundary x1+x2=L is an absorbing state, trajectories that reach this boundary are also reset to the origin (green trajectory in Figure 6a). The probability distribution associated with this dynamics evolves according to Equation 15. We simulate this stochastic dynamics to estimate the stationary distribution pst(x1,x2) in a computationally efficient way, see Figure 6b. We estimate from the same simulations the parameter β as the empirical rate at which the absorbing boundary is reached, see Equation 16.

Replisome dynamics in the (x1,x2) plane.

(a) Two different trajectories demonstrate two different types of resetting events in our simulations. Trajectories are reset to x1=0,x2=0 when the two replisomes complete replication (green trajectory) at the absorbing boundary (solid red line). Additionally, trajectories can be reset from any position to the origin at a rate k (sky blue) to take care of the dilution term in Equation 15. (b) Replisome position distribution pst(x1,x2) in the steady state. In both panels, parameters are v¯=973bps-1, δ=0.19, ω=4radMbp-1, ϕ=3.1rad and D=55kbp2s-1. These parameters are on the order of those fitted from experiments (see Table 1), except for D which is chosen to be larger for illustration purposes.

Appendix 1

Age-dependent dynamics of genome types

In this Appendix, we generalize the model embodied in Equations 1–3 to the more realistic case in which genomes transition from one stage to another with an age-dependent rate. Specifically, we call τ the time since a genome fired its last fork and define the time-dependent fork firing rate k(τ). The time-dependent fork firing rate can be expressed in terms of the fork-firing time-distribution f(τ) as k(τ)=f(τ)/[10τf(τ)dτ].

We also define the age a of genomes in synthesizing or post-replication stage, i.e. the time they spent in their stage. We define the age-dependent rate of completion β(a) and the age-dependent duration of post-replication stage α(a). These rates can be expressed in terms of the distribution of time to complete the synthesizing stage, g(a) and the distribution of time to complete the post-replication stage, h(a). The relations are: β(a)=g(a)/[10ag(a)da] and α(a)=h(a)/[10ah(a)da].

We call nT(τ;t) the number density of templates at time t that fired their last fork at time t-τ. We call nS(a,τ;t), and nP(a,τ;t) the number density of synthesizing and post-replication genomes (respectively) at time t that fired their last fork at time t-τ and with age a. These densities evolve according to the coupled equations:

(24) nTt=nTτk(τ)nT+δ(τ)0 dτ k(τ)nT(τ;t)+0α(a)nP(a,τ;t)da
(25) nSt=nSanSτk(τ)nS+δ(τ)0 dτk(τ)[δ(a)n(τ;t)+nS(a,τ;t)]β(a)nS
(26) nPt=nPanPτk(τ)nP+δ(τ)0 dτ k(τ)nP(a,τ;t)+δ(a)0β(a)nS(a,τ;t)daα(a)nP,

where n(τ;t)=nT(τ;t)+0nS(a,τ;t)da+0nP(a,τ;t)da is the total number density of genomes at time t that fired their last fork at time t-τ.

The total numbers of templates, synthesizing, and post-replication genomes are respectively given by NT(t)=0nT(τ;t)dτ, NS(t)=0nS(a,τ;t)dadτ, and NP(t)=0nP(a,τ;t)dadτ. Integrating Equation 24 with respect to τ and Equations 25; 26 with respect to a and τ we obtain

(27) dNTdt=0α(a)nP(a,τ;t)dadτ
(28) dNSdt=0 dτ k(τ)n(τ;t)0 β(a)nS(a,τ;t)dadτ
(29) dNPdt=0 β(a)nS(a,τ;t)dadτ0α(a)nP(a,τ;t)dadτ.

The total number of genomes, N(t)=NT+NS+NP grows with time as

(30) dNdt=0 dτ k(τ)n(τ;t).

If the fork firing rate is independent of τ, k(τ)=k, then the number of genomes grows as N=N(0)ekt. A computation of the growth exponent for general case requires knowledge of n(τ,t). Using the definition of n(τ,t) in Equation 24-Equation 26 leads to

(31) tn(τ;t)=τn(τ;t)k(τ)n(τ;t)+2δ(τ)0 dτ k(τ)n(τ;t).

We assume that the age dependent genome population scales with the total number of genomes, n(t;τ)=q(τ;t)N(t). This assumption for n(t;τ) yields an exponential growth for the number of genomes, N=N(0)ek^(t)t, with a time-dependent exponent,

(32) k^(t)=0k(τ)q(τ;t)dτ.

Substituting this along with the relation n(τ;t)=q(τ;t)N(t) in Equation 31, we find

(33) tq(τ;t)+k^(t)q(τ;t)=τq(τ;t)k(τ)q(τ;t)+2δ(τ)0 dτ k(τ)q(τ;t).

In the steady state, qst(τ)q(τ;t) is independent of time. In this limit, the growth rate in Equation 32 is also time-independent, k^=0k(τ)qst(τ)dτ. Using these conditions in Equation 33, we find

(34) qst(τ)=2k^ek^τ[10τf(τ)dτ],

where we used e0τk(τ)dτ=10τf(τ)dτ. The normalization condition, 0qst(τ)dτ=1, yields the Euler-Lotka equation for the relation between k^ and fork firing time distribution f(τ),

(35) 20ek^τf(τ)dτ=1.

In the steady state, we assume that nT(τ;t)=NT(t)qT(τ), nS(a,τ;t)=NS(t)qS(a,τ) and nP(a,τ;t)=NP(t)qP(a,τ), see, e.g., (Jafarpour et al., 2018). Because of the definitions of NT, NS and NP, we have the normalization conditions

(36) 0qT(τ)dτ=1,0qS(a,τ)dadτ=1,0qP(a,τ)dadτ=1.

These conditions permit to express the dynamics of genome types as

(37) dNTdt=α^NP,dNSdt=k^N-β^NS,dNPdt=β^NS-α^NP.

where α^=0α(a)qP(a,τ)dadτ and β^=0 β(a)qS(a,τ)dadτ. From Equation 37, we obtain in the long time limit:

(38) NTN=α^β^(k^+β^)(k^+α^),NSN=k^k^+β^,NPN=β^k^(k^+β^)(k^+α^).

These relations are equivalent to Equation 4-Equation 6 for age dependent rates.

To compute β^ and α^ more explicitly from Equation 25 and Equation 26, we solve for the marginals q¯S(a)=0qS(a,τ)dτ and q¯P(a)=0qP(a,τ)dτ respectively. We obtain

(39) q¯S(a)=(k^+β^)ek^a0aβ(a)da,q¯P(a)=(k^+α^)ek^a0aα(a)da.

Using these marginals in the definition of β^ and α^, we find

(40) β^=k^0ek^ag(a)da10ek^ag(a)da,α^=k^0ek^ah(a)da10ek^ah(a)da

In deriving Equation 40, we also used the relations e0aβ(a)da=10ag(a)da and e0aα(a)da=10ah(a)da. In the limiting case of age-independent rates α and β, then g(a)=βe-βa and h(a)=αe-αa, and therefore β^=β and α^=α as expected.

Appendix 2

Parameter estimation based on maximum likelihood

In this section, we outline the maximum likelihood method to fit the parameters of the model. We make a histogram of our data with bin size b=10 kbp. We call B the total number of bins. We empirically estimate the DNA abundance in each bin i as

(41) i=Nie/Nisbi=1B(Nie/Nis),

where Nie and Nis are the numbers of reads in the i th bin from the exponential and stationary culture, respectively. We assume that the number of reads in each bin follows a Poisson distribution (Aird et al., 2011). In the limit of large Nie and Nis, the distribution of the empirical DNA abundance i is Gaussian with a standard deviation

(42) σi=i1Nie+1Nis.

The assumption of large number of reads is well satisfied for our sequencing depth: Nie ranges from 2.8×104 to 19.9×104 and Nis from 4.6×104 to 9.0×104. We call AiA(ib) the DNA abundance predicted by our model at bin i for a given set of parameters. The joint likelihood of the empirical DNA abundance in all the bins is given by

(43) L=i=1B12πσi2e(EiAi)22σi2.

We fit the model parameters by maximizing the logarithm of the likelihood ln.

In the constant speed case, we have a single fitting parameter k/v¯. In the oscillatory model without and with diffusion we have four and five fitting parameters, respectively.

Appendix 3

Correspondence with the Cooper-Helmstetter Theory

In this section, we study our model in the case of constant speed, constant duration of the B and D stages, and slow growth condition, so that all stages B, C, and D of the cell cycle are present. Our goal is to prove that, under these additional assumptions, our model predicts the same total DNA abundance as the Cooper-Helmstetter model.

The expression for the total DNA abundance in the constant speed case reads

(44) C=NNTL/2L/2P(x)dx=2v¯kk+αα[ekL/(2v¯)1],

see Equation 12. The analogous expression for the Cooper-Helmstetter model is

(45) LkτCekτD[ekτC-1],

where τC is the time taken by replisomes to complete replication on the genome and τD is post-replication period (Cooper and Helmstetter, 1968).

Since we assumed constant replisome speed, we have τC=L/2v¯. Therefore, proving that Equation 44 and Equation 45 are equivalent boils down to showing that ekτD=(α+k)/α. We call τB the time spent by a bacterium in stage B. In the CH model the times τB, τC are constant. The division time is τdiv=τB+τC+τD. The population growth rate is k=ln2/τdiv.

Since the division time is constant, the steady-state age distribution is expressed by

(46) p(τ)=ke-kτ1-e-kτdiv=2ke-kτ,

see e.g. Powell, 1956

We now express the steady-state fractions fB, fC, fD of cells in stage B, C, D as

(47) fB=0τBp(τ)dτ=2(1ekτB)=2ek(τC+τD)
(48) fC=τBτB+τCp(τ)dτ=2ekτB(1ekτC)=ek(τC+τD)ekτD
(49) fD=τB+τCτdivp(τ)dτ=2ek(τB+τC)(1ekτD)=ekτD1.

The ratio of the number of post replication genomes to the number of templates is equal to the fraction of cells in stage D. Therefore, equating the fraction fD with NP/NT=k/α obtained from Equation 4 and Equation 6, we obtain ekτD=(α+k)/α as expected.

Appendix 3—figure 1
DNA content per cell as a function of the growth rate for the case of varying nutrients in (a) and of varying temperature in (b).

In (a), the experimental data (orange circles) are from Si et al., 2017. The solid red line is from Equation 45, in which we used τC=38 min and τD=37.1 min (Si et al., 2017). The curve in (b) is from Equation 44. In this case, we substituted the speed of replisomes and the growth rate of cells at each temperature in Equation 44. In addition, we assumed a linear temperature dependence for the post replication duration (α-1), see (c). The parameters of the linear fit are determined from the data (red squares) reported in Stokke et al., 2012 for the LB medium. We used this linear fit to extrapolate the value of α for different temperatures in Equation 44.

Appendix 4

Appendix 4—table 1
Comparison between average (over the sample to sample variations) speed estimated in the constant and oscillatory speed models.

Temperature are expressed in Celsius and speeds in bps-1. The last column shows the average growth rate (expressed in hr1) at different temperatures.

Temperaturev¯(constant speed)v¯(oscillatory speed)k(growth rate)

Appendix 5

Estimate of the diffusion coefficient from the Stokes-Einstein relation

The Stokes-Einstein relation expresses the diffusion coefficient of a spherical particle immersed in a fluid (Hynes, 1977). If r is the radius of the particle, η and T are viscosity and the temperature of the fluid respectively, then according to the Stokes-Einstein relation, the diffusion coefficient is

(50) DSE=kBT6πηr

where kB is the Boltzmann constant. The radius of an E. coli replisome is r50nm(Reyes-Lamothe et al., 2010) and the viscosity of water is η=0.7mPa s at room temperature T=310K. Using that kB=1.38×10-23JK-1, we estimate a diffusion coefficient of replisomes in water equal to DSE,W=6μm2s-1. The typical base pair distance is 3.4Ao. Therefore in terms of base-pair (bp), DSE,W56kbp2s-1. The diffusion constant of large macromolecules in the cytoplasm is found to be about 10 times smaller than in water (Verkman, 2002). This results in an estimate of the diffusion constant of replisomes in the cytoplasm of DSE,C6kbp2s-1, as reported in the Results.

Appendix 6

Exact solution of the oscillatory speed case for D=0

In this section, we exactly solve the oscillatory speed model in the absence of diffusion (D=0). For an arbitrary choice of the function v(x) and D=0, the steady state solution of Equation 15 reads

(51) pst(x1,x2)=δ(x2x1)Av(x1)e0x1dxk/v(x)

where A is a normalization constant that ensures 0Ldx10x1dx2pst(x1,x2)=1. The rate at which replication completes is β=Ae0L/2dxk/v(x). From Equation 9, we obtain

(52) P(y)=Ak+βe0|y|dyk/v(y).

For the specific form of v(x) given in Equation 13, the integral in Equation 52 is equal to

(53) 0|y|dyk/v(y)=2kv¯ω1δ2{arctan[1δ1+δtan(ω|y|+ϕ2)]arctan[1δ1+δtan(ϕ2)]+πω|y|+ϕ+π2ππϕ+π2π}.

where is the floor function. We use the expression of this integral in Equation 52 and substitute the result in Equation 10 to obtain the DNA abundance distribution. We computed the normalization factor of the DNA abundance distribution (i.e., the denominator of Equation 10) by numerical integration.

Appendix 7

Replisome dynamics with time-dependent fork firing rate

To verify the robustness of our results, we generalize our model to the case in which the fork firing rate k depends on time. For simplicity, we assume that replisomes move with constant speed and without diffusion (D=0).

We assume that the total number of genomes grows with time as

(54) N(t)=θ(t)N0e0tk^(t)dt=θ(t)a1+bexp(kt)

where N0=a/(1+b) is the initial number of genomes and we introduced the time-varying fork firing rate

(55) k^(t)=θ(t)bkexp(-kt)1+bexp(-kt).

The step function θ(t) in Equation 54 serves to impose that replication occurs for t0 only.

The corresponding dynamics of synthesizing genomes reads

(56) tnS(x1;t)=δ(x1)k^(t)N(t)-vnSx1,

where the delta function ensures that the replication initiates at x1=0. As we have set D=0, the coordinate x2 of the second replisome obeys the same dynamics expressed in Equation 56 and the DNA abundance is symmetric around the replication origin.

The general solution to Equation 56 is of the form ns(x1;t)=f(x1-vt). Because of the boundary term at x1=0, i.e. ns(0;t)=k^(t)N(t)/v, the general solution is

(57) ns(x1;t)=k^(t-x1v)N(t-x1v)v.

The probability that a randomly chosen genome (either complete or incomplete) includes the genome location y at time t is given by

(58) P(y;t)=NP(t)+NT(t)+|y|L/2nS(x1,t)dx1N(t).

where |y|L/2nS(x1,t)dx1 is the number of incomplete genomes which include the genome location y and NP(t)+NT(t) is the total number of complete genomes. From Equations 54; 55, and (Equation 56) , we obtain

(59) |y|L/2nS(x1;t)dx1=N(t|y|v)N(tL2v)N0θ(t|y|v)θ(L2vt).

Therefore, the total number of incomplete genomes is Ns(t)=0L/2nS(x1;t)dx1=N(t)N(tL2v)N0θ(t)θ(L2vt). Further, because of the conservation of the total number of genomes, the number of complete genomes at time t is

(60) NT+NP=N(t)Ns(t)=N(tL2v)+N0θ(t)θ(L2vt).

Substituting Equation 54, Equation 59 and Equation 60 in Equation 58, and using Equation 10, we compute the DNA abundance distribution for t>L/(2v) as

(61) A(y)=k/(2v)log[bexp(kt)+1]log[bexp(kt)+exp(kL2v)]11+bexp[k(t|y|v)].

We now use this result to test robustness of the steady-state assumption. To this aim, we assume that the number of genomes follows a similar dynamics as the optical density. We accordingly consider, for each sample at each temperature, the parameters ai, bi, ki obtained from the OD fits (see Figure 5—figure supplement 1), and the average speed v¯ estimated from the constant speed model (see Table 1). We substitute these in Equation 61 and compute the DNA abundance at time t=4/k. This value is chosen as, in our experiments, cells were extracted after approximately four generations, see Figure 5.

We then attempt to fit these DNA abundance distribution with the one predicted the constant speed model, Equation 11, by least-square minimization and thus estimating the speed again. The newly estimated speeds at different temperatures are comparable to those inputted in the time-varying fork firing rate model, see Appendix 7—table 1. Finally, we plotted the discrepancy ratio in each of the cases, see Appendix 7—figure 1. At all temperatures, the discrepancy ratio is much smaller than the oscillations we observed in the data. This result supports that the potential non-stationary nature of the fork firing rate is not likely to be the cause of the oscillations.

Appendix 7—figure 1
Discrepancy ratio (blue line) between DNA abundance computed from the varying fork firing rate model and the constant speed model exhibits significantly smaller variations compared to the discrepancy ratio between experimental DNA abundance and the constant speed model (pink for 17°C , red for 22°C, skyblue for 27°C, brown for 32°C and orange for 37°C).

Black line is the prediction of the oscillatory speed model. All plots in this figure are from Sample 1. The effect of relaxing the steady-state assumption is similarly small for the other samples.

We remark that, in the estimate presented in this section, we assumed that cells were not growing before the initial time t=0. In our culture, we expect the effect of saturation to be even milder since cells were inoculated from an already exponentially growing culture.

Appendix 7—table 1
The speed inputted to the time varying fork firing rate model to generate the DNA abundance distribution is comparable with the re-estimated speed by fitting the resulting DNA abundance distribution to the constant speed model.

Reported uncertainties in the re-estimated speed represent standard deviations over the replicates.

Temperature (°C)Speed inputted to the varyingfork firing rate model (bps-1)Re-estimated speed(bps-1)

Data availability

Sequence reads were deposited in the NCBI Sequence Read Archive with links to BioProject accession number PRJNA772106. Corresponding read frequencies along the genome were deposited in Zenodo (DOI:https://doi.org/10.5281/zenodo.5577986).

The following data sets were generated
    1. Pigolotti S
    2. Bhat D
    3. Hauf S
    4. Plessy C
    5. Yokobayashi Y
    (2022) Zenodo
    Escherichia coli DNA replication study: processed alignment data.


Article and author information

Author details

  1. Deepak Bhat

    1. Biological Complexity Unit, Okinawa Institute of Science and Technology, Onna, Japan
    2. Department of Physics, School of Advanced Sciences, Vellore Institute of Technology, Vellore, Tamil Nadu, India
    Conceptualization, Formal analysis, Methodology, Writing – original draft
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-9387-4951
  2. Samuel Hauf

    Nucleic Acid Chemistry and Engineering Unit, Okinawa Institute of Science and Technology, Onna, Japan
    Performed the experiments
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-3034-8441
  3. Charles Plessy

    Genomics and Regulatory Systems Unit, Okinawa Institute of Science and Technology, Onna, Japan
    Analysis of sequencing data
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7410-6295
  4. Yohei Yokobayashi

    Nucleic Acid Chemistry and Engineering Unit, Okinawa Institute of Science and Technology, Onna, Japan
    Performed the experiments
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-2417-1934
  5. Simone Pigolotti

    Biological Complexity Unit, Okinawa Institute of Science and Technology, Onna, Japan
    Conceptualization, Formal analysis, Methodology, Writing – original draft
    For correspondence
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-6157-6906


Japan Society for the Promotion of Science (JP18K03473)

  • Simone Pigolotti

Deutsche Forschungsgemeinschaft (452628014)

  • Samuel Hauf

Deutsche Forschungsgemeinschaft (HA9374/1-1)

  • Samuel Hauf

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.


We are grateful for the help and support provided by the Scientific Computing and Data Analysis section of the Research Support Division at OIST. We thank the DNA Sequencing Section of OIST, in particular N Arasaki, for support with sequencing. This work was supported by JSPS KAKENHI Grant No. JP18K03473 (to SP) and a grant from Deutsche Forschungsgemeinschaft (DFG, grant number: 452628014 to SH). We thank M Cencini and P Sartori for feedback on a preliminary version of our manuscript.

Version history

  1. Preprint posted: October 15, 2021 (view preprint)
  2. Received: November 26, 2021
  3. Accepted: July 13, 2022
  4. Accepted Manuscript published: July 25, 2022 (version 1)
  5. Version of Record published: August 17, 2022 (version 2)
  6. Version of Record updated: April 19, 2023 (version 3)


© 2022, Bhat et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.


  • 1,734
  • 432
  • 9

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Deepak Bhat
  2. Samuel Hauf
  3. Charles Plessy
  4. Yohei Yokobayashi
  5. Simone Pigolotti
Speed variations of bacterial replisomes
eLife 11:e75884.

Share this article


Further reading

    1. Developmental Biology
    2. Physics of Living Systems
    Raphaël Clément

    Geometric criteria can be used to assess whether cell intercalation is active or passive during the convergent extension of tissue.