Overview of the choice of encoding and the choice of timescale for quantifying relative fitness.

(A) Example trajectory of relative abundance x (upper panel) for a mutant invading and eventually replacing a wild-type population. The same trajectory is plotted under the encoding log x (middle panel) and the logit-encoding log(x/(1 − x)) (lower panel). (B) The general flow-chart to predict the future relative abundance of a mutant given a relative fitness value sm = dm/dt for some encoding m. The current relative abundance xt is transformed into the new variable mt = m(xt), then projected into the future through a linear extrapolation using sm (upper horizontal arrow) and finally converted back into a frequency xt+..6t using the decoding function m−1. (C) Four scenarios for positive mutant fitness with different underlying population dynamics. For each scenario, we show an example trajectory of absolute abundance (stacked) for the wild-type (dark grey) and mutant population (light grey). Each scenario is mapped as a single-dot onto the fold-change diagram (center plot) and colored areas indicated positive (green area) and negative relative fitness per-cycle (blue area; compare Eq. (8)). (D) Basic constellation for misranking between relative fitness per-cycle and relative fitness per-generation . For a given competition (red dot), misranking occurs in a bow-tie area of the fold-change space (red shade). Any competition in the right half of this area (grey dot) will have higher mutant fitness but lower mutant fitness (right inset). As small plots on the left, we show possible population dynamics that generate this fold-change variation.

Comparison of mutant fitness rankings with different statistics on empirical trait variation.

(A) Overview of the growth curve dataset and the estimated growth traits for the knockout library of Saccharomyces cerevisiae (Methods). (B) Covariation between estimated steady-state growth rate g and lag time, λ across all mutant strains (grey dots; Pearson correlation coefficient r = −0.17, p = 7 × 10−30) as well as wild-type replicates (orange dots; r = − 0.16, p = 0.002). The reference wild-type strain for our pairwise co-culture simulations is defined by the median trait values (black cross) of all wild-type replicate. (C) Covariation between measured steady-state growth rate g and biomass yield Y across all mutant strains (grey dots; r = 0.21, p = 8 × 10-44) as well as wild-type replicates (orange dots; r = − 0.06, p = 0.25). (D) Overview of pairwise co-culture simulations. For each mutant strain (orange), we simulate a competition growth cycle against a reference wild-type strain using the estimated traits (panel A) and laboratory parameters for the initial condition (N0 = 0.05 OD, R0 = 111 mM glucose, x = 0.5; Methods) and quantify relative fitness of the mutant in different statistics (Eq. (8),Eq. (9)). (E) Rank disagreement between relative-fitness per-generation and per-cycle . For each fitness statistic, we calculate the mutant ranking (higher rank means higher fitness and mutants with equal fitness are assigned the lowest rank in the group). The rank difference is defined as the rank in minus the rank in . (F) Covariation between wild-type and mutant fold-change across all simulated competitions, with mutant strains (grey dots) and wild-type replicates (orange dots). For each wild-type replicate, we simulate a pairwise co-culture competition against the reference wild-type strain. We highlight the mutant with the greatest rank difference (blue halo) in panel E and F, and its corresponding bow-tie area of misranking (compare Fig. 1D).

Potential consequences of the choice of fitness statistic for the interpretation of evolutionary data.

(A) Hypothetical scenario for the trait evolution in a long-term evolution experiment. An evolving population decreases in lag time (orange line), increases in growth rate to a maximum (blue line) and keeps decreasing in biomass yield (green line). This trend is similar to initial observations from the LTEE [94] (B) Corresponding long-term trend in relative fitness based on the trait evolution in panel A. We estimate relative fitness per-cycle (grey line) and per-generation (red line) every 250 generations. Dotted grey lines mark the end of trait evolution in lag time and growth rate in panel A. For the actual fitness trend in the LTEE, see Fig. S11. (C) Epistasis plot for lag time and yield using relative fitness per-generation . Colored dots show the fitness for a single mutant with shorter lag time (blue dot), a single mutant with higher biomass yield (red dot) and a double mutant with both mutations (purple dot). (D) Epistasis plot for lag time and yield in relative fitness per-cycle . (E) Epistasis plot for growth rate and yield in relative fitness per-generation . Colored dots show the fitness for a single mutant with higher growth rate (blue dot), a single mutant with higher biomass yield (red dot) and a double mutant with both mutations (purple dot). (F) Epistasis plot for growth rate and yield in relative fitness per-cycle . All epistais plots are based on 50:50 competition growth cycles with the wild-type (compare panels C,D and Fig. S12A-C, compare panels E,F and Fig. S12D-F).

The choice of library abundance and reference group in bulk competition experiments.

(A) Overview of a pairwise competition experiment (upper row) and multiple scenarios for bulk competition experiments (middle and bottom row) with different initial fraction of the mutant library (colored ovals) in the inoculum (open box). For each scenario, we show a schematic growth cycle (log absolute abundance) in the inset on the right. (B) Schematic relative abundance trajectories for a mutant compared to two alternative subpopulations. We distinguish between the total relative abundance xi with respect to the population as a whole (height of green band in the top box) and the pairwise relative abundance xiwt with respect to the wild-type (height of green band in the bottom box; Eq. (18). We indicate the sign of total relative fitness (Eq. (22)) and pairwise relative fitness (Eq. (23)) on the right. (C) The absolute error between bulk and pairwise competition experiments. The total relative fitness (grey dots; Eq. (22)) and the pairwise relative fitness (red dots; Eq. (23)) for empirical knockouts (Fig. 2A) in bulk competition growth cycle with low mutant library abundance (panel A, case II; Methods). The absolute error is defined as the bulk fitness statistic minus the relative fitness in pairwise competition (Eq. (S81)). In the inset, the absolute error for pairwise relative fitness (Eq. (23)) for a bulk competition growth cycle with high mutant library abundance (blue dots; case III). The x-axis and the red dots in the inset are identical to the main plot. (D) The relative error in bulk competition experiments as a function of mutant library abundance in the inoculum. Each line corresponds to a knockout in our dataset, and represents the relative error between the pairwise relative fitness in bulk competition and the relative fitness in pairwise competition (Eq. (S91)). In black lines, we show the recommended mutant library abundance for our dataset based on Eq. (10) (xlib ≈ 24.6%) and based on Eq. (S109) (xlib ≈ 0.02%, Sec. S15).

Flow-diagram for the quantification of relative fitness from time-series data.

Given the relative abundance of the mutant genotype at two consecutive timepoints (e.g. the start and end of a growth cycle), the user has to choose an encoding (Fig. 1A-B) and the time-scale for evaluating the change in relative abundance (Eq. (5)). In bulk competition experiments, multiple definitions of relative abundance are possible depending on the choice of the reference subpopulation (compare Fig. 4B; Methods). Each combination of these choices (dashed black lines) leads to a different fitness statistic and we summarize our recommendations (thick black line) in the discussion.

Parameter settings for Gaussian Process optimisation.

Predicting the absolute and relative abundance of microbial populations.

A) Example timeseries of absolute abundance for a single microbial population (light grey). The observer (eye symbol) of the timeseries at time t can ask about the future trend in absolute abundance (arrows). (B) Example timeseries of absolute abundance for two microbial strains, that may represent a population of wild-type and mutant. The absolute abundance of the wild-type strain (light grey) is stacked on top of the absolute abundance of the mutant strain (dark grey). Similarly, the observer (eye symbol) can ask about the future trend for the absolute abundance of the mutant population or the wild-type population in this co-culture. (C) The relative abundance for the mutant strain corresponding to panel (B). We sketch the mean relative abundance x over time (green line) inferred from multiple replicates (rectangles). An error band (light green) around the timeseries demonstrates the variation between replicates. The observer (eye symbol) can ask about the future trend for the relative abundance of the mutant population (arrows).

The effect of encodings on a non-logistic relative abundance trajectory.

Example rajectory of relative abundance x (top panel) for a mutant invading and eventually replacing a wild-type population, simulated with the Gompertz equation .Below, we show the Gompertz trajectory under the encoding log x and the logit-encoding log(x/(1 - x)). Compare to Fig. 1A for a trajectory simulated with the logistic equation.

The advantages of the logit-encoding for linear regression to relative abundance time-series data.

(A) Linear regression to the example trajectory for mutant relative abundance from Fig. 1A. Given the true relative abundance x(t) (green line), we sample the relative abundance at a set of intermediate timepoints (grey dots; top row) assuming a binomial distribution and fit a linear regression line (red line; top row). We transform samples of raw relative abundance x (grey dots; top row) into the log-encoded abundance log x (grey dots; middle row) and fit a linear regression (red line; middle row). Similarly, we fit a regression the logit-encoded abundance logit x (bottom row). (B) Corresponding residuals plot for the regressions in panel A. The sampled relative abundance values (grey dots; panel A) are compared to the fitted regression (red line; panel A) to calculate the residuals (grey dots; this panel). A red line traces the mean value of the residuals. We observe that the residuals under the logit-encoding (bottom row) show no systematic trend in the mean residuals (red line) and have constant variation across timepoints (grey dots).

The variation of wild-type and mutant log fold-change under resource consumption constraints

(A) Schematic fold-change variation under a perfect resource consumption constraint between wild-type and mutant. Each dot corresponds to a mutant strain in a 50:50 competition growth cycle with the wild-type strain (Sec. S5). The wild-type LFC is lower in competition with mutants that have very high LFC (red dots) because more resources are consumed by mutant cells. For our model of population dynamics, we can calculate this constraint exactly (black line: Eq. (S36)). We highlight the bow-tie area of misranking (red shading; compare Fig. 1D) for two mutants and show the correlation between relative fitness per-cycle and per-generation in the inset. (B) Schematic fold-change variation that deviates from the resource consumption constraint. Additional variation between the competition growth cycles, e.g., due to variation in the biomass yield of the mutant strains, means that the fold-change values do not fall on a single resource consumption constraint (black line; same as in panel A). This can give rise to rank differences in relative fitness per-cycle and per-generation (inset) for those mutants (orange, blue and green dots) that fall in the bow-tie area of the focal strain (red dot).

Replicate measurements for growth rate, lag time and yield in our empirical trait dataset.

(A) Covariation of growth rate between replicate measurements of the knockouts (grey dots; Pearson correlation coefficient r = 0.94, p = 0.00). Each dot represents a mutant genotype from the single-gene knockout collection in Saccharomyces cerevisiae [55] as measured by Warringer et al. [24]. For the vast majority of genotypes in our dataset (n = 4163 out of n = 4492 knockouts) we are able to fit two traits from two independent growth curve measurements (Fig. 2A; Methods). (B) Covariation of lag time between replicate measurements of the knockouts (grey dots; r = 0.90, p = 0.00) (C) Covariation of biomass yield between replicate measurements of the knockouts (grey dots; r = 0.43, p = 4.81 × 10-188).

The distribution of fitness effects under relative fitness per-cycle and per-generation.

(A) Distribution of relative fitness per-cycle for the knockouts (grey) and wild-type replicates (orange) in our dataset (Fig. 2A), based on the fitness data from Fig. 2E. (B) Distribution of relative fitness per-generation . Based on the fitness data in Fig. 2E. Covariation in fitness ranks between relative fitness per-cycle and per-generation . For each fitness statistic, we calculate the mutant ranking (higher rank means higher fitness and mutants with equal fitness are assigned the lowest rank in the group), based on the rank data from Fig. 2F.

Exploring alternative conditions for misranking between fitness statistics in yeast gene-deletion data.

(A: top row) Small mutant frequency x = 0.01. (B: second row) Standard mutant frequency x = 0.5, but no variation in lag time. (C: third row) Standard mutant frequency x = 0.5, but no variation in growth rate. (D: fourth row) Standard mutant frequency x = 0.5, but no variation in biomass yield. (E: bottom row) Standard mutant frequency x = 0.5, but only positive variation in biomass yield. See Fig. 2 for the default case and panel legends.

Example dataset with anti-correlation between relative fitness per-cycle and per-generation.

(A) Covariation between growth rate and lag time for a synthetically generated dataset of mutants (grey dots) and a single wild-type (orange dots). The variation here may represent the standing variation in an evolved population with improved growth rate and lag time over the wild-type ancestor. The graphic overlap between points means they appear as a single line. (B) Covariation between growth rate and biomass yield for the mutants (grey dots) and wild-type (orange dot) in this synthetic dataset. The variation here is specifically chosen to generate anticorrelation between relative fitnes per-cycle and per-generation. (C) Covariation between relative fitness per-cycle and per-generation for the mutants in panel A-B. We simulate pairwise competitions for all mutants against the wild-type with the exact same settings as for the empirical dataset (Fig. 2D; Methods). (D) Covariation between mutant and wild-type fold-change in the competitions. Based on the competition data in panel C (grey dots), we estimate the LFC values for all mutant-to-wild-type competitions (grey dots). We highlight the bow-tie area of misranking for the mutant with the lowest overall fold-change (red shading; compare Fig. 1D). A dashed black line shows the isocline for zero relative fitness per-cycle (Eq. (8)). (E) Distribution of relative fitness per-cycle for the mutants in panel A-B. Based on the fitness data from panel C. (F) Distribution of relative fitness per-generation for the mutants in panel A-B. Based on the fitness data from panel C.

Comparison of mutant fitness rankings across the complete LTEE competition dataset.

(A) Covariation between relative fitness per-cycle and per-generation for evolved populations of the LTEE. We calculate the fitness statistics based on the competition data published by Wiser et al. [29], who measured the fitness every 250-500 generations (color bar). Each line of the LTEE contributes roughly two competitions per time-point due to replicate measurements [29] (Sec. S8). (B) Rank disagreement between relative fitness per-cycle and per-generation for evolved populations of the LTEE. For each fitness statistic, we calculate a ranking (higher rank means higher fitness and mutants with equal fitness are assigned the lowest rank in the group; compare Fig. 2F). The rank difference is defined as the rank in minus the rank in . We highlight the evolved population with the greatest rank difference (blue halo). (C) Covariation between the wild-type and mutant fold-change for the competition data from the LTEE [29]. The term ’mutant’ (y-axis) refers to an evolved populations at a given time-point from one of the 12 lines of the LTEE. We highlight the evolved population with the greatest rank difference (blue halo; compare panel B) and draw its corresponding bow-tie area of misranking between and (red shading; compare Fig. 1F). The coloring in all panels refers to the time-point at which the evolved population was sampled from the LTEE.

Rank difference between relative fitness per-cycle and per-generation as a function of time in the LTEE.

For each of the 12 lines in the LTEE, we compute the relative fitness per-cycle and per-generation as a function of time, by averaging the data from the LTEE competition dataset [29] across replicate measurements (Sec. S8). For some lines, the time-series is truncated due to measurement difficulties [29]. Based on the quantitative fitness values and , we compute fitness rankings between the replicate lines at any given timepoint (higher rank means higher fitness and mutants with equal fitness are assigned the lowest rank in the group; compare Fig. 2F). The rank difference is defined as the rank in minus the rank in , and we combine the rank difference between all pairs of evolving lines to compute the maximum rank difference at each time-point. At three chosen time-points (t = 4000, t = 15000, t = 30000 generations), we show the correlation between relative fitness per-cycle and relative fitness per-generation (each dot corresponds to one of the 12 lines in the LTEE). We quantify the correlation between and with the Spearman rank correlation ρ (panel title). Colors indicate the time-point and correspond to the color bar in Fig. S9.

Long-term fitness trends in the LTEE under relative fitness per-cycle and per-generation.

(A) Fit of a hyperbolic model (pink line; Eq. (S45)) and a power-law model (cyan line; Eq. (S46)) to a pooled time-series of relative fitness per-generation . We pool the fitness per-generation from all 12 lines of the LTEE into a single dataset (grey dots) and repeat the fits of Wiser et al [29] (Sec. S8). We compute the fraction of variance explained R2 as a measure for the quality of fit (hyperbolic model: R2 = 0.682, power-law model: R2 = 0.701). (B) Fit of hyperbolic (pink) and power-law models (cyan) to a pooled time-series of relative fitness per-cycle . Similar to panel A, we pool the relative fitness-per cycle from all 12 lines of the LTEE into a single dataset (grey dots) and identify a the best fit for the hyperbolic model (R2 = 0.756) and and the power-law model (R2 = 0.764). (C) Fit of hyperbolic (pink) and power-law models (cyan) to an averaged time-series of relative fitness per-generation . Following the ’grand-mean’ averaging strategy outlined in Wiser et al. [29], we compute the average relative fitness per-generation across all evolving lines (grey dots; Methods) and identify a the best fit for the hyperbolic model (R2 = 0.938) and the power-law model (R2 = 0.964). Note that the fraction of variance explained in is much higher compared to panel A, because the averaged time-series has fewer points (Sec. S8). (D) Fit of hyperbolic (pink) and power-law models (cyan) to an averaged time-series of relative fitness per-cycle . Simlar to panel C, we compute the average relative fitness per-cycle across all evolving lines in the LTEE and identify the best fit for the hyperbolic model (R2 = 0.953) and the power-law model (R2 = 0.966). For a comparison between our results and the original fit by Wiser et al [29], see Sec. S8.

Comparing magnitude epistasis between relative fitness per-cycle and per-generation.

(A) Simulated competition growth cycle for a single mutant with short lag time (blue line) in co-culture with a wild-type strain (grey line). We simulate the co-culture using the same settings as for our empirical dataset (Fig. D; Methods). In the top, we show the final relative abundance of the mutant xf, the relative fitness per-cycle (Eq.(8)) and per-generation (Eq. (9)). (B) Simulated competition growth cycle for a single mutant with higher biomass yield (red line), competing against the same wild-type as in panel A (grey line) (C) Simulated competition growth cycle for a double mutant (purple line) with shorter lag time (as in panel A) and higher biomass yield (as in panel B). Compare panel A-C to Fig. 3C-D for the epsistasis plot. (D) Simulated competition growth cycle for single mutant with higher growth rate (blue line). (E) Simulated competition growth cycle for a single mutant with higher biomass yield (red line), same as panel B. (F) Simulated competition growth cycle for a double mutant (purple line) with higher growth rate (as in panel D) and higher biomass yield (as in panel E). Compare panel D-F to Fig. 3E-F for the epistasis plot.

Predicting relative fitness with monoculture proxies under different scenarios of trait variation.

(AA) Covariation between growth rate and lag time in our empirical trait distribution (same as Fig. 2B). (AB) Covariation between growth rate and biomass yield in our empirical trait distribution (same as Fig. 2C). (AC) Quality of prediction for different monoculture proxies under the trait distribution in panel AA-AB. As the ground truth, we estimate the relative fitness per-cycle using a simulated 50:50 competition growth cycle (Fig. 2D; Methods). For each genotype i, we compute the growth rate difference Δ g = gigwt, the lag time difference Δ lag = λi−λ wt and the difference in biomass yield Δ yield = YiYwt. Additionally, we compute the difference in log fold-change Δ LFC = LFCi − LFCwt and the difference in area under the curve ΔAUC = AUCiAUCwt from simulated monoculture growth curves (Sec. S9). We quantify the agreement between the monoculture proxy and the relative fitness per-cycle using theSpearman correlation coefficient ρ, which reflects the agreement in ranking. (B: second row) Modified trait distribution with no variation in lag time. (C: third row) Modified trait distribution with no variation in growth rate. (D: fourth row) Modified trait distribution with no variation in biomass yield.

The choice of the cut-off time for evaluating the area under the curve

(AUC). (A) Distribution of saturation times in mono-culture for the knockouts (grey bars) and wild-type replicates (orange bars; not visible) in our empirical dataset (Fig. 2A). The saturation time tsat is defined as the time when the limiting resource is depleted (R(tsat) = 0) and can be estimated numerically from the trait data (Methods). Three vertical lines indicate example choices for the cut-off time teval of the AUC: the most frequent saturation time (teval = 13 hours; purple line), an external timescale (teval = 24 hours; blue line) and as saturation time half-way in the decay of the distribution (teval = 16 hours; black line). (B) Covariation between relative fitness per-cycle and AUC with a short cut-off time. We compute the AUC from a simulated monoculture knockouts (grey dots) and wild-type replicates (orange dots) using the cutoff time teval = 13 hours (Sec. S9). As the ground truth, we take the relative fitness per-cycle from a pairwise competition (Fig. 2A). As a red line, we show the best fit from a linear regression of to the AUC. (C) Covariation between relative fitness per-cycle and AUC with an intermediate cut-off time (teval = 16 hours). (D) Covariation between relative fitness per-cycle and AUC with a longer cut-off time (teval = 24 hours).

The error in mutant fitness rankings between bulk and pairwise competition experiments.

(A) Rank difference between total relative fitness and pairwise relative fitness in a bulk competition growth cycle with low mutant library abundance (Fig. 4A, case II). Based on the fitness data in Fig. 4C, we calculate a mutant ranking for total relative fitness in bulk (Eq. (22)) and a ranking for pairwise relative fitness in bulk (Eq. (23)) (higher rank means higher fitness and mutants with equal fitness are assigned the lowest rank in the group). The rank difference is defined as the rank in total relative fitness and minus the rank in pairwise relative fitness . (B) Rank difference between pairwise relative fitness in bulk (Eq. (23)) and the relative fitness per-cycle in a pairwise competition (Eq. (8)). Based on the pairwise relative fitness at low mutant library abundance in Fig. 4C (red dots; case II), we calcuate a mutant ranking for pairwise relative fitness in bulk (Eq. (23)) and a mutant ranking for the relative fitness per-cycle in pairwise competition (Eq. (8). The rank difference (red dots) is defined as the rank in bulk competition minus the rank in pairwise competition. (C) Rank difference between pairwise relative fitness in bulk and the relative fitness per-cycle in a pairwise competition for two different cases of the bulk competition. Based on the fitness data in the inset of Fig. 4C, we estimate a mutant ranking for pairwise relative fitness (Eq. (23)) in the case of a bulk competition growth cycle with high mutant library abundance (blue dots; compare Fig. 4A, case III). For each case, the rank difference is defined as the rank in the bulk competition minus the rank in pairwise competition and the rank difference for case II (red dots) is identical to the data in panel B. All fitness statistics in this figure are based on the logit-encoding, however, since the underlying relative abundances are small (x ≪ 1), it is approximately equivalent to the log-encoding (log x ≈ logit x).

A decomposition for the error from higher-order interactions in bulk competition experiments.

For a bulk competition growth cycle with low mutant library abundance (Fig. 4A, case II), we calculate the pairwise relative fitness (Eq. (23)) for the knockouts in our empirical dataset (Fig. 2A) using a previously established approximation [28] (Sec. S13). The error from higher-order interactions is defined as the pairwise fitness in bulk minus the fitness in pairwise competition. Based on the approximation for our model of population dynamics (Sec. (S13)), we derive a decomposition that separates the absolute error into two terms that we call the fitness-dependent error term (dark dots) and the fitness-independent error term (light dots). For full details on the decomposition, see Sec. (S14).

The relative error between bulk and pairwise competition experiments.

(A) Comparing the relative error between total and pairwise relative fitness in bulk. Based on the absolute error in Fig. 4C, we estimate a relative error for the total relative fitness (grey dots; Eq. (22)) and pairwise relative fitness (red dots; Eq. (23)) in a bulk competition growth cycle with low mutant library abundance (Fig. 4A, case II). The relative error of each bulk statistic is defined as the absolute error (Fig. 4C), divided by the fitness in pairwise competition (Eq. (S91)). A dashed grey line indicates the threshold of 1% relative error. (B) Comparing the relative error between the components of higher-order interactions. Based on the absolute error from higher-order interactions in Fig. S16, we estimate the relative error for the fitness-dependent (dark dots) and the fitness-independent error component (light dots). Here the relative error is defined as the absolute error component (Fig. S16), divided by the relative fitness per-cycle in pairwise competition (see Sec. S14 for details). (C) Comparing the relative error between low and high mutant library abundance in bulk. Based on the absolute error in the inset of Fig. 4C, we estimate a relative error for the pairwise relative fitness at low mutant library abundance (red dots; Fig. 4A, case II) and high mutant library abundance (blue dots; Fig. 4A, case III). The relative error for each case is defined as the as absolute error (Fig. 4C; inset), divived by the fitness in pairwise competition (Eq. (S91)). On the x-axis, we plot the absolute value of relative fitness per-cycle in the pairwise competition.