Invasion scenario for Top row: one immune group, Middle row: M = 10 immune groups and fast mixing Cij = 1/M and Bottom row M = 10 immune groups and slow mixing Cij = 1/10M. Other parameters are the same for all rows: in units of δ, we set α = 3 and γ = 5 · 10−3, and f = 0.65, b = 0.8, ε = 0.01. For both rows, graphs represent: Left: number of hosts infectious with the wild-type and the variant; Middle: number of hosts susceptible to the wild-type and the variant, with the equilibrium value δ/α as a gray line; Right: fraction of the infections due to the variant. The thick gray line shows the expected equilibrium frequency β in the case with one immune group, given in Eq. 6. The dashed line shows the trajectory of a constant fitness logistic growth with the same initial growth rate.

A: Simulation of SIR Equations 1 & 2 with additional strains appearing at regular time intervals. The fraction of infections (frequency) caused by each strain is shown as a function of time. The first strain to appear at t = 0 is the variant of interest, and curves are shown in shades of red if they appear on the background of this variant, and of blue if they appear on the background of the wild-type. B: Same as A but with frequencies stacked vertically. The black line delimiting the red and blue areas represents the frequency at which the mutations defining the original variant are found. C: Three realizations of the random walk of Equation 9, all starting at x ≃ 0.5. Two instances converge rapidly to frequency 0 and 1, corresponding to apparent selective sweeps, while the remaining one oscillates for a longer time. D: Representation of a partial sweep using the expiring fitness parametrization of Equation 11. The frequency x of the variant is shown as a blue line saturating at value β (gray line). The thin dashed line shows a selective sweep with constant fitness advantage s0. The fitness s is a red dashed line, using the right-axis.

Retrospective analysis of predictability of viral evolution: frequency trajectories of all amino acid substitutions that are observed to rise from frequency 0 to x* for Top: influenza virus A/H3N2 from 2000 to 2023, and Bottom: SARS-CoV-2 from 2020–2023. Left: all trajectories for x* = 0.4, with blue ones ultimately vanishing and red ones ultimately fixing. The average of all trajectories is shown as a thick black line. Right: showing only the average trajectories for different values of x* (grey lines).

Simulations under the Wright-Fisher model with expiring fitness. A: Average frequency dynamics of immune escape mutations that are found to cross the frequency threshold x* = 0.5, for four different rates of fitness decay. If the growth advantage is lost rapidly (high ν/s0), the trajectories crossing x* have little inertia, while stable growth advantage (small ν/s0) leads to steadily increasing frequencies. B,C,D: Ultimate probability of Pfix(x) of trajectories found crossing frequency threshold x. Each panel corresponds to a different rate of emergence of immune escape variants, with four rates of fitness decay per panel. Increased clonal interference ρ/s0 and fitness decay ν/s0 both result in gradual loss of predictability. We use s0 = 0.03. E: Time to most recent common ancestor TMRCA for the simulated population, as a function of the prediction obtained using the random walk Ne = 1/ρβ2〉. Points correspond to different choices of parameters ρ and Pβ, and a darker color indicates a higher probability of overlap as computed in the SI.

Left: Dynamics of the frequency of the variant for the SIR model from Equations A13&A14 using the invasion scenario from the main text. Two 2 × 2 cross-immunity matrices are used, with off-diagonal parameters f and b chosen to give the same equilibrium. The gray line represents the equilibrium that would be obtained using the model of the main text. Right: Equilibrium frequency β for this new SIR model (y-axis) versus the β from the main text (x-axis). Each point corresponds to a given pair (f, b).

Distribution of partial sweep size β if 1 − f and 1 − b are exponentially distributed. Left: Probability distribution function P(β) for various values of μ/γ. Right: Mean and standard deviation of β as a function of μ/γ.

Probability of fixation of mutations Pfix(x) of mutation frequency trajectories found crossing the frequency threshold x. Fitness effects are exponentially distributed with fixed scale s0 = 0.03. The blue to red gradient in colors corresponds to the increasing rate ρ at which mutations are introduced. Strong clonal interference regime is obtain when ρ/s0 > 1, in which case good mutations are introduced in close succession and compete for fixation. At low ρ/s0, trajectories are very predictible and an increasing trajectory almost certainly fixes. Even for for the highest ρ/s0, Pfix(x) remains significantly larger than x and dynamics are visibly not neutral.

Probability of a strictly monotonous trajectory in the random walk of the main text, as a function of β (fixed) and the initial value x0. The “exact” solution is obtained by numerically computing the product in Equation B1 up to t = 100.

Average coalescence times 〈Tn〉 for a partial sweep coalescent with effective population size Ne and a Kingman coalescent with population size Ne. For simplicity, a constant β is used: Left: a high value β = 0.25; Right: a low value β = 0.05. For low β, the two coalescent processes are very similar until a high n. They considerably differ if β is larger. Note that for the partial sweep process, Tn never goes below ρ−1.

Realisations of different coalescence processes for 30 lineages (leaves). Left: Partial sweep coalescent, with constant β = 0.4 and ρ = 0.00625 such that Ne = (ρβ2)−1 = 1000. Right: Kingman coalescent with population size N = Ne = 1 000.

Example of mutation frequency trajectories that are increasing up to a frequency of 0.5 for H3N2/HA influenza and the expiring fitness model. For the latter, parameters used are α = s0 = 0.03 and three values of ρ to illustrate different clonal interference regimes. In each case, 10 randomly selected trajectories are plotted, with blue color indicating final loss and red final fixation.