Abstract
The rate of acquisition of genomic changes in cancer has been the topic of much discussion, with several recent investigations finding evidence of punctuated evolution instead of gradual accumulation of such changes. Despite forays into the description and quantification of these punctuated events, the effects of such events on subsequent cancer evolution remain incompletely understood. Here we investigate how non-gradual mutagenesis affects the ability of tumor cells to acquire and retain fitness-enhancing adaptations. We find that punctuated mutagenesis significantly facilitates adaptation in scenarios where adaptation requires crossing a fitness valley, i.e. when multiple mutations are required which individually are maladaptive but jointly confer a fitness advantage. By increasing the probability that multiple mutations occur in close succession, punctuation increases the chance that mutants in a fitness valley mutate further to reach a fitness peak before going extinct. Analyzing data from The Cancer Genome Atlas, we find that tumors with signatures of APOBEC mutagenesis, which has been shown to proceed in episodic bursts, exhibit patterns consistent with higher rates of crossing fitness valleys. Lastly, we characterize how the interplay between this enhanced ability to cross fitness valleys and adaptation-limiting effects of clonal interference affects overall adaptability in complex fitness landscapes.
Introduction
Tumors evolve via the acquisition of randomly arising genetic and/or epigenetic alterations and their selection1. The rate at which tumor cells accumulate such genomic changes to produce potentially adaptive innovation is thus an important determinant of the capacity of tumor cells to adapt to diverse selection pressures2,3, affecting their propensity to progress locally, to metastasize, or to evolve resistance to therapeutic intervention. Several methods to quantify mutation rates have been developed4–6 and proxies for mutation rates such as tumor mutation burden often feature in prediction models of therapeutic outcomes7,8.
Most existing methods deal with constant mutation rates6,9, assuming gradual evolutionary change over time. However, accumulating evidence suggests that mutagenesis in cancer is fluctuating. Indeed, a recent study10 demonstrated in vitro that mutations associated with DNA-editing activity of APOBEC cytidine deaminases11 occur in episodic bursts, with more than 100-fold differences in the rate of APOBEC-associated mutations across otherwise identical cell culture replicates (Fig 1A). This episodicity observed in vitro aligned with patterns in previously published in vivo data investigating APOBEC mutagenesis in lung cancer12,13, and data from intestinal crypts14 later confirmed explicitly that this episodic pattern also occurs in patients. Other recent studies using genomic data to reconstruct evolutionary histories of tumors have found evidence of punctuated evolution across several cancer types15–17. These studies have demonstrated the existence of distinct phases of mutation bursts17,18 (Fig 1B), challenging the prevailing paradigm of gradual emergence of mutant lineages (Fig 1C). Various mutagenic processes such as chromothripsis19,20, chromoplexy21 and other drivers of chromosomal instability22 have been implicated in causing such punctuated patterns23.

Population dynamics under gradual vs punctuated mutagenesis.
(A) Evidence for punctuated APOBEC mutagenesis (Petljak et al. 2019) from successive cell culture expansions, seeded with single progenitors from the preceding expansion. Sequencing of the expanded populations revealed large fluctuations in APOBEC-associated mutagenic signatures SBS2 and SBS13. (B) Evidence for punctuated copy-number evolution (Gao et al. 2017, Minussi et al. 2021). Patterns in branch lengths of reconstructed phylogenetic trees reveal fluctuations in the rates of copy number alterations. Dynamics prior to most recent common ancestor (MRCA) are unidentifiable. (C) Gradual vs. punctuated evolution. In gradual evolution novelty emerges and spreads at a constant rate over time. In punctuated evolution novelty emerges and spreads during distinct burst phases. (D) Fitness schematic of two 1-step adaptations which each independently confer a fitness advantage. (E) Fitness schematic for a 2-step adaptation, in which carrying one mutation confers a fitness disadvantage but carrying two mutations confers a fitness advantage. (F,G) Schematic of possible evolutionary dynamics for two 1-step adaptations (F) and of modes of valley-crossing with or without prior fixation of the first mutant (G). Sequential fixation occurs at low mutation rates where emerging mutant lineages are likely to have fixated or gone extinct before the next mutation occurs. At higher mutation rates, the second mutation can occur in a multi-clonal population. (H) Sketch of failing two-step adaptation under a uniform (time-invariant) mutation rate. (I) Sketch of a successful two-step adaptation via stochastic tunneling under a punctuated mutation process with distinct clusters of high mutation rates
How punctuated mutagenesis affects the evolutionary dynamics of tumor cell adaptation across complex fitness landscapes remains incompletely understood. Addressing this question with existing experimental data is inconclusive, as only short time horizons are observed (single expansions10; time to a most recent common ancestor17,18), and as fitness differences between observed cell lineages are incompletely characterized. Mathematical modelling of human cancer genomics data, however, can elucidate the dynamics of adaptation during punctuated tumor evolution. Here, we set out to systematically investigate the evolutionary consequences of punctuated mutation acquisition during tumorigenesis using mathematical modeling and analysis of genomic data from The Cancer Genome Atlas (TCGA).
Results
Temporal clustering facilitates multi-step adaptation
We set out to investigate how temporal clustering of mutagenic events into distinct episodes affects the ability of a cell population to achieve multi-step evolutionary adaptations. We considered two scenarios: tumor evolution via a single advantageous step, such as an oncogenic adaptation24 (“1-step adaptations”, Fig 1D), and accumulation of multiple mutations with synergistic fitness effects (“multi-step adaptations”, Fig 1E). Indeed, the two hit-hypothesis25 for tumor suppressor genes (TGSs) suggests that deactivation of one copy of a TSG can be inconsequential or maladaptive, while bi-allelic deactivation confers a selective advantage26. While some TSGs exhibit (context-dependent) haploinsufficiency27–29, synergistic fitness effects for successive (epigenetic27,28,30) alterations in the same gene remain the prominent feature of TSGs. Analogously, there are ample examples of successive alterations across different genes acting synergistically31–33. High mutation rates are known to limit the rate of 1-step adaptations due to clonal interference34–36 (Fig 1F), which has been highlighted as a potential consequence of punctuated cancer evolution37. The dynamics of multi-step adaptation have been studied extensively26,38–44, elucidating two modes of evolution (Fig 1G): ‘sequential fixation’ refers to the scenario in which the second mutation only emerges after cells harboring the first mutant have taken over the population, while ‘stochastic tunneling’ refers to situations in which the second mutation arises sooner. In large populations, mutants with a selective disadvantage become vanishingly unlikely to reach fixation, so that stochastic tunneling becomes the dominant mode of evolution for multi-step adaptations.
To thus investigate how the rate of successful stochastic tunneling events – enabling multi-step adaptation – is affected by punctuated mutagenesis, we simulated selection dynamics of a population of cells which proliferate according to a Wright-Fisher process 45,46 and accumulate mutations according to a constant (Fig 1H) or temporally clustered (Fig 1I) rate. The Wright-Fisher process models evolution as successive, non-overlapping generations of constant size. Each new generation is populated by drawing with replacement from the cells in the previous generation, with probabilities proportional to their fitness (Fig 2A). Cells can accumulate mutations that change their fitness, i.e. increase their probability to survive to the next generation. We considered fitness landscapes in which two-step adaptations are the only way by which cells can increase their fitness, with fixed fitness ratios across subsequent two-step adaptations (Fig 2B,C).

Simulation results: valley-crossing under uniform vs. temporally clustered mutation rates.
(A) Schematic of a Wright Fisher Process. (B)-(C) Fitness landscapes used for the simulations in Panel (D) and (E) respectively. Mutations move a cell from left to right through the landscape. Each 2-step adaptation corresponds to a fitness increase by a factor of 1.5. Having an odd number of mutations comes at a multiplicative fitness disadvantage of 0.5 in (B) and an advantage of 1.01 in panel (D). (D)-(E) Simulation results for a Wright-Fisher process with 50 cells. The mutation rate trajectories in each panel are chosen such that the total expected number of mutations under the uniform trajectory is identical to that under the temporally clustered trajectory.
Interestingly, for two-step adaptations for which the intermediate mutants that carry only one of the two required mutations have a strong selective disadvantage (Fig 2B), we found that the population undergoes substantially (3.55 times) more two-step adaptations when accumulating mutations in a temporally clustered rather than in a uniform way (Fig 2D). This effect arises because intermediate mutants in “fitness valleys” have a high chance of going extinct before acquiring the next mutation. During mutation bursts, acquiring the next mutation in time before the disadvantageous mutant goes extinct becomes more likely. Similarly, intermediate mutants are more likely to emerge in a mutation burst. This temporal clustering of the likelihood of two succeeding mutation events increases the rate of successful two-step adaptations.
This effect also emerges for two-step adaptations that do not constitute proper fitness valleys, i.e. for which the intermediate mutant is not maladaptive (Fig 2E; 1.15-fold increase). Under stochastic selection dynamics, even mutants with a slight selective advantage (Fig 2C) have a high chance of going extinct due to random drift. Lineages acquiring sets of synergistic mutations, thus, often do so without prior fixation of each intermediate mutant – via stochastic tunneling26. In those regimes, temporal clustering of mutations therefore also increases the speed of adaptation, as confirmed with simulations (Fig S2A-D). Moreover, the proportion of such sets of synergistic mutations acquired via stochastic tunneling rather than sequential fixation also increases with temporal clustering (Fig S2E).
Our findings are not unique to models of constant population size, as confirmed with investigations of a branching process model (Methods, Fig S1) which yielded qualitatively similar results. Furthermore, our results also generalize to multi-step adaptations with arbitrarily many steps (SI1). To demonstrate this result, we compared a uniform mutation process with mutation rate μ to temporally clustered mutation processes in which mutations emerge exclusively during burst phases that constitute a fraction
Temporal clustering in an exploration-exploitation setting
The increased rate of valley crossing is driven by phases of high mutation rates, when the fold-increase in the chance of valley-crossing is larger than the fold-increase in the mutation rate. This effect improves adaptability while the fitness landscape offers scope for multi-step adaptation. However, in fitness landscapes which have global maxima, as the population adapts there is ever less scope for (further) multi-step adaptations, and excessive exploration of the landscape becomes costly.
To investigate the effect of temporal clustering in such an exploration-exploitation setting, we constructed a two-dimensional fitness landscape (Fig 3A) which is randomly re-drawn at regular intervals mimicking exposure to novel physical environments or drugs during cancer evolution and treatment (Methods). Between subsequent re-drawings, the population may move to a global fitness maximum where any further mutation decreases fitness. We considered mutation processes with different average mutation rates µ and different values of the clustering parameter k (Fig 3B) and investigated the average fitness of the cell population over long time horizons. As before, the clustering parameter k denotes the fold-increase of the mutation rate in burst phases relative to the average mutation rate µ. However, rather than fixing the out-of-burst mutation rate at zero, the burst duration is now held constant (Fig 3B) to control for differences in waiting time until a burst occurs.

Exploration and exploitation with temporally clustered mutation rates.
(A) Cells move through two-dimensional fitness landscapes. These fitness landscapes are randomly redrawn every 50714 divisions. (B) Mutation rate trajectories are parameterized with a mean mutation rate μ, and a clustering parameter k. (C) Average fitness in simulations of a population of 20 cells in a Wright Fisher Process. Simulations were run for 108 division events. (D)-(F) Average fitness trajectories for representative snippets of the simulations in (C).
We found that, in this scenario, relative to the uniform mutation process (k = 1) increasing k initially increases the average fitness of the population for any fixed µ (Fig 3C). Similarly, for a fixed k, increasing µ when starting at low µ-values increases the average fitness. However, moving further in either of these directions in the parameter space – towards increasing µ or towards increasing k – eventually brings about a decrease in average fitness (see SI2 and Fig 3D-F for detailed descriptions and plots of the dynamics). For any fixed k, average fitness peaks at intermediate µ, and the highest average fitness on the k-µ-plane is achieved by a k > 1. Maximizing average fitness, thus, involves temporal clustering even when there is a trade-off between exploration and exploitation.
Proxies for valley crossing and for temporal clustering found in patient data
We then sought evidence of this effect in patient data, leveraging whole exome sequencing (WES) data from TCGA. We reasoned that among tumors in which similar numbers of mutations had emerged, those tumors with more temporally clustered mutation processes in their evolutionary history would be more likely to have undergone a larger number of multi-step adaptations. We would, thus, expect the ratio of detectable indicators of multi-step adaptations, such as two deactivating mutations in a TSGs relative to the total number of mutations in a tumor sample, to correlate with indicators of fluctuations in the mutation rate history of a tumor such as those found for APOBEC mutagenesis.
Before considering the TCGA data, to confirm in silico that such a correlation would indeed emerge under biologically realistic parameters, we performed large-scale simulations of tumor growth from a single cell to up to realistically detectable cell numbers of 106−107 cells47 (Methods). We recorded the fraction of mutations that emerged during burst phases, constructed a TSG deactivation score (Fig 4A,B) by counting the number of TSGs (modelled as mutations with synergistic fitness effects) with at least two mutations across the population and divided by the total number of mutations across the genome (Methods). Our simulation results confirmed that these two readouts are correlated (Pearson correlation 0.42, p<0.01; Fig 4C), showing that in silico predictions about the relative ease of acquiring 2-step adaptations under mutation processes with more vs. less temporal clustering are well reflected in our score for TSG deactivation.

Proxies for valley crossing and temporal clustering in simulations and in TCGA data.
(A) Sketch of the simulation analysis workflow. (B) Distributions of the fraction of mutations acquired during mutation bursts and the TSG deactivation score in simulations. (C) Joint distribution of the quantities in (B) indicates strong correlation. (D) Schematic of analysis workflow for TCGA data (methods) and results for the four cancer types with highest APOBEC signature contribution. (E) Probabilities that single base substitutions (SBSs) in samples of the cancer types in (C) were caused by APOBEC associated signatures SBS2 and SBS13. On the left, these probabilities are averaged over all SBSs. On the right, only those SBSs are considered which are classified as “nonsense” or “missense”, and which appear in a TSG with at least 2 such deactivating SBSs. The average probabilities between both groups are significantly different (t-test, p<0.001). (F) Average probabilities analogous to those in panel (D) were constructed for each individual sample and then averaged across samples. Samples without deactivated TSGs were excluded. The average probabilities between groups are significantly different for the four cancer types (*, ** and *** respectively indicate that the p-value in a t-test lies below 0.05, 0.01 and 0.001).
Next, to construct similar readouts from the TCGA data, we first performed SBS signature decomposition on each sample and computed the relative contribution of APOBEC associated mutation signatures (SBS2 and SBS13) vs all other mutation signatures48 (Methods). Given the evidence for episodic APOBEC mutagenesis 10,14, we used this relative contribution as a score for temporal clustering. As a score for multi-step adaptations, a list of known TSGs (the TSGene 2.0 database49) was considered. For each TCGA sample, we counted the number of TSGs in this list which harbored at least two single nucleotide substitutions classified as a missense or nonsense mutation (Methods) normalized by the total number of single nucleotide substitutions in that sample. The TCGA WES data did not enable us to determine whether such mutations indeed deactivated different copies of a TSG, rather than appearing on the same copy. If the likelihood of appearing on the same copy varied with the contribution of the APOBEC signature, this association could confound our results. Since mutations that are interdependently generated on the same copy of a gene would likely appear in close spatial proximity11,50,51, we thus verified in a robustness check that our results remain unchanged when filtering out mutations with close-by genomic coordinates (Methods).
We then ranked the different tumor types in TCGA by the average contribution of SBS2 and SBS13 relative to all detected mutation signatures, and for each tumor type, investigated the correlation between our proxies for multi-step adaptations and for temporal clustering. For the top four categories with the highest mean contribution of APOBEC-driven mutagenesis to all mutations, we observed a positive correlation between the two proxies. This correlation is significant at the 5% level (t-test on the Pearson correlation coefficient) for three out of four of these top four categories (Fig 4D, further results in Fig S3).
Consistent with the hypothesis that this correlation arises because TSG deactivation occurs more readily with APOBEC mutagenesis, we found that mutations which contribute to the TSG-deactivation score on average have a strongly increased likelihood of being caused by APOBEC (Methods, Fig 4E). This finding is consistent across cancer types and independent of whether probabilities are averaged over samples or over individual mutations, suggesting that this result is not driven by outlier samples with many TSG deactivations and many APOBEC associated mutations (Methods, Fig 4F).
An alternative score for multi-step adaptations, in which the list of TSGs is replaced with a list of pairs of genes with synergistic fitness effects31(Methods) again exhibits a positive correlation with the relative contribution of APOBEC associated mutation signatures in all categories; this correlation is significant for two of these categories (Fig S4). Taken together, these results show that tumors with a larger share of APOBEC-associated mutations are enriched for pairs of mutations with synergistic fitness effects.
Stochastic Tunneling vs Clonal Interference during Punctuated Evolution
In tumors evolving at high mutation rates, clonal interference may limit the rate at which a cell population manages to acquire and retain 1-step adaptations (Fig 1F). Previous literature has quantified this limiting effect as a function of the mutation rate and the distribution of fitness values of emerging mutants52–54. However, the effects of non-constant mutation rates in this setting have not been elucidated. We thus set out to investigate clonal interference in the setting of temporally clustered mutation rates (Fig 5A).

Stochastic tunneling vs clonal interference under temporally clustered mutation rates:
(A) Schematic of the simulation approach. (B) Sketch of the mutation process. (C) Simulation results: measuring fixations of higher-fitness mutations per generation as a function of the clustering parameter and of the fitness landscape. In the 1-step adaptation setting fitness is defined as 1.5#mutations. Fitness in the 2-step adaptation setting is defined as in Fig 2B. Results are averaged over simulation runs with 107 generations. (D) Sketch of fitness landscape for simulations shown in (E). Cells start with an unmutated genome of 200 loci. Mutations on each locus have independent multiplicative effects on cell fitness. In the first 100 loci a single mutation confers a multiplicative fitness change of 1.5. In the remaining 100, 2 mutations are required to reach this multiplicative fitness change of 1.5, with the first mutation conferring a multiplicative fitness change of 0.5. (E) Simulation results for adaptation with genome sketched in (D). Results are averaged over 100 simulation runs per value of k. Shaded regions indicate 10th and 90th percentiles. Smaller plots (left) are zoomed in versions of the first 1200 generations of the larger plots (right), with identical color-coding.
We first considered a setting in which cells can only acquire 1-step adaptations and measured the rate at which novel 1-step adaptations reach fixation in the population across simulations with different clustering parameters k (Fig 5B). We found that fixation rates quickly decrease as k increases (Fig 5C). This pattern emerges because the extent to which clonal interference reduces fixation rates disproportionately increases with the mutation rate; for a fixed average mutation rate temporal clustering thus increases the effects of clonal interference.
These findings stand in contrast to our results for 2-step adaptations whose rate of fixation increases when temporal clustering is introduced. However, if in-burst mutation rates are sufficiently large such that multiple clones in the population may acquire a 2-step adaptation independently before one of these clones has fixated, effects of clonal interference also play a role for 2-step adaptations. Performing simulations for 2-step adaptations, we found that fixation rates are non-monotone in k. While at low k increasing k leads to a steep increase in the fixation rate, this trend eventually levels off and becomes negative, with further increases in k leading to a decrease in the fixation rate.
Having observed that temporal clustering increases effects of clonal interference but also facilitates stochastic tunneling, we next set out to investigate the relative contribution of these two effects on overall adaptation rates. We considered a setting in which cells can increase their fitness through both 1-step and 2-step adaptations (Fig 5D).
We found that 1-step adaptations are acquired more quickly under the uniform mutation process (k=1), while 2-step adaptations are acquired more quickly under a temporally clustered mutation process (k=5; Fig 5E). The relative magnitude of both effects varies with time. As 1-step adaptations spread more readily in the population, differences in the speed at which they fixate manifest early in the dynamics. Over time, cells gradually exhaust the possibilities for 1-step adaptations and the difference between the adaptation rate in both processes becomes dominated by differences in the speed of acquiring two-step adaptations. In this latter phase we observe large differences between the two mutation processes in the expected time until any given proportion of the possible two-step adaptations have spread in the population. Analogous time differences for reaching fixed proportions of one-step adaptations in the initial phase of the dynamics are much smaller (Fig 5E).
We also investigated how this pattern depends on the ruggedness of the fitness landscape (Fig S5A). To this end, we varied the relative proportions of loci with and without fitness valleys and quantified adaptability differences. We found that the duration of the initial phase in which the uniform process yields higher adaptability varies with the ruggedness of the fitness landscape: this phase is roughly three times as long when the proportion of loci with fitness valleys is 5% compared to when it is 95% (Fig S5B-D). However, over this range of proportions, we consistently observed that this initial phase only accounts for a small fraction of the time it takes to acquire all adaptations.
Discussion
Accumulating evidence suggests that mutagenesis in tumor cell populations proceeds in punctuated bursts rather than gradually; however, the effects of such punctuation on the ability of a tumor cell population to acquire and retain fitness-enhancing adaptations remains incompletely understood. Here we set out to investigate the evolutionary dynamics of these processes using mathematical modeling and genomics data analysis of human tumors.
We found that when a population acquires multi-step adaptations via stochastic tunneling, the rate of adaptation is substantially enhanced by punctuation. Stochastic tunneling is the predominant mode of evolution in large populations that traverse fitness valleys. However, stochastic tunneling also occurs when the necessary mutation steps to reach a substantial fitness advantage are not individually maladaptive, i.e. if steps do not constitute a fitness valley but confer a slight fitness advantage. Punctuated evolution therefore facilitates the accumulation of sets of mutations that jointly and synergistically confer a fitness advantage, while the fitness effect of carrying only a subset of these mutations can range from being strongly maladaptive to being moderately adaptive.
For the limit of low average mutation rates we show analytically that the rate of stochastic tunneling under a temporally clustered processes is kn times larger than under the time-invariant process, where k is the fold-increase in the mutation rate relative to the average mutation rate during mutation bursts in the temporally clustered process, and n is the number of mutation steps that is required to exit the fitness valley. Moving towards higher average mutation rates, this fold-change in the stochastic tunneling rates decreases, but the absolute difference between the rates increases.
While temporal clustering facilitates multi-step adaptations, it also impedes 1-step adaptations due to clonal interference. We examined the interplay between these contrasting effects in a setting in which cells can acquire both types of adaptations. We showed that the relative importance of both effects varies over time. Since 1-step adaptations tend to be acquired more readily, clonal interference matters most in the early phase of the process. As the share of possible 2-step adaptations relative to 1-step adaptations increases, differences in the tunneling rate become the more relevant determinant for the speed of adaptation. A uniform mutation rate, thus, makes a population more efficient at finding local fitness maxima. However, temporal clustering allows the population to more quickly move between local maxima, thus speeding up the search for a global maximum.
While our theoretical results are easily verified in simulation settings where we can choose a fitness landscape, applying these results to real data remains challenging because of the inherent complexities in determining fitness effects in biological systems. Estimating fitness effects of individual mutations requires either intricate experimental approaches55 or large patient cohorts to control for differences in the (epi-)genetic background against which these mutations emerge; and such estimations become considerably harder when considering joint fitness effects of sets of multiple mutations.
To circumvent these challenges, we focused on TSGs as a representative set of gene pairs with synergistic effects and on one mutation process known to tend to fluctuate over time – APOBEC-driven mutagenesis. Consequently, we observed only a subset of groups of mutations with synergistic fitness effects in the data and possibly only a small fraction of the variability in mutagenic punctuation between different samples. Nevertheless, we find that these scores significantly correlate, which aligns with our model predictions.
Our results have implications for both mechanistic and statistical modeling of tumor evolution. A key quantity in mechanistic models such as those used to optimize treatment schedules56 is the likelihood that resistance-conferring adaptations arise during therapy; our results show that this likelihood depends on the temporal dynamics of the mutation process. Analogously, in statistical models, incorporating measures of punctuated mutagenesis may improve the ability to predict treatment outcomes by accounting for how these temporal dynamics shape adaptability.
Methods
Simulations of a Wright-Fisher and a branching Process in an unbounded fitness landscape
In the simulations presented in Fig 2 and Fig S1, mutations occur between selection steps and are independent of divisions. In the Wright-Fisher process simulations (Fig 2 C) with a temporally clustered mutation rate, the out-of-burst mutation probability per cell and per update step (between two divisions) was set to 0.01 and was increased to 0.1 in bursts. For the corresponding uniform mutation rate trajectory, this probability was simply set equal to the total amount of mutations that occurred in the simulations under the temporally clustered mutation rate, divided by the total number of update steps.
Mutation rates in the branching model simulations (Fig S1) were chosen analogously. However, to achieve temporally equidistant mutation bursts in the branching process simulations, we scaled the burst duration and the time between bursts by the population size.
Simulations Wright-Fisher Process in exploration/exploitation setting
To produce the results shown in Fig 3, we ran agent-based stochastic simulations of a Wright-Fisher Process with 20 cells. Cells were characterized by their location on a discrete fitness landscape (a 30 by 30 two-dimensional lattice). The fitness values f associated with the positions on the lattice were drawn independently at random as f = 0.5 + x4 where x ∼ U[0,1] is distributed uniformly between zero and one. The lower bound of 0.5 ensures that cells at all positions have a non-negligible probability of dividing.
Raising x to the fourth power creates a landscape with few peak-positions surrounded by many valley-positions with little variation in fitness.
In each selection step, one cell was randomly chosen to divide with probability proportional to its fitness, and replaced a cell which was drawn uniformly at random. Mutations occurred between consecutive selection steps, and the mutating cell was chosen uniformly at random amongst all cells in the population (independently of the preceding selection step). If a cell mutated, it would move to a location in the fitness landscape chosen uniformly at random from the set of (at most 8) locations in the Moore neighborhood of its current location.
The rate at which mutation events occurred was governed by the parameters µ and k. Mutation bursts lasted 102 division events and started every 103 division events. Fitness landscapes were redrawn every 50714
Larger-scale simulations
We simulated tumor evolution as a branching process, starting from a single unmutated cell, up to a randomly drawn target cell number between 106 and 107 cells. We assume a constant death rate, so that over the course of a simulation the probability that the next event is a death event rather than a division event remains fixed (at a value of 0.35). In case of a death event, one cell picked uniformly at random is removed from the population. In case of a division event, one cell is chosen to divide with probabilities proportional to its fitness.
We assume that the number of mutations per division follows a binomial distribution Bin(105,µ), and for each simulation we randomly draw a baseline mutation probability µ uniformly between 0 and 3·10−5. In mutation bursts, this probability gets multiplied by a factor of 20 or 50, again chosen randomly for each simulation. The population enters a burst phase with probability
The fitness effect of mutations is modeled as follows. The starting cell has a fitness of one. Each new mutation has an additive effect on the cells fitness. We assume that mutations occur uniformly at random across the genome ant that multi-step adaptations can occur in a fraction θ = 0.01 of the genome, roughly aligning with the ratio of the number of TSGs vs the total number of genes in the human genome. We subdivide this part of the genome into 200 TSGs which are all hit by a mutation with equal probability. For each individual TSG, the first mutation reduces a cell’s fitness by 0.05. The second mutation increases fitness by 0.15, and all further mutations are fitness neutral. For the remaining (1 − θ) fraction of the genome, we assume that mutations are fitness-neutral with a probability of 0.9, and that fitness effects are otherwise drawn from a Gaussian with mean equal to −0.005 and standard deviation equal to 0.005.
To produce the results in Fig 4A-B, we keep track of all mutations that arise in a population, and remove all mutations with less than 1% variant allele frequency from the output, as those would be unlikely to be detected in WES. Moreover, we keep track of the fraction of mutations that a sample acquired during a mutation burst. For each simulation, we then count the number of TSGs for which there are at least two mutations found in the population and divide this by the total number of mutations in the sample.
Analysis of TCGA WES data
Whole-exome sequencing data was acquired from The Cancer Genome Atlas. We performed mutation signature decomposition using the cosmic fit() function in python from the package SigProfilerAssignment57 version 0.1.8 with cosmic version 3.4. Moreover, we used the mutation-level signature probabilities generated via SigProfilerAssignment to compute mean probabilities of signatures SBS2 and SBS13 for individual SBSs (Fig 4 D,E).
To compute our TSG deactivation score, we filtered the SNV data for missense and nonsense mutations in genes belonging to the TSGene 2.0 database49. For each sample we calculated the number of TSGs with at least two mutations and divided by the total number of SNVs.
Analogously, to construct our synergistic mutations score, we filtered for missense and nonsense mutations in frequently co-mutated gene pairs identified by Gu et al.31 [Enrichment for co-occurrence of mutations in pairs of genes, relative to the product of the frequencies of each individual gene being mutated suggests that the two mutations have a synergistic fitness effect.] and for each sample divided the number of pairs in this list with mutations in both genes by the number of SNVs. 29This enrichment for co-occurrences of mutations in both genes in a pair suggests that mutations in the two genes tend to have a synergistic fitness effect. For each sample in the TCGA WES data we thus computed the number of gene-pairs from this list for which there is at least one non-synonymous single nucleotide substitution mutation in each of the two genes, and divided this number by the total number of single nucleotide substitutions in the sample.
As a robustness-check, we investigated whether the results change if we require a minimum distance between the genomic locations of the mutations in TSGs that we count to our TSG deactivation score. APOBEC has been linked to clustered mutagenesis through processes of kataegis and omikli11,50,51. If samples with higher APOBEC activity have higher rates of having multiple close-by mutations on the same allele, and if such a sample has multiple deactivating mutations in a TSG, these deactivations might be more likely to lie on the same allele compared to samples with lower APOBEC activity. Since we use appearances of more than one deactivating mutation as a proxy for bi-allelic deactivation of TSGs, such a pattern would bias our results. To account for that, we explored filtering TSG mutations in the data based different minimum-distance-thresholds ranging from 1 to 100 bp and found no impact on our results.
Supplementary Information
Derivations of results on multi-step adaptation rates in the limit of infrequent mutations
We discussed differences in the rate of multi-step adaptations fk as a function of a temporal clustering parameter k. Here we formalize this discussion, and derive the result presented in the main text.
We consider processes in which multi-step adaptations occur via stochastic tunneling rather than sequential fixation, i.e. processes in which mutants with a selective disadvantage are negligibly unlikely to reach fixation. We assume that mutations happen sufficiently infrequently, so that we can neglect valley crossing scenarios in which the same mutation in a multi-step adaptation sequence occurs multiple times. Additionally, we require that valley crossing happens solely via stochastic tunneling, i.e. that the probability that a maladaptive mutant fixates is vanishingly low.
We denote the average rate at which cells acquire mutations by µ, and compare dynamics under different mutation processes which we index by a temporal clustering parameter k ≥ 1. Starting every d units of time, a process with parameter k undergoes a mutation burst lasting
As stated in the main text, with this setup we can show that for any population dynamics process for which the above limits can be motivated, the fold-increase in the rate fk at which (n + 1)-step adaptations occur in a process with clustering parameter k relative to the uniform process approaches kn.
Deriving this result for any selection process indeed becomes straightforward, if it can be motivated that the mutation rate does not affect the fate of any mutant lineage once the first mutant in this lineage has emerged.
Somewhat more formally, we index the steps in an n-step adaptation sequence by i ∈ {1,…,n}, and denote the expected size of the lineage descending from mutant i by Yi(t). This quantity Yi(t) of course depends on the specificities of the selection process, but for our derivations it suffices to only require that Yi(t) does not depend on the mutation rate.
Since all intermediate mutants have a selective disadvantage, and since we assume that we are in a parameter regime in which the likelihood that a disadvantageous mutant fixates is negligibly low, it follows that the integral
In the limit of a low (and for now time-invariant) mutation rate µ, the probability that the i’th mutant produces a further mutant P(i → i + 1) is simply the product of the mutation rate and this integral.
The probability that the first mutant spawns a sequence of n+1 mutants can, thus, be written as follows.
Lastly, we use µ̃ to denote the rate at which new mutants with only one mutation emerge. For a given selection process, this rate may vary with the population size. We arrive at the following formulation for the rate at which new advantageous mutants emerge f1.
Finally, we can introduce our clustering parameter k to scale both u and ũ, and multiply by a factor of
One consequence of our assumption that none of the maladaptive intermediate mutants reaches fixation is that the composition of the population once the final mutant emerges in the limit of µ → 0 does not depend on the mutation rate. For population dynamics models with a constant population size in which the probability that an emerging mutant (absent further mutation events) reaches fixation only depends on this composition, such as the Wright-Fisher process, we can interpret
Analogously, in certain models of branching evolution in which the prospects of one branch do not depend on other co-evolving branches, such as the Galton-Watson process58, whether the lineage of an emerging mutant with (n + 1) mutations survives is independent of the mutation rate in the limit of µ → 0. The ratio
The above result suggests that the effect of temporal clustering on relative rates of valley crossing is substantial, and gets exponentially stronger the wider the fitness valley is (n). We validated this result with simulations of a Wright-Fisher process for a range of small values for µ (Fig S2 B-E). As we move away from the limit of rare mutations, the relative effect of temporal clustering
Dynamics in exploration-exploitation setting
Our simulations show that for given µ the effect of increasing the clustering parameter k on the average fitness becomes negative at some k (Fig 3C). In the simulations in this section, this effect is driven by two main factors. Firstly, after a re-drawing, the population often is not in a local maximum of the fitness landscape, and hence may be able to increase its fitness by single mutations (1-step adaptations), without tunneling. Waiting for a burst to occur and thereby delaying such local explorations decreases average fitness. Second, once the population has found a peak in the fitness landscape, having very high mutation rates in bursts temporarily scatters cells to lower points in the landscape, and may even cause drift to points of lower fitness. Both of these effects become apparent when considering a representative snapshot of the simulations (Fig 3D). The misalignment of re-drawing and burst causes the population to remain in a valley for several divisions until the first burst occurs. Moreover, relative to similar snapshots of simulations with lower k (Fig 3E,F), once the population has reached a peak, the scattering during bursts at higher k leads to sharper decreases in fitness, and the population may even remain in a lower fitness point after a burst (Fig 3D).
This increased scattering can also be seen when going from k = 1 to k = 5 (Fig 3E,F). However, at k = 1, the population spends much time in local maxima, as large jumps in fitness only occur right after re-drawings, whereas at k = 5 jumps occur also long after the landscape was re-drawn and the population found a local maximum. These later jumps to higher points in the landscape correspond to tunneling events.

Simulation results: valley-crossing under uniform vs. temporally clustered mutation rates in branching process.
(A) Schematic of fitness landscape as in Fig 2. (B) Schematic of Branching Process model. (C) Simulation results for the Branching Process model. The mutation rate trajectories in (C) are chosen such that the total expected number of mutations under the uniform trajectory is identical to that under the temporally clustered trajectory.

Effect of temporal clustering on stochastic tunneling rates as a function of the mutation rate.
(A) Sketch of the mutation process with clustering parameter k. (B) Sketch of the fitness landscape as in Fig 2. (C) Description of the simulations. (D)-(E) Simulation results of a Wright-Fisher process with 100 cells over 106 generations with a fixed average mutation rate of10−5per cell per generation and varying temporal clustering k. The fitness is as in (B), with subsequent peaks differing in fitness by a factor of 1.5. However, on the y-axis in (F) and (G) we vary the fitness of the valleys relative to the preceding peak from 0.5 (as in (B)) to 1.5. Panel (D) shows the number of valleys crossed per generation (color bar in log10 values), where contrary to the previous panels it is no longer true that every valley crossing leads to a fixation – the next mutant might emerge before that. Panel (E) the share of these valley crossings that occur without fixation of the first mutant.(F) Simulation results of the process described in (A)-(C). Each simulation ran for105/(kμ) generations. The y-axis measures fixations of adaptive mutations per generation (see Fig 5D). (G) Fixations per generation relative to the uniform mutation process (k=1) for k=5 and k=10.

TSG deactivation scores in TCGA.
Distribution and correlation of APOBEC signature contribution and TSG deactivation scores as in Fig 4, sorted by cancer type.

Synergistic mutation scores in TCGA.
Distribution and correlation of APOBEC signature contribution and synergistic mutation scores, sorted by cancer type.

Temporal clustering and adaptation in fitness landscapes with varying ruggedness.
(A) Sketch of the setup. Like in Fig 5, we assume that cells have two types of loci on which one-step or two-step adaptations are possible respectively. Fitness effects per locus are as in Fig 5. Here, we run simulations in which we vary the relative shares of the two types of loci. (B) Simulation results over time, averaged for 100 simulation replicates per row and clustering parameter k. Colors indicate the fitness ratio between simulations with k=5 and k=1 on a log10 scale. (C)-(D) show the corresponding absolute numbers of loci with adaptative mutations (i.e. loci in the black and blue states in panel (A)).
References
- 1.Cancer as an evolutionary and ecological processNature Reviews Cancer 6:924–935https://doi.org/10.1038/nrc2013Google Scholar
- 2.Evolution of the cancer genomeNature Reviews Genetics 13:795–806https://doi.org/10.1038/nrg3317Google Scholar
- 3.The Genomic Processes of Biological Invasions: From Invasive Species to Cancer Metastases and Back AgainFront Ecol Evol 9:681100https://doi.org/10.3389/FEVO.2021.681100Google Scholar
- 4.A Quantitative Measurement of the Human Somatic Mutation RateCancer Res 65:8111–8117https://doi.org/10.1158/0008-5472.CAN-04-1198Google Scholar
- 5.Quantification of subclonal selection in cancer from bulk sequencing dataNature Genetics 50:895–903https://doi.org/10.1038/s41588-018-0128-6Google Scholar
- 6.Measuring single cell divisions in human tissues from multi-region sequencing dataNature Communications 11:1–9https://doi.org/10.1038/s41467-020-14844-6Google Scholar
- 7.Tumor mutation burden: From comprehensive mutational screening to the clinicCancer Cell Int 19:1–10https://doi.org/10.1186/S12935-019-0929-4Google Scholar
- 8.Tumor mutational burden as a predictive biomarker in solid tumorsCancer Discov 10:1808–1825https://doi.org/10.1158/2159-8290.CD-20-0522Google Scholar
- 9.The mathematics of cancer: integrating quantitative modelsNature Reviews Cancer 15:730–745https://doi.org/10.1038/nrc4029Google Scholar
- 10.Characterizing Mutational Signatures in Human Cancer Cell Lines Reveals Episodic APOBEC MutagenesisCell 176:1282–1294https://doi.org/10.1016/J.CELL.2019.02.012Google Scholar
- 11.Mechanisms of APOBEC3 mutagenesis in human cancer cellsNature 607:799–807https://doi.org/10.1038/s41586-022-04972-yGoogle Scholar
- 12.Tracking the Evolution of Non–Small-Cell Lung CancerNew England Journal of Medicine 376:2109–2121https://doi.org/10.1056/NEJMOA1616288Google Scholar
- 13.ClonalHistory & genetic predictors of transformation into small-cell carcinomas from lung adenocarcinomasJournal of Clinical Oncology 35:3065–3074https://doi.org/10.1200/JCO.2016.71.9096Google Scholar
- 14.APOBEC mutagenesis is a common process in normal human small intestineNature Genetics 55:246–254https://doi.org/10.1038/s41588-022-01296-5Google Scholar
- 15.Tumor Evolution Inferred by Single Cell SequencingNature 472:90https://doi.org/10.1038/NATURE09807Google Scholar
- 16.A Big Bang model of human colorectal tumor growthNat Genet 47:209https://doi.org/10.1038/NG.3214Google Scholar
- 17.Punctuated copy number evolution and clonal stasis in triple-negative breast cancerNature Genetics 48:1119–1130https://doi.org/10.1038/ng.3641Google Scholar
- 18.Breast Tumors Maintain a Reservoir of Subclonal Diversity During ExpansionNature 592:302https://doi.org/10.1038/S41586-021-03357-XGoogle Scholar
- 19.Massive genomic rearrangement acquired in a single catastrophic event during cancer developmentCell 144:27–40https://doi.org/10.1016/J.CELL.2010.11.055Google Scholar
- 20.Chromothripsis from DNA damage in micronucleiNature 522:179–184https://doi.org/10.1038/nature14493Google Scholar
- 21.Punctuated evolution of prostate cancer genomesCell 153:666–677https://doi.org/10.1016/J.CELL.2013.03.021Google Scholar
- 22.A pan-cancer compendium of chromosomal instabilityNature 606:976–983https://doi.org/10.1038/s41586-022-04789-9Google Scholar
- 23.Tumor evolution: Linear, branching, neutral or punctuated?Biochimica et Biophysica Acta (BBA) - Reviews on Cancer 1867:151–161https://doi.org/10.1016/J.BBCAN.2017.01.003Google Scholar
- 24.OncoKB: A Precision Oncology Knowledge BaseJCO Precis Oncol 2017:1–16https://doi.org/10.1200/PO.17.00011Google Scholar
- 25.Mutation and Cancer: Statistical Study of RetinoblastomaProceedings of the National Academy of Sciences 68:820–823https://doi.org/10.1073/PNAS.68.4.820Google Scholar
- 26.Stochastic tunnels in evolutionary dynamicsGenetics 166:1571–1579https://doi.org/10.1534/GENETICS.166.3.1571Google Scholar
- 27.Redefining tumour suppressor genes: exceptions to the two-hit hypothesisCell Mol Life Sci 60:2147https://doi.org/10.1007/S00018-003-3027-6Google Scholar
- 28.Loss of Tumor Suppressor Gene Function in Human Cancer: An OverviewCellular Physiology and Biochemistry 51:2647–2693https://doi.org/10.1159/000495956Google Scholar
- 29.Higher order genetic interactions switch cancer genes from two-hit to one-hit driversNature Communications 12:1–10https://doi.org/10.1038/s41467-021-27242-3Google Scholar
- 30.The Two-Hit Hypothesis Meets EpigeneticsCancer Res 82:1167–1169https://doi.org/10.1158/0008-5472.CAN-22-0405Google Scholar
- 31.Systematic interpretation of comutated genes in large-scale cancer mutation profilesMol Cancer Ther 9:2186–2195https://doi.org/10.1158/1535-7163.MCT-10-0022Google Scholar
- 32.Synergistic effects of polymorphisms in DNA repair genes and endogenous estrogen exposure on female breast cancer riskAnn Surg Oncol 17:760–771https://doi.org/10.1245/S10434-009-0802-0Google Scholar
- 33.PIK3CA and APC mutations are synergistic in the development of intestinal cancersOncogene 33:2245–2254https://doi.org/10.1038/onc.2013.167Google Scholar
- 34.The fate of competing beneficial mutations in an asexual populationGenetica 102:127–144https://doi.org/10.1023/A:1017067816551Google Scholar
- 35.The Speed of Adaptation in Large Asexual PopulationsGenetics 167:2045–2053https://doi.org/10.1534/GENETICS.104.027136Google Scholar
- 36.Clonal Interference, Multiple Mutations and Adaptation in Large Asexual PopulationsGenetics 180:2163https://doi.org/10.1534/GENETICS.108.090019Google Scholar
- 37.Big Bang Tumor Growth and Clonal EvolutionCold Spring Harb Perspect Med 8https://doi.org/10.1101/CSHPERSPECT.A028381Google Scholar
- 38.Mutation–selection networks of cancer initiation: tumor suppressor genes and chromosomal instabilityJ Theor Biol 223:433–450https://doi.org/10.1016/S0022-5193(03)00120-6Google Scholar
- 39.Evolutionary dynamics of tumor suppressor gene inactivationProceedings of the National Academy of Sciences 101:10635–10638https://doi.org/10.1073/PNAS.0400747101Google Scholar
- 40.Rapid evolutionary escape by large populations from local fitness peaks is likely in natureEvolution (N Y) 59:1175–1182https://doi.org/10.1111/J.0014-3820.2005.TB01769.XGoogle Scholar
- 41.The rate at which asexual populations cross fitness valleysTheor Popul Biol 75:286–300https://doi.org/10.1016/J.TPB.2009.02.006Google Scholar
- 42.The rate of multi-step evolution in Moran and Wright–Fisher populationsTheor Popul Biol 80:197–207https://doi.org/10.1016/J.TPB.2011.07.003Google Scholar
- 43.Stochastic Tunneling and Metastable States During the Somatic Evolution of CancerGenetics 199:1213–1228https://doi.org/10.1534/GENETICS.114.171553Google Scholar
- 44.Fitness variation in isogenic populations leads to a novel evolutionary mechanism for crossing fitness valleysCommunications Biology 1:1–9https://doi.org/10.1038/s42003-018-0160-1Google Scholar
- 45.Evolution in Mendelian PopulationsGenetics 16:97https://doi.org/10.1093/GENETICS/16.2.97Google Scholar
- 46.On the dominance ratioProc R Soc Edinb 42:321–341Google Scholar
- 47.How few cancer cells can be detected by positron emission tomography? A frequent question addressed by an in vitro studyEur J Nucl Med Mol Imaging 33:697–702https://doi.org/10.1007/S00259-005-0038-6Google Scholar
- 48.Signatures of mutational processes in human cancerNature 500:415–421https://doi.org/10.1038/nature12477Google Scholar
- 49.TSGene 2.0: an updated literature-based knowledgebase for tumor suppressor genesNucleic Acids Res 44:D1023–D1031https://doi.org/10.1093/NAR/GKV1268Google Scholar
- 50.Mutational processes molding the genomes of 21 breast cancersCell 149:979–993https://doi.org/10.1016/J.CELL.2012.04.024Google Scholar
- 51.DNA mismatch repair promotes APOBEC3-mediated diffuse hypermutation in human cancersNat Genet 52:958–968https://doi.org/10.1038/S41588-020-0674-6Google Scholar
- 52.Modelling Stochastic Clonal InterferencePublished online :21–38https://doi.org/10.1007/978-3-642-18734-6_2Google Scholar
- 53.Mutational effects on the clonal interference phenomenonEvolution 58:932–937https://doi.org/10.1111/J.0014-3820.2004.TB00427.XGoogle Scholar
- 54.Clonal interference in large populationsProc Natl Acad Sci U S A 104:18135–18140https://doi.org/10.1073/PNAS.0705778104Google Scholar
- 55.Clonal fitness inferred from time-series modelling of single-cell cancer genomesNature 595:585–590https://doi.org/10.1038/S41586-021-03648-3Google Scholar
- 56.Computational approaches to modelling and optimizing cancer treatmentNature Reviews Bioengineering 1:695–711https://doi.org/10.1038/s44222-023-00089-7Google Scholar
- 57.Assigning mutational signatures to individual samples and individual somatic mutations with SigProfilerAssignmentBioinformatics 39https://doi.org/10.1093/BIOINFORMATICS/BTAD756Google Scholar
- 58.On the Probability of the Extinction of FamiliesThe Journal of the Anthropological Institute of Great Britain and Ireland 4:138https://doi.org/10.2307/2841222Google Scholar
Article and author information
Author information
Version history
- Sent for peer review:
- Preprint posted:
- Reviewed Preprint version 1:
Cite all versions
You can cite all versions using the DOI https://doi.org/10.7554/eLife.108058. This DOI represents all versions, and will always resolve to the latest one.
Copyright
© 2025, Graser et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
- views
- 0
- downloads
- 0
- citations
- 0
Views, downloads and citations are aggregated across all versions of this paper published by eLife.