Most cancers carry a substantial deleterious load due to Hill-Robertson interference

Abstract
Editor's evaluation
Introduction
Results
Discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

Cancer genomes exhibit surprisingly weak signatures of negative selection (Martincorena et al., 2017; Weghorn, 2017). This may be because selective pressures are relaxed or because genome-wide linkage prevents deleterious mutations from being removed (Hill-Robertson interference; Hill and Robertson, 1966). By stratifying tumors by their genome-wide mutational burden, we observe negative selection (dN/dS ~ 0.56) in low mutational burden tumors, while remaining cancers exhibit dN/dS ratios ~1. This suggests that most tumors do not remove deleterious passengers. To buffer against deleterious passengers, tumors upregulate heat shock pathways as their mutational burden increases. Finally, evolutionary modeling finds that Hill-Robertson interference alone can reproduce patterns of attenuated selection and estimates the total fitness cost of passengers to be 46% per cell on average. Collectively, our findings suggest that the lack of observed negative selection in most tumors is not due to relaxed selective pressures, but rather the inability of selection to remove deleterious mutations in the presence of genome-wide linkage.

Editor's evaluation

This is an important paper that shows most cancers unavoidably accumulate damaging mutations. Whilst the majority of claims are convincingly supported by the data, evidence that damaging changes are buffered by heat shock pathways is currently incomplete. The insights into selection efficiency are important for the understanding of cancer growth and response to therapy. A broader implication is that high mutation load tumors may use common strategies to tolerate accumulated deleterious mutations, providing a therapeutic target.

https://doi.org/10.7554/eLife.67790.sa0

Introduction

Tumor progression is an evolutionary process acting on somatic cells within the body. These cells acquire mutations over time that can alter cellular fitness by either increasing or decreasing the rates of cell division and/or cell death. Mutations which increase cellular fitness (drivers) are observed in cancer genomes more frequently because natural selection enriches their prevalence within the tumor population (Martincorena et al., 2017; Weghorn and Sunyaev, 2017). This increased prevalence of mutations across patients within specific genes is used to identify driver genes. Conversely, mutations that decrease cellular fitness (deleterious passengers) are expected to be observed less frequently. This enrichment or depletion is often measured by comparing the expected rate of nonsynonymous mutations (dN) accruing within a region of the genome to the expected rate of synonymous mutations (dS), which are presumed to be neutral. This ratio, dN/dS, is expected to be below 1 when the majority of nonsynonymous mutations are deleterious and removed by natural selection, be ~1 when all nonsynonymous mutations are neutral, and can be >1 when a substantial proportion of nonsynonymous mutations are advantageous.

Two recent analyses of dN/dS patterns in cancer genomes found that for most nondriver genes dN/dS is ~1 and that only 0.1–0.4% of genes exhibit detectable negative selection (dN/dS < 1) (Martincorena et al., 2017; Weghorn and Sunyaev, 2017).This differs substantially from patterns in human germline evolution where most genes show signatures of negative selection (dN/dS ~ 0.4) (Martincorena et al., 2017). Two explanations for this difference have been posited. First, the vast majority of nonsynonymous mutations may not be deleterious in somatic cellular evolution despite their deleterious effects on the organism. While most genes may be critical for proper organismal development and multicellular functioning, they may not be essential for clonal tumor growth. In this hypothesis, negative selection (dN/dS < 1) should be observed only within essential genes and absent elsewhere (dN/dS ~ 1). While appealing in principle, most germline selection against nonsynonymous variants appears to be driven by protein misfolding toxicity (Drummond and Wilke, 2008; Lobkovsky et al., 2010), in addition to gene essentiality. These damaging folding effects ought to persist in somatic evolution.

A second hypothesis is that even though many nonsynonymous mutations are deleterious in somatic cells, natural selection fails to remove them. One possible reason for this inefficiency is the unique challenge of evolving without recombination. Unlike sexually recombining germline evolution, tumors must evolve under genome-wide linkage that creates interference between mutations, known as Hill-Robertson interference, which reduces the efficiency of natural selection (Hill and Robertson, 1966). Without recombination to link and unlink combinations of mutations, natural selection must act on entire genomes – not individual mutations – and select for clones with combinations of mutations of better aggregate fitness. Thus, advantageous drivers may not fix in the population, if they arise on an unfit background, and conversely, deleterious passengers can fix, if they arise on fit backgrounds.

The inability of asexuals to eliminate deleterious passengers is driven by two Hill-Robertson interference processes: hitchhiking and Muller’s ratchet (Figure 1A). Hitchhiking occurs when a strong driver arises within a clone already harboring several passengers. Because these passengers cannot be unlinked from the driver under selection, they are carried with the driver to a greater frequency in the population. Muller’s ratchet is a process where deleterious mutations continually accrue within different clones in the population until natural selection is overwhelmed. Whenever the fittest clone in an asexual population is lost through genetic drift, the maximum fitness of the population declines to the next most fit clone (Figure 1B). The rate of hitchhiking and Muller’s ratchet both increase with the genome-wide mutation rate (Johnson, 1999; Neher and Shraiman, 2012). Therefore, the second hypothesis predicts that selection against deleterious passengers should be more efficient (dN/dS < 1) in tumors with lower mutational burdens.

Figure 1

Download asset Open asset

Two Hill-Robertson interference processes that accumulate deleterious mutations at high mutation rates.

(A) Genetic hitchhiking. Each number identifies a different segment of a clone genome within a tumor. De novo beneficial driver mutations that arise in a clone can drive other mutations (passengers) in the clone to high frequencies (black dotted column). If the passenger is deleterious, both beneficial drivers and deleterious passengers can accumulate. (B) Muller’s ratchet. As the mutation rate within a tumor increases, deleterious passengers accumulate on more clones. If the fittest clone within the tumor is lost through genetic drift (black dotted row), the overall fitness of the population will decline.

Here, we leverage the 10,000-fold variation in tumor mutational burden across 33 cancer types to quantify the extent that selection attenuates, and thus becomes more inefficient, as the mutational burden increases. Using dN/dS, we find that selection against deleterious passengers and in favor of advantageous drivers is most efficient in low mutational burden cancers. Furthermore, low mutational burden cancers exhibit efficient selection across cancer subtypes, as well as within subclonal mutations, homozygous mutations, somatic copy number alterations (CNAs), and essential genes. Additionally, high mutational burden tumors appear to mitigate this deleterious load by upregulating protein folding and degradation machinery. Finally, using evolutionary modeling, we find that Hill-Robertson interference alone can in principle explain these observed patterns of selection. Modeling predicts that most cancers carry a substantial deleterious burden (~46%) that necessitates the acquisition of multiple strong drivers (~5) in malignancies that together provide a benefit of ~119%. Collectively, these results explain why signatures of selection are largely absent in cancers with elevated mutational burdens and indicate that the vast majority of tumors harbor a large mutational load.

Results

Null models of mutagenesis in cancer

Mutational processes in cancer are heterogeneous, which can bias dN/dS estimates of selective pressures. dN/dS overcomes this issue by dividing observed mutation counts by what is expected under neutral evolution using null models. These null models must account for mutational biases that are often specific to cancer types and genomic regions.

To ensure our dN/dS calculations are robust and reproducible, we applied two different methods to account for mutational biases. The first approach uses a previously established parametric mutational model (dNdScv) that explicitly estimates the background mutational bias of each gene in its calculation of dN/dS (Martincorena et al., 2017). The second approach uses a permutation-based, non-parametric (parameter-free) estimation of dN/dS. In this approach, every observed mutation is permuted while preserving the gene, patient samples, specific base change (e.g. A>T) and its tri-nucleotide context. Note that permutations do not preserve the codon position of a mutation and thus can change its protein coding effect (nonsynonymous vs. synonymous). The permutations are then tallied for both nonsynonymous d_N^(permuted) and synonymous d_S^(permuted) substitutions (Figure 2—figure supplement 1) and used as expected proportional values for the observed number of nonsynonymous d_N^(observed) (or simply d_N) and synonymous d_S^(observed) (d_S) mutations in the absence of selection. The unbiased effects of selection on a gene, dN/dS, is then:

\frac{d N}{d S} = \frac{d_{N}^{(o b s e r v e d)} / d_{N}^{(p e r m u t e d)}}{d_{S}^{(o b s e r v e d)} / d_{S}^{(p e r m u t e d)}}

For all cancer types and patient samples, p-values and confidence intervals are determined by bootstrapping patient samples. Note that this permutation procedure will account for gene and tumor-level mutational biases (e.g. neighboring bases [Alexandrov and Stratton, 2014], transcription-coupled repair, S phase timing [Haradhvala et al., 2016], mutator phenotypes) and their covariation. We confirmed that this approach accurately measures selection even in the presence of simulated mutational biases (Materials and methods, Figure 2—figure supplement 2A). In addition, this approach also reliably measures the absence of selection (dN/dS = 1) in weakly expressed genes (Figure 2—figure supplement 2C).

We find that both the parametric and non-parametric approaches identify similar patterns of selection (Figure 2A). Since parametric mutational models can become very complex in cancer (exceeding 5000 parameters in some cases; Martincorena et al., 2017; Zapata et al., 2018), we elected to use the non-parametric approach, which makes fewer assumptions about underlying mutational processes, in subsequent calculations of dN/dS.

Figure 2 with 16 supplements see all

Download asset Open asset

Attenuation of selection and increased protein folding stress in high mutation load tumors.

(A) dN/dS of passenger (red) and driver (green) gene sets within 10,288 tumors in TCGA stratified by total number of substitutions present in the tumor (d_N^(observed)+d_S^(observed)). dN/dS is calculated with error bars using a permutation-based null model (left) and *dNdScv* (right). A dN/dS of 1 (solid black line) is expected under neutrality. Solid gray line denotes pan-cancer genome-wide dN/dS. (B) Fraction of pathogenic missense mutations, annotated by PolyPhen2, in the same driver and passenger gene sets also stratified by total number of substitutions. Black line denotes the pathogenic fraction of missense mutations across the entire human genome. (C) Breakpoint frequency of copy number alterations (CNAs) that reside within exonic (dE) to intergenic (dI) regions within putative driver and passenger gene sets (identified by GISTIC 2.0, Materials and methods) in tumors stratified by the total number of CNAs present in each tumor and separated by CNA length. Solid black line of 1 denotes values expected under neutrality. (D) dN/dS of clonal (variant allele frequency [VAF] > 0.2; darker colors) and subclonal (VAF < 0.2; lighter colors) passenger and driver gene sets in tumors stratified by the total number of substitutions. A dN/dS of 1 (solid black line) is expected under neutrality. (**A–D**) Histogram counts of tumors within mutational burden bins are shown in the top panels. (E) Driver and passenger dN/dS values of the highest and lowest defined mutational burden bin in broad anatomical sub-categories. (F) Same as (E), except for all specific cancer subtypes with ≥500 samples. (G) Z-scores of median gene expression within all genes, HSP90, Chaperonin, and Proteasome gene sets averaged across patients (relative to an average tumor) stratified by the total number of substitutions. All shaded error bars are 95% confidence intervals determined by bootstrap sampling.

Attenuation of selection in drivers and passengers for elevated mutational burden tumors

We estimated dN/dS patterns in both driver and passenger gene sets across 10,288 tumors from TCGA aggregated over 33 cancer types (Ellrott et al., 2018) (Materials and methods). Since TCGA is composed of whole-exome data, which limits our ability to assess mutations in non-coding regions, we elected to use the total number of protein-coding mutations as our proxy for the mutational burden of tumors. To quantify the extent that selection attenuates as the mutational burden increases, we stratified tumors into bins based on their total number of substitutions on a log-scale. For each bin of tumors, we pooled all of the variants together and estimated dN/dS jointly. Consistent with the inefficient selection model, whereby selection fails to eliminate deleterious mutations in high mutational burden tumors, we observe pervasive selection against passengers exclusively in tumors with low mutational burdens (dN/dS ~ 0.56 in tumors with ≤3 substitutions, while dN/dS ~ 0.93 in tumors with >10 substitutions, Figure 2A). We observed little negative selection in passenger genes when aggregating tumors across all mutational burdens (dN/dS ~ 0.93), which is broadly similar to previous estimates (Martincorena et al., 2017; Weghorn and Sunyaev, 2017; Zapata et al., 2018; Ostrow et al., 2014).

We confirmed that negative selection on passengers is specific to low mutational burden tumors and not biased by small sample sizes (Figure 2—figure supplement 2B). We randomly sampled passengers from high mutational burden tumors (>10 substitutions) 1000 times using the same bin sizes in Figure 2A and calculated dN/dS. Within the smallest bin size (N=168 somatic nucleotide variant [SNVs]), negative selection on passengers sampled from high mutational burden tumors was absent (average dN/dS ~ 0.96) compared to observed dN/dS in low mutational burden tumors (dN/dS ~ 0.56; p<2.2^–16). In fact, only 1.7% of randomly sampled sets of sites had similar signals of negative selection (dN/dS < 0.56).

Also consistent with the inefficient selection model, drivers exhibit a similar but opposing trend of attenuated selection at elevated mutational burdens (dN/dS ~ 2.7 when in tumors with ≤3 substitutions and dN/dS gradually declines to ~1.16 in tumors with >100 substitutions). This pattern is not specific to drivers that are oncogenes or tumor suppressors (Figure 2—figure supplement 3). While the attenuation of selection against passengers in higher mutational burden tumors is a novel discovery, this pattern among drivers has been reported previously (Martincorena et al., 2017). Furthermore, we confirmed that these patterns are robust to the choices that we made in our analysis pipeline. These include the: (i) effects of germline SNP contamination (Figure 2—figure supplement 4), (ii) choice of driver gene set (Bailey et al., 2018, IntOGen Gonzalez-Perez et al., 2013, and COSMIC Tate et al., 2019; Forbes et al., 2008, Figure 2—figure supplement 5), (iii) differences in tumor purity and thresholding (Figure 2—figure supplement 6), and (iv) null model of mutagenesis (dNdScv, Figure 2A and Figure 2—figure supplement 7; Martincorena et al., 2017) (Materials and methods).

If negative selection is more pronounced in low mutational burden tumors, then the nonsynonymous mutations observed should also be less functionally consequential. By annotating the functional effect of all missense mutations using PolyPhen2 (Adzhubei et al., 2010; Figure 2B), we indeed find that observed nonsynonymous passengers are less damaging in low mutational burden cancers. Similarly, driver mutations become less functionally consequential as mutational burden increases, as expected for mutations experiencing inefficient positive selection (Figure 2B). Together these two trends provide additional and orthogonal evidence that selective forces on nonsynonymous mutations are more efficient in low mutational burden cancers.

Since all mutational types experience Hill-Robertson interference, attenuated selection should also persist in CNAs. We used two previously published statistics to quantify selection in CNAs: breakpoint frequency (Korbel et al., 2007) and fractional overlap (Zack et al., 2013). For both measures, we compare the number of CNAs that either terminate (breakpoint frequency) within or partially overlap (fractional overlap) Exonic regions of the genome relative to non-coding (Intergenic and Intronic) regions (dE/dI, see Materials and methods). Like dN/dS, dE/dI is expected to be <1 in genomic regions experiencing negative selection, >1 in regions experiencing positive selection (e.g. driver genes), and ~1 when selection is absent or inefficient (Figure 2—figure supplement 8). Using dE/dI, we observe attenuating selection in both driver and passenger CNAs as the total number of CNAs increases for both breakpoint frequency (Figure 2C) and fractional overlap (Figure 2—figure supplement 9). While CNAs of all lengths experience attenuated selection, CNAs longer than the average gene length (>100 KB) experience greater selective pressures in drivers. Collectively, these results strongly support the inefficient selection model and argue that the observed patterns must be due to a universal force in tumor evolution. We find that selection consistently attenuates in both drivers and passengers across all cancers as mutational burden increases.

Strong selection in low mutational burden tumors cannot be explained by mutational timing, gene function, or tumor type

We next tested alternative hypotheses to the inefficient selection model. We considered the possibility that selection is strong only during normal tissue development, but absent after cells have transformed to malignancy. This would disproportionately affect low mutational burden tumors, as a greater proportion of their mutations arise prior to tumor transformation. If true, then attenuated selection should be absent in subclonal mutations, which must arise during tumor growth. However, selection clearly attenuates with increasing mutational burden for the subset of likely subclonal mutations with variant allele frequency (VAF) below 20% (Figure 2D and Figure 2—figure supplement 10). Although selection attenuates in drivers and passengers in both subclonal and clonal mutations, selection is weaker in both drivers and passengers with lower VAFs. Weaker efficiency of selection among less frequent variants is expected under a range of population genetic models (Messer, 2009) and especially so in rapidly expanding, spatially constrained cancers (Sottoriva et al., 2015). In addition, heterozygous mutations, to the extent they are only partially dominant (López et al., 2020), are also expected to exhibit lower VAFs and experience weaker selection.

Next, we considered and rejected the possibility that attenuated selection is limited to particular types of genes. We first annotated our observed mutations by different functional categories and Gene Ontology (GO) terms (Harris et al., 2004) and find that negative selection is not specific to any particular gene functional category expected to be under constraint, and specifically not limited to essential or housekeeping genes – a key prediction of the ‘weak selection’ model (Martincorena et al., 2017; Figure 2—figure supplement 11, p<0.05, Wilcoxon signed-rank test).

Finally, we found that these patterns of attenuated selection persist across cancer subtypes for both SNVs and CNAs. We calculated dN/dS in tumors grouped by nine broad anatomical sub-categories (e.g. neuronal) and 33 subtype classifications (Grossman et al., 2016; Figure 2E–F). We find that patterns of attenuated selection in SNVs persists in the broad and specific (drivers p=3.8 × 10^–5, passengers p=1.7 × 10^–2, Wilcoxon signed-rank test; Figure 2—figure supplement 12) classification schemes. Furthermore, dE/dI measurements of CNAs exhibit similar patterns of selection in broad (Figure 2—figure supplement 13) and specific subtypes (Figure 2F; drivers p<0.05 and passengers p<0.05).

Collectively, these results suggest that tumors with elevated mutational burdens carry a substantial deleterious load. Since nonsynonymous mutations are thought to be primarily deleterious by inducing protein misfolding (Drummond and Wilke, 2008; Lobkovsky et al., 2010), we tested whether an increase in the number of passenger mutations in tumors would lead to elevated protein folding stress, and, in turn, drive the upregulation of heat shock and protein degradation (McGrail et al., 2020) pathways in cancer (Santagata et al., 2011). Indeed, gene expression of HSP90, Chaperonins, and the Proteasome does increase across the whole range of SNV (weighted R² of 0.84, 0.78, and 0.78, respectively) and CNA burdens (weighted R² of 0.83, 0.88, and 0.85, respectively) (Figure 2G and Figure 2—figure supplement 14A). This trend persists across cancer types for SNVs and CNAs (Figure 2—figure supplement 14D-E). Importantly, expression of these gene sets increases across the whole range of mutational burdens, even after the dN/dS of passengers approaches 1. This result presents additional evidence that passengers continue to impart a substantial cost to cancer cells, even in high mutational burden tumors.

Evolutionary modeling estimates the fitness effects of drivers and passengers, and rate of Hill-Robertson interference processes

We next tested whether Hill-Robertson interference – a process where selection becomes inefficient due to interference between linked mutations with competing fitness effects – alone can generate these patterns of attenuated selection. Specifically, we modeled tumor progression as a simple evolutionary process with advantageous drivers and deleterious passengers. We then used approximate Bayesian computation (ABC) to compare these simulations to observed data and infer the mean fitness effects of drivers and passengers.

Our previously developed evolutionary simulations model a well-mixed population of tumor cells that can randomly acquire advantageous drivers and deleterious passengers during cell division (McFarland et al., 2013). The product of the individual fitness effects of these mutations determines the relative birth and death rate of each cell, which in turn dictates the population size N of the tumor. If the population size of a tumor progresses to malignancy (N>1,000,000) within a human lifetime (≤100 years), the accrued mutations and patient age are recorded. The mutation rate of each simulated tumor is randomly sampled from a broad range (10^–12–10^–7 mutations · nucleotide^–1 · generation^–1, Materials and methods). Although this model ignores a great deal of known tumor biology, we believe it constitutes the simplest evolutionary model that could possibly recapitulate observed selection for drivers and against passengers. Our question is not whether this model is correct in all details but rather whether even such a simple model can generate quantitatively similar patterns as observed in the data with sensible values of mutation rates and selection coefficients.

Figure 3A illustrates the ABC procedure. To compare our model to observed data, we simulated an exponential distribution of fitness effects (DFEs) with mean fitness values that spanned a broad range (10^–2–10⁰ for driver and 10^–4–10^–2 for passengers, Materials and methods). We summarized observed and simulated data using statistics that capture three relationships: (i) the dependence of driver and passenger dN/dS rates on mutational burden, (ii) the rate of cancer age incidence (SEERs database National Cancer Institute, 2007), and (iii) the distribution of mutational burdens (summary statistics of (ii) and (iii) were based on theoretical parametric models Frank, 2007, Materials and methods, Figure 3—figure supplements 1–2). We then inferred the posterior probability distribution of mean driver fitness benefit and mean passenger fitness cost using a rejection algorithm that we validated using leave-one-out cross validation (CV) (Materials and methods, Figure 3—figure supplement 3).

Figure 3 with 6 supplements see all

Download asset Open asset

Approximate Bayesian computation (ABC) procedure estimates the strength of selection in passengers and drivers.

(A) Schematic overview of the ABC procedure used. A model of tumor evolution with genome-wide linkage contains two parameters – s_drivers (mean fitness benefit of drivers) and s_passengers (mean fitness cost of passengers) – sampled over broad prior distributions of values. Simulations begin with an initiating driver event that establishes the initial population size of the tumor. The birth rate of each individual cell within the tumor is determined by the total accumulated fitness effects of drivers and passengers. If the final population size of the tumor exceeds 1 million cells within a human lifetime (100 years), patient age and accrued mutations are recorded. Summary statistics of four relationships are used to compare simulations to observed data: (i) dN/dS rates of drivers and (ii) passengers across mutational burden, (iii) rates of cancer incidence vs. age, and (iv) the distribution of mutational burdens. Simulations that excessively deviate from observed data are rejected (Materials and methods). (**B–C**) Inferred posterior probability distributions of s_drivers and s_passengers. The maximum likelihood estimate (MLE) of s_drivers is 53.0% (green, 95% CI [16.0, 111.4]), and the MLE of s_passengers is 1.03% (green, 95% CI [0.40, 3.98%]). (**D–F**) Comparison of the summary statistics of the best-fitting simulations (MLE parameters, dashed lines) to observed data (solid lines). (D) dN/dS rates of passengers (red) and drivers (light green) for simulated and observed data vs. mutational burden. A model where 6% of synonymous mutations within drivers experience positive selection (dark green) was also considered. (E) Cancer incidence rates for patients above 20 years of age. (F) Distribution of the mutational burdens of tumors.

Using this approach, the maximum likelihood estimate (MLE) of mean driver fitness benefit is 53% (Figure 3B), while the MLE of passenger mean fitness cost is 1.03% (Figure 3C). Simulations with these MLE values agree well with all observed data (Figure 3D–F, Pearson’s r=0.988 for combined driver/passenger dN/dS).

While Hill-Robertson interference alone explains dN/dS rates in the passengers well, the simulations most consistent with observed data still exhibited consistently higher dN/dS rates in drivers (Figure 3D). We tested whether positive selection on synonymous mutations within driver genes could explain this discrepancy. Indeed, we find that a model incorporating synonymous drivers agrees modestly better with observed statistics (3.5-fold relative likelihood, ABC posterior probability). The best-fitting model predicts that ~6% of synonymous mutations within driver genes experience positive selection, which is consistent with previous estimates for human oncogenes (Supek et al., 2014) (Materials and methods, Figure 3D and Figure 3—figure supplement 4). Furthermore, we observe additional evidence of selection and codon bias in synonymous drivers exclusive to low mutational burdens (TCGA samples, Materials and methods, Figure 3—figure supplement 4).

We note that although deleterious passengers are necessary to explain attenuation of negative selection with mutational burden in passengers, alternative explanations could also contribute to attenuation of positive selection in drivers. Specifically, high mutational burden tumors are more likely to contain mutations in pan-cancer driver gene sets which might not directly contribute to tumorigenesis in specific tumors, and thus might not be under direct positive selection in all tumors. Similarly, additional driver mutations might not directly contribute to tumor fitness beyond a certain number of driver mutations (e.g. 5-hit model). Nonetheless, it’s important to note that Hill-Robertson interference is capable of reproducing all the features of the data (steep attenuation of negative selection in passengers and gradual attenuation of positive selection in drivers).

Overall, our results indicate that rapid adaptation through natural selection – acting on entire genomes, rather than individual mutations – is pervasive in all tumors, including those with elevated mutational burdens. Given the quantity of drivers and passengers observed in a typical cancer (TCGA), our model implies that cancer cells are in total ~90% fitter than normal tissues (119% total benefit of drivers, 46% total cost of passengers). A median of five drivers each of which has a mean benefit of ~19% accumulate per tumor in these simulations – also consistent with estimates from age incidence curves (National Cancer Institute, 2007), known hallmarks of cancer (Hanahan and Weinberg, 2000), and estimates of the selective benefit of individual drivers (Dai et al., 2007). Lastly, the mutation rates of tumors that could progress to cancer in our model also recapitulate observed mutation rates in human cancer (Camps et al., 2007) (median 3.7×10^–9, 95% interval 1.1×10^–10–8.2×10^–8, Figure 3—figure supplement 5).

Most notably, under our modeling assumptions, all passengers together confer a fitness cost of ~46% per tumor. While this collective burden appears large, the individual fitness effects of accumulated passengers in these simulations (mean 0.8%) are similar to observed fitness costs in cancer cell lines (1–3%) (Williams et al., 2008) and the human germline (0.5%) (Cassa et al., 2017). Note that in our model, these passengers accumulated primarily via Muller’s ratchet, while only ~5% accumulated via hitchhiking inferred using population genetics theory (McFarland et al., 2013) and MLE fitness effects, Materials and methods, Figure 3—figure supplement 6. These results suggest that Hill-Robertson interference is a plausible model for the empirical patterns of attenuated selection with mutational burden observed in the data.

Discussion

Here, we argue that signals of selection are largely absent in cancer because of the inefficiency of selection and not because of weakened selective pressures. In low mutational burden tumors (≤3 total substitutions per tumor), increased selection for drivers and against passengers is observed and ubiquitous: in SNVs and CNAs; in heterozygous, homozygous, clonal, and subclonal mutations; and in mutations predicted to be functionally consequential. These trends are not specific to essential or housekeeping genes. Importantly, these patterns persist across broad and specific tumor subtypes. Collectively, these results suggest that inefficient selection is generic to tumor evolution and that deleterious load is a nearly universal hallmark of cancer.

Importantly, these patterns of selection are missed when dN/dS rates are not stratified by mutational burden. Since <0.1% of mutations in TCGA reside within low mutational burden tumors (~1% of all tumors, N=83), dN/dS in passengers at low mutational burdens (~0.56) does not appreciably alter pan-cancer dN/dS of passengers (0.97 in our study, 0.82–0.98 in Martincorena et al., 2017; Weghorn and Sunyaev, 2017; Zapata et al., 2018; Ostrow et al., 2014). In fact, the power to detect negative selection on passengers at low mutational burdens is only possible by aggregating all mutations within these tumors and estimating dN/dS jointly. Thus, we believe that low mutational burden tumors are uniquely valuable for identifying genes and pathways under positive and negative selection. While only ~1% of tumors exhibit substantial negative selection, selection in drivers, selection on CNAs, and expression patterns of chaperones and proteasome components all show a continuous response to deleterious passenger load across a broad range of mutational burdens. Collectively, this suggests that passengers continue to be deleterious even in high mutational burden tumors.

Using a simple evolutionary model, we show that Hill-Robertson interference alone can explain this ubiquitous trend of attenuated selection in both drivers and passengers. dN/dS rates attenuate in drivers because the background fitness of a clone becomes more important than the fitness effects of an additional driver at elevated mutation rates. Furthermore, these simulations indicate that, despite dN/dS patterns approaching 1 in tumors with elevated mutational burdens, passengers are not effectively neutral (Ns > 1). Instead, passengers confer an individually weak, but collectively substantial fitness cost of ~46% that measurably impacts tumor progression. Because this simple evolutionary model does not explicitly incorporate many known aspects of tumor biology (e.g. haploinsufficiency, see Supplementary file 1), these fitness estimates are highly provisional. Nonetheless, we note that selection’s efficiency in cancer is further reduced when spatial constraints are considered (Sottoriva et al., 2015).

The functional explanation for why passengers in cancer are deleterious is unknown. In germline evolution, mutations are believed to be primarily deleterious because of protein misfolding (Drummond and Wilke, 2008; Lobkovsky et al., 2010). Deleterious passengers in somatic cells should confer similar effects. Indeed, we find that elevated mutational burden tumors may buffer the cost of deleterious mutations by upregulating multiple heat shock pathways. However, deleterious passengers may carry other costs to cancers or be buffered by additional mechanisms. Understanding and identifying how tumors manage this deleterious burden should identify new cancer vulnerabilities that enable new therapies and better target existing ones (Gorgoulis et al., 2018; Dai et al., 2007; Glaire and Church, 2017).

Share this article

Cite this article

Two Hill-Robertson interference processes that accumulate deleterious mutations at high mutation rates.

Attenuation of selection and increased protein folding stress in high mutation load tumors.

Approximate Bayesian computation (ABC) procedure estimates the strength of selection in passengers and drivers.

Author details

Susanne Tilk

Contribution

For correspondence

Competing interests

Svyatoslav Tkachenko

Contribution

Competing interests

Christina Curtis

Contribution

Competing interests

Dmitri A Petrov

Contribution

Competing interests

Christopher D McFarland

Contribution

For correspondence

Competing interests

Citations by DOI

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Research organism