Variations and predictability of epistasis on an intragenic fitness landscape

Sarvesh Baheti; Namratha Raj; Supreet Saini

doi:10.7554/eLife.104848.1

eLife Assessment

This paper addresses the significant question of quantifying epistasis patterns, which affect the predictability of evolution, by reanalyzing a recently published combinatorial deep mutational scan experiment. The findings are that epistasis is fluid, i.e. strongly background dependent, but that fitness effects of mutations are predictable based on the wild-type phenotype. However, these potentially interesting claims are inadequately supported by the analysis, because measurement noise is not accounted for, arbitrary cutoffs are used, and global nonlinearities are not sufficiently considered. If the results continue to hold after these major improvements in the analysis, they should be of interest to all biologists working in the field of fitness landscapes.

https://doi.org/10.7554/eLife.104848.1.sa4

Strength of evidence

inadequate: Methods, data and analyses do not support the primary claims

exceptional
compelling
convincing
solid
incomplete
inadequate

During the peer-review process the editor and reviewers write an eLife assessment that summarises the significance of the findings reported in the article (on a scale ranging from landmark to useful) and the strength of the evidence (on a scale ranging from exceptional to inadequate). Learn more about eLife assessments

Abstract

How epistasis hinders or facilitates movement on fitness landscapes has been a longstanding question of interest. Several high throughput experiments have demonstrated that despite its idiosyncrasy, epistatic effects exhibit global statistical patterns. Recently, Papkou et. al. constructed a fitness landscape for a 9-base region in the folA gene, which encodes for dihydrofolate reductase (DHFR), in E. coli, and demonstrated that despite being highly rugged, the landscape is highly navigable. In this work, using the folA landscape, we ask two questions: (1) How does the nature of epistatic interactions change as a function of the genomic background? (2) How predictable is epistasis within a gene? Our results show that epistasis is “fluid” - the nature of epistasis exhibited by a pair of mutations is strongly contingent on the genetic background. Mutations exhibit one of two binary “states”: a small fraction of mutations exhibit extremely strong patterns of global epistasis, while most do not. Despite these observations, we observe that the distribution of fitness effects (DFE) of a genotype is highly predictable based on its fitness. These results offer a new perspective on how epistasis operates within a gene, and how it can be predicted.

Significance Statement.

How a mutation changes organismal fitness is dependent on the genome in which it occurs. This phenomenon is known as epistasis and makes evolution unpredictable. Recent efforts to understand epistasis have led to the identification of statistical patterns in its manifestations. To study how epistasis operates in protein evolution, we analyze a recently reported landscape which quantifies fitness of ∼260000 sequences of an E. coli gene. We show two previously unknown properties of epistasis: “fluid” (epistasis between mutations is controlled via other epistatic interactions) and “binary” (only a few mutations exhibit statistical patterns; most do not). This work sheds new light on how epistasis manifests in gene sequences. Our results have important consequences for protein & organismal evolution.

Introduction

Mutations decide the fitness of an organism in an environment-dependent fashion. But, the effect of mutations also depends on the genetic background they are occurring in¹. This phenomenon is referred to as epistasis. Hence, epistasis influences adaptation^2,3. However, it is largely unpredictable, although a few statistical patterns based on macroscopic traits have been reported in the last few years^4–10. One way to study genotype-phenotype relationship and patterns of epistasis is by using fitness landscapes^11–13.

Fitness landscape is a multidimensional surface, in which one dimension represents a fitness-related phenotype, and the others genotype. Hence, it serves as a genotype-phenotype map, whose shape, in a given environment, is a consequence of genetic interactions or epistasis^14–17. If the landscape has many peaks, its structure is called rugged^18–20. Populations navigating on a rugged landscape are likely to be trapped in local peaks, whose precise identity is dictated by chance and population’s starting point on the landscape^21–23. Alternatively, on smooth landscapes, populations starting from different points on the sequence space all converge to the same global peak^24–27. Thus, the structure of the landscape also dictates our ability to predict evolution, and this has wide-ranging implications.

Although fitness landscape was conceptualized to explain the relationship between a population’s genotype and fitness, it has evolved to explain the relationship between fitness and functional protein-coding sequences^13,28. This effort to characterize fitness landscapes started by considering a handful of “important” biallelic sites in a protein^27,29–37. Although such landscapes could explain the pervasiveness of epistasis in protein sequences, they do not allow us to make wide-ranging statistical predictions of the characteristics of the fitness landscape³⁸. However, with advancing high-throughput technologies, it has become possible to construct high-dimensional landscapes^39–45.

In this study, we use one such high-dimensional fitness landscape constructed by Papkou and coworkers to understand how epistasis operates in a 9-base pair region of the folA gene in E. coli⁴⁰. folA encodes for dihydrofolate reductase (DHFR) and mutations in the gene are known to confer resistance to the antibiotic trimethoprim^46–50.

Via analysis of the folA landscape, we ask three specific questions (Supplement Figure S1): (1) How does the nature of epistasis between two given sites change as a function of the genetic background? (2) Are these changes dependent on the fitness or genotype? (3) Does a given mutation follow already known patterns of global epistasis? If yes, what does it depend on? Our results show that epistasis is “fluid” – i.e., the nature of epistasis two mutations exhibit is a function of the genetic background. We also show that only a small fraction of mutations follow global epistasis. In fact, mutations can be classified into two groups consisting of ones that show global epistasis and others (comprising of a majority) that do not. We also propose a novel way to estimate and predict distribution of fitness effects (DFE) of a given genotype.

Results

folA fitness landscape in E. coli

A fitness landscape of a 9-base pair region of folA was generated recently by Papkou and coworkers⁴⁰. The landscape explored a 9-bp region of the gene that has been shown to be important for resistance to the antibiotic trimethoprim^47,51. All 4⁹ variants of folA were generated, and grown in media containing the antibiotic. Deep sequencing was used to quantify fitness, and relative fitness of ∼99.7% of all variants was obtained. A striking feature of the landscape reported by Papkou and coworkers was that despite being highly rugged with 514 peaks, a majority of the landscape had adaptive access to high fitness peaks. This was because, compared to lower peaks, high fitness peaks had a large basin of attraction.

Our analysis using this dataset shows that with increasing size of the landscape, the number of peaks increase; however, the density of the peaks decreases (Supplement Figure S2 and S3). With this increasing size of the landscape, however, the accessibility of the global peak decreases (Supplement Figure S4). Globally, only small regions of the landscape could be represented as a maximally rugged NK landscape (Supplement Figure S5).

By scanning all mutations on a 9-dimensional landscape, Papkou et. al. have created a dataset that allows us to ask specific questions about epistasis and its manifestations on a fitness landscape. The genotypes on the folA landscape have been divided into two categories by Papkou et. al. into functional (∼7% of all points) and non-functional (∼93% of all points). This distinction is based on a statistical segregation of the 4⁹ points. Since this segregation does not have any functional basis, we call the two groups as “high fitness” and “low fitness”.

Nature of epistasis between two mutations is context dependent or “fluid” nature of epistasis

Epistasis between two mutations, A and B, can manifest as no, positive, negative, or sign epistasis⁵². While we know several examples of pairs of mutations exhibiting epistasis of each kind^9,52, we do not know if or how often the nature of epistasis exhibited by a pair of mutations changes with the genomic background.

To answer this question, we pick a pair of mutations and compute the fraction of genomes in which these mutations exhibit (a) Positive epistasis (PE), (b) Negative epistasis (NE), (c) Sign epistasis, and (d) No epistasis (No). Sign epistasis was further classified into (i) Reciprocal Sign epistasis (RSE), (ii) Single Sign epistasis (SSE) and (iii) Other Sign epistasis (OSE) based on the number of paths restricted by Darwinian evolution (Figure 1A).

Nature of epistasis exhibited by two mutations is contingent on the genetic background.
**(A)** The nature of epistasis exhibited by two mutations (indicated by red circle and green square) in all 4⁷ genetic backgrounds was quantified. Numbers f₁ to f₆ indicate the fraction of genomes in which the two mutations exhibit Positive epistasis (PE), Negative epistasis (NE), Reciprocal Sign epistasis (RSE), Single Sign epistasis (SSE), Other Sign epistasis (OSE), and No epistasis (No), and were calculated for this pair of mutations. The process was repeated for every possible pair of mutations. Distribution of the six fractions f₁ to f₆ in **(B)** high and **(C)** low fitness backgrounds is plotted.

For example, the mutation pair (G ◊ A at position 3 and T ◊ C at position 7) exhibits PE in 26.08%; NE in 34.36%; RSE in 0.67%; SSE in 2.31%; OSE in 5.00%; and No Epistasis in 31.57% of all high fitness backgrounds. The corresponding figures for low fitness backgrounds are: PE: 19.41%; NE: 22.61%; RSE: 7.65%; SSE: 5.71%; OSE: 13.16%; and No Epistasis: 30.92%. The exact numbers for all mutations pairs are provided in Supplement Data-File1.

We repeat this process for all possible pairs of mutations in the 9-base pair region of folA. The frequency distribution of the fraction of all genotypes where a pair exhibits each type of epistasis is shown in Figure 1B (for high fitness backgrounds) and Figure 1C (for low fitness backgrounds).

Barring a few pair of mutations which exhibit positive epistasis in all/nearly all high fitness backgrounds, nature of epistasis between a pair of mutations is strongly dependent on the genetic background (Figure 1 and Supplement Table 1 and Supplement Data-File2). In high fitness backgrounds, mutation pairs exhibit positive epistasis most frequently (median 41% of the genotypes), followed by negative epistasis (median 23%) and no epistasis (median 16%), with sign epistasis being relatively rare (Figure 1B). In low fitness backgrounds, mutation pairs exhibit no epistasis most frequently (median 30%), followed by negative epistasis (median 22%), positive epistasis (median 21%) and other sign epistasis (median 13%) (Figure 1C).

Because of this contingency of the nature of epistasis between two mutations on the genetic background, we propose that epistasis is “fluid”.

Functionally important sites cause switch in epistasis more frequently

It has previously been shown that both the secondary structure and location of a residue in a protein dictate the nature of epistasis exhibited by residues⁵³. In Figure 1, we saw that epistasis changes with genomic background. But, are certain positions more robust to changes in the nature of epistasis than others? In other words, does the location of a mutation in the genetic background dictate the likelihood of epistasis change?

The effect of a locus X on changing the nature of epistasis between two mutations was quantified as shown in Figure 2A. As shown in Figure 2B and Figure 2C, upon the introduction of a single mutation at locus X, the nature of epistasis switches in more than 50% of cases. The contributions of different loci is different. Positions 4 and 5 on the landscape, which are critical for protein function, are more responsible for changing the nature of epistasis between two mutations, than other sites on the landscape (Supplement Table 2 and Supplement Table 3). This control of nature of epistasis between other sites by functionally important sites is likely an important factor controlling protein evolution.

Different sites on the landscape dictate change in epistasis differently.
**(A)** To determine the impact of mutation at locus X in changing the nature of epistasis between two mutations (indicated by a red circle and a green square), we took the following approach. For a particular pair of mutations, the six sites (indicated by N) were fixed, and the nature of epistasis recorded before and after a mutation at X. This process was repeated for all 4⁶ backgrounds, and the fraction of genomes in which, upon introduction of a mutation at X, epistasis between the mutations (red circle and green square) changed, was noted. This gives us f, as indicated in the Figure. This process was repeated for all possible pairs of mutations (red circle and green square); giving a distribution of f for locus X. This distribution of f is shown in **(B)** for high fitness backgrounds and **(C)** for low fitness backgrounds. Functionally important sites (4 and 5) cause change in epistasis more frequently than others.

The switch of the nature of epistasis is most frequent to positive, negative, or no epistasis (Supplement Figure S6 and S7) (also see Supplement Data-File3). Switch to sign epistasis is relatively infrequent. Interestingly, in high fitness backgrounds, mutations at functionally important positions (4 and 5, on the landscape) cause a switch to sign-epistasis more frequently, as compared to a mutation at any of the other seven positions. This pattern is not seen in low fitness backgrounds.

The pervasiveness of the change in epistatic interactions is surprising, and makes prediction of evolutionary trajectories harder. However, we also note that change of nature of epistasis due to a mutation is not unique to folA. An analysis of a previously published five-point landscape of beta-lactamase gene²⁹ shows that change in nature of epistasis in common in intramolecular landscapes (Supplement Figure S8). The above results further emphasize that epistasis is “fluid”, and that different sites control the switch of epistasis differently.

Synonymous mutations can cause change of nature of epistasis

While historically synonymous mutations were thought to be neutral^54,55, several studies have demonstrated that they can have a wide range of effects on cellular fitness^56–59. However, their role in changing epistasis between two mutations has not been studied. To study this in the folA landscape, we note that any two mutations lie in either one or two codons in the folA sequence. Depending on the location of these two mutations, there is/are one/two codon(s) that remain unaffected by these mutations. We introduce all possible synonymous mutations in the unaffected codon(s) and ask how frequently does the nature of mutation change? Through this, we seek the likelihood that a synonymous mutation will change the nature of epistasis between two mutations (Figure 3A).

Introduction of a synonymous mutation changes the nature of epistasis between two interacting mutations.
**(A)** For a given sequence, two mutations (red circle and green square) could occur in either one or two codons, resulting in two or one unaltered codons (underlined in figure). All possible synonymous mutations were introduced in the unaltered codon(s) and change of nature of epistasis (if any) between the two mutations was noted. This was done for all backgrounds and all pairs of two mutations. Probability that a synonymous mutation at locus X leads to a change in nature of epistasis between two mutations in **(B)** high and **(C)** low fitness backgrounds is shown. Mutations resulting in TAA ↔ TGA and TAA — TAG are considered as synonymous mutations.

In our analysis, we consider TAA ↔ TGA and TAA ↔ TAG as synonymous mutations (see discussion). All three codons encode for a stop signal for translation.

Our results shows that synonymous changes can cause change in nature of epistasis frequently (Figures 3B and 3C). In high fitness backgrounds, the likelihood of change of nature of epistasis upon introduction of a synonymous mutation is ∼0.45 (Figure 3B). In low fitness backgrounds, this number is ∼0.75 (Figure 3C).

These results clearly demonstrate that change of nature of epistasis can take place via the acquisition of even a synonymous mutation, reaffirming “fluidity” in the nature of epistasis. This property of epistasis makes predicting evolution difficult. However, in recent years, epistasis has been shown to exhibit several statistical patterns, collectively termed as global epistasis¹. We next check how these patterns hold on the folA landscape.

Mutations on the landscape exhibit diminishing returns and increasing costs

Global epistasis suggests that the fitness effect of a mutation is a decreasing function of the background fitness^7,60 ⁶¹ (Figure 4A). In Figure 4B, a point represents fitness effect of a single mutation (y-axis) against the fitness of the genotype in which the mutation is introduced (x-axis) on the folA landscape. This is done for all mutations in all backgrounds. The line in red indicates the linear fit between the two variables. In both high and low fitness backgrounds, the negative slope of the line indicates presence of global epistasis, although the correlation is not very strong (R² = 0.1284 for high fitness backgrounds, & R² = 0.1236 for low fitness backgrounds).

The *folA* fitness landscape exhibits weak patterns of global epistasis.
**(A)** Cartoon to illustrate statistical patterns of global epistasis. The beneficial effect of a mutation decreases with an increase in background fitness (Diminishing Returns Epistasis). Beyond a certain background fitness (Pivot Point), the mutation becomes deleterious and its deleterious effects increase as the fitness of the genetic background increases (Increasing Costs Epistasis). **(B)** Mutational effects vs. the background fitness show weak correlation for low fitness (R² = 0.1236) (left of dotted line) and high fitness (R² = 0.1284) (right of dotted line) backgrounds. The red lines show the best linear fits for the two groups.

Only a small fraction of mutations exhibit global epistasis/ “binary” nature of global epistasis

We next investigate the effect on fitness of each of the 108 (12 possible mutations at each of the 9 sites) mutations on all possible genotypes. Figure 5 shows that only a few mutations exhibit strong patterns of global epistasis. These mutations are primarily (14 out of 16) at positions (nucleotide 4 and 5) which are functionally most important for folA⁴⁶. An overwhelming majority of mutations (77/108) exhibit no correlation (R² < 0.2) between their fitness effect and the fitness of the background on which they occur (Supplement Figure S9 and Supplement Table S4).

A small fraction of mutations follow global epistasis.
Only 16 (14 of which are at position 4 and 5) (highlighted in red boxes) out of 108 possible mutations exhibit statistically significant patterns (R² > 0.4) of global epistasis. See Supplement Table 1 for statistics for each mutation.

Therefore, mutations exhibit one of two states depending on whether they follow global epistasis, or not. This indicates the “binary” nature of mutations. In the case of folA landscape, only mutations at nucleotide positions critical for function exhibit global epistasis.

A recent work⁶¹ demonstrated that the growth rate at which any mutation switches from being beneficial to deleterious is conserved for all mutations. This growth rate, around which the nature of mutation changes from being beneficial to deleterious was referred to as the pivot growth rate. While the mechanistic origins of pivot growth rate are yet unknown, this phenomenon likely represents a deep fundamental “rule” of how a cell works. We test the existence of the pivot growth rate for each of the 108 mutations in the folA landscape. Most (∼80%) mutations pivot from being beneficial to deleterious at a growth rate -0.657 ± 0.0657 (Figure 6 and Supplement Table 4). The presence of a pivot growth rate despite poor statistical correlations between background fitness and the fitness effect of a mutation is a surprising observation.

Most mutations switch from being beneficial to deleterious at the pivot point.
Background fitness (y-axis) at which each of the 108 mutations (x-axis) switches from being beneficial to deleterious. The black solid line shows the average fitness (−0.657) (or pivot) at which a mutation changes from being beneficial to deleterious. Individual bars show the deviation from the mean pivot point for each mutation. More than 80% (86 out of 108) of the mutations exhibit the pivot fitness in the range -0.657 +/-0.0657. The dotted line in red indicates the growth rate used in Papkou et al⁴⁰ to differentiate between high and low fitness variants.

Predicting DFE from phenotype

The “binary” and “fluid” nature of epistasis discussed in this work make prediction of evolutionary trajectories difficult. As shown in Figure 5, only functionally important sites exhibit global epistasis. For most mutations, fitness effects are idiosyncratic. Hence, in the absence of knowledge of the sites which exhibit global epistasis, predictions are likely going to be accurate. From the context of the “fluidity” of epistasis, in the absence of complete knowledge of the genetic background, it is barely possible to comment on nature of epistasis between two mutations. Thus, these two features of epistasis make evolution unpredictable. We now ask if there exists any statistical pattern at all, which would enable the prediction of evolution.

In this context, we define a quantity called “phenotypic DFE”, which represents the collective DFE of all genotypes exhibiting near identical fitness (Figure 7A). To compute the phenotypic DFE, we binned all genotypes in non-overlapping narrow fitness intervals (see methods). DFE of each genotype was computed by introducing all possible 27 (by introducing all three mutations at each of the 9 positions for a given genotype) mutations. DFE of all genotypes in one interval was averaged to obtain the phenotypic DFE. We next ask how robustly the DFE of a particular genotype of near identical fitness can be predicted solely from the phenotypic DFE.

“Phenotypic DFE” is a predictor of DFE.
**(A)** Backgrounds with near identical fitness were randomly distributed in two groups comprising 90% and 10% of all backgrounds. The first group was used to define a “phenotypic DFE” (a mean DFE of all genomes in the group). The phenotypic DFE was compared with individual DFEs in the 10% group, and the p-value distribution of this comparison was obtained. The process was repeated for genotypes with other fitness. **(B)** Phenotypic DFE of high fitness backgrounds exhibited two peaks. The first peak was at fitness effect ∼ 0. The second peak comprised of deleterious mutations, whose magnitude increased with increasing background fitness. **(C)** Phenotypic DFE of low fitness backgrounds exhibited a single peak, whose mean decreased with increasing background fitness. **(D)** Percent of DFEs in a fitness window with p-value < 0.05 when compared with the Phenotypic DFE. As fitness increases, Phenotypic DFE becomes a better predictor of DFE of a genotype.

To test this, we compute the phenotypic DFE from randomly sampled 90% of the genotypes with fitness between f_o and f_o + Δ. The DFE of the remaining 10% of the genotypes in a fitness window was computed separately, and each DFE was compared with the phenotypic DFE. The likelihood of equivalence of each genotype’s DFE with the corresponding phenotypic DFE was estimated as the p-value of the Mann-Whitney U test. This comparison gives us a distribution of p-values, for each background fitness.

The phenotypic DFE of high fitness backgrounds comprised of two peaks. The first peak corresponds to mutations with large deleterious effects, whose magnitude increases with increasing background fitness (Figures 7B). The second peak is roughly centered at fitness effect ∼ 0. For low fitness backgrounds, the DFE comprised of only one peak, whose mean decreased as the background fitness increased (Figure 7C).

Phenotypic DFE is better predicted, than fitness effects of individual mutations, by the background fitness of the genotype. The fraction of genomes for which the phenotypic DFE is statistically significantly different from the actual DFE reduces as the background fitness increases (Figure 7D). This effect can also be seen from the distribution of p-values of comparisons between phenotypic DFE and actual DFEs, for different background fitness (Supplement Figure 10).

Discussion

Genotype-phenotype mapping, especially of proteins, is of great interest in evolutionary biology, cell biology, and genetics^{11,46,62–66}. The underlying rules that dictate this mapping are a combination of individual mutation effects, epistatic interactions between the mutations, and between the mutations and the genetic background they occur in^{7,29,35,60,61,67–70}. The extent and ubiquity of epistatic interactions is of particular interest, because they are mostly unpredictable and have a direct effect on the shape of the fitness landscape, and consequently, adaptation^{34,38,71–76}. Therefore, in order to understand the statistical rules which govern epistasis on a landscape, we analyzed a recently published folA landscape which was constructed by quantifying fitness of more than 260,000 sequences in a 9-base pair region⁴⁰. Mutations in these nine base pairs have been reported to be adaptive^47,49.

Navigability of fitness landscapes decides the likelihood of a population to reach the global fitness maximum. Most fitness landscapes indicate that protein landscapes are rugged, and hence, protein evolution is constrained^{47,70,77–79}. But, fitness landscapes are constructed by considering a handful of sites, and there is theoretical evidence to suggest that mutations in other “dimensions” could enable populations to navigate valleys in the landscape¹¹. Fisher suggested the same in his correspondence with Wright⁸⁰.

A particularly confounding mode of epistasis is sign epistasis, which alters the qualitative nature of a mutational effect, and creates valleys in fitness landscapes. In the folA landscape, sign epistasis frequently changes to positive or negative epistasis, indicating “fluidity” in its effects. We also report that synonymous mutations can change the nature of epistasis between two existing mutations. Hence, even if valleys existed, they are not that difficult to navigate. Similar findings have been reported in the past - deleterious mutations making peaks accessible, and neutral mutations being adaptive^81,82. Additionally, synonymous mutations are known to have fitness effects by changing protein amount (by altering mRNA stability or translation rates)^56,83–87 or structure^88–92. Our results provide a novel mechanism via which synonymous mutations are relevant, for driving evolutionary change and controlling disease states^93–96.

Interestingly, our analysis shows that the nature of epistasis between two mutations can change by simply changing the stop codon (TAA ↔ TGA or TAA ↔ TAG). This observation holds even when the stop codon is encoded by the first three bases of the 9-bp landscape, and the interacting mutations whose nature of epistasis changes are in the subsequent two codons (which are presumably not even translated). This indicates that the change in the nature of epistasis has likely not do with the protein synthesis but with the mRNA and its effect on cellular fitness.

Premature termination of translation has been known to destabilize the entire transcript⁹⁷. In eukaryotes, elaborate mechanisms are present to deal with potentially toxic effects of truncated proteins^98–101. However, prokaryotes lack these mechanisms. In fact, a recent study shows that in E. coli under stress, despite a premature stop codon in the gene sequence, stop codon read-through rates may be as high as 80%, due to a high probability of a mismatch at a premature stop codon¹⁰². This is a likely explanation for change of epistasis (or change of fitness) due to alternate premature stop codons in folA, when E. coli is grown in antibiotic stress.

Neutral mutations can improve evolvability of a sequence, and hence aid the movement on a fitness landscape^103–108. However, we do not see any evolvability enhancing mutations in the 9-bp folA landscape (Supplement Figure S11).

Predicting evolution is a longstanding goal in the field, and global epistasis patterns offered a tool to predict epistatic effects. However, our analyses show that global epistasis often does not hold. Instead, we observe that sequences with similar fitness have similar DFEs, offering some predictability, based on a macroscopic trait, of the adaptive potential of a population^109–111. The means of these DFEs decreased linearly with an increase in background fitness, despite the mutations on this landscape not following global epistasis patterns (Supplement Figure S12).

We show that the navigability of a landscape can change via mutations in the same protein. Can it also change via mutations elsewhere on the genome? High-dimensional landscapes are necessary to answer this question. Additionally, increasing the dimensionality of the landscape is likely going to provide qualitatively new perspectives of adaptation.

Methods

Calculating Hamming Distance

To calculate the hamming distance between two sequences of equal length, we compare the sequences and count the number of dissimilar loci.

Construction of sequence spaces and landscapes

In order to construct an n-base pair sequence space from a parent p-base pair sequence space (p > n), we choose any p − n loci in the parent sequence space and find all variants in the parent space in which the selected loci are fixed (these loci contain a same sequence). The set of these selected variants are assigned their one hamming distance neighbours to construct a new n-base sequence space.

By repeating this process over all combinations of selecting the loci to be fixed and the 4^p−n permutations of choosing the fixed sequence in each case, we are able to break the parent sequence space into n-base pair sequence spaces (Supplement Table 2).

For our study, we use a nine-base pair parent sequence space to generate n-base pair sequence spaces (1 ≤ n ≤ 9). Landscapes are generated by mapping fitness values of each sequence in a sequence space corresponding to the ones assigned in the empirical fitness landscape generated by Papkou et al. using a nine base pair gene.

In rare cases where fitness value was not known, we disregarded those variants in all studies.

Finding peaks in a fitness landscape

Number of Peaks in Landscape

To count the number of peaks in fitness landscapes, we find the number of variants that have a higher fitness value than all its one hamming distance neighbours in that landscape.

Peak Probability

To quantify the probability of encountering a fitness peak in any landscape, we find the ratio of number of peaks found to the total number of sequences in that fitness landscape.

Expected number of peaks

In our case (as we are dealing with 4 letter genome), the expected number of peaks in maximally rugged (uncorrelated) NK landscapes is found by for an n-base pair landscape²⁰. The number of peaks predicted by NK n+1 landscapes is rounded off to nearest integer.

Fitness effect of mutations

To find the fitness effect of a mutation acting on a genotype, we start with a set of all genotypes in the empirical Papkou et al. fitness landscape which would result in a new sequence formed following the mutation. In all such genotypes, we find the background fitness f_b and the fitness of resulting mutant f_b. The fitness effect of this mutation in this background is determined as the fitness difference between the mutant and the background (s = f_m − f_b).

Finding phenotypic DFE

We found the Distribution of Fitness Effects for variants lying in a small slice of fitness value (for our study, we kept the range of this slice to be 0.05). The DFE was constructed by analysing the frequency of fitness effects of all mutations acting on the backgrounds lying in the selected range.

Finding genotypic DFE

For any 9 length genotypic sequence, we found the 9loci × 3bases = 27 mutations that may result in a new sequence. We used the frequency of fitness effect of all these mutations to constitute the distribution of fitness effects for any genotype.

Non parametric tests

We found the p-value of the two parameter KS test and the Mann– Whitney U test using the python scipy.stats library.

Linear regression

We found the pivot point fitness of individual mutations via linear regression of background fitness and the selection coefficient of the mutation using the “LinearRegression” model from python sklearn.linear_model library.

Epistasis as function of genetic background

Classifying Epistasis

To quantify the epistasis present in a mutation pair acting on two differing genetic loci, we compute the fitness effect of the individual mutations on a genetic background and the cumulative fitness effect of the two mutations on the same genetic background. If the fitness effect of the individual mutations were s₁ and s₂, while the cumulative effect of the two mutations was s₁₂, then we classify the epistasis into following categories,

No Epistasis: If |s₁₂ − (s₁ + s₂)| < 0.05 i.e. we assume no epistasis was present if the cumulative fitness effect of the two mutations was about the same as the two mutations acting independently on the genetic background.
Sign Epistasis: If s₁₂ × (s₁ + s₂) < 0 i.e. we classify sign epistasis if the cumulative effect of the two mutations lead to a different sign of fitness effect than the sum of individual fitness effects of the two mutations on the same background. For example, if the combined effect of the mutation pair was beneficial even though the sum of fitness effects of the two mutations on the same background was deleterious, and vice versa.
We further classified this epistasis into three categories,
1. Reciprocal Sign Epistasis: If s₁ < 0, s₂ < 0 and s₁₂ > 0 i.e. if the cumulative effect of the two mutations lead the background to a higher fitness, but both shortest paths to this higher fitness point are blocked to darwinian evolution.
2. Single Sign Epistasis: If exclusively s₁ < 0 or s₂ < 0 and s₁₂ > 0 i.e. if exactly one of the two shortest paths leading background to higher fitness are blocked to darwinian evolution.
Other Sign Epistasis: All other cases cases classified to sign epistasis.
Positive Epistasis: If s₁₂ > s₁ + s₂ i.e. if the combined fitness effect of the two mutations was more beneficial / less deleterious than the sum of individual effects of the mutation pair on the genetic background.
Negative Epistasis: If s₁₂ < s₁ + s₂ i.e. if the combined fitness effect of the two mutations was less beneficial / more deleterious than the sum of individual effects of the mutation pair on the genetic background.

Epistasis Change with Genetic Background

Having the epistasis dossier generated for all mutation pairs, we compiled all cases of Positive, Negative, Sign and No Epistasis. For each of these cases, we select the genetic backgrounds and their one mutant neighbours such that their differing mutation locus is unrelated to the loci involved in Epistasis (In our case, we can find (9 − 2)loci × 3bases = 21 such neighbours for each background).

We then check the nature of epistasis in each of these 21 neighbours on the same mutation pair, and quantify the number of these cases in which nature of epistasis changes.

Finding paths in a sequence space

Set of all variants

We start by listing all the mutations required to convert the starting sequence Ato the target sequence T. If ℎ denotes the minimum number of mutations required to change sequence A to T, then for each step s ∈ 1, ⋯, ℎ − 1, we list all variants in the sequence space which are at s hamming distance from the starting variant and ℎ − s hamming distance from the target sequence. The resulting set includes all variants involved at each step for shortest traversal from A to T.

Having the set of all variants at each step in traversal of sequence A to T, we recursively find all ℎ! permutations of paths such that each step only allows one base change while leading the sequence to the target in minimum number of steps.

Effect of neutral mutations on evolvability

To perform this study, we compiled all the (9loci × 4bases) = 36mutations that are possible among all sequences of folA landscape. We then listed all the backgrounds on which these mutations change the genotype, but showcase neutral fitness effect (magnitude of fitness effect < 0.05).

We then allow a second mutation which changes both the background and the mutant genotype. If the selection coefficient of the second mutation on the background is s₁ and on the neutral mutant is s, then the relative increase in evolvability is quantified as: .

Consider a variant A and a neutral mutation X which results in a variant B (A ≠ B|s_AB| < 0.05). For any mutation Y taking place on genotypes A and B forming A^′ and B^′ such that (A ≠ A^′B ≠ B^′), we quantify the relative change in evolvability of A from mutation Y due to a neutral mutation X as .

Finding synonymous mutations

We identified the all synonymous codons, i.e. the codons that encode the same amino acid / termination function. For each codon c_i, we then found the list of all synonymous codons which are at 1 Hamming distance from the codon c_i.

Using this data, we were able to identify the set of synonymous mutations for each codon.

Change in mode of epistasis due to synonymous mutation in an extrinsic codon

For any background in DHFR gene, we tested all mutation pairs (A and B) and identified the type of epistasis exhibited. Since these two mutations can mutate a minimum of one and a maximum of two codons in the 9 base pair gene, at least one codon remains un-mutated. For the mutation pair, we found the un-mutated codon(s), and derived the set of all possible synonymous mutations on the codon(s) of the given background.

We noted the number of instances where introduction of a synonymous mutation (X) at a particular locus (on an un-mutated codon) does or does not change the nature of epistasis for the background and given mutation pair.

We did this analysis separately (all / high fitness / low fitness) backgrounds, their possible mutation pairs (A and B) and their respective synonymous mutations (X) to identify the overall probability of epistasis change due to synonymous mutation on each locus.

Codes

All codes used in this work and Supplement Data Files are available at: https://github.com/SainiSupreet/Ecoli-folA-DHFR. The “readme” file at the repository gives details of how to run codes.

Acknowledgements

We thank Christian Landry and Krishna Swamy for feedback on the manuscript.

Funding

This work was funded by a Wellcome Trust/DBT (India Alliance) grant (Award Number: IA/S/19/2/504632) to SS. NMR was funded by Prime Minister’s Research Fellowship (PMRF ID 1301163).

Additional files

Supplement figures and tables.

References

1
1. Johnson M. S.
2. Reddy G.
3. Desai M. M
2023Epistasis and evolution: recent advances and an outlook for predictionBMC Biol 21:120https://doi.org/10.1186/s12915-023-01585-3 Google Scholar
2
1. Bank C
2022Epistasis and Adaptation on Fitness LandscapesAnnual Review of Ecology, Evolution, and Systematics 53Google Scholar
3
1. Ostman B.
2. Hintze A.
3. Adami C
2012Impact of epistasis and pleiotropy on evolutionary adaptationProc Biol Sci 279:247–256https://doi.org/10.1098/rspb.2011.0870 Google Scholar
4
1. Good B. H.
2. Desai M. M
2015The impact of macroscopic epistasis on long-term evolutionary dynamicsGenetics 199:177–190https://doi.org/10.1534/genetics.114.172460 Google Scholar
5
1. Johnson M. S.
2. Desai M. M
2022Mutational robustness changes during long-term adaptation in laboratory budding yeast populationsElife 11https://doi.org/10.7554/eLife.76491 Google Scholar
6
1. Wunsche A.
2. et al.
2017Diminishing-returns epistasis decreases adaptability along an evolutionary trajectoryNat Ecol Evol 1https://doi.org/10.1038/s41559-016-0061 Google Scholar
7
1. Kryazhimskiy S.
2. Rice D. P.
3. Jerison E. R.
4. Desai M. M.
2014Microbial evolution. Global epistasis makes adaptation predictable despite sequence-level stochasticityScience 344:1519–1522https://doi.org/10.1126/science.1250939 Google Scholar
8
1. Park Y.
2. Metzger B. P. H.
3. Thornton J. W
2024The simplicity of protein sequence-function relationshipsNat Commun 15:7953https://doi.org/10.1038/s41467-024-51895-5 Google Scholar
9
1. Starr T. N.
2. Thornton J. W
2016Epistasis in protein evolutionProtein Sci 25:1204–1218https://doi.org/10.1002/pro.2897 Google Scholar
10
1. de Visser J. A.
2. Cooper T. F.
3. Elena S. F
2011The causes of epistasisProc Biol Sci 278:3617–3624https://doi.org/10.1098/rspb.2011.1537 Google Scholar
11
1. Greenbury S. F.
2. Louis A. A.
3. Ahnert S. E
2022The structure of genotype-phenotype maps makes fitness landscapes navigableNat Ecol Evol 6:1742–1752https://doi.org/10.1038/s41559-022-01867-z Google Scholar
12
1. Srivastava M.
2. Payne J. L
2022On the incongruence of genotype-phenotype and fitness landscapesPLoS Comput Biol 18:e1010524https://doi.org/10.1371/journal.pcbi.1010524 Google Scholar
13
1. de Visser J. A.
2. Krug J
2014Empirical fitness landscapes and the predictability of evolutionNat Rev Genet 15:480–490https://doi.org/10.1038/nrg3744 Google Scholar
14
1. Blanquart F.
2. Bataillon T
2016Epistasis and the Structure of Fitness Landscapes: Are Experimental Fitness Landscapes Compatible with Fisher’s Geometric Model?Genetics 203:847–862https://doi.org/10.1534/genetics.115.182691 Google Scholar
15
1. Fraisse C.
2. Welch J. J
2019The distribution of epistasis on simple fitness landscapesBiol Lett 15:20180881https://doi.org/10.1098/rsbl.2018.0881 Google Scholar
16
1. Diaz-Colunga J.
2. et al.
2023Global epistasis on fitness landscapesPhilos Trans R Soc Lond B Biol Sci 378:20220053https://doi.org/10.1098/rstb.2022.0053 Google Scholar
17
1. Park Y.
2. Metzger B. P. H.
3. Thornton J. W.
2024The simplicity of protein sequence-function relationshipsbioRxiv https://doi.org/10.1101/2023.09.02.556057 Google Scholar
18
1. Van Cleve J.
2. Weissman D. B
2015Measuring ruggedness in fitness landscapesProc Natl Acad Sci U S A 112:7345–7346https://doi.org/10.1073/pnas.1507916112 Google Scholar
19
1. Meger A. T.
2. et al.
2024Rugged fitness landscapes minimize promiscuity in the evolution of transcriptional repressorsCell Syst 15:374–387https://doi.org/10.1016/j.cels.2024.03.002 Google Scholar
20
1. Kauffman S.
2. Levin S
1987Towards a general theory of adaptive walks on rugged landscapesJ Theor Biol 128:11–45https://doi.org/10.1016/s0022-5193(87)80029-2 Google Scholar
21
1. Hayashi Y.
2. et al.
2006Experimental rugged fitness landscape in protein sequence spacePLoS One 1:e96https://doi.org/10.1371/journal.pone.0000096 Google Scholar
22
1. Neidhart J.
2. Szendro I. G.
3. Krug J
2014Adaptation in tunably rugged fitness landscapes: the rough Mount Fuji modelGenetics 198:699–721https://doi.org/10.1534/genetics.114.167668 Google Scholar
23
1. Fontanari, D. B. S. a. J. F.
2009Evolutionary dynamics on rugged fitness landscapes: Exact dynamics and information theoretical aspectsPhysical Review E 80Google Scholar
24
1. Carneiro M.
2. Hartl D. L
2010Colloquium papers: Adaptive landscapes and protein evolutionProc Natl Acad Sci U S A 107:1747–1751https://doi.org/10.1073/pnas.0906192106 Google Scholar
25
1. Franke J.
2. Klozer A.
3. de Visser J. A.
4. Krug J
2011Evolutionary accessibility of mutational pathwaysPLoS Comput Biol 7:e1002134https://doi.org/10.1371/journal.pcbi.1002134 Google Scholar
26
1. de Visser J. A.
2. Park S. C.
3. Krug J
2009Exploring the effect of sex on empirical fitness landscapesAm Nat 174:S15–30https://doi.org/10.1086/599081 Google Scholar
27
1. Hall D. W.
2. Agan M.
3. Pope S. C
2010Fitness epistasis among 6 biosynthetic loci in the budding yeast Saccharomyces cerevisiaeJ Hered 101:S75–84https://doi.org/10.1093/jhered/esq007 Google Scholar
28
1. Wright S
1932The roles of mutation, inbreeding, crossbreeding, and selection in evolutionProceedings of the Sixth International Congress on Genetics 1:355–366Google Scholar
29
1. Weinreich D. M.
2. Delaney N. F.
3. Depristo M. A.
4. Hartl D. L
2006Darwinian evolution can follow only very few mutational paths to fitter proteinsScience 312:111–114https://doi.org/10.1126/science.1123539 Google Scholar
30
1. Smith J. M
1970Natural selection and the concept of a protein spaceNature 225:563–564https://doi.org/10.1038/225563a0 Google Scholar
31
1. Malcolm B. A.
2. Wilson K. P.
3. Matthews B. W.
4. Kirsch J. F.
5. Wilson A. C
1990Ancestral lysozymes reconstructed, neutrality tested, and thermostability linked to hydrocarbon packingNature 345:86–89https://doi.org/10.1038/345086a0 Google Scholar
32
1. de Visser J.
2. Hoekstra R. F.
3. van den Ende H.
1997Test of Interaction between Genetic Markers That Affect Fitness in Aspergillus NigerEvolution 51:1499–1505https://doi.org/10.1111/j.1558-5646.1997.tb01473.x Google Scholar
33
1. Kouyos R. D.
2. Silander O. K.
3. Bonhoeffer S
2007Epistasis between deleterious mutations and the evolution of recombinationTrends Ecol Evol 22:308–315https://doi.org/10.1016/j.tree.2007.02.014 Google Scholar
34
1. Hietpas R. T.
2. Jensen J. D.
3. Bolon D. N
2011Experimental illumination of a fitness landscapeProc Natl Acad Sci U S A 108:7896–7901https://doi.org/10.1073/pnas.1016024108 Google Scholar
35
1. Starr T. N.
2. Flynn J. M.
3. Mishra P.
4. Bolon D. N. A.
5. Thornton J. W
2018Pervasive contingency and entrenchment in a billion years of Hsp90 evolutionProc Natl Acad Sci U S A 115:4453–4458https://doi.org/10.1073/pnas.1718133115 Google Scholar
36
1. Gorter F. A.
2. Aarts M. G. M.
3. Zwaan B. J.
4. de Visser J
2018Local Fitness Landscapes Predict Yeast Evolutionary Dynamics in Directionally Changing EnvironmentsGenetics 208:307–322https://doi.org/10.1534/genetics.117.300519 Google Scholar
37
1. Schenk M. F.
2. Szendro I. G.
3. Salverda M. L.
4. Krug J.
5. de Visser J. A
2013Patterns of Epistasis between beneficial mutations in an antibiotic resistance geneMol Biol Evol 30:1779–1787https://doi.org/10.1093/molbev/mst096 Google Scholar
38
1. Buda K.
2. Miton C. M.
3. Tokuriki N
2023Pervasive epistasis exposes intramolecular networks in adaptive enzyme evolutionNat Commun 14:8508https://doi.org/10.1038/s41467-023-44333-5 Google Scholar
39
1. Li C.
2. Qian W.
3. Maclean C. J.
4. Zhang J
2016The fitness landscape of a tRNA geneScience 352:837–840https://doi.org/10.1126/science.aae0568 Google Scholar
40
1. Papkou A.
2. Garcia-Pastor L.
3. Escudero J. A.
4. Wagner A
2023A rugged yet easily navigable fitness landscapeScience 382:eadh3860https://doi.org/10.1126/science.adh3860 Google Scholar
41
1. Flynn J. M.
2. et al.
2022Comprehensive fitness landscape of SARS-CoV-2 M(pro) reveals insights into viral resistance mechanismsElife 11https://doi.org/10.7554/eLife.77433 Google Scholar
42
1. Wu N. C.
2. Dai L.
3. Olson C. A.
4. Lloyd-Smith J. O.
5. Sun R
2016Adaptation in protein fitness landscapes is facilitated by indirect pathsElife 5https://doi.org/10.7554/eLife.16965 Google Scholar
43
1. Pokusaeva V. O.
2. et al.
2019An experimental assay of the interactions of amino acids from orthologous sequences shaping a complex fitness landscapePLoS Genet 15:e1008079https://doi.org/10.1371/journal.pgen.1008079 Google Scholar
44
1. Karapanagioti F.
2. Atlason U. A.
3. Slotboom D. J.
4. Poolman B.
5. Obermaier S
2024Fitness landscape of substrate-adaptive mutations in evolved amino acid-polyamine-organocation transportersElife 13https://doi.org/10.7554/eLife.93971 Google Scholar
45
1. Flynn J. M.
2. et al.
2020Comprehensive fitness maps of Hsp90 show widespread environmental dependenceElife 9https://doi.org/10.7554/eLife.53810 Google Scholar
46
1. Bershtein S.
2. Choi J. M.
3. Bhattacharyya S.
4. Budnik B.
5. Shakhnovich E
2015Systems-level response to point mutations in a core metabolic enzyme modulates genotype-phenotype relationshipCell Rep 11:645–656https://doi.org/10.1016/j.celrep.2015.03.051 Google Scholar
47
1. Tamer Y. T.
2. et al.
2019High-Order Epistasis in Catalytic Power of Dihydrofolate Reductase Gives Rise to a Rugged Fitness Landscape in the Presence of Trimethoprim SelectionMol Biol Evol 36:1533–1550https://doi.org/10.1093/molbev/msz086 Google Scholar
48
1. Matthews D. A.
2. et al.
1977Dihydrofolate reductase: x-ray structure of the binary complex with methotrexateScience 197:452–455https://doi.org/10.1126/science.17920 Google Scholar
49
1. Benkovic S. J.
2. Fierke C. A.
3. Naylor A. M
1988Insights into enzyme function from studies on mutants of dihydrofolate reductaseScience 239:1105–1110https://doi.org/10.1126/science.3125607 Google Scholar
50
1. Schnell J. R.
2. Dyson H. J.
3. Wright P. E
2004Structure, dynamics, and catalytic function of dihydrofolate reductaseAnnu Rev Biophys Biomol Struct 33:119–140https://doi.org/10.1146/annurev.biophys.33.110502.133613 Google Scholar
51
1. Toprak E.
2. et al.
2011Evolutionary paths to antibiotic resistance under dynamically sustained drug selectionNat Genet 44:101–105https://doi.org/10.1038/ng.1034 Google Scholar
52
1. Phillips P. C
2008Epistasis--the essential role of gene interactions in the structure and evolution of genetic systemsNat Rev Genet 9:855–867https://doi.org/10.1038/nrg2452 Google Scholar
53
1. Gonzalez C. E.
2. Ostermeier M
2019Pervasive Pairwise Intragenic Epistasis among Sequential Mutations in TEM-1 beta-LactamaseJ Mol Biol 431:1981–1992https://doi.org/10.1016/j.jmb.2019.03.020 Google Scholar
54
1. Kimura M
1968Genetic variability maintained in a finite population due to mutational production of neutral and nearly neutral isoallelesGenet Res 11:247–269https://doi.org/10.1017/s0016672300011459 Google Scholar
55
1. King J. L.
2. Jukes T. H.
1969Non-Darwinian evolutionScience 164:788–798https://doi.org/10.1126/science.164.3881.788 Google Scholar
56
1. Agashe D.
2. Martinez-Gomez N. C.
3. Drummond D. A.
4. Marx C. J
2013Good codons, bad transcript: large reductions in gene expression and fitness arising from synonymous mutations in a key enzymeMol Biol Evol 30:549–560https://doi.org/10.1093/molbev/mss273 Google Scholar
57
1. Kristofich J.
2. et al.
2018Synonymous mutations make dramatic contributions to fitness when growth is limited by a weak-link enzymePLoS Genet 14:e1007615https://doi.org/10.1371/journal.pgen.1007615 Google Scholar
58
1. Lebeuf-Taylor E.
2. McCloskey N.
3. Bailey S. F.
4. Hinz A.
5. Kassen R
2019The distribution of fitness effects among synonymous mutations in a gene under directional selectionElife 8https://doi.org/10.7554/eLife.45952 Google Scholar
59
1. Walsh I. M.
2. Bowman M. A.
3. Soto Santarriaga I. F.
4. Rodriguez A.
5. Clark P. L
2020Synonymous codon substitutions perturb cotranslational protein folding in vivo and impair cell fitnessProc Natl Acad Sci U S A 117:3528–3534https://doi.org/10.1073/pnas.1907126117 Google Scholar
60
1. Johnson M. S.
2. Martsul A.
3. Kryazhimskiy S.
4. Desai M. M
2019Higher-fitness yeast genotypes are less robust to deleterious mutationsScience 366:490–493https://doi.org/10.1126/science.aay4199 Google Scholar
61
1. Ardell S.
2. Martsul A.
3. Johnson M. S.
4. Kryazhimskiy S
2024Environment-independent distribution of mutational effects emerges from microscopic epistasisbioRxiv https://doi.org/10.1101/2023.11.18.567655 Google Scholar
62
1. Ahnert S. E
2017Structural properties of genotype-phenotype mapsJ R Soc Interface 14https://doi.org/10.1098/rsif.2017.0275 Google Scholar
63
1. Mak H. C.
2. Justman Q
2017Genotype-Phenotype Mapping Meets Single Cell BiologyCell Syst 4:1–2https://doi.org/10.1016/j.cels.2017.01.008 Google Scholar
64
1. Pigliucci M
2010Genotype-phenotype mapping and the end of the ’genes as blueprint’ metaphorPhilos Trans R Soc Lond B Biol Sci 365:557–566https://doi.org/10.1098/rstb.2009.0241 Google Scholar
65
1. Alberch P
1991From genes to phenotype: dynamical systems and evolvabilityGenetica 84:5–11https://doi.org/10.1007/BF00123979 Google Scholar
66
1. Costanzo M.
2. et al.
2010The genetic landscape of a cellScience 327:425–431https://doi.org/10.1126/science.1180823 Google Scholar
67
1. Hinz A.
2. Amado A.
3. Kassen R.
4. Bank C.
5. Wong A
2024Unpredictability of the Fitness Effects of Antimicrobial Resistance Mutations Across Environments in Escherichia coliMol Biol Evol 41https://doi.org/10.1093/molbev/msae086 Google Scholar
68
1. Rauscher R.
2. et al.
2021Positive epistasis between disease-causing missense mutations and silent polymorphism with effect on mRNA translation velocityProc Natl Acad Sci U S A 118https://doi.org/10.1073/pnas.2010612118 Google Scholar
69
1. Khan A. I.
2. Dinh D. M.
3. Schneider D.
4. Lenski R. E.
5. Cooper T. F
2011Negative epistasis between beneficial mutations in an evolving bacterial populationScience 332:1193–1196https://doi.org/10.1126/science.1203801 Google Scholar
70
1. Weinreich D. M.
2. Watson R. A.
3. Chao L
2005Perspective: Sign epistasis and genetic constraint on evolutionary trajectoriesEvolution 59:1165–1174Google Scholar
71
1. Sailer Z. R.
2. Harms M. J
2017Molecular ensembles make evolution unpredictableProc Natl Acad Sci U S A 114:11938–11943https://doi.org/10.1073/pnas.1711927114 Google Scholar
72
1. Morris S. C
2010Evolution: like any other science it is predictablePhilos Trans R Soc Lond B Biol Sci 365:133–145https://doi.org/10.1098/rstb.2009.0154 Google Scholar
73
1. Harms M. J.
2. Thornton J. W
2014Historical contingency and its biophysical basis in glucocorticoid receptor evolutionNature 512:203–207https://doi.org/10.1038/nature13410 Google Scholar
74
1. Miton C. M.
2. Tokuriki N
2016How mutational epistasis impairs predictability in protein evolution and designProtein Sci 25:1260–1272https://doi.org/10.1002/pro.2876 Google Scholar
75
1. Lassig M.
2. Mustonen V.
3. Walczak A. M.
2017Predicting evolutionNat Ecol Evol 1:77https://doi.org/10.1038/s41559-017-0077 Google Scholar
76
1. Sarkisyan K. S.
2. et al.
2016Local fitness landscape of the green fluorescent proteinNature 533:397–401https://doi.org/10.1038/nature17995 Google Scholar
77
1. Starr T. N.
2. et al.
2022ACE2 binding is an ancestral and evolvable trait of sarbecovirusesNature 603:913–918https://doi.org/10.1038/s41586-022-04464-z Google Scholar
78
1. Park Y.
2. Metzger B. P. H.
3. Thornton J. W
2022Epistatic drift causes gradual decay of predictability in protein evolutionScience 376:823–830https://doi.org/10.1126/science.abn6895 Google Scholar
79
1. Macken C. A.
2. Perelson A. S
1989Protein evolution on rugged landscapesProc Natl Acad Sci U S A 86:6191–6195https://doi.org/10.1073/pnas.86.16.6191 Google Scholar
80
1. Provine W. B
1986Sewall Wright and Evolutionary BiologyChicago University Press Google Scholar
81
1. Despres P. C.
2. et al.
2024Compensatory mutations potentiate constructive neutral evolution by gene duplicationbioRxiv https://doi.org/10.1101/2024.02.12.579783 Google Scholar
82
1. Douglas S. M.
2. Chubiz L. M.
3. Harcombe W. R.
4. Marx C. J
2017Identification of the potentiating mutations and synergistic epistasis that enabled the evolution of inter-species cooperationPLoS One 12:e0174345https://doi.org/10.1371/journal.pone.0174345 Google Scholar
83
1. Ando H.
2. Miyoshi-Akiyama T.
3. Watanabe S.
4. Kirikae T
2014A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosisMol Microbiol 91:538–547https://doi.org/10.1111/mmi.12476 Google Scholar
84
1. Kershner J. P.
2. et al.
2016A Synonymous Mutation Upstream of the Gene Encoding a Weak-Link Enzyme Causes an Ultrasensitive Response in Growth RateJ Bacteriol 198:2853–2863https://doi.org/10.1128/JB.00262-16 Google Scholar
85
1. Kudla G.
2. Murray A. W.
3. Tollervey D.
4. Plotkin J. B
2009Coding-sequence determinants of gene expression in Escherichia coliScience 324:255–258https://doi.org/10.1126/science.1170160 Google Scholar
86
1. Goodman D. B.
2. Church G. M.
3. Kosuri S
2013Causes and effects of N-terminal codon bias in bacterial genesScience 342:475–479https://doi.org/10.1126/science.1241934 Google Scholar
87
1. Bailey S. F.
2. Alonso Morales L. A.
3. Kassen R
2021Effects of Synonymous Mutations beyond Codon Bias: The Evidence for Adaptive Synonymous Substitutions from Microbial Evolution ExperimentsGenome Biol Evol 13https://doi.org/10.1093/gbe/evab141 Google Scholar
88
1. Jiang Y.
2. et al.
2023How synonymous mutations alter enzyme structure and function over long timescalesNat Chem 15:308–318https://doi.org/10.1038/s41557-022-01091-z Google Scholar
89
1. Lan P. D.
2. et al.
2024Synonymous Mutations Can Alter Protein Dimerization Through Localized Interface Misfolding Involving Self-entanglementsJ Mol Biol 436:168487https://doi.org/10.1016/j.jmb.2024.168487 Google Scholar
90
1. Deane C. M.
2. Saunders R
2011The imprint of codons on protein structureBiotechnol J 6:641–649https://doi.org/10.1002/biot.201000329 Google Scholar
91
1. Yu C. H.
2. et al.
2015Codon Usage Influences the Local Rate of Translation Elongation to Regulate Co-translational Protein FoldingMol Cell 59:744–754https://doi.org/10.1016/j.molcel.2015.07.018 Google Scholar
92
1. Buhr F.
2. et al.
2016Synonymous Codons Direct Cotranslational Folding toward Different Protein ConformationsMol Cell 61:341–351https://doi.org/10.1016/j.molcel.2016.01.008 Google Scholar
93
1. Sauna Z. E.
2. Kimchi-Sarfaty C
2011Understanding the contribution of synonymous mutations to human diseaseNat Rev Genet 12:683–691https://doi.org/10.1038/nrg3051 Google Scholar
94
1. Hunt R. C.
2. Simhadri V. L.
3. Iandoli M.
4. Sauna Z. E.
5. Kimchi-Sarfaty C.
2014Exposing synonymous mutationsTrends Genet 30:308–321https://doi.org/10.1016/j.tig.2014.04.006 Google Scholar
95
1. Supek F.
2. Minana B.
3. Valcarcel J.
4. Gabaldon T.
5. Lehner B
2014Synonymous mutations frequently act as driver mutations in human cancersCell 156:1324–1335https://doi.org/10.1016/j.cell.2014.01.051 Google Scholar
96
1. Sharma Y.
2. et al.
2019A pan-cancer analysis of synonymous mutationsNat Commun 10:2569https://doi.org/10.1038/s41467-019-10489-2 Google Scholar
97
1. Nilsson G.
2. Belasco J. G.
3. Cohen S. N.
4. von Gabain A
1987Effect of premature termination of translation on mRNA stability depends on the site of ribosome releaseProc Natl Acad Sci U S A 84:4890–4894https://doi.org/10.1073/pnas.84.14.4890 Google Scholar
98
1. Shi M.
2. et al.
2015Premature Termination Codons Are Recognized in the Nucleus in A Reading-Frame Dependent MannerCell Discov 1:15001https://doi.org/10.1038/celldisc.2015.1 Google Scholar
99
1. Kim J. H.
2. et al.
2022SMG-6 mRNA cleavage stalls ribosomes near premature stop codons in vivoNucleic Acids Res 50:8852–8866https://doi.org/10.1093/nar/gkac681 Google Scholar
100
1. Arribere J. A.
2. Fire A. Z
2018Nonsense mRNA suppression via nonstop decayElife 7https://doi.org/10.7554/eLife.33292 Google Scholar
101
1. Garcia-Rodriguez R.
2. et al.
2020Premature termination codons in the DMD gene cause reduced local mRNA synthesisProc Natl Acad Sci U S A 117:16456–16464https://doi.org/10.1073/pnas.1910456117 Google Scholar
102
1. Romero Romero M. L.
2. et al.
2024Environment modulates protein heterogeneity through transcriptional and translational stop codon readthroughNat Commun 15:4446https://doi.org/10.1038/s41467-024-48387-x Google Scholar
103
1. Payne J. L.
2. Wagner A
2019The causes of evolvability and their evolutionNat Rev Genet 20:24–38https://doi.org/10.1038/s41576-018-0069-z Google Scholar
104
1. Pigliucci M
2008Is evolvability evolvable?Nat Rev Genet 9:75–82https://doi.org/10.1038/nrg2278 Google Scholar
105
1. Raynes Y.
2. Gazzara M. R.
3. Sniegowski P. D
2011Mutator dynamics in sexual and asexual experimental populations of yeastBMC Evol Biol 11https://doi.org/10.1186/1471-2148-11-158 Google Scholar
106
1. van Nimwegen E.
2. Crutchfield J. P.
3. Huynen M
1999Neutral evolution of mutational robustnessProc Natl Acad Sci U S A 96:9716–9720https://doi.org/10.1073/pnas.96.17.9716 Google Scholar
107
1. Tenaillon O.
2. Toupance B.
3. Le Nagard H.
4. Taddei F.
5. Godelle B
1999Mutators, population size, adaptive landscape and the adaptation of asexual populations of bacteriaGenetics 152:485–493https://doi.org/10.1093/genetics/152.2.485 Google Scholar
108
1. Wagner A
2023Evolvability-enhancing mutations in the fitness landscapes of an RNA and a proteinNat Commun 14:3624https://doi.org/10.1038/s41467-023-39321-8 Google Scholar
109
1. Good B. H.
2. Rouzine I. M.
3. Balick D. J.
4. Hallatschek O.
5. Desai M. M
2012Distribution of fixed beneficial mutations and the rate of adaptation in asexual populationsProc Natl Acad Sci U S A 109:4950–4955https://doi.org/10.1073/pnas.1119910109 Google Scholar
110
1. Good B. H.
2. Desai M. M
2014Deleterious passengers in adapting populationsGenetics 198:1183–1208https://doi.org/10.1534/genetics.114.170233 Google Scholar
111
1. Hallatschek O
2011The noisy edge of traveling wavesProc Natl Acad Sci U S A 108:1783–1787https://doi.org/10.1073/pnas.1013529108 Google Scholar

Article and author information

Author information

Sarvesh Baheti
Department of Chemical Engineering, Indian Institute of Technology Jodhpur, Jodhpur, India
Namratha Raj
Department of Chemical Engineering, Indian Institute of Technology Bombay, Mumbai India
Supreet Saini
Department of Chemical Engineering, Indian Institute of Technology Bombay, Mumbai India
ORCID iD: 0000-0001-6838-4619
- For correspondence: saini@che.iitb.ac.in

Author Notes

Competing Interest Statement: The authors have declared no competing interest.

Version history

Preprint posted: October 1, 2024
Sent for peer review: November 10, 2024
Reviewed Preprint version 1: February 3, 2025
Reviewed Preprint version 2: October 30, 2025

Cite all versions

You can cite all versions using the DOI https://doi.org/10.7554/eLife.104848. This DOI represents all versions, and will always resolve to the latest one.

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

views: 1,033
downloads: 57
citation: 1

Views, downloads and citations are aggregated across all versions of this paper published by eLife.