Predictable properties of fitness landscapes induced by adaptational tradeoffs
Abstract
Fitness effects of mutations depend on environmental parameters. For example, mutations that increase fitness of bacteria at high antibiotic concentration often decrease fitness in the absence of antibiotic, exemplifying a tradeoff between adaptation to environmental extremes. We develop a mathematical model for fitness landscapes generated by such tradeoffs, based on experiments that determine the antibiotic dose-response curves of Escherichia coli strains, and previous observations on antibiotic resistance mutations. Our model generates a succession of landscapes with predictable properties as antibiotic concentration is varied. The landscape is nearly smooth at low and high concentrations, but the tradeoff induces a high ruggedness at intermediate antibiotic concentrations. Despite this high ruggedness, however, all the fitness maxima in the landscapes are evolutionarily accessible from the wild type. This implies that selection for antibiotic resistance in multiple mutational steps is relatively facile despite the complexity of the underlying landscape.
eLife digest
Drug resistant bacteria pose a major threat to public health systems all over the world. Darwinian evolution is at the heart of this drug resistance: a mutation that allows bacteria to divide in the presence of a drug appears initially in a single cell. This mutation makes this cell and its descendants more likely to survive, so they can end up taking over the population.
The evolution of resistance can be thought of in terms of ‘bacterial fitness landscapes’. These landscapes visualise the relationship between the mutations present in a population of bacteria and how quickly the bacteria divide or reproduce. They are called landscapes because they can be represented as a series of mountains and valleys. The peaks of this landscape represent combinations of mutations that give bacteria the greatest chance of dividing (the greatest fitness). In a landscape with multiple peaks, some peaks will be higher than others. If the landscape is smooth, bacteria can easily acquire mutations for drug resistance. However, in a rugged landscape, bacteria may get stuck at sub-optimal peaks, because the mutations that would enable them to reach a higher peak would first lead them to losing fitness.
Several studies on the evolution of antibiotic resistance exist for specific bacteria and specific drugs, but relatively little is known about the general properties of the underlying fitness landscapes. Do these landscapes have features that can help explain the rapid evolution of high levels of resistance?
Antibiotic resistance often comes at a cost – more resistant strains of bacteria tend to grow more slowly when the drug is absent. To build a model of antibiotic resistance landscapes, Das et al. performed growth experiments on several strains of Escherichia coli exposed to a drug called ciprofloxacin. They measured how the rate at which the bacteria divided changed at different antibiotic concentrations, and combined this with the observation about resistant strains growing slower to formulate a mathematical model of antibiotic resistance landscapes. The landscapes that resulted were found to be very rugged, but unexpectedly, the bacteria could still evolve to access all fitness peaks. This means that landscape ruggedness does not constrain the evolution of resistance.
Understanding how and when resistance evolves is important both for the design of new drugs and the development of treatment protocols. A specific prediction of the model is that resistance evolution in fitness landscapes where resistant strains divide more slowly is reversible. This implies that the bacteria could regain their susceptibility to treatment when the drug concentration decreases, but this would depend on the specific bacteria and drug in question. More broadly, the model provides a framework for addressing the evolution of resistance in clinical and environmental settings, where drug concentrations vary widely in time and space.
Introduction
Sewall Wright introduced the concept of fitness landscapes in 1932 (Wright, 1932), and for decades afterwards it persisted chiefly as a metaphor, due to lack of sufficient data. This has changed considerably in recent decades (de Visser and Krug, 2014; Hartl, 2014; Kondrashov and Kondrashov, 2015; Fragata et al., 2019). There are now a large number of experimental studies that have constructed fitness landscapes for combinatorial sets of mutations relevant to particular phenotypes, such as the resistance of microbial pathogens to antibiotics (Weinreich et al., 2006; DePristo et al., 2007; Marcusson et al., 2009; Lozovsky et al., 2009; Brown et al., 2010; Schenk et al., 2013; Goulart et al., 2013; Mira et al., 2015; Palmer et al., 2015; Knopp and Andersson, 2018), and the genomic scale of these investigations is rapidly growing (Wu et al., 2016; Bank et al., 2016; Domingo et al., 2018; Pokusaeva et al., 2019). Mathematical modeling of fitness landscapes has also seen a revival, motivated partly by the need to quantify and interpret the ruggedness of empirical fitness landscapes (Szendro et al., 2013; Weinreich et al., 2013; Neidhart et al., 2014; Ferretti et al., 2016; Blanquart and Bataillon, 2016; Crona et al., 2017; Hwang et al., 2018; Kaznatcheev, 2019; Crona, 2020). Conceptual breakthroughs, such as the notion of sign epistasis (where a mutation is beneficial in some genetic backgrounds but deleterious in others), have shed light on how ruggedness can constrain evolutionary trajectories (Weinreich et al., 2005; Poelwijk et al., 2007; Franke et al., 2011; Lobkovsky and Koonin, 2012; Zagorski et al., 2016).
Despite this progress, a limitation of current studies of fitness landscapes is that they focus mostly on (gene-gene) interactions, and little on (where stands for environment) interactions, that is on how changes in environment modify gene-gene interactions. A few recent studies have begun to address this question (Flynn et al., 2013; Taute et al., 2014; Gorter et al., 2018; de Vos et al., 2018). In the context of antibiotic resistance, it has been realized that the fitness landscape of resistance genes depends quite strongly on antibiotic concentration (Mira et al., 2015; Stiffler et al., 2015; Ogbunugafor et al., 2016). This is highly relevant to the clinical problem of resistance evolution, since concentration of antibiotics can vary widely in a patient’s body as well as in various non-clinical settings (Kolpin et al., 2004; Andersson and Hughes, 2014). Controlling the evolution of resistance mutants thus requires an understanding of fitness landscapes as a function of antibiotic concentration. Empirical investigations of such scenarios are still limited, and systematic theoretical work on this question is also lacking.
In the present work, we aim to develop a theory of interactions for a specific class of landscapes, with particular focus on applications to antibiotic resistance. The key feature of the landscapes we study is that every mutation comes with a tradeoff between adaptation to the two extremes of an environmental parameter. For example, it has been known for some time that antibiotic resistance often comes with a fitness cost, such that a bacterium that can tolerate high drug concentrations grows slowly in drug-free conditions (Andersson and Hughes, 2010; Melnyk et al., 2015). While such tradeoffs are not universal (Hughes and Andersson, 2017; Durão et al., 2018), they certainly occur for a large number of mutations and a variety of drugs.
Tradeoffs can also arise in complex scenarios involving multiple drugs. It has been reported in Stiffler et al., 2015 that certain mutations in TEM-1 -lactamase are neutral at low ampicillin concentration but deleterious at high concentration, and that a number of the latter mutations also confer resistance to cefotaxime. Therefore in a medium with cefotaxime and a moderately high concentration of ampicillin, it is possible that these mutations will be deleterious at low cefotaxime concentrations but beneficial at high cefotaxime concentration. Fitness landscapes with adaptational tradeoffs are therefore also of potential relevance to evolution in response to multi-drug combinations.
Our starting point for investigating fitness landscapes induced by tradeoffs is the knowledge of two phenotypes that are well studied – the drug-free growth rate (which we call the null-fitness) and the IC50 (the drug concentration that reduces growth rate by half), which is a measure of antibiotic resistance. These two phenotypes correspond to the two extreme regimes of an environmental parameter, that is zero and highly inhibitory antibiotic concentrations. The function that describes the growth rate of a bacterium for antibiotic concentrations between these two extremes is called the dose-response curve or the inhibition curve (Regoes et al., 2004). When tradeoffs are present, the dose-response curves of different mutants must intersect as the concentration is varied (Gullberg et al., 2011). This is schematically shown in Figure 1. The intersection of dose-response curves of the wild type and the mutant happens at point A, swapping the rank order between the two fitness values. The intersection point is known as the minimum selective concentration (MSC), and it defines the lower boundary of the mutant selection window (MSW) within which the resistance mutant has a selective advantage relative to the wild type (Khan et al., 2017; Alexander and MacLean, 2018).
When there are several possible mutations and multiple combinatorial mutants, a large number of such intersections occur as the concentration of the antibiotic increases. This leads to a succession of different fitness landscapes defined over the space of genotype sequences (Maynard Smith, 1970; Kauffman and Levin, 1987). Whenever the curves of two mutational neighbors (genotypes that differ by one mutation) intersect, there can be an alteration in the evolutionary trajectory towards resistance, whereby a forward (reverse) mutation now becomes more likely to fix in the population than the corresponding reverse (forward) mutation. These intersections change the ruggedness of landscapes and the accessibility of fitness maxima. In this way a rich and complex structure of selective constraints emerges in the MSW. To explore the evolutionary consequences of these constraints, here we construct a theoretical model based on existing empirical studies as well as our own work on ciprofloxacin resistance in E. coli. Specifically, we address two fundamental questions: (i) How does the ruggedness of the fitness landscape vary as a function of antibiotic concentration? (ii) How accessible are the fitness optima as a function of antibiotic concentration?
We find that even when the null-fitness and resistance values of the mutations combine in a simple, multiplicative manner, the intersections of the curves produce a highly epistatic landscape at intermediate concentrations of the antibiotic. This is an example of a strong interaction, where changes in the environmental variable drastically alter the interactions between genes. Despite the high ruggedness at intermediate concentrations, however, the topology of the landscapes is systematically different from existing oft-studied random landscape models, such as the House-of-Cards model (Kauffman and Levin, 1987; Kingman, 1978), the Kauffman NK model (Kauffman and Weinberger, 1989; Hwang et al., 2018) or the Rough Mt. Fuji model (Neidhart et al., 2014). For example, most fitness maxima have similar numbers of mutations that depend logarithmically on the antibiotic concentration. Importantly, all the fitness maxima remain highly accessible through adaptive paths with sequentially fixing mutations. In particular, any fitness maximum (including the global maximum) is accessible from the wild type as long as the wild type is viable. As a consequence, the evolution of high levels of antibiotic resistance by multiple mutations (Hughes and Andersson, 2017; Wistrand-Yuen et al., 2018; Rehman et al., 2019) is much less constrained by the tradeoff-induced epistatic interactions than might have been expected on the basis of existing models.
Results
Mathematical model of tradeoff-induced fitness landscapes
The chief goal of this paper is to develop and explore a mathematical framework to study tradeoff-induced fitness landscapes. We consider a total of mutations, each of which increases antibiotic resistance. A fitness landscape is a real-valued function defined on the set of genotypes made up of all combinations of these mutations. A genotype can be represented by a binary string of length , where a 1 (0) at each position represents the presence (absence) of a specific mutation. Alternatively, any genotype is uniquely identified as a subset of the mutations (the wild type is the null subset, that is the subset with no mutations).
In this paper, unless mentioned otherwise, we define the fitness as the exponential growth rate of a microbial population. The fitness is a function of antibiotic concentration. This function has two parameters of particular interest to us – the growth rate at zero concentration, which we refer to as the null-fitness and denote by , and a measure of resistance such as IC50 which we denote by . Each single mutation is described by the pair , where and are the null-fitness and resistance values respectively of the ith single mutant. We further rescale our units such that for the wild type, and . We consider mutations that come with a fitness-resistance tradeoff, that is a single mutant has an increased resistance () and a reduced null-fitness () compared to the wild type. To proceed we need to specify two things: (i) how the fitness of the wild type and the mutants depend on antibiotic concentration, and in particular if this dependence exhibits a pattern common to various mutant strains; (ii) how the and values of the combinatorial mutants depend on those of the individual mutations. To address these issues we take guidance from two empirical observations.
Scaling of dose-response curves
Marcusson et al., 2009 have constructed a series of E. coli strains with single, double and triple mutations conferring resistance to the fluoroquinolone antibiotic ciprofloxacin (CIP), which inhibits DNA replication (Drlica et al., 2009). In their study they measured MIC (minimum inhibitory concentration) values and null-fitness but did not report dose-response curves. Some of the present authors have recently shown that the dose-response curve of the wild-type E. coli (strain K-12 MG1655) in the presence of ciprofloxacin can be fitted reasonably well by a Hill function (Ojkic et al., 2019).
Here we expand on this work and determine dose-response curves for a range of single- and double-mutants with mutations restricted to five specific loci known to confer resistance to CIP (Marcusson et al., 2009) (see Materials and methods). Figure 2A shows the measured curves for the wild type, the five single mutants, and eight double-mutant combinations. The genotypes are represented as binary strings, where a 1 or 0 at each position denotes respectively the presence or absence of a particular mutation. If we rescale the concentration of CIP by IC50 of the corresponding strain, , and the growth rate by the null-fitness , the curves collapse to a single curve that can be approximated by the Hill function (Figure 2B). The precise shape of the curve is not important for further analysis in this paper. However, the data collapse suggests that we can assume that the dose-response curve of a mutant with (relative) null-fitness and (relative) resistance is
that is, it has the same shape as the wild-type curve except for a rescaling of the fitness and concentration axes. Similar scaling relations have been reported previously by Wood et al., 2014 and Chevereau et al., 2015. A good biological understanding of the conditions underlying this feature is presently lacking, but it seems intuitively plausible that the shape would be robust to changes that do not qualitatively alter the basic physiology of growth and resistance.
Limited epistasis in and
An interesting recent finding reported by Knopp and Andersson, 2018 is that chromosomal resistance mutations in Salmonella typhimurium mostly alter the null-fitness as well as the MIC of various antibiotics in a non-epistatic, multiplicative manner, that is if a particular mutation increases (decreases) the resistance (null-fitness) by a factor , and another mutation does the same with a factor , then the mutations jointly alter these phenotypes roughly by a factor of (with a few exceptions). We have done a similar comparison for the data on the null-fitness and MIC for E. coli strains in Marcusson et al., 2009. We have analyzed a subset of 4 mutations for which the complete data set for all combinatorial mutants is available from Marcusson et al., 2009. The data are shown in Table 1. Out of 11 multiple-mutants, only 2 show epistasis in and 4 show epistasis in . Moreover, in all cases where significant epistasis occurs it is negative, that is the effect of the multiple mutants is weaker than expected from the single mutation effects.
Formulation of the model
The above observations suggest a model where one assumes, as an approximation, that all the and values of individual mutations combine multiplicatively. A genotype with mutations has a null-fitness and a resistance value given by
Moreover, the dose-response curves of the genotypes are taken to be of the scaling form (Equation 1), where the function does not depend on the genotype. As indicated before, and without any loss of generality, we choose units such that, for the wild type, and . Therefore the dose-response curve of the wild type is with , and choosing IC50 as a measure of resistance, we have . Henceforth, we refer to simply as the concentration. We also recall that the condition of adaptational tradeoff means that and for all mutations.
If the and values combine non-epistatically, and if the shape of the dose-response curve is known, it is thus possible to construct the entire concentration-dependent landscape of size from just measurements (of the and values of the single mutants) instead of the measurement of fitness values at every concentration. In practice we do not expect a complete lack of epistasis among all mutations of interest, and the dose-response curve is also an approximation obtained by fitting a curve through a finite set of fitness values known only with limited accuracy. However, the fitness rank order of genotypes, and related topographic features such as fitness peaks, are robust to a certain amount of error in fitness values (Crona et al., 2017), and our model may be used to construct these to a good approximation.
Lastly, we require that the dose-response curves of the wild type and a mutant intersect at most once, which implies that the equation with and has at most one solution. This then also implies that the curves of any genotype and a proper superset of it (i.e. a genotype which contains all the mutations in and some more) intersect at most once. This property holds for all functions that have been used to represent dose-response curves in the literature, such as the Hill function, the half-Gaussian or the exponential function, as well as for all concave function with negative second derivate (see Materials and methods for details).
Properties of tradeoff-induced fitness landscapes
To understand the evolutionary implications of our model, we first describe how the fitness landscape topography changes with the environmental parameter represented by the antibiotic concentration. Next we analyze the properties of mutational pathways leading to highly fit genotypes.
Intersection of curves and changing landscapes
We start with a simple example of mutations and a Hill-shaped dose-response curve (Figure 3). At , the rank ordering is determined by the null-fitness. The wild type has maximal fitness, and the double mutant is less fit than the single mutants. As increases, the fitness curves start to intersect, and each intersection switches the rank of two genotypes. In the present example we find a total of six intersections and therefore seven different rank orders across the full range of . This is actually the maximum number of rank orders that can be found by scanning through for , see Materials and methods. The final fitness rank order (in the region G in Figure 3A) is the reverse of the original rank order at . Figure 3B depicts the concentration-dependent fitness landscape of the 2-locus system in the form of fitness graphs. A fitness graph represents a fitness landscape as a directed graph, where neighboring nodes are genotypes that differ by one mutation, and arrows point toward the genotypes with higher fitness (de Visser et al., 2009; Crona et al., 2013).
A fitness graph does not uniquely specify the rank order in the landscape (Crona et al., 2017). For example, the three regions C, D and E have different rank orders but the same fitness graph. Because selection drives an evolving population towards higher fitness, a fitness graph can be viewed as a roadmap of possible evolutionary trajectories. In particular, a fitness peak (marked in red in Figure 3B) is identified from the fitness graph as a node with only incoming arrows. Fitness graphs also contain the complete information about the occurrences of sign epistasis. Sign epistasis with respect to a certain mutation occurs when the mutation is beneficial in some backgrounds but deleterious in others (Weinreich et al., 2005; Poelwijk et al., 2007). It is easy to read off sign epistasis for a mutation from the fact that parallel arrows (i.e. arrows corresponding to the gain or loss of the same mutation) in a fitness graph point in opposite directions.
For example, in the graph for the region B there is sign epistasis in the first position, since the parallel arrows 00 → 10 and 01 ← 11 point in opposite directions. Notice that in the current example, we start with a smooth landscape at (as seen in the fitness graph for region A), and the number of peaks and the degree of sign epistasis both reach a maximum in the intermediate region C+D+E. This fitness graph displays reciprocal sign epistasis, which is a necessary condition for the existence of multiple fitness peaks (Poelwijk et al., 2011). Beyond the region E, the landscape starts to become smooth again, with only one fitness maximum and a lower degree of sign epistasis. In the last region G, the landscape is smooth with only one peak (the double mutant 11) and no sign epistasis.
These qualitative properties generalize to larger landscapes. To show this, we consider a statistical ensemble of landscapes with mutations, where the parameters , of single mutations are independently and identically distributed according to a joint probability density . Figure 4 shows the result of numerical simulations of these landscapes for . The mean number of fitness peaks with mutations reaches a maximum at where to leading order , independent of any further details of the system, as argued in Materials and methods. The asymptotic expression works well already for (see inset of Figure 4A). Figure 4B shows the mean number of mutations in a fitness peak. This is well approximated by the curve , showing that the mean number of mutations in a fitness peak grows logarithmically with the concentration. This is consistent with what we would expect from the variation in the number of peaks with mutations as shown in Figure 4A. The existence of a typical number of mutations in a fitness peak is one of the distinctive features of our landscape, a feature typically lacking in other well-studied random landscape models. This property arises from the existence of adaptational tradeoffs. Since a high number of mutations is beneficial at higher concentrations but deleterious at lower concentrations, it is clear that there must be an optimal number of mutations at some intermediate concentration.
As another indicator of ruggedness, we consider the number of backgrounds in which a mutation is beneficial as a function of . At , any mutation is deleterious in all backgrounds, whereas at very large it is beneficial in all backgrounds. Therefore there is no sign epistasis in either case. Sign epistasis is maximized when a mutation is beneficial in exactly 1/2 of all backgrounds. Figure 5 shows the mean number of backgrounds (with mutations each) in which the occurrence of a mutation is beneficial, for two different values of . The curves have a sigmoidal shape, starting from zero and saturating at , which is the total number of backgrounds with mutations. The blue curve shows the mean total number of backgrounds (with any ) in which a mutation is beneficial, which has a similar shape. Since every mutation in every background goes from being initially deleterious to eventually beneficial, there must be some at which every mutation is beneficial in exactly half the backgrounds. The inset of Figure 5 shows that for backgrounds with mutations, the average concentration at which a mutation is beneficial in 1/2 the backgrounds is given by , which is the same concentration at which the largest number of fitness peaks were found in Figure 4. A derivation of this relation is given in Materials and methods. Similarly, when summed over all mutation numbers , the fraction of beneficial backgrounds reaches 1/2 around the same concentration at which the total number of fitness peaks is maximal. Since the number of backgrounds is largest at for combinatorial reasons, this concentration is approximately given by .
Complementary to these results about the background dependence of the sign of mutational effects, it can be shown that any two distinct sets of mutations occurring in any genetic background show sign epistasis at some value of . This is a consequence of the rank ordering properties of the landscapes that are described in the next subsection (see Materials and methods for a proof). A special case is that any two single mutations occurring in the wild type background must exhibit pairwise sign epistasis at some concentration.
Accessibility of fitness peaks
Having shown that tradeoff-induced fitness landscapes display a large number of fitness peaks at intermediate concentrations, we now ask how these peaks affect the evolutionary dynamics. We base the discussion on the concept of evolutionary accessibility, which effectively assumes a regime of weak mutation and strong selection (Gillespie, 1984). In this regime the evolutionary trajectory consists of a series of fixation events of beneficial single-step mutations represented by a directed path in the fitness graph of the landscape (Weinreich et al., 2005; Franke et al., 2011). We say that a genotype is accessible from another genotype if a directed path exists from the initial to the final genotype.
The accessibility of peaks in a fitness landscape is determined by the rank ordering of the genotypes. We now show that the rank orders of tradeoff-induced fitness landscapes are constrained in a way that gives rise to unusually high accessibility. Consider two distinct sets of one or more mutations and that can occur on the genetic background , and the four genotypes and , where a concatenation of symbols represents the genotype which contains all the mutations referred to by the symbols. The ordering condition (derived in Materials and methods) says that whenever is the fittest among these four genotypes, must be the least fit, and whenever is the fittest, must be the least fit. For the case of two single mutations this situation is illustrated by the fitness graphs in Figure 3B, where the background genotype is the fittest in the first segment A and the genotype is the fittest in the last segment G. The ordering condition has the immediate consequence that for all environments , the fittest genotype is always accessible from the background genotype . If the fittest genotype is one of the single mutants (segments B, C, D and F), then it is of course accessible. If it is the double mutant (segment G), then the background genotype must be the least fit genotype (from the ordering condition), and therefore and should be fitter than . Then is accessible from the wild type through the path and the path .
To fully exploit the consequences of the ordering property we need to introduce some notation. Let be a genotype with mutations. We define a subset of as a genotype with mutations, , which are all contained in as well. Likewise, a superset of is a genotype with mutations, , that contains all the mutations in . With this, the ordering condition can be seen to imply that the superset of a fitness peak is accessible from its own supersets. For example, if is the fittest genotype, then is a superset of it, and because of the ordering condition, must be fitter than its superset , and therefore accessible from it. Similarly, it is easy to show that the subset of a fitness peak is accessible from its own subsets. This property can be generalized and constitutes our main result on accessibility of fitness peaks.
Accessibility property: Any genotype that is a superset of a local fitness peak is accessible from all the superset genotypes of . Similarly, any genotype that is a subset of a local fitness peak is accessible from all the subset genotypes of .
The proof is given in Materials and methods. Three particularly important consequences are:
Any fitness peak is accessible from all its subset and superset genotypes.
Any fitness peak is accessible from the wild type. This is because the wild type is a subset of every genotype.
For the same reason, when the wild type is a fitness peak (e.g., at ), it is accessible from every genotype, and is therefore also the only fitness peak in the landscape. The same holds for the all-mutant when is sufficiently large, since it is a superset of every genotype.
These properties are illustrated by the fitness graph in Figure 6. We assume for some environment that the landscape has (at least) two peaks at the genotypes 1001 (marked in red) and 0111 (marked in blue). The colored arrows point towards mutational neighbors with higher fitness and are enforced by the accessibility property.
The edges without arrowheads are not constrained by the accessibility property and the corresponding arrows (which are not shown in the figure) could point in either direction. Consider the genotype 0111 (marked in blue). It is accessible from all its subsets, namely 0000, 0010, 0010, 0001, 0110, 0101 and 0011, following the upward pointing blue arrows. These subsets are in turn accessible from their subsets. For example, 0011 is accessible from all its subsets – 0000, 0010, and 0001. The fitness peak is also accessible from its superset 1111. The same property holds for the other fitness peak. The subsets or supersets may access the fitness peaks using other (unmarked) paths as well, which would include one or more of the undirected lines in conjunction with some of the arrows. Moreover, other genotypes, which are neither supersets nor subsets, may also access these fitness peaks through paths that incorporate some of the undirected edges.
A fitness peak together with its subset and superset genotypes defines a sub-landscape with remarkable properties. It is a smooth landscape with only one peak which is accessible from any genotype via all direct paths, that is paths where the number of mutations monotonically increases or decreases. For example, the fitness peak 1001 is accessible from the all-mutant 1111 by the two direct paths – 1111→1101→1001 and 1111→1011→1001. Likewise, the peak 0111 is accessible from its subset 0001 via the paths 0001→0101→0111 and 0001→0011→0111. In general, a peak with mutations is accessible from a subset genotype with mutations by direct paths, and from a superset genotype with mutations by direct paths. This gives a lower bound on the total number of paths by which a fitness peak is accessible from a subset or superset genotype.
Importantly, the accessibility property formulated above holds under more general conditions than stipulated in the model. We show in Materials and methods that it holds whenever the null fitness and resistance values of the mutations, and , do not show positive epistasis. This is a weaker requirement than our original assumption of a strict lack of epistasis in these two phenotypes.
In this context it should be noted that the rank orderings forbidden by the ordering condition all show positive epistasis for the fitness values, whereas all the allowed orderings can be constructed without positive epistasis. Therefore, any landscape where positive epistasis in fitness is absent will also display the accessibility property. However, whereas the lack of positive epistasis is a sufficient condition, it is not necessary. In particular, our model does allow for cases of positive epistasis in the fitness values.
Reachability of the fittest and the most resistant genotype
The preceding analyses have shown that within the mutant selection window, where mutants with higher fitness than the wild type exist, every fitness peak is accessible from the wild type. This includes in particular the fittest genotype at a given concentration. However, in general there will be many peaks in the fitness landscape, and it is not guaranteed that evolution will reach the fittest genotype. One can ask for the probability that the fittest genotype is actually accessed under the evolutionary dynamics, which we call its reachability. We assume that the dynamics is in the strong selection weak mutation (SSWM) regime, and the population is large enough such that the fixation probability of a mutant with selection coefficient is for , and 0 for (Gillespie, 1984). In our setting the selection coefficient is , where is the growth rate of a mutant appearing in a population of cells with growth rate .
Figure 7 shows the numerically obtained reachability for , averaged over the distribution given in Equation (8). The reachability of the highest peak is 1 at very low and very high concentrations, since there is only peak, the wild type or the all-mutant, at these extremes. The reachability is lower at intermediate concentrations, where there are multiple peaks, all of which are accessible from the wild type. The dashed blue line is the mean of the reciprocal of the total number of fitness peaks, and is therefore the mean reachability of fitness peaks. The reachability of the highest peak follows the qualitative behavior of the mean reachability, but remains higher than the mean reachability everywhere. The green curve is the reachability of the most resistant genotype, that is the all-mutant. It is extremely low at low and moderate concentrations and grows steeply and saturates quickly at a very large concentration. The all-mutant genotype is less-than-average reachable everywhere except at very high concentration, when it is the only fitness peak and accessible from every other genotype.
We have compared the reachability to that of two other widely studied landscape models. One is the House-of-Cards (HoC) model (Kauffman and Levin, 1987; Kingman, 1978), where each genotype is independently assigned a fitness value drawn from a continuous distribution. The reachability is found to be around 0.018, an order of magnitude smaller than the lowest reachability seen in the tradeoff-induced landscape. The mean number of fitness maxima in the HoC landscape is , which in this case is approximately 93.1, much higher than the maximum mean number of peaks in the tradeoff-induced landscape (inset of Figure 7). We would therefore naturally expect a smaller fraction of adaptive walks to terminate at the fittest peak. A more illuminating comparison is with the NK model (Kauffman and Weinberger, 1989; Hwang et al., 2018). Here, once again, , and the mutations are divided into two blocks of 5 mutations each. As per the usual definition of the model, the fitness of a genotype is the sum over the contributions of each of the 10 mutations, and the contribution of each mutation depends only the state of the block to which it belongs. The fitness contribution of each mutation for any state of the block is an independent random number. The mean number of fitness maxima here is (Perelson and Macken, 1995; Schmiegelt and Krug, 2014), which is comparable to the maximum mean number in the tradeoff-induced landscapes (see inset of Figure 7). Nonetheless, the reachability of the fittest peak (dotted pink line) is found to be nearly 4 times smaller than the lowest reachability in our landscape. We found that in a fraction of about 0.64 of the landscapes, the fittest maximum is not reached in any of 32000 dynamical runs, indicating the absence of an accessible path in most of these cases (Schmiegelt and Krug, 2014; Hwang et al., 2018). In contrast, an evolutionary path always exists to any fitness peak in the tradeoff-induced landscapes, as we saw in the previous subsection. This endows the tradeoff-induced landscapes with the unusual property of being highly rugged and at the same time having a much higher evolutionary reachability of the global fitness maximum compared to other models with similar ruggedness.
Discussion
Fitness landscapes depend on the environment, and gene-gene-interactions can be modified by the environment. Systematic studies of such interactions are rare, but they are clearly of relevance to scenarios such as the evolution of antibiotic resistance, where the antibiotic concentration can vary substantially in space and time. In this paper we have explored the structure of such landscapes in the presence of tradeoffs between fitness and resistance. We summarize the main findings of our work.
We have shown experimental evidence that the dose-response curves of various mutant strains of E. coli to the antibiotic ciprofloxacin have the same shape, except for a rescaling of the fitness and concentration values. If this shape is known, the fitness of a strain can be estimated at any antibiotic concentration simply by measuring its null-fitness and IC50 (or MIC). This makes it possible to construct empirical fitness landscapes at any antibiotic concentration from a limited set of data.
Under the assumptions of our model the degree of epistasis, particularly sign epistasis, is low for zero and high antibiotic concentrations, but it is nevertheless high in the intermediate concentration regime. The number of local fitness peaks scales exponentially in the number of mutations at these concentrations. Epistasis is often discussed as a property intrinsic to mutations and their genetic backgrounds, with limited consideration of environmental parameters. But in the landscapes studied here, the environmental parameter is of paramount importance, since changes in it can dramatically alter gene-gene interactions.
The expected number of mutations in a fitness peak increases logarithmically with the antibiotic concentration. This implies that, at a given concentration, the highly fit genotypes that make up the fitness peaks carry an optimal number of mutations that arises from the tradeoff between fitness cost and resistance.
Despite the high ruggedness, the landscape displays strong non-random patterns. A rank ordering condition between sets of mutations holds at all concentrations. A remarkable and unexpected consequence of this is that any fitness peak is evolutionarily accessible from the wild type.
It is well known from experimental studies of antimicrobial resistance evolution that highly resistant genotypes often require multiple mutations which can be acquired along different evolutionary trajectories. Epistatic interactions constrain these trajectories and are generally expected to impede the evolution of high resistance. We find that strong and complex epistatic interactions inevitably arise in the mutant selection window, but at the same time the evolution of the most resistant genotype (the identity of which changes with concentration) remains facile and can occur along many different pathways.
All of these conclusions follow from three basic assumptions that are readily generalizable beyond the context of antimicrobial resistance evolution: the existence of tradeoffs between two marginal phenotypes that govern the adaptation at extreme values of an environmental parameter; the scaling property of the shape of the tradeoff function; and the condition of limited epistasis for the marginal phenotypes. How generally these assumptions are valid is a matter of empirical investigation. We have shown that they hold for certain cases, and the interesting evolutionary implications of our results indicate that more empirical research in this direction will be useful.
In the case of antimicrobial resistance, there can be fitness compensatory mutations (Levin et al., 2000; Brown et al., 2010; Durão et al., 2018) that do not exhibit any adaptational tradeoffs. These mutations are generally found in a population in the later stages of the evolution of antibiotic resistance, which implies that they emerge in a genetic background of mutations with adaptational tradeoffs. An understanding of tradeoff-induced landscapes is therefore a prerequisite for predicting the emergence of compensatory mutations. While compensatory mutations are expected to facilitate the evolution of high resistance (Hughes and Andersson, 2017), our study shows that the acquisition of multiple resistance mutations may readily occur even if compensatory mutations are absent.
In the formulation of our model we have assumed for convenience that the marginal phenotypes combine multiplicatively, but this assumption is in fact not necessary for all our results. As shown in Materials and methods, our key results on accessibility only require the absence of positive epistasis. These results therefore hold without exception for the combinatorially complete data set in Table 1, where epistasis is either absent or negative. More generally, our analysis remains valid in the presence of the commonly observed pattern of diminishing returns epistasis among beneficial mutations (Chou et al., 2011; Schoustra et al., 2016; Wünsche et al., 2017). We expect our results to hold approximately even when there is a small degree of epistasis (positive or negative) in and , but we do not explore that question quantitatively in this paper.
A strict absence of epistasis, while certainly not universal, can be expected to occur under certain generic circumstances. Assuming that we deal with a single antibiotic that has a single target enzyme, we can think of two situations that could lead to a multiplicative behavior of the IC50: (i) Single mutations occur in different genes that affect the concentration of the antibiotic-target enzyme complex through independent mechanisms. (ii) Single mutations occur in the same gene but their effect is multiplicative due to the nature of antibiotic-enzyme molecular interactions. An example of scenario (i) would be a combination of mutations in the target gene (reduction of the binding affinity), its promoter (increase in expression), genes regulating the activity of efflux pumps and porins (decrease in intracellular concentration of the antibiotic), or genes controlling the level (increase in concentration) or activity of drug-degrading enzymes. These mechanisms are ‘orthogonal’ to each other, in the sense that they modify independent pathways within the cell. If each of them affects the concentration of the antibiotic-target complex through first-order kinetics, their cumulative effect will be multiplicative in terms of the IC50s of single mutations.
In the case of ciprofloxacin and E. coli (Figure 2 and Table 1), we expect mutations in gyrA (target) to be orthogonal to mutations in acrR and marR (efflux pumps). This is borne out by the observed multiplicativity of the IC50 (Table 1). In contrast, we expect scenario (ii) to apply if the single mutations affect different parts of the antibiotic-enzyme binding site independently. This is not the case for two particular mutations in gyrA studied here – S83L and D87N (see cases of epistasis in Table 1). An example for scenario (ii) are the two mutations P21L and A26T in the gene encoding the enzyme dihydrofolate reductase, which increase the resistance to trimethoprim in a multiplicative way in the absence of other mutations (Palmer et al., 2015). If the antibiotic has more than one target, multiplicativity would not generally hold. In particular, topoisomerase IV (gene parC) is a secondary target for ciprofloxacin with much weaker affinity than gyrase. Therefore, mutations in parC do not contribute to resistance unless there is already a mutation in gyrA. As a consequence, in contrast to the mutants listed in Table 1, combinations containing parC display positive epistasis (Hughes and Andersson, 2017).
The co-existence of high ruggedness and high accessibility found in the tradeoff-induced landscapes studied here is counterintuitive, and to the best of our knowledge fitness landscape models with this property have not been described previously. The situation is depicted schematically in Figure 8. The first landscape is smooth with a single peak that must be accessible from everywhere else. The second landscape is rugged, and each fitness peak is typically accessible from a few genotypes only. This is the typical picture of a rugged fitness landscape with limited accessibility, as it would be predicted by simple statistical models such as the HoC, NK or rough Mt. Fuji models (Szendro et al., 2013; Neidhart et al., 2014; Hwang et al., 2018). The landscapes we describe here belong to a third type, where a high number of peaks are accessible from a high number of genotypes, creating overlapping ‘valleys’ from which a population may evolve towards different local fitness maxima. Moreover, not only are fitness peaks accessible from all their subset and superset genotypes, but there are many direct paths leading up to each peak. This appears contrary to the expectation that in landscapes with high epistasis, accessibility should be facilitated through mutational reversions, that is, indirect paths (DePristo et al., 2007; Palmer et al., 2015; Wu et al., 2016; Zagorski et al., 2016).
We conclude with some possible directions for future work. Our model provides a principled framework for predicting how microbial fitness landscapes vary across different antibiotic concentrations. This could be exploited to describe situations where the antibiotic concentration varies on a time scale comparable to the evolution of resistance, either due to the degradation of the drug or by an externally imposed treatment protocol (Marrec and Bitbol, 2018). In this context it would be of particular interest to include compensatory mutations that lack the tradeoff between growth and resistance, since such mutations are expected to strongly affect the extent to which resistance can be reversed (Andersson and Hughes, 2010). Significant extension of the theory is required if the drug concentration varies on a faster time scale comparable to the growth time of the microbial population, in which case the concept of a concentration-dependent fitness would need to be reconsidered.
From the broader perspective of evolutionary systems with adaptational tradeoffs mediated by an environmental parameter, our study makes the important conceptual point that it is impossible to have non-epistatic fitness landscapes for all environments. Using the terminology of Gorter et al., 2016, the tradeoffs enforce reranking interactions which in turn, as we have shown, induce sign-epistatic interactions at intermediate values of the environmental parameter. Notably, this general conclusion does not depend on the scaling property of the tradeoff function. It would nevertheless be of great interest to identify instances of scaling for other types of adaptational tradeoffs, in which case the detailed predictions of our model could be applied as well.
Materials and methods
Experiments
Bacterial strains
Request a detailed protocolWe used strains from Marcusson et al., 2009 (courtesy of Douglas Huseby and Diarmaid Hughes). The strains are isogenic derivatives of MG1655, a K12 strain of the bacterium E. coli, with specific point mutations or gene deletions in five different loci: gyrA:S83L, gyrA:D87N, parC:S80I, ΔmarR, and ΔacrR. There are 32 possible combinations of these alleles, but we only used the wild type, single mutants (five strains) and double mutants (8 strains of 10 possible combinations): LM179 (00000), LM378 (10000), LM534 (01000), LM792 (00100), LM202 (00010), LM351 (00001), LM625 (11000), LM862 (10100), LM421 (10010), LM647 (10001), LM1124 (01100), LM538 (01010), LM592 (01001), LM367 (00011). A binary sequence after the strain’s name represents the presence/absence of a particular mutated allele (order as in the above list of genetic alterations).
Growth media and antibiotics
Request a detailed protocolLB growth medium was prepared according to Miller’s formulation (10 g tryptone, 5 g yeast extract, 10 g NaCl per litre). The pH was adjusted to 7.2 with NaOH, and autoclaved at 121°C for 20 min. Ciprofloxacin (CIP) solutions were prepared from a frozen stock (10 mg/ml ciprofloxacin hydrochloride, pharmaceutical grade, AppliChem, Darmstadt, in sterile, ultra-pure water) by diluting into LB to achieve the desired concentrations.
Dose-response curves
Request a detailed protocolWe incubated bacteria in 96-well clear flat bottom micro-plates (Corning Costar) inside a plate reader (BMG LABTECH FLUOstar Optima with a stacker) starting from two different initial cell densities (half a plate for each), and measured the optical density (OD) of each culture every 2–5 min to obtain growth curves. Plates were prepared automatically using a BMG LABTECH CLARIOstar plate reader equipped with two injectors connected to a bottle containing LB and a bottle with a solution of CIP in LB. The injectors were programmed to create different concentrations of CIP in each column of the 96 well plate. The injected volumes of the CIP solution were 0, 20, 25, 31, 39, 49, 62, 78, 98, 124, 155, 195 µl, and an appropriate volume of LB was added to bring the total volume to 195 µl per well. Since different strains had MICs spanning almost two decades of CIP concentrations, we used a different maximum concentration of the CIP solution for each strain (approximately 1.5–2 times the expected MIC). Bacteria were diluted from a thawed frozen stock 103 and 104 times in PBS (phosphate buffered saline buffer), and 5µl of the suspension was added to each well (103 dilution to rows A-D, 104 dilution to rows E-H). We used one strain per plate and up to four plates per strain (typically 1–2). After adding the suspension of bacteria to each well, the plates were immediately sealed with a transparent film to prevent evaporation, and put into a stacker (37°C, no shaking), from which they would be periodically fed into the FLUOstar Optima plate reader (37°C, orbital shaking at 200 rpm for 10 s prior to OD measurement). We then used the time shift method to obtain exponential growth rates for each strain and different concentrations of CIP, see Ojkic et al., 2019 for further details.
Mathematical methods
Rank orders and fitness graphs
Request a detailed protocolThe total number of possible rank orders with mutations is , which is 24 for . Not all these rank orders, however, can be realized as one scans through . Since any two curves intersect at most once, the maximum number of distinct rank orders that can be reached is the rank order at plus the total number of possible intersections, which is . Thus the upper bound on the number of rank orders found by scanning through is , which is smaller than for .
It is also instructive to determine the number of fitness graphs that can be found by varying for a system with mutations. This can be computed as follows: At every mutation is deleterious, and every mutational neighbor with one less mutation is fitter; but due to the tradeoff condition, at sufficiently large every mutation is beneficial and any mutational neighbor with one less mutation is less fit. In order for this reversal of fitness order to happen, the dose-response curves of any two mutational neighbors must intersect at some . Therefore, the number of fitness graphs generated is equal to the number of distinct pairs of mutational neighbors, which is , and the number of distinct fitness graphs encountered is . For , this number is 5, as seen in the example in the main text.
Condition for two dose-response curves to intersect at most once
Request a detailed protocolConsider two DR curves characterized by and , where and . We need to show that for the commonly observed cases, the curves and intersect at most once. First, notice that it is sufficient to prove this for the case , because any rescaling of the and axes does not alter the number or ordering of intersection points. Therefore we require and .
Let us consider the case where the dose-response curve is of the form of a Hill function, that is , with . The intersection of curves happens at the solution of , which we denote by . In this case the solution is given by
which is positive and unique if ; otherwise no solution with exists. It is similarly easy to show that at most one intersection point exists for exponentials, stretched exponentials, and half-Gaussians.
The property also holds for any concave dose-response curve with . We prove this as follows. Any intersection point is the solution of
where . We will show that is monotonic and therefore the above equation has at most one solution. We have
and has the same sign as the numerator . Since is a decreasing function and , . When , we also have . Since , this implies , and . Therefore is monotonically decreasing.
Proof of the accessibility property
Request a detailed protocolTo derive the ordering condition, let us start with the simplest case of two single mutations occurring on the wild type background. There are correspondingly four different genotypes , , , , which are listed in decreasing order of fitness at . Let the intersection of the DR curves of two genotypes and occur at . Then is given by the solution of
and is given by the solution of
This last equation can be re-written as
where . Comparing this with the first equation above, we have
This equation tells us that whenever the double mutant is fitter than one of the single mutants, the wild type must be less fit than the other single mutant. Consequently, when the double mutant is fitter than both the single mutants, the WT must be less fit than both the single mutants. In other words, the number of single mutants fitter than the wild type cannot be less than the number of single mutants less fit than the double mutant. This is the ordering condition given in the main text. Any ordering that violates this condition is a forbidden ordering. For greater clarity, we list all the possible forbidden orderings (up to interchange of indices and ).
Although we showed this for two single mutations in the wild type background, the same arguments hold for any two sets of mutations in any background, since the succession of orderings is independent of the rescalings of the fitness and concentration axes. To put it more precisely, , and are any three non-overlapping sets of mutations, where and are non-empty sets.
Next we use this to prove the accessibility property. Let have mutations. It is sufficient to prove that (i) any superset of with or fewer mutations is accessible from all its own supersets with or fewer mutations, for all (the statement follows from the case ); and that (ii) any subset of with or more mutations is accessible from any of its own subsets with or more mutations, for all (the statement corresponds to ). We prove this by induction.
Firstly, we notice that the case is trivial, since is of accessible from itself. For the case of supersets, our base case is , and the assertion above holds because is a local fitness peak, and therefore accessible from all its supersets with mutations, which are of course accessible from themselves.
Now we prove the induction step. Assume that all supersets of that have or fewer mutations (where ) are accessible from all their supersets with or fewer mutations. Consider a superset of with mutations, and denote it by , where is the set of mutations in not present in . By assumption, is accessible from . In the following, we use the notation to indicate that a genotype is fitter than a genotype (we use the '<' and '=' signs in a similar way). Therefore, we have .
Now consider any superset of with mutations, where the additional mutation not contained in is denoted . Then this superset can be denoted by . We must have since is a local fitness peak. We now have the relation . Therefore we must have , for otherwise we violate the ordering condition. Now since , must be accessible from , proving that any superset with mutations is accessible from any of its supersets with mutations. This completes the proof of the induction step.
The proof for the case of subsets is essentially the same, utilizing the symmetry between the wild type and the double mutant in the ordering condition.
The accessibility property follows entirely from the ordering condition, and hence any landscape that obeys the ordering condition will obey the theorem. The ordering condition follows from , as obtained in Equation 3. However, this same inequality obtains under more general conditions. To see this, let us define the null-fitness of the double mutant as , and the resistance of the double mutant as . The dose-response curves of and intersect at , whereas the curves for and intersect at
Now it is easy to show that is a decreasing function of both and . Therefore holds if and .
Number of local fitness peaks
Request a detailed protocolWhen dealing with complex fitness landscapes with parameters that can vary across species and environments, a useful strategy is to model the fitness effects as random variables that are chosen from a probability distribution (Kauffman and Levin, 1987; Szendro et al., 2013; Hwang et al., 2018). In the limit of large system size , many properties emerge that are independent of the details of the system. In practice, even relatively small system sizes are often approximated well by results obtained in the asymptotic limit.
The mean number of peaks with mutations in the tradeoff-induced landscapes is
where is the total number of genotypes with mutations, and is the probability that a genotype with mutations is a fitness maximum at antibiotic concentration . Then the total number of peaks at is . Let the resistance of a genotype be , and likewise its null-fitness be . The genotype is a local fitness maximum if it is fitter than all its subsets with mutations and all its supersets with mutations.
To find the concentration at which the curves of and its neighboring genotypes intersect, we start with the simplest case of the dose-response curves of the wild type and a single mutant . These curves intersect at the solution of , which is a decreasing function of and . The wild type is fitter than the single mutant when . Now the intersection of the DR curves of a genotype with mutations and a subset with mutations that lacks the mutation occurs at the solution of
which is read off as . Likewise, the intersection of the DR curves of and a superset with mutations that contains the additional mutation occurs at . Therefore is a fitness maximum if
for all and with and . Alternatively,
Let us consider the regime where . Then ; if is smaller than , it is clear that the second inequality is almost certainly satisfied whereas the probability of the first inequality is vanishingly small. Both the probabilities are finite if . Thus the probability of being a fitness peak is maximized when , where and depends on the details of the distribution . Thus the mean number of fitness peaks with mutations is maximal at where to leading order , independent of any further details of the system.
The total number of genotypes with mutations is , and , where , and
The mean number of fitness maxima can be found by multiplying this with . One may expect to be exponentially small in , since a total of inequalities (as indicated in Equation 6) need to be satisfied. However, this is complicated by the fact that the probabilities of the inequalities being satisfied are not independent. The correlations between the inequalities would depend on the distribution of and the dose-response curve. If the correlations are sufficiently weak, one might still expect to find an exponential scaling in large . To leading order is itself exponential in , and if the probability that a genotype is a fitness peak is exponentially small in , we expect the mean number of peaks to be exponential in as well. This is supported by the scaling shown in the inset of Figure 4A.
For the simulation results shown in the main text we chose a joint distribution of the form
The conditional distribution is a shifted gamma distribution. The shift ensures that the curves of a background genotype and a mutant intersect.
Sign epistasis
Request a detailed protocolSign epistasis with respect to a certain mutation occurs when the mutation is beneficial in one background but deleterious in another. We first show that any two distinct sets of mutations on any genetic background display sign epistasis at some value of the scaled concentration . Consider a genetic background , and two distinct sets of mutations and (which share no mutations with each other or ). At we have and . As increases, must become less fit than either or before becomes fitter than either of these (by the ordering condition). Without loss of generality, let us assume that becomes less fit than before it becomes less fit than . At this point, we must have and . This means that, in the wild type background, is beneficial in the absence of but deleterious in the presence of , indicating pairwise sign epistasis.
To quantify the amount of sign epistasis for large and , we next ask for the number of backgrounds in which a mutation is beneficial at concentration . If one considers only those backgrounds that have mutations, then would depend both on and . In a statistical ensemble of landscapes, one may compute the probability that a mutation is beneficial in a background with mutations, and of course . In the limit of large and , exhibits some universal properties to leading order. When , we are in the regime of high concentration relative to , and we expect a mutation to be beneficial. We find that to leading order , with corrections that are exponentially small in . When , we are at concentrations that are too low to prefer additional mutations, and is exponentially small in . When , we are at the threshold concentration where a new mutation becomes beneficial. Here we find that . For large we therefore expect a steep transition from 0 to 1 as the concentration crosses the threshold value (see inset of Figure 5).
Consider a mutation in a background with mutations . The mutation is beneficial in this background if
Taking logarithms, we have
Define and , and . Then the above inequality becomes
Let the distribution of be , and let . Define the random variable , and denote its distribution . Then the probability that a mutation is beneficial in a background with mutations is
The mean number of backgrounds with mutations in which a mutation is beneficial is . Note that where . When , for and for , with a sharp transition from 1 to 0 that happens within a region of width of the origin. Also for large , is sharply peaked around over a region of width .
When , over this entire region, as observed before. Thus to leading order, . The mean number of backgrounds in which a mutation is beneficial is .
where is defined in Equation 7. Therefore
to leading order.
When , the dominant contribution to the integral in (Equation 12) comes from , since quickly drops from 1 to zero for . Further, since for (except for a region of width around , as observed before), we can approximate simply by the probability that . Then
where is the large deviation function of −µ, and
This implies that is reduced by a factor that is exponentially small in compared to Equation 14, and therefore the fraction of backgrounds in which a mutation is beneficial is very small.
Finally, when , that is , is centered at the origin and decays over a width . For , is 0 except over a much smaller width to the right of the origin, whereas for , it is 1 except for a small region of width left of the origin. Thus the dominant contribution to the integral in Equation 12 comes from , and as before, can be approximated by the probabilitythat . Due to the central limit theorem, is approximately Gaussian and therefore symmetric around , and therefore . Consequently, we should have
which is 1/2 times the total number of backgrounds given by Equation 13. This proves that the concentration where the mutation is beneficial in half of the backgrounds is given by or for large and .
Data availability
Experimental data have been deposited in Edinburgh DataShare.
-
Edinburgh DataShareExperimental data for the article Predictable Properties of Fitness Landscapes Induced by Adaptational Tradeoffs.https://doi.org/10.7488/ds/2756
References
-
Antibiotic resistance and its cost: is it possible to reverse resistance?Nature Reviews Microbiology 8:260–271.https://doi.org/10.1038/nrmicro2319
-
Microbiological effects of sublethal levels of antibioticsNature Reviews Microbiology 12:465–478.https://doi.org/10.1038/nrmicro3270
-
Compensatory mutations restore fitness during the evolution of dihydrofolate reductaseMolecular Biology and Evolution 27:2682–2690.https://doi.org/10.1093/molbev/msq160
-
The peaks and geometry of fitness landscapesJournal of Theoretical Biology 317:1–10.https://doi.org/10.1016/j.jtbi.2012.09.028
-
Exploring the effect of sex on empirical fitness landscapesThe American Naturalist 174 Suppl 1:S15–S30.https://doi.org/10.1086/599081
-
Empirical fitness landscapes and the predictability of evolutionNature Reviews Genetics 15:480–490.https://doi.org/10.1038/nrg3744
-
Mutational reversions during adaptive protein evolutionMolecular Biology and Evolution 24:1608–1610.https://doi.org/10.1093/molbev/msm118
-
Quinolones: action and resistance updatedCurrent Topics in Medicinal Chemistry 9:981–998.https://doi.org/10.2174/156802609789630947
-
Evolutionary mechanisms shaping the maintenance of antibiotic resistanceTrends in Microbiology 26:677–691.https://doi.org/10.1016/j.tim.2018.01.005
-
Measuring epistasis in fitness landscapes: the correlation of fitness effects of mutationsJournal of Theoretical Biology 396:132–143.https://doi.org/10.1016/j.jtbi.2016.01.037
-
Evolution in the light of fitness landscape theoryTrends in Ecology & Evolution 34:69–82.https://doi.org/10.1016/j.tree.2018.10.009
-
Evolutionary accessibility of mutational pathwaysPLOS Computational Biology 7:e1002134.https://doi.org/10.1371/journal.pcbi.1002134
-
Molecular evolution over the mutational landscapeEvolution 38:1116–1129.https://doi.org/10.2307/2408444
-
What can we learn from fitness landscapes?Current Opinion in Microbiology 21:51–57.https://doi.org/10.1016/j.mib.2014.08.001
-
Evolutionary trajectories to antibiotic resistanceAnnual Review of Microbiology 71:579–596.https://doi.org/10.1146/annurev-micro-090816-093813
-
Universality classes of interaction structures for NK fitness landscapesJournal of Statistical Physics 172:226–278.https://doi.org/10.1007/s10955-018-1979-z
-
Towards a general theory of adaptive walks on rugged landscapesJournal of Theoretical Biology 128:11–45.https://doi.org/10.1016/S0022-5193(87)80029-2
-
The NK model of rugged fitness landscapes and its application to maturation of the immune responseJournal of Theoretical Biology 141:211–245.https://doi.org/10.1016/S0022-5193(89)80019-0
-
A simple model for the balance between selection and mutationJournal of Applied Probability 15:1–12.https://doi.org/10.2307/3213231
-
Urban contribution of pharmaceuticals and other organic wastewater contaminants to streams during differing flow conditionsScience of the Total Environment 328:119–130.https://doi.org/10.1016/j.scitotenv.2004.01.015
-
Topological features of rugged fitness landscapes in sequence spaceTrends in Genetics 31:24–33.https://doi.org/10.1016/j.tig.2014.09.009
-
Compensatory mutations, antibiotic resistance and the population genetics of adaptive evolution in BacteriaGenetics 154:985–997.
-
The fitness costs of antibiotic resistance mutationsEvolutionary Applications 8:273–283.https://doi.org/10.1111/eva.12196
-
Adaptive landscapes of resistance genes change as antibiotic concentrations changeMolecular Biology and Evolution 32:2707–2715.https://doi.org/10.1093/molbev/msv146
-
Adaptive landscape by environment interactions dictate evolutionary dynamics in models of drug resistancePLOS Computational Biology 12:e1004710.https://doi.org/10.1371/journal.pcbi.1004710
-
Reciprocal sign epistasis is a necessary condition for multi-peaked fitness landscapesJournal of Theoretical Biology 272:141–144.https://doi.org/10.1016/j.jtbi.2010.12.015
-
Pharmacodynamic functions: a multiparameter approach to the design of antibiotic treatment regimensAntimicrobial Agents and Chemotherapy 48:3670–3676.https://doi.org/10.1128/AAC.48.10.3670-3676.2004
-
Mechanisms of ciprofloxacin resistance in Pseudomonas aeruginosa: new approaches to an old problemJournal of Medical Microbiology 68:1–10.https://doi.org/10.1099/jmm.0.000873
-
Patterns of epistasis between beneficial mutations in an antibiotic resistance geneMolecular Biology and Evolution 30:1779–1787.https://doi.org/10.1093/molbev/mst096
-
Evolutionary accessibility of modular fitness landscapesJournal of Statistical Physics 154:334–355.https://doi.org/10.1007/s10955-013-0868-8
-
Diminishing-returns epistasis among random beneficial mutations in a multicellular fungusProceedings of the Royal Society B: Biological Sciences 283:1376.https://doi.org/10.1098/rspb.2016.1376
-
Quantitative analyses of empirical fitness landscapesJournal of Statistical Mechanics: Theory and Experiment 2013:P01005.https://doi.org/10.1088/1742-5468/2013/01/P01005
-
Should evolutionary geneticists worry about higher-order epistasis?Current Opinion in Genetics & Development 23:700–707.https://doi.org/10.1016/j.gde.2013.10.007
-
Evolution of high-level resistance during low-level antibiotic exposureNature Communications 9:1599.https://doi.org/10.1038/s41467-018-04059-1
-
ConferenceThe roles of mutation, inbreeding, crossbreeding and selection in evolutionProc 6th Int Congress of Genetics. pp. 356–366.
-
Diminishing-returns epistasis decreases adaptability along an evolutionary trajectoryNature Ecology & Evolution 1:0061.https://doi.org/10.1038/s41559-016-0061
-
Beyond the hypercube: evolutionary accessibility of fitness landscapes with realistic mutational networksPLOS Computational Biology 12:e1005218.https://doi.org/10.1371/journal.pcbi.1005218
Article and author information
Author details
Funding
Deutsche Forschungsgemeinschaft (CRC 1310)
- Joachim Krug
H2020 European Research Council (ERC Consolidator Grant 682237 EVOSTRUC)
- Rosalind J Allen
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
We thank Douglas Huseby and Diarmaid Hughes for providing us with the E. coli strains of Marcusson et al., 2009, and Tobias Bollenbach, Michael Brockhurst and Kristina Crona for useful comments. The work of SGD and JK was supported by DFG within CRC 1310 Predictability in Evolution, and JK acknowledges the kind hospitality of the Scottish Universities Physics Alliance and the Higgs Center for Theoretical Physics during the completion of the project. SOLD and RJA acknowledge the support of the ERC Consolidator Grant 682237 EVOSTRUC.
Copyright
© 2020, Das et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 3,344
- views
-
- 446
- downloads
-
- 57
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Computational and Systems Biology
- Microbiology and Infectious Disease
Bacterial membranes are complex and dynamic, arising from an array of evolutionary pressures. One enzyme that alters membrane compositions through covalent lipid modification is MprF. We recently identified that Streptococcus agalactiae MprF synthesizes lysyl-phosphatidylglycerol (Lys-PG) from anionic PG, and a novel cationic lipid, lysyl-glucosyl-diacylglycerol (Lys-Glc-DAG), from neutral glycolipid Glc-DAG. This unexpected result prompted us to investigate whether Lys-Glc-DAG occurs in other MprF-containing bacteria, and whether other novel MprF products exist. Here, we studied protein sequence features determining MprF substrate specificity. First, pairwise analyses identified several streptococcal MprFs synthesizing Lys-Glc-DAG. Second, a restricted Boltzmann machine-guided approach led us to discover an entirely new substrate for MprF in Enterococcus, diglucosyl-diacylglycerol (Glc2-DAG), and an expanded set of organisms that modify glycolipid substrates using MprF. Overall, we combined the wealth of available sequence data with machine learning to model evolutionary constraints on MprF sequences across the bacterial domain, thereby identifying a novel cationic lipid.
-
- Computational and Systems Biology
- Neuroscience
Perception can be highly dependent on stimulus context, but whether and how sensory areas encode the context remains uncertain. We used an ambiguous auditory stimulus – a tritone pair – to investigate the neural activity associated with a preceding contextual stimulus that strongly influenced the tritone pair’s perception: either as an ascending or a descending step in pitch. We recorded single-unit responses from a population of auditory cortical cells in awake ferrets listening to the tritone pairs preceded by the contextual stimulus. We find that the responses adapt locally to the contextual stimulus, consistent with human MEG recordings from the auditory cortex under the same conditions. Decoding the population responses demonstrates that cells responding to pitch-changes are able to predict well the context-sensitive percept of the tritone pairs. Conversely, decoding the individual pitch representations and taking their distance in the circular Shepard tone space predicts the opposite of the percept. The various percepts can be readily captured and explained by a neural model of cortical activity based on populations of adapting, pitch and pitch-direction cells, aligned with the neurophysiological responses. Together, these decoding and model results suggest that contextual influences on perception may well be already encoded at the level of the primary sensory cortices, reflecting basic neural response properties commonly found in these areas.