How the organization of genes on a chromosome shapes adaptation is essential for understanding evolutionary paths. Here, we investigate how adaptation to rapidly increasing levels of antibiotic depends on the chromosomal neighborhood of a drug-resistance gene inserted at different positions of the Escherichia coli chromosome. Using a dual-fluorescence reporter that allows us to distinguish gene amplifications from other up-mutations, we track in real-time adaptive changes in expression of the drug-resistance gene. We find that the relative contribution of several mutation types differs systematically between loci due to properties of neighboring genes: essentiality, expression, orientation, termination, and presence of duplicates. These properties determine rate and fitness effects of gene amplification, deletions, and mutations compromising transcriptional termination. Thus, the adaptive potential of a gene under selection is a system-property with a complex genetic basis that is specific for each chromosomal locus, and it can be inferred from detailed functional and genomic data.https://doi.org/10.7554/eLife.25100.001
In the process of regulatory evolution, a finite set of genes are continuously combined to form new gene expression patterns and create a myriad of phenotypes (Carroll, 2000; Wittkopp et al., 2004; Wray, 2007). Acquiring mutations that increase the expression of a single gene can be sufficient to make an individual substantially fitter than its competitors. For example, increased expression of drug target or efflux genes is a common mechanism for the evolution of resistance to antibiotics (Li et al., 2015; Palmer and Kishony, 2014), chemotherapeutics (Cole et al., 1992), and insecticides (Devonshire and Field, 1991; Coderre et al., 1983). Increased expression of individual genes also provides access to new nutrient resources (Notebaart et al., 2014) and tolerance to diverse toxins (Soo et al., 2011). The fitness effect of increased expression of individual genes has mostly been determined in plasmid-based overexpression libraries (Notebaart et al., 2014; Soo et al., 2011). However, the large majority of genes reside on chromosomes, neighboring other genes, and thus mutations affecting gene expression occur in a specific chromosomal context. Unequal mutation rates along the genome (Foster et al., 2013; Anderson and Roth, 1981) imply that the chromosomal location can affect the adaptive potential of a gene, that is, the probability that adaptive mutations increasing expression of the gene will spread in a population under given selective conditions.
Adaptation by increased gene expression can result from mutations of different types (Blank et al., 2014; Lind et al., 2015): point mutations, promoter insertion by mobile elements (Mahillon and Chandler, 1998; Ellison and Bachtrog, 2013; Stoebel et al., 2009), promoter capture by chromosomal rearrangements (ar-Rushdi et al., 1983; Blount et al., 2012; Xiao et al., 2008), and gene duplication or amplification, which increases expression by way of gene dosage (Andersson and Hughes, 2009; Elliott et al., 2013). How the rate of mutation of these individual mutation types depends on chromosomal position has in part been determined experimentally (Foster et al., 2013; Hudson et al., 2002; Mahillon and Chandler, 1998; Craig, 1997; Touchon et al., 2009; Anderson and Roth, 1981; Seaton et al., 2012; Wahl et al., 1984). Despite considerable experimental data, we currently lack an understanding of how position biases of the different mutation types together combine across different chromosomal loci, and therefore, how the chromosomal context of a gene under selection affects overall adaptation.
Here, we investigate how the complex interplay of different mutation types and mutation rate biases gives rise to an effect of chromosome position on adaptation in Escherichia coli. To this end, we use a single chromosomal drug resistance gene as the target of selection and a two-color fluorescence reporter readout for adaptive mutations in evolution experiments. We quantify the effect of the chromosomal position of the selected gene on adaptation and identify the mutation types underlying this effect. We find that a strong effect of chromosome position on adaptation is largely explained by rate differences of gene duplications and fitness effect differences of two types of promoter co-opting mutations (promoter capture deletions and mutations that cause read-through across upstream transcriptional terminators). Both the observed rate differences and fitness effect differences depend on simple features of the chromosomal neighborhood of the gene under selection. This suggests that the adaptive potential of a gene can be estimated by looking for respective features of chromosomal neighborhoods in genomics data. Based on these results, we propose that the chromosomal context of a gene under selection is an important factor in adaptation.
We devised an evolution experiment with E. coli, in which we use a single target of selection embedded in a genetic cassette that serves as a reporter of adaptive potential and mutation types. The reporter cassette can be inserted at any chromosomal position (Figure 1A and Figure 1B), and it allows us to distinguish amplifications from other adaptive mutations in real-time using two-color fluorescence measurements. The reporter cassette contains a promoterless, translational tetA-yfp gene fusion followed by a transcriptional terminator and a constitutively expressed cfp gene. Mutations that increase expression of the tetracycline efflux pump TetA-YFP can be selected with antibiotic and monitored through YFP fluorescence (Figure 1C, left). Due to the immediate proximity of the tetA-yfp and cfp genes, the large majority of tetA-yfp amplifications are expected to contain the cfp gene as well. Thus, adaptation by reporter cassette amplification is expected to be distinguishable from other up-mutations by a fluorescence increase of both YFP and CFP (Figure 1C, right). We integrated the reporter cassette at four different intergenic loci (A, B, C, and D) along the chromosome of an E. coli ΔtolC strain (Figure 1A), giving rise to four strains (strain A, B, C, and D). The four loci were chosen to lie in intergenic regions between divergently transcribed genes in order to exclude transcription from upstream genes into the tetA-yfp gene (Figure 1B). Loci A and C are located approximately in the middle of the right and left replichore, respectively. Since we wanted to also include a locus close to the origin of replication, where no pair of divergently oriented genes is present, we chose a locus in the relatively large intergenic region between the co-oriented rsmG and atpI genes (locus D, Figure 1B), a locus previously used for large insertions (Kuhlman and Cox, 2010). Locus B was chosen based on its vicinity to several insertion sequences (IS).
We used a ΔtolC genetic background in order to constrain the spectrum of possible adaptive mutations to the reporter cassette locus. TolC is an outer membrane porin and an essential part of several E. coli multi-drug efflux pumps, which are a frequent target of selection during drug exposure (Li et al., 2015) and which cause low-level intrinsic resistance of E. coli to tetracyclines (Sulavik et al., 2001). By employing daily increasing levels of tetracycline (Figure 1D) and constant daily dilution, we created an experimental evolutionary rescue scenario (Carlson et al., 2014), in which populations of ancestral cells rapidly undergo extinction. Rescue from extinction requires the spread of adaptive mutations activating tetA-yfp expression in a race against population decline.
The probability of evolutionary rescue depends on the size and decline rate of an unadapted population, and on a combination of rate and fitness effect of adaptive mutations (Martin et al., 2013). We chose selective conditions such that the initial population size and decline rate are approximately equal for all strains. In this way, the probability of rescue (estimated by performing a large number of replicate rescue experiments) is expected to be informative about the strain-specific rate and fitness effect of adaptive mutations of all types. Specifically, we adjusted the tetracycline concentrations used in evolution experiments to strain-specific minimum inhibitory concentrations (MICs), which we measured precisely (Figure 1—figure supplement 1). Given the otherwise isogenic background of the strains, we interpret MICs as a proxy for initial expression of tetA-yfp. MIC measurements revealed locus-dependent differences in the initial sensitivity to tetracycline, and all strains showed an increased MIC compared to the cassette-free ancestor, which indicates low baseline expression of tetA-yfp. For evolution experiments, we used tetracycline concentrations starting at 50% of the strain-specific MICs (Figure 1D).
We evolved 95 populations of each strain and measured optical density (OD600) and fluorescence daily. Populations yielding OD600 above a fixed threshold after ten days were regarded as rescued. Rescued populations were assigned to fluorescence phenotypes (YFP or YFP+CFP) based on the increase in fluorescence at the end of the experiment compared to the ancestor (Figure 1C). We performed qPCR on genomic DNA of populations displaying increased CFP fluorescence and found a good correlation between CFP fluorescence and the chromosomal copy number of the tetA-yfp gene (Figure 1E). Thus, CFP fluorescence is a valid proxy for the extent of high level amplification of the reporter cassette.
The number of rescued populations differed significantly between strains (Figure 2A), showing that the chromosomal location of the tetA-yfp gene is critical for its adaptive potential. No rescue was observed without the reporter cassette (Figure 2—figure supplement 1), and all rescued populations displayed increased YFP fluorescence, suggesting that rescue depended on the presence and overexpression of tetA-yfp. To test if increased expression of tetA-yfp was indeed causative for rescue, we deleted the reporter cassette genes in single clones isolated from three different rescued populations. Deletions eliminated growth on tetracycline in all three cases (Figure 2B). A minority of populations went extinct despite transiently increased YFP fluorescence (37/290 extinct populations), illustrating how our experimental selection filters for mutations that increase tetA-yfp expression above a minimum level. Two sets of replicate experiments yielded qualitatively similar results (Figure 2C), although the number of rescued populations fluctuated considerably between replicates, which likely reflects both technical variability (for example, in the precise amount of transferred inoculum from day to day) as well as the inherent stochasticity of evolutionary rescue processes. Time-trajectories of OD600 and OD-normalized YFP and CFP fluorescence of all evolved populations are available in Supplementary file 1 and fluorescence phenotype classifications in Supplementary file 2.
We next set out to identify which mutation types were responsible for locus-dependent differences in the number of rescued populations. Strain B gave the highest number of rescued populations, and 76/77 rescued populations of this strain had reporter cassette amplifications (Figure 2A). Rescue by amplification in the other three strains was rare (Figure 2AC), implying that large differences between strains were related to locus-specific amplification. According to the ‘canonical’ model, formation of amplifications is limited by the rate at which initial duplications are generated (Romero and Palacios, 1997). Rates of spontaneous duplication are elevated between homologous sequences such as rRNA operons or duplicate copies of insertion sequences (IS) due to frequent unequal crossing-over (Anderson and Roth, 1981; Andersson and Hughes, 2009). We found homologous copies of IS5 at either side of locus B (IS5H and IS5I), but no flanking homology in the chromosomal neighborhood of the other three loci. We verified the presence of IS5 at the boundary of the amplicon in rescued B populations by obtaining a PCR product of the expected junction in 16/16 tested clones of evolved populations (PCR products of three populations shown in Figure 2D). The junction was undetectable in the ancestor. Deleting one of the two flanking IS (strain BΔIS5I) gave highly reduced numbers of rescued populations (Figure 2A) and only a minority (3/9) had increased CFP fluorescence, which was not connected to amplification between the IS5H and IS5I (Figure 2D). These results confirm flanking homology and its effect on gene amplification as a main factor of chromosomal neighborhood on adaptation by increased gene expression.
Given the above result, we expected differences to disappear in the absence of IS and we repeated the evolution experiments with four strains that had the reporter cassette integrated at the same four loci as before, but that are derived from a multiple deletion strain (MDS42) free of all IS elements (Pósfai et al., 2006). MDS42 lacks around 15% of the MG1655 chromosome, including all prophages and many nonessential genes. Apart from the absence of IS-related mutations, the rates of other mutation types in MDS42 are similar to those in MG1655 (Pósfai et al., 2006). Loci A-D are not immediately next to genes absent in MDS42, the chromosomal neighborhood at a larger scale, however, is different between IS-wt and IS-free versions of the strains (Figure 3—figure supplement 1).
Despite the expected absence of frequent amplification of locus B in the IS-free genetic background, the fraction of rescued populations was still different among strains (p=3 × 10−5, Fisher’s exact test), and rescue was observed only in strains B and D (10 and 8 rescued populations, respectively). To explain these remaining differences, we identified candidate rescue mutations in strains with and without IS. Sequencing ~1 kb of DNA upstream of tetA-yfp revealed mutations of different types: point mutations (including small insertions and deletions), larger deletions, and insertions of mobile elements (Figure 3AB and Figure 3—figure supplement 2). The relative contribution of the different mutation types to adaptation differed between different chromosomal loci in both IS-containing and IS-free strains (p=10−9 and p=0.003, Fisher’s exact test). In several cases, mutations co-occurred with other mutations or amplifications (colored dots in Figure 3A), suggesting interactions between mutations, some of which we explored in more depth later (section ‘Chromosomal neighborhood influences adaptation by affecting the fitness cost of amplifications’).
We then continued to identify the mutation types responsible for the remaining differences in adaptation among strains, independent of neighborhood-dependent amplifications as described above. In order to test the effect of mutations on downstream expression independent of chromosomal locus, we constructed yfp reporter plasmids with all mutations found within the p0 region of clones from rescued populations of the first replicate set of evolution experiments (IS-wt strains, IS-free strains and strain BΔIS5I, Figure 3CDE). Five of six small mutations altering the sequence of p0 increased yfp fluorescence in plasmid reconstructions (Figure 3C), presumably by increasing the affinity of RNA polymerase to p0. One mutation (T-145C) did not affect fluorescence and likely did not contribute to adaptation. Instead, rescue of the respective population, which also displayed a YFP+CFP fluorescence phenotype, likely depended on amplification alone. In contrast, two other point mutations identified in conjunction with amplifications (C-31T and G-92T), did increase reporter fluorescence on plasmids, providing examples of a combined beneficial effect of amplifications and additional mutations. Two of four insertions sequences that we had found inserted into p0 increased reporter fluorescence on plasmids greatly (IS2 and IS3, Figure 3D), which is consistent with the delivery of outward-facing promoters within the termini of IS (Mahillon and Chandler, 1998). The two other IS (IS1 and IS5) had no or no strong effect on plasmid reporter fluorescence. Since some IS have been reported to contain partial outward-facing promoters that can drive downstream expression after insertion next to a resident complementary partial promoter site (Mahillon and Chandler, 1998), we tested IS1 and IS5 in the precise sequence context of p0 in which these IS were found in evolution experiments (Figure 3E). In this sequence context, IS1 indeed increased reporter fluorescence, which depended on the 20 bp downstream of the insertion point within p0 (Figure 3E), consistent with the delivery of a half-promoter within the terminus of this IS. Insertion of IS5, which we repeatedly observed in evolution experiments, had very weak, but significant effects on downstream fluorescence on plasmids (Figure 1DE). To confirm the adaptive role of upstream IS5 insertions in the evolution experiments, we transduced one of the observed upstream IS5 insertions into the ancestral background, which restored growth on tetracycline as well as a marked increase in YFP fluoresence (Figure 3—figure supplement 3). Thus, in the chromosomal context, IS5 does increase expression of downstream genes, possibly due to effects on DNA bending (Zhang and Saier, 2009), which may not be recapitulated on the plasmid reconstruction. These results illustrate the diverse ways in which IS can adaptively affect gene expression, both dependent (IS1, IS5) and independent (IS2, IS3) of the insertion context. Given the reporter plasmid results and the fact that the same p0 sequence is part of the reporter cassette at all four chromosomal loci, point mutations and IS insertions likely were not responsible for the observed differences in the frequency of rescue between strains that are not explained by amplifications.
Whole genome sequencing of clones from three rescued populations with neither upstream genetic changes nor amplifications (Figure 3—figure supplement 4), as well as subsequent screening of other rescued populations, revealed another candidate type of adaptive mutations, which altered the protein sequence of rho (Figure 4—video 1). Unlike mutations of the other types, rho mutations occurred in trans with respect to the reporter cassette. The rho gene of E. coli is an essential gene that encodes a transcriptional termination factor estimated to be required for termination at around half of all termination sites in E. coli (Ciampi, 2006). Contrary to point mutations and IS insertions, which we found upstream of all four loci (Figure 3A and Figure 3—figure supplement 5), Rho mutations and also upstream deletions were only found in evolved clones of strains with the reporter cassette at locus B or D, with one exception of a Rho mutation co-occurring with an upstream IS insertion in strain A. Thus, upstream deletions and Rho mutations provide candidates for locus-dependent adaptive mutations. Comparing the upstream neighborhood of the four different loci revealed the basis of this locus-dependency (Figure 4A and Figure 4—figure supplement 1). The orientation and expression of upstream transcripts as determined in a different study (Conway et al., 2014) suggests that in strains B and D, active upstream promoters were co-opted to tetA-yfp, either by deletion of intervening genes, or by compromising Rho-dependent termination by partial-loss-of-function mutations in Rho that cause transcriptional read-through into tetA-yfp. At loci A and C, such adaptive mutations were not available because of two kinds of constraints from neighboring genes: either intervening genes were essential (constraining adaptive deletions, Figure 4A), or no upstream Rho-terminated transcripts were present (constraining adaptive Rho mutations, Figure 4A).
Since active transcripts shown in Figure 4A were experimentally determined under conditions different from our evolution experiments (Conway et al., 2014), and classification of termination sites as intrinsic or Rho-dependent was done only computationally (Kingsford et al., 2007; Conway et al., 2014), we experimentally assessed the effect of Rho mutations on transcriptional read-through across candidate upstream terminators at all four loci under experimental conditions approximating those in evolution experiments. We first confirmed the neighborhood-dependent effect of two different Rho mutations (S153F and M416I) on the phenotype of interest, that is, tetracycline resistance, by transduction into the ancestral IS-wt strains, which are isogenic except for the position of the reporter cassette (Figure 4B). Consistent with the presence of upstream Rho-terminated transcripts as shown in Figure 4A, an increased tolerance of Rho-mutants to tetracycline was observed only in strains with the reporter cassette at loci B and D, matching our observation that Rho-mutants were only found in rescued populations of these strains. We then performed PCR on cDNA prepared from a Rho-wt strain and a Rho mutant (M416I) strain grown in sub-inhibitory tetracycline (Figure 4C). We obtained PCR products consistent with read-through across candidate terminators upstream of locus B (downstream of yeeD) and locus D (mnmG), but not upstream of locus A (cysS) and locus C (xapR). A read-through transcript at locus D was detectable even in the Rho-wt background, which offers an explanation for the higher initial TetA-YFP expression observed in strain D (Figure 1—figure supplement 1). Mutations found in rescued populations of additional replicate experiments (fluorescence phenotypes in Figure 2B) are consistent with the above constraints on promoter co-opting mutations (Figure 3—figure supplement 5). Thus, upstream deletions and trans mutations that compromise transcriptional termination are mutation types that depend on the chromosomal neighborhood of the gene under selection. Specifically, the orientation, expression, essentiality, and termination mode of neighboring genes shape the fitness effect of these promoter co-option mutations.
As seen from promoter co-opting mutations, chromosomal neighborhood may affect the adaptive potential of a gene by influencing not only mutation rates (as flanking homology does for duplications that can expand into amplifications), but also mutation fitness effects. We next asked if this applies to amplifications as well. Due to the instability of amplifications and related difficulties in detecting them, quantifying the fitness effect of amplifications is laborious (Adler et al., 2014) and has so far not been done on a genome-wide scale. The benefit of amplifying a selected gene is counteracted by a cost that arises in part due to dosage imbalances in the co-amplified neighboring genes. This cost limits the ability of amplifications to effectively expand at the population level as selection increases, an ability that comes from high rates of expansion of amplifications at the level of the individual chromosome by homologous recombination. The probability of an amplification to contain a costly gene is expected to increase with the length of the amplicon.
We used two-color fluorescence data to extract information on amplification cost and its effect on adaptation. For validating this approach, we used two strains (strain BΔIS5I, and the newly created strain E, Figure 5A) that we predicted to form amplifications of higher and lower cost, respectively, when compared to the IS-containing strain B, which serves as reference. The IS5 deletion in strain BΔIS5I, which reduces the rate of duplications that kick-start amplifications (see above), is also expected to increase the fitness cost of amplifications, since amplicons may be larger than the 35 kb between IS5I and IS5H. In strain E, we placed the reporter cassette between two copies of IS1, where duplications are expected to form frequently, and where amplifications, due to small amplicon size (11 kb), are expected to expand at low cost. In our experiment, if cost is negligible, amplifications are expected to expand continuously as tetracycline selection increases, resulting in rescue. In this case, YFP+CFP fluorescence increases in correlation with the level of tetracycline selection (Figure 5B, left). If amplifications are cost-limited, two outcomes are possible: (i) amplifications fail to expand beyond a certain level below that required for rescue, resulting in extinction – if the level of expansion before extinction is high enough, it will appear as a transient increase of CFP fluorescence in the fluorescence trajectories of extinct populations (Figure 5B, middle), (ii) amplifications allow rescue through interaction with other adaptive mutations that increase tetA-yfp expression – resulting in increased YFP/CFP fluorescence ratios in rescued populations (Figure 5B, right). We compared the numbers of extinct vs. rescued populations with (transiently) increased CFP fluorescence and found, as expected, that amplifications in strain BΔIS5I had a significantly higher extinction risk than amplifications in the reference strain B (Table 1, upper part), and rescued amplifications had significantly higher final YFP/CFP ratios (Figure 5C), confirming that low numbers of rescue in strain BΔIS5I were in part due to amplification costs. Contrariwise, populations with increased CFP fluorescence in strain E never went extinct (Table 1, upper part) and had consistently low final YFP/CFP ratios (Figure 5C), indicating the absence of a cost limitation.
Having validated extinction risk (Table 1) and final YFP/CFP ratios (Figure 5C) as indicators of amplification cost, we tested if neighborhood-dependent amplification costs had affected adaptation in strains A, C, and D. A significantly elevated extinction risk of populations with amplifications in strain A (Table 1, lower part), and significantly elevated final YFP/CFP ratios in connection with diverse additional mutations in strains A, C, and D (Figure 5D), support that the costs of amplifications in these strains were higher compared to strain B. Thus, amplification costs represent another neighborhood-dependent constraint on adaptation. In this perspective, the availability of neighborhood-dependent promoter co-option mutations, the most prevalent non-amplification mutation types at loci B and D, is an important determinant of adaptive potential not only in itself, but also in the interaction with amplifications.
We next investigated whether our observations of chromosomal neighborhood effects on adaptation transfer to different selective conditions. In particular, we tested the possibility that differences in rescue between strains were due to different population sizes and thus different chances for beneficial mutations to occur, rather than due to different mutation rates or fitness effects as we propose. Our experimental design corrects for population size differences between strains at the first day of selection (Figure 1—figure supplement 1, and first section of the results part), but not necessarily for population size differences at later days of the experiments. Therefore, we performed single-step plating experiments, in which approximately the same numbers of cells are plated for every strain. In these Luria-Delbrück-type experiments, we plated replicate cultures grown under non-selective conditions on solid medium with tetracycline at two-fold MIC levels. We scored the number of colonies on each plate after 2 days, when clearly visible colonies first appeared. These early colonies are expected to result mostly from pre-plating single-step mutations that increase tetA-yfp expression (point mutations, IS insertions, and promoter co-option mutations). As in evolution experiments, colony numbers in strains B and D were higher than in strains A and C (Figure 6, left), for both IS-wt and IS-free genetic backgrounds. This result is consistent with neighborhood-dependent availability of promoter co-option mutations as observed also in evolution experiments. High CFP fluorescence, indicative of amplifications, was observed only in a small fraction of early colonies (34 of 1661 across all strains and plates). During longer incubation, the number of colonies on plates of IS-wt strain B increased steadily (Figure 6—figure supplement 1) and almost all of these later colonies (1229/1304 on ten plates) showed high CFP fluorescence. Since tetracycline is bacteriostatic rather than bactericidal, the appearance of these late colonies can be explained by a continuous process of reporter cassette amplification expansion and increasing growth rates after plating on selective medium, starting from frequent duplications that have a slight growth advantage over single-copy cells (Andersson et al., 1998). After 5 days, colony counts on plates were qualitatively similar to rescue frequencies in evolution experiments, with IS-wt strain B giving the highest number of colonies (Figure 6, right). In all other tested strains, late colonies appeared at much lower rates (Figure 6—figure supplement 1) and did not show high CFP fluorescence in most cases (Figure 6, right, and Figure 6—figure supplement 2), reflecting the minor role of amplification in strains that lack flanking homology in the chromosomal neighborhood of the selected gene. The consistency between liquid-culture evolutionary rescue experiments and plating experiments supports that strong effects of chromosomal neighborhood on the rate and fitness effect of adaptive mutations extend to different selective regimes.
Our results reveal a complex genetic basis of strong effects of chromosomal position on the adaptive potential of a specific gene (Figure 7A). By combining time-resolved fluorescence data from the reporter cassette and end-point genetic analysis, we demonstrate how the relative contribution of previously known mutation types to adaptation (Figure 7A, bottom row) differs between chromosomal loci, how these differences arise, and how a layer of complexity is added by the interaction of mutation types. Thus, the concept of a one-dimensional mutation rate and a focus on point mutations can be misleading (Martinez and Baquero, 2000), even for the simple case of adaptation by increased expression of a single gene. Instead, the adaptive potential of a given gene is a system-level property shaped by the local chromosomal genetic neighborhood. Consequently, the organization of genes on a chromosome is both cause and consequence of evolutionary change.
Importantly, the effects that we describe arise from several properties (Figure 7A, top row) of different genetic elements that are present in the vicinity of the selected gene, rather than from more global factors such as distance to the origin of replication or chromosome macro-domain organization (Bryant et al., 2014). Therefore, we propose to refer to them as ‘chromosome neighborhood effects’ that determine the evolution of gene expression, as opposed to ‘chromosome position effects’ that modulate gene expression per se (Bryant et al., 2014; Levis et al., 1985; Akhtar et al., 2013).
In our experiments, chromosomal neighborhoods facilitate or constrain adaptation mainly via amplification and promoter co-option mutations, by affecting the rate of mutations (duplication-amplification) or the fitness effects of mutations (promoter co-option mutations and amplifications). For gene amplification, a strong effect of flanking homology as provided by IS, which are often present in multiple copies, has been known for a long time (Peterson and Rownd, 1985; Andersson and Hughes, 2009). Our data confirm that if flanking homology is present at a given locus, amplification is the main response to selection for increased gene expression. For loci lacking nearby flanking homology, which depending on the distribution of IS elements on a chromosome may be the majority of loci (Boyd and Hartl, 1997; Green et al., 1984), our data show that adaptation by amplification is limited on the level of duplication rate and fitness cost. For these loci, differences in the adaptive potential are largely due to the different availability of deletions and mutations compromising transcriptional termination, both of which co-opt upstream promoters to the selected gene. Such mutations also act in concert with amplifications and can alleviate amplification cost limitations by lowering the required fold-amplification to reach a certain level of expression of the selected gene (Figure 7A).
The multitude of mutations discovered in the termination factor Rho suggests that the function of this protein may be more ‘tunable’ than expected from it being an essential gene in E. coli. Our results may suggest that adaptation via trans mutations in Rho with potentially large pleiotropic effects is more likely than via local mutations that compromise upstream terminators in cis. Given that the sequence-dependence of Rho-dependent termination is poorly understood (Ciampi, 2006), there is no clear expectation of the nature and target size of mutations that would compromise Rho-dependent termination in cis. This makes it difficult to compare adaptation via mutations affecting Rho-dependent termination in cis versus trans. The adaptiveness of trans mutations in Rho despite their pleiotropic effects is supported by a previously characterized single amino-acid substitution in Rho, which was found to have large-scale effects on the E. coli transcriptome and to confer higher fitness in several environments (Freddolino et al., 2012). We found substitutions at 22 different amino acid residues mapping to various regions of the Rho protein structure (Skordalakes and Berger, 2003) (Figure 4—video 1 and Figure 4—video 1—source data 1), which largely expands the number of Rho residues found mutated in evolution experiments (Conrad et al., 2011). This supports the idea that operons delimitated by factor-dependent terminators may be rather fluid, providing a large source of variation for adaptation to changing environments. It remains to be seen whether different Rho alleles, by revealing ‘hidden’ transcriptional variation, serve as capacitors of adaptation (Masel, 2013) beyond laboratory evolution experiments.
For both amplifications and promoter co-opting mutations, the influence of the chromosomal neighborhood arises mechanistically from several simple properties of neighboring genes—their expression, orientation, transcriptional termination, essentiality, the presence or absence of flanking gene duplicates—and from the cost of neighboring gene co-amplification (Figure 7A, top row). If these properties are known at a genomic scale, inferring a chromosome-wide ‘map of adaptive potential’ becomes conceivable. An understanding of adaptive potential may help assess the risk of resistance evolution via overexpression of preexisting chromosomal genes (as opposed to via acquisition by horizontal transfer). Clearly, some properties of neighboring genes can be assessed on a genome-wide scale more easily (e.g. gene orientation) than others (e.g. gene essentiality or cost of genes when amplified). Once it becomes feasible to acquire data on all the main factors shaping adaptive potential, this data may improve efforts to predict specific adaptations.
As a first step toward this goal, we used published information on gene essentiality, and promoter and terminator locations (Conway et al., 2014) to assess how many of E. coli genes (strain MG1655) are expected to reside in a chromosomal neighborhood associated with high adaptive potential (Figure 7B). Based on the most simply assessable properties (colored circles in Figure 7B), the chromosomal neighborhood of most genes (2295/4317) is expected to have a medium adaptive potential, comparable to that of locus D from this study.
Importantly, some properties of chromosomal neighborhoods are dynamic (rounded boxes in Figure 7A)—gene essentiality (Baba et al., 2006) and expression can be environment-dependent, and transposition causes rapid turnover of mobile element positions (Sawyer et al., 1987; Wagner, 2006). Therefore, the classification of chromosomal neighborhoods of genes according to adaptive potential as in Figure 7B needs to be understood as a snapshot in time reflecting particular conditions. Also, how adaptive potential translates into the actual likelihood of adaptation depends on population parameters and the precise selection scenario.
On evolutionary timescales, the dynamics of chromosomal neighborhood properties would rapidly degrade signals that neighborhood-dependent evolution leaves in genome sequences. Nevertheless, neighborhood-dependent evolution could offer mechanistic explanations for phenomena observed in genomic data such as operon organization (Reams and Neidle, 2004; Lawrence and Roth, 1996), reductive genome evolution by promoter capture-deletions as suggested previously (Lind et al., 2015), or the chromosomal position of horizontally transferred genes (Touchon et al., 2009). Since horizontally transferred genes carrying selective functions are often silenced after initial integration (Navarre et al., 2006; Cardinale et al., 2008), they depend on activating mutations to play out their benefit to the host and become stably maintained in the host chromosome. Thus, the evolutionary fate of horizontally transferred genes will be shaped by the new chromosomal neighborhood they find themselves in. For example, a drug resistance gene entering the genome at loci B or D via horizontal transfer will be more likely to enable survival of the host under drug selection, compared to insertion at loci A and C, both because of higher initial expression and the higher adaptive potential associated with these loci as described here. The common association of horizontally acquired genes with flanking mobile elements as in complex transposons and genomic islands (Dobrindt et al., 2004) may not only reflect the high transferability of such configurations, but also their high amplifiability, which may be of particular relevance for mis-expressed foreign genes.
Although our results reflect many specifics of prokaryote genome organization, the importance of promoter-capture mutations (ar-Rushdi et al., 1983), modulation of transcriptional read-through (Grosso et al., 2015) and gene amplification (Cole et al., 1992; Gajduskova et al., 2007) extends to cancer evolution and cases of rapid adaptation in higher organisms (Devonshire and Field, 1991). This implies that chromosomal neighborhood effects on evolution may be of wider significance and they could be investigated with similar reporter-based methods.
Unless noted otherwise, we obtained chemicals from Sigma-Aldrich (St. Louis, Missouri) and enzymes from New England Biolabs (Ipswich, Massachusetts). Evolution experiments and phenotyping tests were done in in M9 medium supplemented with 2 mM MgSO4, 0.1 mM CaCl2, and 0.2% glucose and 0.2% casein hydrolysate as carbon sources (M9CG medium), unless noted otherwise. A list of oligonucleotides, strains, and plasmids is available in Supplementary file 3.
The reporter cassette (p0-RBS-tetA-yfp-pR-cfp) was assembled on a plasmid using a combination of standard cloning techniques, ligation chain reaction (Rouillard et al., 2004), and fusion PCR. For the p0 sequence upstream of tetA-yfp, we generated a random 188 bp nucleotide sequence matching the average GC content of E. coli (CCGGAAAGACGGGCTTCAAAGCAACCTGACCACGGTTGCG
CGTCCGTATCAAGATCCTCTTAATAAGCCCCCGTCACTGTTGGTTGTAGAGCCCAGGACGGGTTGGCCAGATGTGCGACTATATCGCTTAGTGGCTCTTGGGCCGCGGTGCGTTACCTTGCAGGAATTGAGGCCGTCCGTTAATTTCC). We synthesized the sequence from oligonucleotides in a ligation chain reaction. The tetA sequence was taken from strain TKC (Sharan et al., 2009), and the yfp gene from plasmid pZA21-yfp (Lutz and Bujard, 1997). At the fusion point, we placed a 3xGS linker peptide between the C-terminus of TetA and the N-terminus of YFP. Between p0 and the start codon of tetA-yfp is a sequence containing a restriction site and a ribosomal binding site (GTCGACAGGAGGAATTCACC). We placed the p0-tetA-yfp sequence on plasmid pAH81-FRT-cfp (Haldimann and Wanner, 2001), upstream of the chloramphenicol resistance gene and the terminator-flanked pR-cfp gene. pR is a strong constitutive promoter originating from phage λ. We sequenced the full length of the reporter cassette on the resulting plasmid, pMS7. Replication of the pMS7 plasmid depends on the Pir protein and the plasmid was propagated in a pir-containing version of strain DH5α.
We moved the ΔtolC::kan allele from E. coli strain JW5503-1 into strain MG1655 using P1 transduction. For the IS-free genetic background, the same ΔtolC::kan allele was introduced into strain MDS42 (Pósfai et al., 2006) by recombineering (Thomason et al., 2014) with pKD13 (Datsenko and Wanner, 2000) as PCR template. kanR cassettes were removed using plasmid pCP20 (Datsenko and Wanner, 2000). We inserted the reporter cassette from plasmid pMS7 into the two ΔtolC strains by recombineering. Precise insertion points are given in Figure 1—source data 1. All reporter cassette genes point toward the terminus of replication. Recombinants were selected on LB agar with chloramphenicol (10 µg/mL). The chloramphenicol marker was subsequently removed (Datsenko and Wanner, 2000). We confirmed the presence of the full-length single copy insertion by PCR and verified the sequence of p0-tetA-yfp by sequencing. The presence of functional pR-cfp was confirmed by observing fluorescence. To obtain strain BΔIS5I, the camR cassette from pKD3 (Datsenko and Wanner, 2000) was recombineered into the IS5I element of strain B. Recombinants were selected with choloramphenicol (10 µg/mL) and confirmed by PCR. Deletion of the reporter cassette genes in evolved clones was done by recombineering the kanR cassette of pKD13 into the reporter cassette such that the coding regions of both tetA-yfp and cfp were disrupted. Deletions were confirmed by absence of fluorescence and PCR with flanking primers (Figure 2—figure supplement 2). For P1 transduction of rho mutations, we first transduced mutations S153F and M416I from rescued clones of populations of strain D into MG1655. As selective marker, we used a kanR cassette that we had inserted upstream of rho by recombineering. After sequence verification, we transduced rho mutations into IS-wt strains A-D.
Strains were pre-grown for 16 hr in M9CG medium without tetracycline and transferred to 96-well plates (200 µL/well). From there, we pin-diluted cultures with a VP408 pin replicator (V and P Scientific, San Diego, California, dilution factor ~1:820, tested with fluorescein) into fresh medium with different concentrations of tetracycline, incubated plates for 24 hr at 37°C on a Titramax plateshaker (Heidolph, Schwabach, Germany, 900 rpm), shook plates for 20 s at 1200 rpm and measured OD600 with a H1 platereader (Biotek, Vinooski, Vermont). For obtaining fine-scale MIC measurements we tested tetracycline concentrations at intervals of 0.125 µg/mL. We defined MIC as the lowest drug concentration that yielded OD600 ≤ 0.075 (plate reader units) in three replicates performed on different days.
All precultures and evolution experiments were performed in M9CG medium. We transferred an overnight culture of every strain into 95 wells of clear flat bottom 96-well plates (200 µL/well), from where we diluted cultures into medium with tetracycline using VP408. One well contained a growth medium control. As initial concentration of tetracycline, we used half of the strain-specific MIC. For 10 days, we pin-diluted cultures with VP408 every 24 hr into medium with geometrically increasing tetracycline concentrations such that at day 10 the concentration was 10 times the initial concentration (Figure 1D). During the experiment, the maximum number of generations was set by the daily dilution factor (~1:820) and was ~97. A fresh tetracycline stock solution was prepared from powdered tetracycline-HCl every day. All incubations were done at 900 rpm on a plate shaker at 37°C in the dark and plates were wrapped in plastic bags to mitigate evaporation. Replicate evolution experiments were performed with two additional 96-well plates for each of strains A, B, C, and D (IS-wt). Each 96-well plate was started from a culture inoculated with a different colony. At the end of experiments, we froze all rescued populations.
Every day during the evolution experiment, after using 24-hr-old cultures for inoculating fresh medium with a higher tetracycline concentrations using VP408, we shook the old plates for 20 s at 1200 rpm to resuspend cells and measured OD600 and reporter fluorescence with a H1 Platereader (Biotek, Vinooski, Vermont; excitation/emission: YFP 515/545 nm / gain 100; CFP 433/475 nm / gain 60).
Populations were classified as rescued if OD600 exceeded 0.075 (plate reader units) at the end of the experiment. Fluorescence values were normalized to OD600 and set to zero if OD600 fell below 0.075. As reference for calculating the fold-increase in fluorescence, we took the average OD-normalized fluorescence of 95 cultures of the respective ancestral strain, inoculated in the same way as described for the beginning of evolution experiments, and grown in 96-well plates for 24 hr without tetracycline. Rescued populations were classified as YFP or YFP+CFP if the observed fold increase in respective fluorescence over the ancestor was >2.77 at the end of the experiment. This threshold corresponds to the lowest observed increase in YFP fluorescence that was sufficient for rescue in the first set of replicate experiments (IS-wt strains A, B, C, and D). To identify populations that went extinct despite elevated YFP and/or CFP fluorescence we applied more stringent criteria, requiring increased fluorescence (fold increase >2.77) for at least 2 days at which OD was >0.3 (platereader units). These criteria were used to exclude extinct populations that were false positive for increased fluorescence due to low OD600 values prior to extinction. Rescued populations that met the more stringent criteria for elevated CFP fluorescence, but that did not show elevated CFP fluorescence at the end of the experiment (final fold increase <2.77), were counted as amplifications for cost analysis (Figure 5 and Table 1), but not for Figure 2. For calculating final YFP/CFP ratios of rescued amplifications, we used internal plate reader fluorescence units directly. A Matlab script used to perform the above analysis is available as a supplementary file along with the platereader raw data used as input for the script (Source code 1). Plots of fluorescence trajectories of every population can be found in Supplementary file 1 and phenotype classifications in Supplementary file 2.
We inoculated samples of all rescued populations that we had chosen for sequencing from the first set of replicate experiments and that had a YFP+CFP fluorescence phenotype. We inoculated 2 mL M9CG with 10 µL of populations that were frozen at the end of the evolution experiment. The large inoculum was used to maintain amplification-related population diversity. We added the same amount of tetracycline as on the last day of evolution experiments to maintain amplifications. From all cultures that were turbid after overnight incubation, we isolated genomic DNA (gDNA). Ancestor gDNA was isolated from cultures without tetracycline. We performed qPCR using the GoTaq qPCR mastermix (Promega, Madison, Wisconsin) and a C1000 instrument (Bio-Rad, Hercules, California). Using dilution series of one of the gDNA extracts as template, we confirmed that all primer pairs had an amplification efficiency >90%. We quantified the copy number of tetA in each sample with the ΔΔCq method implemented in the instrument software (Bio-Rad), taking amplification efficiency into account. As reference, we used loci equidistant from the origin of replication and compared ratios of the measured and reference locus to the ratio of the same two loci in the ancestral DNA. qPCR was done in three technical replicates.
We searched 400 kb around loci A-D for homologous sequences on either side using REPuter (Kurtz et al., 2001) with the following search criteria: forward repeats ≥ 200 bp, Hamming distance ≤5.
We streaked all rescued populations of strains A, C, D (IS-wt), of strains B and D (IS-free), and of strain BΔIS5I for single colonies on LB agar. For IS-wt strain B, we analyzed one rescued population that had a YFP-only fluorescent phenotype, two YFP+CFP populations with unusual fluorescence trajectories and 11 randomly chosen populations from the remaining 74 YFP+CFP rescued populations, which had highly similar fluorescence trajectories. Colony-PCRs were performed on a single representative clone of each streak. We amplified at least 1.5 kb of the region upstream of the tetA start codon. The size of PCR products was checked for insertions or deletions on an agarose gel. Sequences were obtained using primer tetA_pseq1_f. If no PCR product was obtained, we performed arbitrary PCR with primer tetA_pseq2_f and a random primer, arb1 or arb6, for upstream binding. We then did a second PCR with a nested primer tetA_arb2 and primer arb2 using the first PCR product as template, and sequenced DNA extracted from the largest distinct band on an agarose gel. The full-length sequence of the rho gene was amplified and sequenced with primers rho_seq_f and rho_seq_r. For additional replicate evolution experiments, we sequenced clones of all rescued populations with a YFP fluorescence phenotype and with a YFP+CFP fluorescence phenotype showing high final YFP/CFP ratios. In four cases, we identified the exact same mutation in clones isolated from two populations that had been in neighboring wells during evolution experiments. In order to ensure that a potential cross-contamination between these two wells did not influence results, we excluded one of each pair of such neighboring populations from all analyses.
Colony PCR for amplification junctions was performed with primers IS5I_flank_f and IS5H_flank_r on single colonies of 16/16 evolved populations of strain B. For the data shown in Figure 2D, we used gDNA previously isolated from populations for qPCR to ensure a comparable amount of PCR template in all reactions.
We isolated gDNA from overnight cultures of single clones of four rescued D populations as well as of the ancestral D strain grown in LB. A whole genome library was prepared and sequenced by GATC biotech (Konstanz, Germany) on an Illumina sequencer (125 bp reads). Fastq files were analyzed with the breseq script (Barrick et al., 2014). We used the MG1655 genome (Genbank accession number U00096.3) as a reference for assembling the ancestral D genome, which then served as a reference for analyzing the genomes of the evolved clones. Fastq files are available at: http://dx.doi.org/10.15479/AT:ISTA:65.
For building the reference plasmid pAnc, which reports on expression from the ancestral p0 sequence, we exchanged the pLtetO-1 promoter and RBS of pZA21-yfp for the p0-RBS sequence upstream of tetA-yfp in the reporter cassette. Using a Q5 site-directed mutagenesis kit (New England Biolabs) with pAnc as template, we reconstructed small mutations (substitutions and small insertions and deletions, Figure 3C). We did the same with the terminal 50 bp of IS1 (5’ terminus) and IS5 (3’ terminus), which we put instead of the 50 bp of p0 in the exact position where insertions were found in the experiment (Figure 3E). To confirm the IS1-p0 hybrid promoter, we exchanged 20 bp of p0 downstream of the IS1 insertion point in the respective reporter plasmid. The 20 bp were replaced by a randomly shuffled sequence composed of the same nucleotides. For the other IS reporter plasmids (Figure 3D), we PCR-amplified the last 600 bp of IS and cloned them into the XhoI/EcoRI sites of pZA21-yfp. The orientation of the truncated IS corresponds to that found in sequenced clones. As autofluorescence control, we removed the YFP fragment between EcoRI and MfeI restriction sites of pZA21-yfp and obtained pZA21-empty by religation of compatible ends. All changes were sequence-verified. Cloning and reporter measurements were done in strain NEB 5 alpha (New England Biolabs).
We grew six replicate overnight cultures of the reporter plasmid strains in LB Kanamycin (50 µg/mL) in a 96-well plate and diluted them into M9CG supplemented with Kanamycin using a VP407 pin replicator (approximate dilution factor 1:100). Diluted cultures were shaken and incubated at 37°C in the platereader and OD600 and YFP fluorescence was monitored every 10 min (YFP gain 120). YFP readings were normalized to OD600 and averaged for each replicate at all timepoints at which OD600 was between 0.20 and 0.25 (platereader units, that is, mid-exponential phase).
Clones and strains to be tested were pregrown overnight in M9CG and diluted as shown in Figure 2B and Figure 3—figure supplement 3. We spotted 2.5 µL of diluted cultures on M9CG agar plates. After 24 hr incubation at 37°C, we took YFP fluorescence images of plates using a lab-made macroscope (http://openwetware.org/wiki/Macroscope). The macroscope uses a Canon EOS 600D digital camera and a Canon EF-S 60 mm f/2.8 Macro USM lense (Canon, Tokyo, Japan). For illumination, we used a Cyan (505 nm) Rebel LED (Luxeon Star LEDs, Brantford, Canada) with a HQ500/20x excitation filter (Chroma, Bellow Falls, Vermont). As emission filter we used a camera-mounted D530/20 filter (Chroma).
Stationary cultures of MG1655 ∆tolC (rho-wt) and of the isogenic strain with the rho M416I mutation in LB were diluted 1:100 in M9CG supplemented with tetracycline (0.44 µg/mL, that is, 50% of the MIC of strain MG1655 ∆tolC and grown overnight at 37°C with shaking. Total RNA was isolated using an Aurum Total RNA Mini kit (Bio-Rad) and DNA removed using an Ambion DNA-free kit (Life Technologies, Carlsbad, California). Isolated RNA was quantified using a Nanodrop spectrophotometer and integrity was checked on an agarose gel. cDNA was synthesized using an iScript cDNA synthesis kit (Bio-Rad) with 1 µg of total RNA as input in a 20 µL reaction. For the non-reverse-transcriptase (NRT) control reaction we used 0.5 µg of each of the two RNA samples.
After reverse transcription, cDNA samples and the NRT control sample were diluted by adding 150 µL of nuclease-free water. Endpoint PCR to test for the presence of transcripts resulting from possible read-through across Rho-dependent terminators were done with a OneTaq Quick-Load Mastermix (New England Biolabs), using 1 µL of diluted cDNA or NRT control as template in a 50 µL reaction. To detect rare transcripts, we used 45 amplification cycles. As a positive control template in PCR reactions, we used 1 µL of a colony of strain MG1655 ∆tolC resuspended in 25 µL water and heated to 95°C for 4’. For agarose gel visualization, we loaded 15 µL of cDNA and NRT control PCR reactions and 2 µL of the positive control PCR reactions.
In several cases, fluorescence analysis and sequencing revealed two potentially adaptive mutations in the same clone/population (colored dots on top of bars in Figure 3A and Figure 5—figure supplement 1). To infer which mutation came first, we proceeded as follows. For amplifications that occurred in combination with point mutations, we examined sequence chromatograms obtained from single clones. In all three cases, point mutations appeared as mixed nucleotide peaks, indicating that amplifications were initiated before the point mutations occurred. In two cases of amplifications co-occuring with upstream IS insertions, insertions occurred first. This is evident since PCR products used for sequencing appear as single bands of larger size than expected on agarose gels, whereas later insertions are expected to give two bands – a smaller one for copies without the insertion and a larger one for copies having the insertion. In one case, the insertion of IS3 upstream of locus C was a prerequisite for amplification initiation, as we could show by PCR that the IS3 insertion was at the amplicon junction. Cases of co-occurrence of amplifications with deletions or Rho-mutations were decided based on fluorescence trajectories. YFP/CFP ratios that remained high and relatively constant throughout the experiment indicate that amplifications expanded only after the other mutation had occurred. YFP/CFP ratios that increase at an intermediate timepoint during the experiment indicate that amplifications were initiated first. Last, we assume that a Rho mutation in strain A was selected only after the insertion of an upstream IS5 element, since Figure 4B indicates that Rho mutations alone would not have been adaptive in strain A. Rather, we assume that the Rho mutation enhanced transcriptional read-through from IS5 into the reporter cassette. Dots-on-bar color assignments in Figure 3—figure supplement 5 do not reflect the order of mutations, as we did not do such analysis for additional replicate experiments.
Essentiality data for upstream protein coding genes (Figure 4A) was taken from a published dataset (Baba et al., 2006). We did not find data on the essentiality of the valU tRNA operon upstream of locus C in the literature data. Therefore we tested experimentally, if deletions of the complete valU operon are tolerated. We attempted to delete the operon using recombineering with pKD13 as template plasmid for a kanR cassette and primers valU_ko_f and valU_ko_r. The number of colonies on the valU knockout selection plate was more than tenfold lower than that of a control knockout of the neighboring xapR gene with primers xapR_ko_f and xapR_ko_r. To exclude that the low number of recombinants was due to a hairpin structure contained in the valU_ko_r primer, we repeated recombineering with a different reverse primer, valU_ko_r2, and obtained similar results. The low recombineering efficiency was not due to a smaller amount of PCR product used in transformations. Of six tested colonies obtained on the ΔvalU::kanR selection plate, only one colony gave a PCR product of the expected size in a test with flanking primers, showing that 5/6 colonies are not true valU knockouts. This suggests that valU deletion mutants require rare compensatory mutations to restore growth. Therefore the valU operon was considered as essential.
We inoculated 1 mL of LB with a single colony of strains to be tested. After overnight incubation, saturated cultures were diluted 1:1000 into experimental evolution medium without tetracycline, and then split into 10 wells of a 96-well plate (220 µL / well). The 96-well plate was incubated on a plate shaker at 37°C for 24 hr to obtain saturated cultures, of which 180 µL containing approximately 2 × 108 cells were plated on M9CG medium with tetracycline at a concentration two times the MIC of respective strains (cell numbers were determined by plating dilutions on non-selective medium). Plates were incubated at 37°C in the dark and colonies counted every 24 hr. After 2 days, we picked one colony from every plate that had at least one colony on it and inoculated 200 µL of M9CG medium in a 96-well plate with the picked colony. After 24 hr incubation at 37°C, we used the VP407 pinner to spot approximately 2 µL on M9CG agar plates. After another 24 hr incubation, we took CFP fluorescence images of plates with the macroscope (see ‘Tetracycline resistance phenotyping on solid medium’). For illumination, we used a Royal Blue (447.5 nm) Rebel LED (Luxeon Star LEDs) with a D436/20x excitation filter (Chroma). As emission filter we used a camera-mounted D480/40m filter (Chroma). The mean intensity of pixels of each spot was quantified. Spots with intensity six times greater than the mean intensity of all ancestor spots are considered to have amplifications (Figure 6—figure supplement 2).
To test for homogeneity in the distribution of rescued vs. extinct populations, fluorescence phenotypes and mutation types, r × c Fisher’s exact test for Count Data was used (fisher.test function in R [R Core Team, 2012]). For testing the distribution of mutation types, we used types indicated in Figure 3A by bar color, not dot color. For testing 2 × 2 contingency tables, Fisher’s exact test was used with an alternative hypothesis of odds ratio ≠ 1. Permutation tests were performed with the perm package (Fay and Shaw, 2010) for R (permTS function, method=‘exact.mc’, 104 Monte Carlo replications, two-sided).
We used the Profiling of E. coli Chromosome (PEC) database available at https://shigen.nig.ac.jp/ecoli/pec/genes.jsp (accession number UA00096.2) and included all 4317 genes (feature type ‘gene’) of E. coli MG1655 with essentiality information in our analysis, which excludes non-coding genes. The position and orientation of promoters was extracted from Table S2 of the same study used to identify candidate transcripts in Figure 4A (Conway et al., 2014). We only included promoters annotated as ‘primary’ promoters in our analysis. The ‘Promoter Confidence Score’ was not taken into account. The position, orientation, and termination mode (intrinsic or non-intrinsic) of all terminators was extracted from Table S3 of the same study (Conway et al., 2014). In order to identify all genes downstream of Rho-dependent terminators (green circle in Figure 7B), we identified the closest upstream co-oriented terminator of every gene and evaluated whether it was predicted to be an intrinsic terminator or not, in which case we assumed it is Rho-dependent. In order to identify all genes to which a co-oriented upstream promoter could be co-opted by deletion without disrupting an essential gene, we first identified the next essential upstream gene of every gene, and then evaluated if there is at least one co-oriented promoter and intervening co-oriented terminator between the gene of interest and the next upstream essential gene. If this was the case, the gene of interest was included in the respective set of genes (magenta circle in Figure 7B). In order to identify genes between flanking duplicates (blue circle in Figure 7B), we used the online REPuter tool (Kurtz et al., 2001) to find all forward repeats on the chromosome that satisfied the following criteria: repeat length ≥200 bp, Hamming distance ≤8, maximum distance between repeats 100 kb, minimum distance between repeats 200 bp. In this way, we identified four large regions of the MG1655 chromosome between flanking repeats: between IS1B and IS1C (containing 13 genes and locus E), between IS5H and IS5I (containing 42 genes and locus B), and between the ribosomal operons rrnA and rrnC (81 genes), and rrnB and rrnE (31 genes). We also obtained six genes between closely spaced repeats matching our criteria (ybfB, ybfL, yibA, ldrA, ldrB, and ldrC), which we did not include in the set ‘between flanking duplicates’, since the behavior of such closely spaced repeats might be different than those studied in our system. The Venn diagram was drawn in Matlab using the ‘ChowRodgers’ method for sizes of circles and intersection areas. A list of all included genes and their assignment to the three sets is available in Figure 7—source data 1.
Fastq files of whole genome sequencing of IS-free strain D (ancestor) and of evolved clones from four evolved populations of this strain (A11, C08, C10, D08) have been deposited in the IST Data Repository, http://dx.doi.org/10.15479/AT:ISTA:65.
High fitness costs and instability of gene duplications reduce rates of evolution of new genes by duplication-divergence mechanismsMolecular Biology and Evolution 31:1526–1535.https://doi.org/10.1093/molbev/msu111
Gene amplification and adaptive evolution in bacteriaAnnual Review of Genetics 43:167–195.https://doi.org/10.1146/annurev-genet-102108-134805
Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collectionMolecular Systems Biology 2:2006.0008.https://doi.org/10.1038/msb4100050
Nonrandom location of IS1 elements in the genomes of natural isolates of Escherichia coliMolecular Biology and Evolution 14:725–732.https://doi.org/10.1093/oxfordjournals.molbev.a025812
Chromosome position effects on gene expression in Escherichia coli K-12Nucleic Acids Research 42:11383–11392.https://doi.org/10.1093/nar/gku828
Rho-dependent terminators and transcription terminationMicrobiology 152:2515–2528.https://doi.org/10.1099/mic.0.28982-0
Microbial laboratory evolution in the era of genome-scale scienceMolecular Systems Biology 7:509.https://doi.org/10.1038/msb.2011.42
Gene amplification and insecticide resistanceAnnual Review of Entomology 36:1–21.https://doi.org/10.1146/annurev.en.36.010191.000245
Genomic islands in pathogenic and environmental microorganismsNature Reviews Microbiology 2:414–424.https://doi.org/10.1038/nrmicro884
Copy number change: evolving views on gene amplificationFuture Microbiology 8:887–899.https://doi.org/10.2217/fmb.13.53
Exact and asymptotic weighted Logrank tests for Interval Censored Data: The interval R packageJournal of Statistical Software, 36, 10.18637/jss.v036.i02, 25285054.
Effect of chromosome location on bacterial mutation ratesMolecular Biology and Evolution 19:85–92.https://doi.org/10.1093/oxfordjournals.molbev.a003986
Site-specific chromosomal integration of large synthetic constructsNucleic Acids Research 38:e92.https://doi.org/10.1093/nar/gkp1193
REPuter: the manifold applications of repeat analysis on a genomic scaleNucleic Acids Research 29:4633–4642.https://doi.org/10.1093/nar/29.22.4633
The challenge of efflux-mediated antibiotic resistance in Gram-negative BacteriaClinical Microbiology Reviews 28:337–418.https://doi.org/10.1128/CMR.00117-14
The probability of evolutionary rescue: towards a quantitative comparison between theory and evolution experimentsPhilosophical Transactions of the Royal Society of London. Series B, Biological Sciences 368:20120088.https://doi.org/10.1098/rstb.2012.0088
Opposing effects of target overexpression reveal drug mechanismsNature Communications 5:1–8.https://doi.org/10.1038/ncomms5296
R: A Language and Environment for Statistical ComputingR: A Language and Environment for Statistical Computing, Vienna, Austria.
Selection for gene clustering by tandem duplicationAnnual Review of Microbiology 58:119–142.https://doi.org/10.1146/annurev.micro.58.030603.123806
Gene amplification and genomic plasticity in prokaryotesAnnual Review of Genetics 31:91–111.https://doi.org/10.1146/annurev.genet.31.1.91
Gene2Oligo: oligonucleotide design for in vitro gene synthesisNucleic Acids Research 32:W176–180.https://doi.org/10.1093/nar/gkh401
Antibiotic susceptibility profiles of Escherichia coli strains lacking multidrug efflux pump genesAntimicrobial Agents and Chemotherapy 45:1126–1136.https://doi.org/10.1128/AAC.45.4.1126-1136.2001
Recombineering: genetic engineering in bacteria using homologous recombinationCurrent Protocols in Molecular Biology 106:1.16.1–1.1639.https://doi.org/10.1002/0471142727.mb0116s106
Cooperation is fleeting in the world of transposable elementsPLoS Computational Biology 2:e162.https://doi.org/10.1371/journal.pcbi.0020162
The evolutionary significance of cis-regulatory mutationsNature Reviews. Genetics 8:206–216.https://doi.org/10.1038/nrg2063
Michael T LaubReviewing Editor; Massachusetts Institute of Technology, United States
In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.
Thank you for submitting your article "Complex chromosomal neighborhood effects determine the adaptive potential of a gene under selection" for consideration by eLife. Your article has been reviewed by three peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Diethard Tautz as the Senior Editor. The reviewers have opted to remain anonymous.
The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.
This paper uses a specially designed reporter to examine the nature of adaptation and the mutations that can drive tetracycline resistance under continual selection. The reviewers were generally enthusiastic about the work as such systematic analyses of mutational spectra during selection have the potential to inform us about evolution at a very detailed molecular level. In particular, this paper presents reasonably strong data to suggest that the local chromosomal context of a tet-resistance gene can influence its adaptation via differences in amplification potential (driven by IS elements) and read-through from other nearby genes. Despite some enthusiasm, the reviewers noted a few important weaknesses that would have to be remedied in a revised manuscript. These issues are detailed below and generally fall into three categories.
1) Some of the suggestions about how adaptation has occurred are incompletely substantiated. Additional experiments are needed to address these concerns; in most, if not all, cases, the necessary experiments are relatively straightforward.
– Results section; "All rescued populations displayed increased YFP fluorescence. Thus, rescue depended on the presence and overexpression of tetA-yfp." The second sentence doesn't logically follow from the first. The authors should test, at least in 1-2 cases, that selective deletion of TetA-YFP in the evolved clone eliminates rescue.
– "…or no upstream Rho-terminated transcripts were present (constraining adaptive Rho mutations, Figure 4A)." The notion that the Rho mutations allow read-through into tetA-YFP is interesting, but the claim that no upstream Rho-terminated transcripts exist in strains A and C needs to be substantiated. Is this information being pulled from some prior study of Rho and if so, how reliable are these data? Or are they simply predictions based on Rho site bioinformatics? If it's the latter, then qRT-PCR data supporting the claims should be generated and shown.
– Figure 3C: It wasn't clear how the evolved clones examined in a plasmid-based assay in Figure 3C were chosen. Additionally, it was unclear why 7 of the 14 mutations "reconstructed" on plasmids didn't recapitulate the evolved behavior. This point raises the important concern that the paper is ultimately about how genomic context can influence evolution, but the "reconstruction" experiments involve placing a limited genomic region onto a plasmid to drive YFP, not TetA-YFP, production. Could it be that 7 of the cases in Figure 3C failed because the genomic context has changed? This should be better discussed. In addition, the authors should, for at least a handful of the 'failed' cases in Figure 3C, do proper reconstructions by engineering the putatively causal mutation into the ancestral strain background.
2) The presentation of results could be substantially improved. There is a general lack of primary data provided and many figures lack highly relevant details on the exact nature of the genomic context of each locus, both in the original strains and the evolved isolates.
– The authors should say something in the main text, not just the Materials and methods section, about (i) the rationale for choosing the four sites of insertion that they chose and (ii) the nature/genomic context of the four insertion sites. Genomic diagrams of the four insertion sites with flanking genes, etc should also be provided in the main Figure 1. One half of these regions is shown in Figure 4, but the other half is omitted and these diagrams should come earlier in the paper.
– "We verified the presence of IS5 at the boundary of the amplicon in rescued B populations (Figure 2C)." Figure 2C appears to contain only a schematic of locus B, with no data supporting the claim that IS5 is at the boundary of the amplified region, nor is there really any primary data demonstrating or validating the notion that the cassette was amplified.
Subsection “Properties of upstream genes determine the availability of two different types of adaptive promoter co-option mutations”: The use of a strain that is apparently free of all IS elements needs more explanation. The strain generated in Posfai et al., 2006, has actually been reduced by 15% and so lacks a wide range of genomic features. This could complicate the interpretation that the difference in mutational spectra in the two strains (Figure 3A) results from a lack of IS elements. The authors should more clearly discuss this issue and its impact on the interpretation of their results.
– In the same section: Without more details, it's difficult to assess or interpret the data in this section of the manuscript. It's not clear whether individual strains only have single mutations or multiple mutations and if it's the latter, were the individual mutations made? And while the nature of 'point mutations' is clear, it's not clear what the authors mean by mutations 'delivering full or partial promoters'. And in the cases where the reconstructed mutation does not increase the reporter, what's the explanation? Is it because there are other mutations?
3) There was a concern about whether the results depend on the precise way in which the selection was performed. The evolution is done with relatively small populations in microtiter plates that are diluted each day into fresh media whose tetracycline concentration increases by about 25% each day. Consequently, the populations are effectively in a race for increasing the expression of the tetA sufficiently fast to 'keep ahead' of the increasing tetracycline concentration. This suggests that even small differences in the initial expression levels of tetA at the different loci could have a large effect on the probability of survival. Thus, it seems plausible that the increased survival of clones with tetA at the B and D loci might be a result of higher initial expression at these loci. This might also explain puzzling observations like the fact that, in the IS-free case, rescue through point mutations in the promoter are observed 4 times at D but never at A or C, and the fact that rescue rates at the B locus vary across replicates even though CFP increase is observed almost without fail (table 1). The authors should quantify initial expression levels at each locus and investigate the effect of this on the probability of rescue.
Related to this, the authors should include a panel of 96 figures (one for each well of the plate) showing the time courses of OD, YFP/OD, and CFP/OD across the 10 days. Right now the authors only report numbers of rescued and numbers with YFP or YFP+CFP increase, but it is hard to judge if such a simple classification does justice to the potentially complex dynamics that is observed across populations.
Apart from the effect of initial expression, to what extent does the selection for 'fast increase in expression' bias the mutations that are observed to one type or another? To argue for the generality of the observations that the authors make, it would be useful to assess the rate of rescue in another setting where the type of selection is different, e.g. using a Luria-Delbrueck type experiment in which many cultures (with tetA at the different loci) are grown in media without tetracycline, with colonies counted on agar plates with tetracycline concentrations beyond the MIC. Does one again find mostly rescue of clones at B and D, even in the IS free case?
[Editors' note: further revisions were requested prior to acceptance, as described below.]
Thank you for submitting your article "Complex chromosomal neighborhood effects determine the adaptive potential of a gene under selection" for consideration by eLife. Your article has been reviewed by two peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Diethard Tautz as the Senior Editor. The reviewers have opted to remain anonymous.
The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.
The reviewers are still generally positive and there was a consensus that no additional experiments or analyses are critical at this stage. However, there was a continued concern, also raised in the first round of review, that the Introduction is somewhat disorienting and excessively abstract. While the reviewers appreciate the effort to place the work in a broad and general context, they also felt strongly that the paper would benefit from a more direct, concrete exposition of what specific questions were addressed, what was done, and what was learned. The use of simple, straightforward language is encouraged. It's not a matter of "dumbing" down the Introduction, but making it clear and accessible. The comments of one reviewer are provided in full below as they will be helpful as you further revise the manuscript.
As I suspected the initial MIC's of the different constructs (cassette at locus A, B, C or D) are not the same, but even the initial B and D populations already have elevated MIC. The authors explain that they attempt to normalize for this difference by treating each construct with a concentration profile that is the same relative to the original MIC of that construct. That is, the constructs at locus D will be subjected to time-dependent anti-biotic concentrations that are about twice that of what the construct at locus A are subjected to (based on data presented in Figure 1—figure supplement 1).
I understand that this is the most reasonable way to correct for the differences in MIC of the initial populations, but I am not really convinced that it will now make the selective environments directly comparable. That is, it seems likely that, even if no beneficial mutations were to occur at all, substantial differences in the population sizes would remain and these might substantially affect the probability of generating beneficial mutants (i.e. just because of an elevated population size).
I think it would be good for the authors to add some discussion of this issue. I personally feel this issue is the main motivation for doing the Luria-Delbrueck like experiments (i.e. to check that the results are not specific to the precise way of performing the selection) and it might be worthwhile to mention this also.
Finally, regarding this point, although I feel that it still cannot be definitively said that the increased rescue of B and D constructs is solely due to the increased RATE of generating beneficial mutations rather than an increase in the size of the target population for such mutations, what one could definitely conclude is that, if this cassette were to be inserted in the genome through horizontal transfer, than rescue under selection with antibiotic is clearly highest when it is inserted into loci B or D, rather than at A or C.
The other comment I have is about presentation. I still find the introductory sections rather disorienting regarding what this paper is going to actually show. My impression is that this is due to a choice of presentation style which seems to aim to present the questions that the work addresses in as general and abstract terms as possible, leaving it typically up to the reader to connect the general abstract description to the actual concrete things that are being done. I find this rather tiresome to read. I believe the paper would become much more pleasant to read if the authors simply removed all attempts at phrasing large general abstract questions about evolution, and instead just say what exactly they do and what they find. The readers will then themselves immediately appreciate the generality of these findings.
(As an aside, I found the start of paragraph two in subsection “Different mutation types and their rate biases” a particularly striking example of general abstract phrasing that is almost impossible to parse, e.g. I have no idea what it means for mutation types to sustain adaptations.)
In the end it seems to me that the authors have a rather straight-forward and simple message: The rate at which genetic mutations occur that increase the expression level of a gene depends strongly on features of the genomic context of the gene, i.e. the rate at which the region is duplicated, the rate at which deletions or insertions cause promoter sequences to be placed upstream of the gene, and the rate of mutations causing read-through from neighboring genes. In contrast, apart from the read-through mutations, the EFFECT of the observed mutations, once introduced, is largely independent of this chromosomal context (as the experiments with the plasmids show).
Why not stress rather than obscure that these are the concrete results?https://doi.org/10.7554/eLife.25100.042
No external funding was received for this study.
We thank M Ackermann, N Barton, J Bollback, J Jäger, members of the Barton, Bollback, Bollenbach and Guet groups, and especially M Pleška for comments on earlier versions of the manuscript. We thank F Korč for assistance with the data analysis script.
- Michael T Laub, Reviewing Editor, Massachusetts Institute of Technology, United States
- Received: January 12, 2017
- Accepted: June 15, 2017
- Version of Record published: July 25, 2017 (version 1)
© 2017, Steinrueck et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.