Primary and promiscuous functions coexist during evolutionary innovation through whole protein domain acquisitions

  1. José Antonio Escudero
  2. Aleksandra Nivina
  3. Harry E Kemble
  4. Céline Loot
  5. Olivier Tenaillon
  6. Didier Mazel  Is a corresponding author
  1. Institut Pasteur, Unité de Plasticité du Génome Bactérien, Département Génomes et Génétique, France
  2. CNRS, UMR3525, France
  3. Molecular Basis of Adaptation, Departamento de Sanidad Animal, Facultad de Veterinaria, Universidad Complutense de Madrid, Spain
  4. VISAVET Health Surveillance Centre. Universidad Complutense Madrid. Avenida Puerta de Hierro, Spain
  5. Université Paris Descartes, Sorbonne Paris Cité, France
  6. Infection, Antimicrobials, Modelling, Evolution, INSERM, UMR 1137, Université Paris Diderot, Université Paris Nord, France
15 figures, 1 table and 7 additional files


Introduction and outline of this work.

(a) Diagram of an integron. The stable platform of integrons contains the integrase-coding gene (intI), the promoters dedicated to the expression of the integrase and the cassettes (Pint and Pc, respectively), and the attI site (blue triangle). The variable part is composed of cassettes, encoding genes of different functions (arrows facing right) and their cognate attC sites (red and turquoise triangles). Recombining different sites, integrases can incorporate (attI x attC), excise (attC x attC), and reshuffle cassettes (excision followed by integration reactions) within the platform. (b) Recombination reactions in integrons can follow two distinct pathways: a replicative one when at least one substrate is an attC site -this is the modern activity of integrases-; or the classical pathway with a second strand exchange when recombining exclusively attI sites. This classical pathway is considered the ancestral activity of integrases for its resemblance with that of other Y-recombinases. (c) Structural models of E. coli XerC and Class one integron integrase IntI1, in which the I2 domain is marked in red. Models were obtained using Phyre2. d: Schematic representation of the setup followed in this work and the results (in form of mutations and gain in attI x attI activity) obtained at each step.

First round of enrichment cycles.

Results are shown for intI1 (blue) alt1 (green) and alt2 (brown). a: list of mutations found among the three alleles after enrichment cycles. Substitutions not reachable from the intI1 code are highlighted in red. Substitutions marked in blue are similar to those obtained with intI1, validating the new selection cycles. Mutations selected for the next round are marked with an asterisk (n.t.: not tested). b: fold increase in attI x attI recombination of all haplotypes. Bars represent average values of at least three independent experiments. Error bars show standard error. Activity of each mutant was compared to that of the parental allele. Statistically significant differences are indicated by * (alpha = 0.05) and non-significance by ‘ns’.

Construction of alleles with eight mutations.

(a) fold change in recombination rates for attI x attI, attC x attC, and attI x attC of intermediate steps in the construction of intI18mut containing a variety of combinations of mutations. Bars represent average values of at least three independent experiments. Error bars show standard error. Mutations are indicated only on top of blue bars but mutants are depicted in the same order among the three graphs. Arrows and the '+' symbol in 5x, 6x, and 8x mutants mean that mutations shown are added to those in the previous mutant. Strain numbers are found in Supplementary file 2. (b) negative correlation between the activity in attI x attI reactions and that in attC x attC (above) (Spearman test, r = −0.945, P (two-tailed) <0.0001; log-log regression R2: 0.973); and attI x attC reactions (below) (Spearman test, r = −0.654, P (two-tailed) = 0.0336; log-log regression R2: 0.13). (c) expected and observed increases in recombination rates among mutants containing subsets of the mutations in intI8mut. Expected values are obtained based on a multiplicative model in the absence of epistasis. The lower recombination rates observed are indicative of negative epistatic interactions among mutations.

Second round of mutagenesis and enrichment cycles.

a: list of mutations found among the three 8mut alleles after four cycles. Substitutions that are not reachable from the intI18mut code are highlighted in red. b: fold increase in recombination rates for the attI x attI reaction of all mutants compared to their parental 8mut allele. Bars represent average values of at least three independent experiments. Error bars show standard error. Statistically significant results are indicated by * (alpha = 0.05). c: Analysis of the mutations found in the alt18mut mutant C122. Of both non-synonymous mutations, only E103K produces a relevant increase in activity. d: Mutational pathways in our experiments highlight the importance of broad sequence space exploration. Mutation S173R, conferring a high activity gain, was only found in the alt1 code because intI1 and alt2 codes needed an additional mutation to access that codon. Glycine to aspartic acid substitution (G320D) appeared in the intI allele in the first round of experiments and was included in both alt8mut alleles using two different codons. It then re-evolved to asparagine (D320N), a residue that was not reachable from any of the starting codons with a single mutation.

Epistasis purification.

a: fold change in recombination rates -relative to intI1- for the attI x attI, attC x attC, and attI x attC reactions of mutants obtained after epistasis purification (intI18mut is also shown for clarity). Gains in attI x attI activity entail asymmetrical losses for attC x attC. Bars represent average values of at least three independent experiments. Error bars show standard error. Statistically significant differences are indicated by * (alpha = 0.05) and non-significance by ‘ns’. b: allele-ID (identity of the residues at the 13 positions of interest) of the most prevalent mutants after epistasis purification.

Mutation dynamics in the library.

(a) logos showing allele-IDs reflecting the residue composition at each position of interest. Logos for the wild type protein (IntI1), the expected and observed composition of the library before the cycles and the final composition after six enrichment cycles. (b) fitness of single mutations compared to the background mutational composition of the library. (c) fitness effect of V315A and A321G mutations shows sign epistasis between both mutations. (d) fitness of double mutants. Plot representation of expected vs. observed fitness. The good linear correlation suggests a lack of diminishing returns among highly adaptive mutations. Purple dots represent double mutants in which epistasis values are statistically significant (red squares in panel e). (e) heat map representing epistasis between mutations in all positions of interest. P-values for each analysis are indicated within squares. Statistical significance is represented with red boxes (95% confidence interval) and red, bold numbers (99% confidence interval).

Mutant levels of ancestral and modern activities.

Recombination frequency for attI x attI and attC x attC of mutants. Some haplotypes fall within the higher part of the graph, suggesting the possibility of assymetric trade-offs between functions fostering innovation.

Appendix 1—figure 1
Diagram of the enrichment cycles used to select for integrase variants hyperactive for attI x attI.

(a) Cycles adapted from Demarre et al., 2007 used for the evolution experiments with intI1. Briefly, a randomized library of integrase encoding genes cloned in a pBAD plasmid (A) is established in a toxin-resistant E. coli strain containing a pSU plasmid that encodes an attI site and a toxin (1). This strain acts as receptor in the conjugation of an attI-bearing suicide plasmid (2). Mutagenized integrases deliver the attI x attI recombination reaction allowing the formation of a selectable cointegrate (3). pBAD plasmids are then purified in two steps: the plasmid preparation from recombinants (4) is digested to specifically degrade cointegrates (5) and then transformed in a toxin sensitive strain (6). Plasmid extraction from this strain yields very pure pBAD plasmids containing a variety of integrase-coding genes of high activity (B). Higher selective pressure is applied by subjecting these plasmids to further cycles. b: Novel enrichment cycles used for the evolution of the rest of alleles. Briefly: the library of integrase mutants cloned in a pBAD plasmid (A) is established in a dap- E. coli that contains a dapA gene in the chromosome interrupted by two attI sites (1). Expression of the integrase leads to the recombination of both sites and the reconstitution of dapA, allowing recombinants to grow in media not supplemented with DAP (2). Plasmid preparations from recombinants (3) yields pure pBAD plasmids containing a mixture of hyperactive-integrase-coding genes (B) that can be further used in subsequent cycles.

Appendix 1—figure 2
Recombination activity for the attI x attC reaction of evolved alleles from alternative codes, relative to their parental alt allele.
Appendix 1—figure 3
Methods used for evaluating the efficiency of integrase alleles.

(a) scheme of conjugative and chromosomal tests. In both cases the recombination between attI sites produces a selectable phenotype. The dynamic range of each assay is different and so are the results for a given integrase allele. There is a drop of approximately two orders of magnitude between the conjugative and the chromosomal test. The chromosomal assay allowed us to determine the efficiency of highly active mutants that were close to the upper limit (10−1) of the conjugative test. (b) example of recombination rates for intI1 and intI18mut measured in both assays.

Appendix 1—figure 4
Recombination activity for the attI x attI reaction of subsets of mutations found in alleles C325 and C326.
Appendix 1—figure 5
Increase in recombination rates for the epistasis purification library.

Note that for technical reasons these data are not comparable to single mutant measurements.

Appendix 1—figure 6
Fold increase in double-strand exchange recombination of mutants from the epistasis purification experiment.
Appendix 1—figure 7
Haplotype dynamics during library cycles.

(a) Distribution of mutations along the intI1 sequence of all alleles in the final cycle. (b) Representation of the number of different alleles found at the initial time point and at cycles 3 and 6. The decrease in allele variety is proof of selection. (c) Frequency along the cycles of the 15 most frequent alleles at cycle 6. (d) Evolution of the residues at each position of interest during the experiment.

Appendix 1—figure 8
Difference in fitness of double mutants compared to the wild type.


Key resources table
Reagent type (species) or resourceDesignationSource or referenceIdentifiersAdditional information
Strain, strain background (Escherichia coli)DH5aLab strainsupE44 DlacU169 (F80lacZ’ DM15) DargF hsdR17 recA1 endA1 gyrA96 thi-1 relA1
Strain, strain background (Escherichia coli)UB5120PMID:2157593F-pro met recA56 gyrA [NalR]
Strain, strain background (Escherichia coli)β2163PMID:15748991MG1655:: DdapA::(erm-pir)RP4-2-T c::Mu [KmR]
Strain, strain background (Escherichia coli)B36PMID:26961432MG1655 ΔdapA recA269::Tn10 attB::attI1WT-attI1STOP-dapA [SpR]
Strain, strain background (Escherichia coli)4137PMID:16341091β2163/pSW23T::attCaadA7
Strain, strain background (Escherichia coli)2714PMID:15748991β2163/pSW23T::attI1
Strain, strain background (Escherichia coli)H203Strain from this work. Plasmid p6944 from PMID:19730680β2163/p6944 (pSW23T:: attCaadA7-VCR2/1)
Recombinant DNA reagentpBAD18PMID:7608087pBAD18
Recombinant DNA reagentp6944PMID:19730680pSW23T::[Ptac]-attCaadA7lacIq-VCR2-pir116*(BOT)
Recombinant DNA reagentp2714PMID:15748991pSW23T::attI1
Recombinant DNA reagentp929PMID:15716446pSU38Δ::attI1

Additional files

Supplementary file 1

mutations found in enrichment cycles.
Supplementary file 2

bacterial strains used in this work.
Supplementary file 3

plasmids and oligonucleotides used in this work.
Supplementary file 4

Distribution -in single and double mutants- of mutations in the rest of loci of interest.
Supplementary file 5

Observed and simulated values of fitness distribution for single mutants.
Supplementary file 6

Histograms representing, for every mutation pair, the apparent epistasis measured for simulated data (for 1000 simulations in the absence of epistasis).

Red lines show epistasis measured from the real data.
Transparent reporting form

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. José Antonio Escudero
  2. Aleksandra Nivina
  3. Harry E Kemble
  4. Céline Loot
  5. Olivier Tenaillon
  6. Didier Mazel
Primary and promiscuous functions coexist during evolutionary innovation through whole protein domain acquisitions
eLife 9:e58061.