Non-allelic gene conversion enables rapid evolutionary change at multiple regulatory sites encoded by transposable elements

  1. Christopher E Ellison
  2. Doris Bachtrog  Is a corresponding author
  1. University of California, Berkeley, United States
6 figures


Figure 1 with 1 supplement
TE-derived MSL recognition element (MRE) motifs from the neo-X chromosome of Drosophila miranda.

(A) The MSL recognition motif (MRE) plus 20 basepairs of flanking sequence were extracted from all 77 ISX transposable elements located on the neo-X chromosome in the MSH22 reference genome assembly. The multiple sequence alignment of these 77 sequence regions (arranged from top-to-bottom in the order in which they are found on the chromosome) shows that there is sequence variation among elements both within and adjacent to the 21 basepair MRE motif. Each variant has been classified as ancestral or derived based on its frequency in the ISX progenitor element, ISY. The derived allele frequency for each variant in this region is shown for ISX as well as 139 ISY elements from the neo-X chromosome (see Figure 1—figure supplement 1 for ISY alignment). Red arrows point to the derived TT haplotype that is common among ISX elements but rare in ISY. (B) Barplot showing the frequencies of all haplotypes at the GA/TT sites, for ISY and ISX elements separately. Two haplotypes are present within ISX elements (GA and TT) and the two alleles within each haplotype are in perfect linkage disequilibrium. In contrast, the majority of ISY elements harbor the GA haplotype, but these two alleles are not in perfect linkage disequilibrium among ISY elements. Rather, five additional allelic combinations are present at low frequencies in this location among ISY, but not ISX elements.
Figure 1—figure supplement 1
Alignment of ISY elements from the D. miranda MSH22 genome assembly.

139 ISY elements from the MSH22 neo-X chromosome were identified and 200 basepairs from their 5′ flanks were aligned. The black arrows point to the sites where the derived ‘T’ alleles are common among ISX elements. In contrast, only a single ISY element from the neo-X chromosome harbors the TT haplotype.
The TT haplotype increases MSL binding affinity.

(A) MSL3 ChIP-seq data from D. miranda strain MSH22 shows that the ISX insertions carrying the TT haplotype recruit significantly higher levels of MSL complex compared to those with the GA haplotype (Wilcoxon test p = 0.01). (B) Engineered ISX elements that differ only with respect to the TT haplotype bind different levels of MSL complex. There is an epistatic interaction between the two ‘T’ alleles such that separately, they decrease MSL complex binding relative to the ancestral allele, but together in the TT haplotype, they increase MSL complex binding (Wilcoxon Test p = 0.028 for both comparisons [GT vs TT and TA vs TT]). The rectangles and error bars show the average and standard deviation of values from four biological replicates for each condition.
Figure 3 with 1 supplement
ISX variation among wild lines of D. miranda.

For each ISX insertion identified within the D. miranda MSH22 reference genome assembly (alignment shown at left, see also Figure 1), we characterized sequence variation across D. miranda individuals. The TT haplotype (magenta lines) was fixed across individuals at 30% of insertions (see example alignment, top right), polymorphic at 41% of insertions (example shown middle right), and absent at 29% of insertions (bottom right). Allele sharing between insertions occurs at sites other than the TT haplotype, but these sites tend to be shared across fewer insertions (see heatmap, bottom right). Figure 3—figure supplement 1 shows the population alignment across all ISX insertions on the neo-X.
Figure 3—figure supplement 1
Shared polymorphism across sixty-nine ISX insertions.

The 5′ 200 basepairs of the ISX element was assembled for an average of 20 individuals, for each of 69 ISX insertions. Each stripe corresponds to the population data for a given insertion and nucleotides are colored as in Figure 1. Solid lines point to columns of the alignment containing polymorphisms that are shared between multiple ISX insertions. For these columns, the heatmap is shaded to reflect the degree of allele-sharing, which ranges from 3% of insertions to 41% of insertions. The ‘T’ letters under the heatmap mark the location of the TT haplotype shown in Figure 1.
Figure 4 with 1 supplement
Population frequency of TT haplotype across ISX insertions.

The location of all ISX elements on the D. miranda neo-X chromosome, as inferred from the MSH22 reference genome assembly, is shown by vertical green bars. The derived TT haplotype (frequency shown in red), is polymorphic at 27 of 66 ISX insertions, a pattern consistent with non-allelic gene conversion.
Figure 4—figure supplement 1
ISX genotype across insertions and individuals.

Heatmap showing each of the 27 ISX insertions (rows) where the TT haplotype is polymorphic among individuals. Columns show the genotype of each individual, for each of these insertions. Each individual has a mixture of TT and GA ISX insertions, suggesting that TT polymorphism among lines is not due to population subdivision.
Selection shapes patterns of variation at the TT haplotype.

(A) Haplotype diversity across all ISX sequences. Assembled ISX contigs were combined for all insertions and individuals. The 25 basepairs flanking each side of the TT region were extracted from a total of 1291 sequences and split into two groups based on whether they contained the TT or GA haplotype. Haplotype diversity was then calculated for each group. The difference between groups is significantly larger than expected by chance (resampling p < 0.001), with the sequences containing the TT haplotype having less haplotype diversity compared to those containing the GA haplotype. (B) Nucleotide diversity across all ISX sequences. We compared nucleotide diversity for ISX insertions where all individuals carried the ancestral GA haplotype to those where the derived TT haplotype was fixed. ISX insertions that are fixed for the TT haplotype have significantly reduced nucleotide diversity compared to insertions fixed for the GA haplotype (one-sided Wilcoxon test p = 0.035). (C) Allele-frequency spectrum across ISX sequences. The allele frequency spectrum was calculated separately for TT and GA-carrying ISX elements, across all insertions and individuals, using the first 200 basepairs of ISX sequence. Consistent with incomplete hitchhiking under positive selection, the TT frequency spectrum shows an excess of high frequency derived alleles, compared to the GA spectrum (resampling p = 0.027).
Non-allelic gene conversion spreads refining mutations among TE-derived MSL recognition motifs.

Shared polymorphism of the TT haplotype among ISX insertions suggests a model where a mutation that refines regulatory activity arose once at a single TE-derived regulatory element, and spread across elements via non-allelic gene conversion. Over evolutionary time, such a mutation spreads in two dimensions: horizontally among TE-derived regulatory elements and vertically through the population, until it is fixed across elements and across individuals. The TT haplotype is at the midpoint of this process. Across ISX insertions, it is fixed, absent, and polymorphic, in approximately equal proportions.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Christopher E Ellison
  2. Doris Bachtrog
Non-allelic gene conversion enables rapid evolutionary change at multiple regulatory sites encoded by transposable elements
eLife 4:e05899.